Could a computer fitted with AI software take an IQ test?

And what would happen? What would it show? Clearly, AI is already being used in many applications. Is it possible to measure the strength of that intelligence using IQ tests? How do these concepts go together? i.e. intelligence testing and artificial intelligence.

IQ tests require too much general-purpose real world knowledge for any existing AI to perform well. There’s an article in the March 2017 Scientific American about improved versions of the Turing Test to distinguish AIs from humans. It mentions that experiments have been done where AIs have been given some of the same standardized written tests given to elementary school students. A system called Aristo can score about 75 percent (not a passing grade) on fourth-grade science tests, but it only does that well on multiple-choice questions without diagrams. The Aristo researchers say “no system to date comes even close to passing a full 4th grade science exam”. AIs would perform even more poorly on general IQ (not science) tests.

I’d suggest it would show an impressive advance in natural language and visual input processing.

IBM Watson is probably the closest system to being able to handle the textual elements of an IQ test. I think it would do fairly well. But it would completely fail at the visual test elements of a standard IQ test, or any non-textual reasoning.

Artificial Intelligence covers a wide range of computer technologies, and most don’t match with human expectations of intelligence. Real-world AI applications are narrowly focused to a single problem space with controlled input management (and can be very impressive in that space) but there are no general solutions in the way that human brains are a general problem solving engine.

So the answer is no - Artificial Intelligence cannot currently pass (or even take) an intelligence test.
Sometimes the AI application in a specific problem space can be compared to human performance in that space, but it is a very limited comparison.

Depends on what the test is.

Ah, excellent find. Thank you. Of course, now I’m stuck on this ridiculous side-trip:

Which is the odd one out? (i) calm, (ii) quiet, (iii) relaxed, (iv) serene, (v) unruffled.

Current AIs are not general-application devices. They can be trained on a certain problem space and then will perform at varying rates based on it.

IQ tests mix a wide variety of different tests for different aspects of how the brain can approach a problem. There might be a problem for guessing the next number in a sequence of numbers, or counting stacked blocks and guessing how many are behind those that are readily visible, assuming gravity, etc.

You would need to train the AI to handle each of these types of problems separately, over a variety of examples of each. If you did that, then it should be able to perform fairly well.

I understand that all of the knowledge-type tests (“What language did they speak in ancient Rome?”) have been removed from modern tests, as that isn’t what an IQ test is meant to test for.

I’ll go with (ii) quiet as being the different one. I can be scared speechless, and thus technically ‘quiet’ – I don’t think the same can be said of the others.

Ditto for “quiet.” The others are emotional states.

…and to get back on topic, I’ll wonder out loud if an AI would easily determine that kind of subtlety.

The WAIS has a section on “information”, which is simply general knowledge cultural stuff, and is part of the “Verbal comprehension index”; for low-scoring people it’s more of a test as to whether they are able to comprehend what people are asking them about. For high-scorers it measures how much random knowledge you’ve managed to absorb and are able to regurgitate. I managed to get up to some that were really tricky that I wouldn’t have been able to get without having the passion for learning that I have, but couldn’t come up with a good estimate for a certain distance they wanted; I just am not good at estimating distances at scales other than an inch or foot.

In terms of gut feeling, I would go with “unruffled” because it’s the only one with an affix (“relaxed” is merely conjugated). I would stand by my answer as well, even when recognizing why “quiet” might be what they were looking for, because it wasn’t specified at all what sorts of things we were supposed to be considering about the words. If I choose to consider the linguistic construction of the words, that’s merely a different choice than considering the possible semantic meanings.

Yes, the same (or at least analogous) can be said for the others.

All words in English have a spectrum, wide or narrow, of multiple definitions, according to the context. It is an invalid question to ask “which word has the most blatantly conspicuous outliers of usages?” A bird can be ruffled, and still perfectly calm, quiet, serene, relaxed.

For the OP.

There is the trusim - IQ tests test how good a person is a doing IQ tests.

There are IQ test primers, where you can learn how to do better at them. For some people this has been an important thing to do. This is all about learning the test, and not the ability.

For an AI program this is exactly the trick. Across the range of IQ tests, you could craft a program using well understood traditional AI techniques targeted at each problem type. Some problems we could probably do a better job on than most humans do. Things like similar geometrical shapes. Indeed all of the “odd one out” problems could probably be managed with a not too difficult set of heuristics.

All this really does is underline how poor conventional IQ tests are for measuring “IQ”.

Sorry to continue the hijack, but this is a great example (as others have said) of why IQ tests are a very narrow measure of intelligence. How can you argue against (iv) being the odd one out, on the basis that it is the only word to end in a vowel? Or what about (iii) being the odd one out because it is the only one whose first letter has an even-numbered position in the alphabet? I agree (ii) is probably the answer being sought, by the way.

Odd man out questions can be completely useless. One that comes to mind is i) camel ii) crocodile iii) coyote iv) hyena v) canary.

Possible answers:

Camel, the only domesticated one
Crocodile, the only cold-blooded one
Coyote, the only not of old-world habitat
Hyena, the only one not starting with a C
Canary, the only one with two feet and wings.

A paper published a couple years ago by Chinese and Microsoft researchers compared Human Performance (HP), obtained via Amazon’s Mechanical Turk, and their various AI models on verbal IQ questions. Their best model (RK) outdid the humans. See page 6 here. (PDF).

I think AI solving IQ questions involving visual material is still in its infancy.

IQ tests are timed, and a higher score can be attained by a subject who can solve the problems faster. So, in nearly all cases, AI can solve a problem faster than HP, even if HP has the ability to solve every such question correctly. For example, if it is a multiple choice question asking which of five numbers is prime, AI can solve it in milliseconds, where even the most intelligent (non-savent) human wold have to go through the laborious process of testing all options for prime. This would give AI a huge scoring advantage, even though the human is perfectly capable of solving exactly the same problem.

No. This is what is wrong with IQ test style testing of school students too.

There are two cases for any one question.

  1. The test subject has been taught , via his schooling, how to solve the the same sort of questions before. The test is simply testing his speed at applying the solution, and his ability to think about abstract concepts with being confused.
    Variations the student can deal with is , variations on properties of the items in the set where he asked to find the odd one out or the missing one or the next one. eg its a set of numbers, eg its a set of pictures with varying geometry and colour and orientation.

or 2. They have not seen that type of question before, and have no ready solution method . The human may waste time trying to solve one sort of question when he should really move onto the next. The computer will just know it doesn’t know and move on, or somehow produce an answer that isn’t correct.
“Q. What is the next number in the series 1,2,3,4 ,5,6,7,8”
Preliminary Answers in the AI’s set of answers may include ideas like : “84% of phone numbers which have those digits in that order have the next digit as 9”.
This is equivalent of google fu… Hey… we humans due it because we an solve NP complete problems with intuition… (… content addressable memory… find me another memory with similar content… )
Its not clear how an AI should convert its set of ideas , such as the phone number answer, into a single numeric answer.

The computers answer reveals its failing to understand that the question wants the likely answer. It may be programmed to ignore answers that are less than 90%, arbitrarily. Whereas the IQ test is often asking for an answer where 5 other possible answers could be worked out in ten minutes of human thought, and one more likely one could be decided upon, because it took only 30 seconds to produce, whereas the others took about 2 minutes to produce. "What is the next number in this series ? " actually have infinite answers, unless the question is further constrained by specifics about what the pattern should be. And that is how people who score high on IQ tests get through the different types of questions so fast… and how the question askers write questions they know there is no one answer for… the know what the simplest pattern would be and assume that is the pattern that is expected to be used.

And so to the crucial corrollary of all that… what I am saying is that if someone had an AI trained up to solve IQ tests , I could take these odd one out, next in the series, whats missing questions, and rephrase them to make them phrased in a way the AI hadn’t been trained to solve.

The person would easily catch on to what I am up to. Probably this is why IQ tests are trusted for use with humans… the habits of the question writer are picked up by the subject and so its always just a speed of accurate thinking test.

The computer would just be totally defeated, probably giving the answer as per what it was trained to assume the question to be, rather then working with the extra little complication.

MENSA entry test is more true test of ability to think logically, and with more numerous factors, … rather than simply run through similar questons in rapid fire… a simple speed test.

Thats why the two are at different ends of the spectrum.
school age IQ tests are teaching the student how to write the expected answer ( The teachings of the pattern is implied not explicit, and so its unfair to say it is a true constraint imposed on the series.). That way the subject isn’t greatly affected by not having sat the same test last year, or having been taught how to answer similar questions.
Mensa tests are so many different questions, with no such “teaching” in them, however they can’t help but to test the subjects previous learning, such as the set of prime numbers (lowest first… ),the more the better…
and its impossible to write a great long list of questions without duplicating the sorts of questions that someone else has already asked and had answered.
Back to the AI… if they provide an answer they may apply the same algorithm they knew as a method to calculate as an answer to a similar question. They would then give the wrong answer because of some nuance of language or diagram understanding it wasn’t noticing as important.

Watson was extremely impressive on Jeopardy. However this was an illusion of intelligence, not the real thing. Watson’s capability (however useful) is narrow and “brittle” by human standards.

E.g, if you asked Watson “What is the freezing point of water?” It would accurately respond something like “32 F” – simply because those words are commonly found in proximity. It is like a super-version of Google.

But if you asked Watson “If a snowman melts and later refreezes, does that turn back into a snowman?”, it would never be able to answer that.

Another example: You give Watson a typical Middle School math problem: “If a large harvester combine can harvest 5,000 bushels of corn per hour, and a medium combine can harvest 2,500 bushels of corn per hour, if they are both working together how many bushels will they harvest in an two hours?” It would probably state 15,000 bushels.

Now you ask Watson: “If a 17-yr-old boy can pick two gallons of berries per hour and a 16-yr-old girl can pick one gallon of berries per hour, if they both go into the berry patch together, how many gallons will they pick in two hours?”

These are not my examples – they were stated previously by other AI researchers.