Why is the Turing test considered to be a poor measure of artifical intelligence?

According to this article in Salon leading AI researchers think the Turing test is not a particularly useful metric for investigating or measuring artificial intelligence.

The author of the article suggests this is because AI has been over promising and under delivering for a long time and has not even come close despite decades of effort, to devising an AI construct that could pass a rigorous Turing test.

A Turing Test is essentially a test where a human judge converses via written/typed words with someone/something they can neither see or hear, and the quality of the communication has to be such that the entity being evaluated is indistinguishable from a human.

Why is the Turing test a poor metric of artifical intelligence?

The Turing Test isn’t truly a measure of the intelligence of anything. It is rather more a measure of how well a computer program can fool a human into believing they are talking to another human.

I saw a Turing Test being conducted once (on television) where they had a bunch of people and a bunch of computers in one room and ‘testers’ (humans) in another room. The testers would rotate from computer to computer and ask questions via typing. They had no idea which stations hooked them to a human and which to a program.

None of the computers successfully passed the Turing Test but a few did a good job. Part of the trick here is each station was limited to a specific topic. One might be ‘random chit-chat’ while another might be only about chess (I can’t remember what the categories all were). IIRC the random chit chat program actually did well as it exceled at being vague. The more rigidly defined categories had the computer program usually coming off too knowledgable or too stilted.

Interestingly one human was picked as a computer. She was an expert in Shakespeare and most testers found her knowledge so complete and deep that they figured only a computer could have so much detailed knowledge.

Besdies, who is to say an intelligence has to be able to chat with a human to be intellgent? Dolphins are fairly intelligent. Do you think there is one out there that could convince a human of that fact by talking to them?

Perhaps I’m incorrect, but I thought the idea of the Turing Test was to determine whether or not an algorithm would terminate or run infinitely.

The Turing test is a poor metric of artificial intelligence because it places too much stock in natural intelligence.

When Eliza, (the famous Rogerian therapy emulator) was created in the mid-sixties, it “fooled” some college-educated people.

I guess it was hard to distinguish Eliza’s responses from those of a real person – assuming the real person had just drank two bottles of cough-syrup. Nevertheless, her creator, Joseph Weizenbaum, reported that some of his colleagues insisted that they needed time on the computer with Eliza “to work through some things.”

The Turing Test was devised to establish what we now call “weak AI,” which is a system that gives the appearance of true intelligence. “Strong AI” would be a system that was truly intelligent (and, as you can imagine, what “true” intelligence involves is a widely debated topic).

The company that I work for develops commercial weak AI-based agents that allow you to communicate in natural language. Our toolkit allows you to develop content that can respond in a human-like manner, so that if you spent enough time and effort, you could create an agent that would be very difficult to distinguish from a human. Indeed, some of our Japanese clients have gone to great lengths to produce remarkably human-like agents (although still a far cry from being able to pass a rigorous Turing Test).

You’re thinking of Turing machine’s - same guy (Alan Turing) - different concept. A Turing machine is a theoretical, idealized but simple computer that is used to consider problems in computer science. One such classic problem is the Halting Problem - that of knowing whether, for a given program, a Turing Machine (maybe a Universal Turing Machine - can’t remember the difference, it’s been a while) will halt for a given input.

That’s the halting problem on a Turing Machine, or an idealized representation of a theoretical computer composed of one read-write head and an infinite tape upon which symbols can be written and read from, and it has been proven insoluble: There is no way to absolutely prove, in the general case, whether a program will eventually make the head stop or whether it will make the head go on forever.

Because no one has yet found me out.

Until now. Damn.

Wait, wait… so, because AI developers have yet to be able to develop an AI that can pass the Turing Test, the test itself is flawed?

Is it just me, or does it sound like some programmer has a wounded ego?

My difficulty is with the word intelligence. Better to say consciousness. My computer shows signs of intelligence, e.g. it can add little faster than me (1,000,000 x faster to be precise). The turing test is to test the theroem of strong AI, e.g is it possible for a machine to be conscious. The point is, I assume you are conscious too because you behave like me. If a machine were to behave like me then I would have to assume it was thinking too. This is too big a step for many philosophers and is basically untestable. There is probably no real test to see whether a computer could think (or whether you are thinking either). However, if a machine can make plans, decisions, analyse things etc, then it may well be thinking. A limited turing test (e.g. conversing with a computer for 5 minutes) is of little use, but a more rigorous one may be.

My 2c:

How do we know if a computer is intellegent (whatever that is). Turing pointed out that if it is indistinguishable from a human, it must be (assuming humans are intellegent, whatever that is.)

However if you point a random schmoe at a computer and it says ‘that’s right, joe. you rock’ he’ll say ‘whoa! intellegence.’

If you point an AI researcher at a computer, he’ll know exactly what to look for (prob. random factoids, or good guessing of context or something) and it’s almost impossible to pass.

So there’s no easy way to do it well.

I agree with most of what the respondents have said. Attempts to pass the Turing test that are limited in some way (e.g. narrow subject) and are tested against naive users can appear to be quite impressive. The trouble is that, at least for programs such as Eliza, they rely heavily on linguistic tricks.

A huge stumbling block for any computer to pass the Turing test is that people have a vast amount of real-world knowledge that greatly exceeds the knowledge of even the most advanced computer system.

The Turing test in it original form is not unreasonable it is just that by the time we have an AI system that can pass it, it is likely that people will have long since accepted that artificial intelligence is possible from slightly lesser systems.

If scientists could produce a machine with the intelligence of a dog, I for one would be very, very impressed. To achieve the intelligence of a two year old child would be a staggering achievment. Neither of these would pass the Turing test.

I agree with G. Cornelius. I think that if a computer had the visual and auditory recognition capabilities of a small bird that it would be an astounding accomplishment.

The essential problem is this: intelligence cannot be well measured by how a *person[/] carries a conversation. Besides, how intelligent are most human conversations anyway?

Another point to consider: Turing never meant for his proposition to become such a Holy Grail for A. I. researchers. Because it was proposed by Alan Turing, though, it’s become very famous. The result of this is that there are people and companies (like Cerowyn’s) that are aiming to pass the Turing test as a specific goal, rather than, as Turing proposed it, a feature of intelligent machines in general. No matter how well a pure conversational system is refined, it won’t be intelligent - but with a large enough vocabulary, it’s quite feasible that one of these systems might eventually pass a general Turing test.

Right, the Turing Test was designed not as the definition of ‘intelligent’ or the only test, but as sort of a far bound for an ‘intelligent’ computer. If it can pass the Turing Test (consistently) then, as long as we agree humans are ‘intelligent’ then the computer must be, too. But that doesn’t mean failing the test proves it’s not ‘intelligent’. In math words, passing the test is sufficient for proof, but not necessary.
I think perhaps scm1001 (hmm… suspicously like the name of program process?) is right in that ‘conscious’ is a better term, though.
I would guess current AI researchers aren’t so much saying it’s a non-valid test, as saying that they’re not working on passing it as a goal right now, rather focusing on more useful and achievable goals.

Some sort of AI conciousness would be so radically different from people in how it learns and thinks that it’s hard to imagine it’d be indistinguishably from a human. Even if it was it’s own conciousness, somehow, to demand that it has to be human-like in it’s interaction as a way of certifying that is faulty.

Well, even an illiterate adult with only an elementary education can do something that is truly astounding: abstract thought.

Now, concrete abstractions, like, all these objects are blue can be done by animals and computers. But truly idealistic abstractions, like, it is not fair and just that so little have so much and so much have so little is going to be harder to program.


Kill me, I’m too happy!

Norman, coordinate.

Dang liberal computers!


One of the reasons why it has become irrellevant is that we are no longer after human mimicking AI, we are focused more on human augmenting AI, that is, focusing on how we can utilise the Computers strengths to take advantage of human strengths. The major fields in AI ATM, Speech recognition, Agents, data mining, etc, all focus on helping humans, not mimicking them. The only notable exception would probably be game AI, in this field, I would say that some applications are getting very close.

The other reason is that the turing tests have become a series of cheap parlour tricks. They focus more on how humans work than on how the computer works. A lot of the work is on how humans can be fooled.