I need a combination programmer and semanticist

I don’t necessarily expect anyone to know the answer to this; maybe there isn’t one exactly. But I have a better chance here than anywhere else!

I love random poetry generators. Ever since the first computer-written poetry was publicized (back in the 80’s, I think) I have been fascinated with the idea. I like the ones that compose original poetry far more than the ones that simply take great poets’ work and cut-and-paste it.

My favorite, Random Verse Lab, has a lot of cool features. For one thing, it lets you modify the lexicon any way you like – add to it, delete words, have separate subsets of words in a separate word set. The way it works is, the words are separated into parts of speech; you make a template in a grid where each word in each line is represented by a box into which you enter a part of speech. It even has a few blank part-of-speech categories; which is fortunate because it works much better if you don’t have to lump the articles in with the adjectives.

My question is this: It’s pretty much only by chance that an occasional poem will sound like natural language. The rest of the time, it sounds like phrasebook English. Are the rules of construction in English simply so complex and with so many exceptions that it’s just not practical to write so many rules into a program?

I’ve noticed that translation engines seem to have the same problem; as if they can’t look at the inputted text as a whole and translate it, but rather they do it one word at a time.

It’s better than more than one word at a time, but yeah, that’s pretty much the approach. Computational linguistics in general, and natural language processing in particular are very active research areas right now, so with any luck it’ll all get better in the next 5-10 years.

It’s a very hard problem. I notice that when I read papers by non-native English speakers they sometimes pick the wrong word from a set in their dictionaries. It’s close enough that I can tell what they meant usually. Try to do a translation, and you’ll see you’ll often use the sense of the sentence, or the entire reading, to choose the right word. Even when I too AI, mumble years ago, it was recognized that you really needed to understand the meaning before doing the translation.

It’s difficult but possible to create sentences that sound like grammatical English. Even there it’s difficult to create sentences with complicated grammar. You can more easily create sentences with simple grammar. It’s much harder to create sentences that sound like plausible, logical sentences. To create sentences that bear some relation to the real world you would have to put a large amount of knowledge of the real world into the computer. This is barely possible if your sentences are just about some small set of topics. Creating sentences that are about pretty much anything that a person might talk about (and which make sense as comments about those subjects) is basically hopeless at the moment, since it would mean putting all the knowledge of a person in the machine.

Paging Mr. Turing … paging Mr. Turing …

Why do you want to get hold of Alan Turing? Do you want to find out which SD posters are real people and which ones are just computer programs that create plausible-sounding posts? Which posters aren’t really people?

I have the SDMB to thank for knowing about this germane illustration:

So, is that what the Turing Test amounts to, finally? Not simulating human intelligence, but simulating human language? Or do they amount to the same thing?

I guess I had always assumed that the idea of the test implied that somehow we would be able to get computers to *talk * like humans long before we could get them to think like humans.

(Mods, if you think this is veering away from GQ, go ahead and move it.)

Here’s a postulate: Say Commander Data from Next Generation for some reason had to pose as human, so he couldn’t say things like, “I lack the capacity for emotion.” Maybe he was in a chat room, trying to pick up a girl online. :dubious:

If he could talk like humans, but could only think like a computer, could he pass? Or is there no such thing as think like a computer, because they “think” in whatever way humans program them to think?

I must leave for work soon. But I am looking forward to seeing what Dopers have to say about this when I get home. And thanks for the info thus far.

As of right now, even thinking like a mouse stretches the capacity of what we’ve got. It’s really not possible to do anything but speculate about human-like intelligence.

~ from The Cyberiad, originally written in Polish and translated by Michael Kandel into English

Just to agree with everyone else–the problem of a computer generating English prose is very deep (and poetry is even deeper). It is a current topic of research, but that has been the case for the last 50 years. Breakthroughs are regularly predicted 10 to 20 years in the future.

The Turing test measures language, since there is no known way of directly measuring intelligence, and the test is about whether the entity is indistinguishable from human. Probing the brain is as illegal as looking at the box the entity comes in.

What do you mean by “think like human?” If our conscious mind is build on top of a subconscious mind, and the computer’s human simulation is built on a non-human program, how is that different? The Turing test isn’t just about talking like humans, it is about acting in a conversation just like a human. I don’t know if this ability counts as language or thought.

There’s not much of a way to test whether something “thinks” like a human. All you can do is observe how they react to things. If a machine were able to talk like a human successfully, that machine would likely be thinking like a human to the extent of our ability to test it.

Note that talking like a human requires a lot more than just natural language parsing and translation. But dealing with natural language is a necessary component of passing a Turing test.

As an example of the subtleties of human language (English in particular, since that’s the one I’m most familiar with), consider the order of adjectives. I can refer to a “red ball”, or a “rubber ball”, or a “small ball”. But what if the ball in question has all of those properties? I could call it a “rubber red small ball”, but that sounds completely unnatural to any English speaker. In fact, any combination other than “small red rubber ball” sounds wrong. Likewise, “large yellow wooden ball” sounds natural, but not any other permutation.

There’s apparently some sort of order relation on categories of adjectives, such that adjectives for size always come before adjectives for color, and that adjectives for color always come before adjectives for composition. Any computer which hoped to emulate natural English constructions would have to have a list of these categories, in order, with each adjective sorted into one of the categories. But now I ask: What are all of the categories which go on this order list? And what adjectives belong in each of them?

Anything that talks like a person (not just in its grammar, but in understanding what’s asked of it and replying in a sensible way) is a person. To talk like a person you have to think like a person and thinking like a person is the definition of a person. If you could create a computer program that talked like a person (in the way defined above), you better be prepared for it to ask to be declared a citizen of its country and be given the vote.

Well, that’s a good question. But suppose you had an entity that could communicate to you in natural language, such that you couldn’t tell whether the entity was a human being or an AI.

You could claim that although an AI capable of such a thing wasn’t thinking like a human, and wasn’t conscious, and wasn’t a person, on what basis would you make that assertion? And if you asserted that the AI wasn’t “really” conscious, and didn’t “really” think, how could you answer someone who asserted the same thing about a human being?

In other words, how do you know human beings have consciousness and think? Well, they act as if they do, and they are able to communicate with you in such a way that, if such a thing as consciousness exists, if such a thing as thinking exists, then human beings are conscious and think. But the AI can do the same thing! So to assert that humans can think but AI can’t would be perverse. If you assert that the AI isn’t “really” thinking, you’d have to also accept the possibility that the humans aren’t really thinking either, they just act “as if” they were thinking.

But what’s the difference between an entity that acts “as if” it were conscious, “as if” it were thinking, and an entity that really is conscious and really does think?

And in my opinion, such a distinction is incoherent. An AI that can pass a Turing test would be conscious. Either that or there is no such thing as consciousness.

Of course, the out is that as of today no such AI exists, and it might be true that no such AI will ever exist, although we have no good reason to suspect that such an AI is impossible, after all, our human brains are composed of ordinary matter arranged in ordinary ways, so to assert that AI is impossible would be to assert that there’s some sort of magic happening in our brains. That would be pretty surprising if true.

Seems like some AI software could be created that learned those rules over time, no? But I agree it would be very difficult, if not impossible, to get it right the first time.

I think I remember reading somewhere that learning grammars is NP-complete. I’ll have to dig around and see if I can find the source.

By “learning grammars” do you mean “deducing a grammar by observing instances of valid and invalid strings in that grammar”? If so, off the cuff, that sounds like a generalization of the halting problem.

Or maybe you mean something else.

Nevertheless, that doesn’t much impact the ability of an AI to correctly learn natural language. It’s perfectly possible to construct meaningless sentences in natural languages that pass any formally-defined grammar, and similarly possible to construct arbitrarily complicated sentences that actually do have a reasonable meaning, but which no human would ever parse.

Yes, that’s what I mean. I don’t completely understand all the details, but it looks like the problem of deducing a grammar from valid and invalid strings is at least as hard as several cryptographic problems that are widely believed to be intractable.

I’m almost embarrassed to mention it now, in view of the very erudite conversation preceeding; but what I was thinking that would distinguish an AI which had otherwise acquired a capacity for natural language from a human in a remote conversation would be a certain lack of what they call “affect”. But I suppose that this notion is more romantic than rational, and anyway, many humans also demonstrate a lack of affect in verying degrees.

I actually like to use the RPG as a sort of oracle and so the mismatched verb tenses & gender pronouns just add to the moody cryptic-ness.

Thank you all for adding to my understanding of the scope & nature of the problem.

Just in the interest of giving credit where it’s due, the original URL of that cartoon is

(The strip also includes mouse-over comments. Sometimes they’re funnier than the cartoon. Sometimes Turing himself would miss the joke, to say nothing of an AI.)