Why ChatGPT doesn't understand

Is it possible to show this argument doesn’t apply to a human?

(I haven’t read the entire thread; apologies if already addressed.)

If you still think these LLMs are just stochastic word generators or complex ‘lookup tables’, you might want to read this:

PROGRESS MEASURES FOR GROKKING VIA MECHANISTIC INTERPRETABILITY

‘Grokking’ is the name they have given to an LLM’s emergent ability to ‘understand’ things as opposed to simple memorization or lookup. The term comes from Heinlein’s ‘Stranger in a Strange Land’, and it basically means to understand something fully. ‘Grokking’ seems to be a synonym for a phase change in behavior, or an emergent capability.

The author is looking for a way to understand why LLM models do what they do. So he set up a very simple model that could be reverse engineered, and started training it on samples of modulo arithmetic, waiting to see if it ‘grokked’ the process at some point.

At first, the model attempted to answer modulo arithmetic the ‘lookup’ way, by generating tokens based on similar operations it was trained on. This was low accuracy, like the way we described ChatGPT’s ability to do math. It too was clearly using math symbols like text and trying to build answers the same way it builds sentences.

But along the way, suddenly the thing ‘grokked’ modulo arithmetic, and could solve any of the problems perfectly. At that point, the researchers dove into the model to see what had changed, and what they found was pretty amazing:

The model did not know any math, had no access to math references, but after enough data was ingested and its responses graded, an algorithm for calculating modulo arithmetic that involved trig identities and fast-fourier transforms was created inside the neural net through connections and weights of a set of parameters.

It did this purely through the same gradient descent/loss function mechanism through which it did everythjng else. No one knew this ‘circuit’ in its neural network existed until they went looking for it. No one kmows what other ‘circuits’ were evolved to enable theory of mind, translation, or other emergent capabilities in larger LLMs.

Apparently, once the model figured out the more efficient, universal algorithm for modulo arithmetic, it reset or pruned the part of the model that had been doing it the ‘memorization’ way.

It’s clear that these trained models have a huge amount of complexity inside of them, including many evolved ‘circuits’ for solving complex problems.

It is way too premature to say that these models don’t ‘understand’. We know what happens when a model ‘groks’ modulo arithmetic. We have no visibility at all into the complexity that may have evolved in these models for grokking something way more complex like theory of mind. But we know it’s there, and it’s not simple ‘stochastic word generation’.

This is an argument about phenomenal consciousness, however. I don’t think anybody proposes ChatGPT to be conscious in this sense. What’s at issue here is intentionality—the property of symbols to be about, or refer to, something other than themselves.

Again, there’s a simple notion of ‘understanding’ under discussion here, which is broadly just that words mean things. If you want to discuss something else, you’re free to open up a thread about it, but if you want to continue here, I’d ask you to stop equivocating.

And again, as the argument in the OP establishes, when it talks about bottles, cups, and so on, it’s not talking about anything in particular, your impression notwithstanding.

It only applies to humans if all we have access to about the world is its abstract structure—so as soon as we don’t, it ceases to apply. More broadly, the question of how the symbols we use acquire meaning is the problem of intentionality, and it’s indeed an open problem, see the points made above to @Dr.Strangelove.

It also doesn’t help your argument at all, because the text input to ChatGPT was “caused” by humans (some of whom have actually seen Venus and understand it more deeply). If one can push understanding down the causal chain, then it should work for humans and computers.

But that’s specific to indistinguishable particles. And we can still distinguish electrons from photons.

How so? They’re still there. Anyone saying “Alice the Cat” might be referring to either one.

To be clear, I actually agree with this–the reason the copies can be purged is because no measurement can possibly distinguish them. But if they do have some underlying existence aside from their properties, then we can’t exise them.

Here’s another thought experiment. Don’t take the details too seriously–I’m mainly just trying to be illustrative here. But suppose you run a simulation, or multiplayer video game or the like. As the admin, you can instantiate objects and give them properties. So you create three generic objects, which you arbitrarily label just to keep track of things, and then give them the names Alice/Bob/Charlie, and also make Alice and Bob into cats.

Players or observers inside the game can only see the two properties of name and cat-status. They can’t see the labels, which are just implementation details not accessible to them.

As admin, you have quite a bit of control over these objects. For instance, you can swap two of them by simply reassigning the properties. From one point of view, you’ve permuted the objects. From another, you’ve permuted the properties. But they’re equivalent statements. For you, the objects do have a concrete nature; for instance, each one is backed by a specific region of memory to hold the state. But there’s no Alice-object; just an object that currently happens to have the name-property of Alice.

You can go further: suppose you want to put different players onto different servers for performance reasons. Everybody needs access to some common objects like Venus, but it’s too slow to constantly request data between servers. So you just duplicate Venus–i.e., you create a separate object on each server that each has identical properties.

A game player therefore can’t know what “physical” object they are referencing when they talk about Alice the Cat or anyone else. It might change moment to moment. It might at the same moment be different from what someone else means by Alice the Cat. But none of this matters, because no measurement can distinguish between these cases.

Maybe you scoff a bit at the simulation hypothesis, and I won’t claim that it’s probable. But if Newman’s objection is to be generally applicable, it has to consider this case, or something equivalent. And it doesn’t seem to have an answer.

Furthermore, I see no reason why the world can’t work this way except without the simulation. There is no backing store for the objects; the collection of properties simply is. Nevertheless, the same basic problem is there. We can only permute artificially assigned labels, but as long as the bundles of properties remain stable, nothing changes.

Not to defend a theory I don’t think works, but the common response to that would probably be, that the symbol for ‘dog’ in a human being is causally related to dogs, while the cause of the symbol ‘dog’ for an AI that has only words to train on is just other words, and other words aren’t dogs.

On a conception of relational structure, everything is just indistinguishable particles, even cats are; it’s just that they stand in some relation ‘…is a cat’ that supplies their cat-identity. In reality, ‘…is a cat’ might just be a chunked large-scale description for a great many different relations, such as ‘…has four legs’, ‘…has two kidneys’, ‘…purrs’, and so on, but that doesn’t change the issue.

That isn’t what I’m saying. Any relation determines the minimal cardinality of the domain upon which it is defined. So, there will be Alice’ and so on, if and only if there is some relation in which they occur. But then, they’re just ordinary indistinguishable objects within the set of such objects, and don’t pose any special issues.

You can claim that there are more objects than are needed to fulfill the structure you intend the world to have, but this won’t cause any problems of reference, as you refer only to the number of objects that occur in the relation you claim exists.

Suppose the drawing in the OP were extended by an additional object in ‘the world’: then, you could build a different model in which that object is used, rather than one of the ones in the original domain. But still, the number of objects used to fix reference would be the same; so it’s not the case that there would be any uncertainty about whether one refers to Alice or Alice’. All that’s needed is that each term refers to one object, and that’s by construction.

In your example, the admin and the players simply have access to different relational structure, leading them to associate a different cardinality to the domain. That isn’t surprising: the number of elements, in a purely Newmannian scheme, is an empirical datum—in fact, it’s the sole such datum. If you’re somehow prevented from probing some of the structure of the world, you’ll not be able to discover the correct cardinality.

No. This is not about “impressions”. These objects and the actions associated with them have certain specific qualities that a human understands as a matter of common sense, and knowing about those qualities is essential to be able to answer the questions correctly. ChatGPT did so.

This is an absolutely baffling comment. This thread is about whether ChatGPT understands the questions it is asked. You’ve created an argument attempting to show that it doesn’t. A number of us have provided multiple lines of evidence to show that it does, which lately you’ve been steadfastly ignoring. I will note, for instance, completely ignoring the success of AI on the Winograd schemas explicitly designed to test understanding, or @Sam_Stone’s informative post just previously.

I am not “equivocating” – which incidentally is perilously close to being an accusation of lying. I’m pointing out the considerable successes that neural nets like ChatGPT have had with tests for understanding, while also honestly acknowledging the failures and the reasons for them.

I’ll summarize where we’re at here, in my view. It’s been demonstrated that ChatGPT can successfully solve many problems in logic, which requires both an understanding of the problem and, obviously, correctly parsing the natural language question in the first place. The Turing test was proposed back around 1950 as a test for machine intelligence based on its responses to questions over a text terminal such as a teletype – i.e.-based on the machine’s behaviour. More recently, the Winograd schema was proposed as a more rigorous approach to such an assessment.

ChatGPT has been remarkably successful in the Winograd schema tests I’ve tried it on, and in recent years neural nets like it have scored at close to human level in large-scale testing. When failures occur, they are almost invariably due to the AI’s lack of appropriate real-world understanding. It’s often surprising how much such real-world understanding actually exists – in the councilmen vs demonstrators question, for instance, it understood that councilmen would be the ones concerned about potential violence, and demonstrators the ones likely to cause it.

But fundamentally the AI lives in its own world of total isolation. It does not experience the world as we do, so many of its failures are due to lack of common-sense concepts that even a child would have. The key to a more successful level of true AI understanding is training in the realm of what has been called infant metaphysics – basic real-world knowledge. It’s a tough nut to crack, but we’re getting there – quite fast, actually.

Because those are not on topic for this thread—a topic which you have been steadfastly ignoring completely. All that’s said there is that it seems strange to imagine that ChatGPT could do these things without understanding—but if the argument in the OP is right, then that’s what happens.

Compare: I give an argument that the world, in fact, moves through space. You point out that it doesn’t seem like it’s moving at all. Does this pertain to the argument I made? No: not without any argument that things couldn’t seem like that if the Earth did move through space.

You’ve said yourself that there is a difference in the ‘engineer’ concept of understanding and the ‘philosopher’ concept of understanding (which, I think, is just ‘understanding’, but of course, you can come up with concepts interesting to you all day long). This thread is about the latter; it’s about the question whether, if ChatGPT is producing a sentence like ‘Bob is a cat’, that sentence refers to Bob, who is a cat.

We know that the ‘engineering’ conception of understanding doesn’t answer the question, by example—a random answerer can pass any test you propose for this sort of ‘engineering-understanding’ (with nonzero probability), while manifestly not possessing any understanding. I.e., it’s possible to show all of those capabilities you have been pointing to as explicable only by ‘understanding’, yet not understand a damn thing.

And that’s the question you keep begging. Does is require understanding to solve these problems? If the argument in the OP is right, then no. If you continue to point to such examples as demonstrating understanding, you’re simply not engaging with that argument and hence, with the thread topic.

OK, let me address your OP directly. :wink:

The argument presented here is about the limitations of Large Language Models (LLMs) in understanding the semantic value of the tokens they manipulate. The argument is based on Newman’s objection, which asserts that knowledge of a structure W is equivalent to knowledge of the cardinality of its domain D, i.e., the number of distinguishable objects within it. The objection is that knowledge of structure alone is not sufficient to know anything substantial about the world.

The argument concludes that the same applies to LLMs. Suppose that ChatGPT has learned from its training data that there are three terms in D, and they stand in a relation C = {, }. To correctly interpret this structure, we need to find a model that mirrors the relevant part of the world. However, according to Newman’s objection, knowledge of the structure alone does not tell us anything substantial about the world. Therefore, it is unlikely that ChatGPT can identify the “proper” model of the structure it has learned.

Overall, the argument challenges the notion that LLMs can understand the world in a meaningful way, as they operate solely based on structure and lack the ability to capture the semantic value of the tokens they manipulate. However, it is worth noting that this argument is limited to the domain of metaphysics and does not address the practical uses and limitations of LLMs in other areas such as language processing, text generation, and machine translation.

Note: I did not write the above. It came from ChatGPT, the thing that supposedly doesn’t understand anything. I just bolded that last sentence.

I wonder why, though, since I’d usually expect that this would indicate it to somehow be challenging my point. But it’s just appropriate (although one could quibble about the use of ‘metaphysical’)—I don’t disagree about the utility of LLMs, and the ‘metaphysical’ question of whether they understand is exactly the topic of the thread, so it’s not strange that the argument should be limited to that.

That last bit is not challenging your point, it’s highlighting the distinction between a practical (i.e.- “engineering”) view of understanding versus a metaphysical one. ChatGPT has somehow seemed to side with my “engineering” side of the debate.

What I think is really challenging your point about an absence of understanding is ChatGPT’s summary and counterpoint to your OP.

And still, all you’re saying is ‘well it really looks like it actually understands things to me’. But your personal incredulity simply isn’t relevant as a counterpoint to the argument in the OP. You can’t imagine how ChatGPT could come up with such seemingly a propos answers to your prompts; but the limits of your imagination need not be the limits of possibility. The argument in the OP explicitly implies that this is possible, because it assumes that it can perfectly replicate the structure of language as is present in its language corpus, i.e. that it will only assert those properties of the terms it uses that a natural speaker of language would. This is a significant idealization—its capabilities are not anywhere near that level—but even so: this simply doesn’t imply that there is any understanding.

Over in the other thread, I’ve linked to a series of videos that, I think, really help understand where its capabilities come from. In brief, what ChatGPT knows is what words typically cluster around other words, the position of words in an input string, and which other words to pay attention to when evaluating a word. Its building blocks are ultimately quite simple, and it strikes me as completely clear that there’s nothing in there that depends, in any way, on what the words actually mean—indeed, if they mean anything: it would be just as happy completing strings of meaningless tokens (and just as good at it).

Sorry, due to HTML issues the previous statement got screwed up and should have appeared as below:

Suppose that ChatGPT has learned from its training data that there are three terms in D, and they stand in a relation C = {<Alice>, <Bob>}.

I am indeed saying that behaviour is all that matters for practical considerations like the future utility of a particular AI implementation. I’ll take a look at the videos when I have a chance.

Fine. You’re welcome to consider that all day long. It’s just not what this thread is about.

The title of the thread says that it’s about the claim that ChatGPT doesn’t “understand”. It seems reasonable to assume that this invites discussion about whether or not this is true in any meaningful sense of the term. I know you would prefer to keep the discussion within your preferred metaphysical territory, but I don’t think it’s reasonable to object to having other perspectives introduced, such as asking what a behavioural view of ChatGPT’s performance tells us about its future potential.

Back in the days of Hubert Dreyfus expounding on the alleged limitations of AI, the murmurings in the AI community were along the lines of “the job of engineers is to build things; the job of philosophers is to explain why they’re no good”. :wink: As I said way back in the other thread, the only consideration that really matters in the real world is which one has the strongest predictive value for future performance.

You are still approaching this in the wrong way, IMO. You are neglecting to consider the complexity of the models and what we have already discovered about them.

For example, in an LLM trained to caption pictures, researchers discovered the same kinds of ‘multi-modal neurons’ that evolved in the human brain. A multi-modal neuron trained on images of Halle Berry will fire when an image of Halle Berry is found. But it will also fire when the words ‘Halle Berry’ appear.

As these LLMs evolve with training, the interiors of their neural nets get incredibly complex. Far too complex for us to understand. But they are full of ‘circuits’ implementing complex algorithms. Even more interesting is the concept of ‘universality’, in which different neural nets with different architectures, different functions and different training data still seem to evolve some of the same structures seen in other, very different neural nets - including the human brain.

I suggest reading this:

https://distill.pub/2020/circuits/zoom-in/

This is from a researcher at OpenAI, talking about their attempts to trace the ‘circuits’ being built inside the LLMs - a circuit being a collection of parameters, weights and connections which together implement an algorithm like a curve detector, a floppy ear detector for dogs, whatever. We can only see the simple ones (their max trace so far was 50,000 neurons). Imagine the stuff going on inside a 175 billion parameter model.

Making any kind of sweeeping conclusion about what an LLM can or can’t do, can or can’t experience, is WAY premature. But so far, we’re seeing an awful lot in those networks that are similar to the networks in our own brains.

This is the way I look at it: We built something based on how our own brains work. We trained that something with human knowledge, and as we did, human-like capabilities emerged wihout our writing them or even knowing they were doing it. We see in the network the same kinds of structures we see in the human brain, independently evolved. That seems potentially profound to me.

We already know that finite-state automata can lead to incredible complexity from iterating very simple rules. We also know that nature uses this as part of its evolutionary process. We are now iterating incredibly complex networks with massive data, and apparent intelligence is emerging in unpredicted ways.

Rather than say this can’t lead to ‘understanding’ because it’s a machine, a better takeaway might be that if the machine can do this, then maybe all we are is biological machines doing exactly the same thing.

Thank you for another excellent post, Sam. I will say, however, that a primary distinction between the impressive achievements of contemporary AI and human intelligence is the deficiency in the former of knowledge about the properties and behaviours of entities in the real world. No matter how capable its cognitive skills, an AI is still hampered by being essentially a “brain in a jar”. Thus the key to enhancing its true understanding is building its knowledge of real-world relationships, as suggested in my previous link to the “infants’ metaphysics” paper. IMHO, this is not a fundamental and certainly not an insoluble problem, but merely an incremental one in the advancement of AI.

I would slightly quibble with this phrasing. The corpus used for ChatGPT includes plenty of real-world information. I do agree that it needs richer data (audiovisual, etc.) as well as a focus on more basic “infant metaphysics,” but I would call those improvements ones of degree, not of kind. They are still just a series of tokens presented to the network. It is already “accessing” the real world, just not as effectively as it could.

Isn’t that exactly what a multi-modal neuron is doing? In a multi-modal LLM, if you say ‘Halle Berry’, essentailly the AI knows what she looks like as well. Or if you show it a picture of Halle Berry, it knows that this is Halle Berry and can write about it.

That’s a very simple example. We don’t yet know the extent of such capability and how it allows an AI to ‘understand’ things. But there could be millions of multi-modal circuits that together provide a lot of real ‘understanding’ about things. And maybe that’s exactly how understanding works with us as well - we develop circuits and specialized neurons that act as the ‘glue’ between characteristics or different items that goes beyond mere cardinality or ordering and starts to imbue meaning to various things.

I agree with that. We are in very early days here. But the real question is whether the difference between a conscious brain and an AI is one of kind, degree, or scale.

In other words is an AI destined to only ever be at best a simulation of a mind, with no real consciousness? That would be a difference in kind. Or is the AI just destined to be a limited ‘mind’ because it lacks key functions yet to be developed? Or is the difference between an AI and a human brain just one of scale, and we have emerged more complex structures that give us ‘consciousness’, that an AI won’t reach until it reaches the scale of a human brain? If so, we’ll know soon because we are getting close. GPT-4 may be there.

Absolutely! There is no quibble here, really. I think we’re in complete agreement. ChatGPT demonstrably includes plenty of real-world concepts, just not enough of them to pass all tests of understanding. The observation that these improvements are what I called “incremental” and that you called “of degree, not of kind” is profoundly true.

I’m not sure what the distinction you’re making here is. Isn’t all knowledge a type of abstract structure? All the sensations one has are just their brain’s abstraction of the world around them.

I’m having trouble understanding what the problem of intentionality is. A thought is a state of the brain. That brain state may or may not be associated to any sensations of the world by that brain.