The title of the thread is “Why ChatGPT doesn’t understand”, indicating that a reason will be presented that is supposed to be the topic of this discussion. Furthermore, in the OP (which one generally assumes people read in addition to just the title), it is clarified that this thread isn’t concerned with whether arguments for understanding being present in ChatGPT succeed, but with the negative case as laid out there, by the presented argument. As that seemingly was too subtle a hint regarding the topic, I’ve patiently, if apparently fruitlessly, clarified it for you several times now.
If you want to discuss the topic of ChatGPT’s understanding from another perspective, you’re free to create a thread to do so, but please, stop hijacking this one.
And yet, I can make innumerable such conclusions. LLMs can’t tell me whether a set of matrices, multiplied together in the right way, can yield the zero matrix. LLM’s can’t produce more than the initial few bits of an \Omega-number. An LLM can’t determine whether two context-free grammars produce the same sentences. Or determine whether a given strategy in Magic: The Gathering is a winning strategy. Find a complete theory of its own function. And many more.
I know all these things without even having to take a look at the details of the LLMs implementation, because they can be proven. Suppose you now tell me that some LLM produced, say, 10 initial bits of an \Omega-number. I would still know it couldn’t keep on producing them forever. Suppose you then came to me with, well, another 10 digits. Or 100. Or 1000: still, I would keep fast to my claim that eventually, the LLM won’t be able to correctly produce any more. Because I can prove it.
The OP contains a purported proof that there can be no understanding in LLMs. The proof starts out with the assumption that the LLM has a complete knowledge of the structure inherent in language. This will, automatically, enable it to answer any behavioral test with perfect accuracy: it will know exactly which words to use when, because there exists a model such that all the sentences it produces come out true. So none of these appeals to how much it looks like the system understands is going to hold water: that’s what I expect.
The trouble is that there is more than one model that makes all the sentences a LLM produces true, and that on the vast majority of these models, what we mean by a predicate such as ‘…is a cat’ won’t line up with what is picked out in that model by the respective relation. But there is no sense in which any of these models is preferred; so there is no determinate fact of the matter which one connects the words used by an LLM to the objects in the world. If there was some definite model, and we just didn’t know which one, then fine: ChatGPT would just use a language that has all of the vocabulary of English, but in which the words have a different meaning. But since there isn’t any particular model appropriate to its utterances, the words it uses don’t have any particular meaning.
Against this, to try and marshal arguments from various tests and tasks ChatGPT has fulfilled, is simply to misunderstand the argument being offerered: if the proof in the OP is right, then that performance is to be expected, but not indicative of any understanding whatever.
Well, it’s a tempting idea, but if that’s the case, then ‘abstract structure’ needs to be spelled out in some way different from what’s given in the OP. Because if you just tell me that the world has a particular abstract structure in that sense, then what you’re telling me is fully logically equivalent to telling me that there is a certain number of objects in the world. So if you believe that we know more than just how many things there are, then abstract structure won’t cut it.
There have been other attempts to elucidate ‘structure’, most notably using Ramsey sentences, but they run into a different version of the same problem. I’m not aware if anybody has come up with a feasible solution that doesn’t in some way amount to a weakening of the claim that all we know is just abstract structure (such as Russell, who arguably felt contained to appeal to direct knowledge of at least some concrete structures to defend his theory).
But in what sense does ‘associated to’ imply ‘about’? Thunder is associated to lightning. If you hear thunder, you know there is lightning; thunder, in that sense, could be said to be ‘about’ lightning—it informs you that there’s lightning there, when you didn’t know that before. But to do so, you have to know how thunder is related to lightning. Somebody who’s never heard it might wonder what all that noise is about—they wouldn’t connect it to anything like huge electric discharges without some further knowledge. Thunder isn’t intrinsically about lightning.
But then, what is that knowledge but a thought that’s about how ‘thunder is associated to lightning’? But then, suppose you try to elucidate the intentionality of that thought by means of association: as we’ve seen, association isn’t enough, you need to know—i.e. have a thought that’s about—that association. So then the whole thing just iterates, falling into what’s known as the homunculus regress (which I think is one of the most underappreciated problems in cognitive science, and hence, my avatar is a representation of it): trying to explain a capacity in terms of itself.