The next page in the book of AI evolution is here, powered by GPT 3.5, and I am very, nay, extremely impressed

No: the one who has understood the movie could just lie. There is no reliable way to gauge differences in understanding by observing behavior, hence, a behavioral concept of ‘understanding’ just fails to capture what it’s intended to.

But it is not the manner in which understanding was achieved that’s at issue here, but the question whether it was; and the behavioral approach fails to answer it.

By testing them. For instance, there is a group operation that can transform from A->B->C->D->A. In fact, both the cyclic and Rubik’s group has one of those, so by itself it’s not distinguishing. But the Rubik’s group has five other operations that also produce distinct 4-cycles from the same starting point: A->E->F-G->A, A->H->I->J->A, etc. So as I explore around the space of operators, I can discover these things that distinguish between the structures.

And although it would be easier if I had free choice to examine the operations and how they permute the set, it’s probably not necessary. Just give me a sufficiently large corpus of operations and their effects, and I can sleuth my way to the answer. With a large enough data set, you can tell me less going in: perhaps the groups could be of any type or finite cardinality.

The way I number my objects, and whatever special interpretation I put on them is of course impossible to distinguish, but also useless.

Human language is not as logical as math, but nevertheless there are inferences we can make. For example, you cannot simply exchange “hates” for “loves” in the corpus of the English language and expect it to make sense. If Alice hates Bob, there will be other statements in different language: Alice loathes Bob, and so on. So we have to swap the meanings of those as well if we aren’t to contradict ourselves. But it gets harder, because if Alice fed Bob poison, we have to alter it the meanings so that it now means Alice fed Bob a delicious cake. Or maybe “fed” is changed to mean “prevented the person from eating.” It just goes on and on.

I think you were arguing against this line of reasoning back here:

I’m not sure I can pinpoint exactly where this went wrong (though I guess I disagree with the “linear” part), but in any case I think it’s clearly false. If it were true, then we could swap any two words in the English language–across the entire corpus–and then with minimal effort (since N/N! is a tiny number), patch up the rest of the language for it to make sense again. But that just doesn’t happen; you obviously get a combinatorial explosion of required alterations, due to the relationships between relationships, and so on. It’s just clearly impossible to remap meanings without breaking absolutely everything.

That’s beside the point. I’m not saying these structures aren’t different, I’m saying there’s no sense in which there just is that structure, rather than this one. You have 43,252,003,274,489,856,000 objects. By virtue of that simple fact, you have relations there that instantiate the cyclic group, as well as the Rubik’s cube group. If you want to say that you instantiate the Rubik’s group, and not the cyclic group, you’ll have to point to something that distinguishes the appropriate relation. But, you can’t do it using structural knowledge: since then, you’ve just again given me a set of N object. And, you can’t do it using non-structural knowledge, since you claim there’s no such thing. Hence, you can’t do it.

It’s not. The push-through construction gives you an explicit way, for any structure, to permute the elements of the domain and obtain an equivalent structure. Consider the way this is put here:

The Push-Through Construction. Let L be any signature, let M be any L -structure with underlying domain M, and let π : M → N be any bijection. We can use π to induce another L -structure N with
underlying domain N, just by ‘pushing through’ the assignments in M, i.e. by stipulating that sN = π(sM) for each L -symbol s.

N may in particular just be a permutation of M, thus guaranteeing that if you say that the structural data of language pick out a certain structure M over the things in the world (i. e. M), I can equally well say, no, it ‘fixes’ structure N over the permuted things N.

Sure, but that’s irrelevant to my argument for three reasons. First, it simply reflects the logical problem that in the general case it’s impossible to prove a negative of this kind, but fortunately we don’t need to. For instance, for everyone from first-graders to doctoral candidates, a series of tests are applied to see if they meet the criteria to pass through the gate. Do they have the knowledge, the skills, etc. We test for the desired qualities. Testing for understanding is precisely in this category. If the person didn’t understand the movie, the right questions will immediately reveal this fact. If the person did understand it but chooses to lie to us, then we’ll have bad information, but what’s the alternative?

Secondly, if proving a negative is going to be a requirement here, then the Turing test is useless as well. Surely any AI smart enough to pass the Turing test could well be smart enough to have a wicked sense of humour and decide to lie to us by deliberately sounding dumb. Follow this to its logical conclusion, and behavioural assessment is shown to be completely useless. Yet, as I just said, we use it all the time to assess things like knowledge, intelligence, and skills of all kinds.

Thirdly, when some information-processing entity, be it a human or an AI, is claimed to have certain capabilities, appropriate observation (i.e.- testing) for those capabilities is generally the only way to verify them. An analysis of its internal functioning is unlikely to be helpful because in the general case, a functional or behavioural assessment tells us the “what” of an automaton’s performance; what is it doing and, in particular, what is its level of skill? Subjectively, is it exhibiting intelligence and understanding?

The internal processes tell us the “how” of an automaton’s performance: how does it do what it does? In general, an examination of internal design tells us little or nothing about the “what”. You build an AI as best you can, subject it to the appropriate types of training, but ultimately the “what” can only be judged by running and testing it – i.e.- doing a behavioural assessment.

Granted that in trivial cases an examination of internal design might reveal extreme limitations on the “what” that the machine seeks to accomplish; for example, the Eliza program produced its responses by retrieving a preset response template based on simple pattern matching. Its performance is correspondingly trivially limited and could not in any practical sense be significantly better using that model.

However, the operation of LLMs like that of ChatGPT is very far from trivial and its performance very far from predictable. Explaining the “how” in a simplistic one-liner – that it just “matches tokens to tokens based on the statistical regularity within a great corpus of tokens” – has essentially zero predictive value about the “what” that would be expected to emerge, and is therefore not a useful observation.

As I alluded to above, if presented with an information-processing entity with certain claimed capabilities, I know of only two ways of assessing it. One can observe its behaviours in response to inputs (i.e.- treat it as a black box) or one can analyze its internal processes. But as I already suggested, in the general case the former is what yields information about the “what”, and the latter only technical information about the “how”. And those are the only two assessment methodologies I know of – the “what” and the “how”. If behaviour doesn’t yield adequate information about something like “understanding” (or equally abstract notions like “intelligence”) I’d sure be interested in your view of what does. Because it seems like you’re claiming that we have to conclude that things like intelligence and understanding are entirely un-assessable and unknowable.

These couple of sentences were not very clear and might be confusing, so let me rephrase:

An analysis of its internal functioning is unlikely to be helpful because in the general case, it is the functional or behavioural assessment that tells us the “what” of an automaton’s performance, whereas an internal process analysis only tells us the “how”. It’s the behaviour that answers crucial questions like what is it doing and, in particular, what is its level of skill? Subjectively, is it exhibiting intelligence and understanding?

@Half_Man_Half_Wit , you appear to be saying that if you ignore the structure, then you can’t see the structure. Which is true, but trivial: What if you don’t ignore the structure?

But in the Rubik’s cube group, those relations that instantiate the cyclic group don’t exist. And likewise, some of the relations in the Rubik’s cube group don’t exist in the cyclic group. And we can see which relations we actually have, because we don’t have to ignore the structure.

This isn’t getting us anywhere productive, I’m afraid. The point was that ChatGPT behaves as if it understands; the question, whether that implies that it actually understands. The answer that, of course it does, because it behaves as if it understands then just misses the point. Furthermore, there is clearly a difference between understanding and behavior, as it is possible to have differences in understanding without differences in behavior.

If all we had access to was behavior, then we might have to make do with that, but that’s not the case—we know both how ChatGPT works, and how computers as a whole work. As an analogy, computers are able to produce pseudorandom numbers that, in principle, satisfy every test for randomness for arbitrary lengths of sequences (i.e. they behave as if they could produce genuine randomness), but we know from purely a priori considerations that they aren’t able to generate genuinely random sequences.

I’m not sure where the disconnect happens, so let’s try and lay the argument out more carefully, perhaps you (and @Dr.Strangelove ) can then tell me where we part company. (Out of interest, do you think the Newman objection is making the same error as I do, or is the error just mine?)

So, suppose you give me a structure <R, D> consisting of a relation R defined on a domain (a set of distinct, but unspecified objects) D. You say that that’s what exists. Then, in particular, D exists1. Then, every subset of D likewise exists. Then, the powerset P(D), i.e. the set of all subsets of D, also exists. Then, every subset of P(D) exists. Consequently, every relation R’ over D exists, because all a relation over D is, is just a subset of the powerset P(D). Consequently, if <R, D> exists, every other structure <R’, D> exists just as well. But that already follows directly from the fact that D exists. But D, as a set of arbitrary, unspecified entities, is completely specified by its cardinality |D|. Consequently, all I know is that |D| objects exist.

So, if you want to single out <R, D> as somehow special, merely saying that ‘it exists’ doesn’t get you anywhere, because if it does, then so does every other structure <R’, D>, and <R, D> is just on the same footing with all the others. Thus, you’ll have to figure out some way to mark <R, D> as special in a way that any other <R’, D> isn’t. One way to do so is to tell me in virtue of what the relation R holds of its objects (i.e. specifying the relation’s intension). But this involves telling me the properties of the objects that make it so that they stand in the relation R. But this means you’ve given me non-structural knowledge of the objects in the domain D. If you’re claiming that we don’t have non-structural knowledge of the objects of D, that’s a no-go. Hence, you must appeal to some structural knowledge to single out <R, D>. But then, any attempt to mark <R, D> as special using structural must mean that you’re telling me, there exists some structure <X, Y>. But then, there exists… And so on.

Where do you think the above goes wrong?



----------------------------------------------------------------------------
1Actually, there’s a way out of trouble here, known as ontic structural realism—the claim that we have only structural knowledge of things, because structure is all there is. There are only relations, no relata. This just runs into the problem of specifying what, exactly, is meant by a relation that exists, without any objects that actually stand to each other in that relation. I think this makes the problem harder, not easier; and at any rate, it’s not applicable to the present context, because we start with relations in which words stand.

Yes, all of those things exist, but only one of those things is the thing we’re studying at any given moment. Once you have the mathematical tools to do all of those things, you have the mathematical tools to show the existence of all sorts of other things, too. You say that all we can know is the cardinality of the domain-set, but heck, we can use those mathematical tools to show the existence of other sets, of different cardinality, so if that’s your standard, we can’t even know cardinality.

That should properly be ‘…a set of subsets of the powerset P(D).’.

Well, perhaps take it from the other end, then. Suppose that you know that |D| things exist. Then, somebody comes to you, telling you: I’ve discovered something! The structure <R, D> exists! That’ll leave you nonplussed: you already knew that, because you already knew that there are |D| things. Consequently, knowing that there is <R, D> tells you nothing new, doesn’t add any information whatsoever: all you know is still only that |D| things exist.

I’m not talking about what mathematical entities might exist, I’m talking about what we know about the things out there, in the world. That there exist |D| things doesn’t imply that there exist any other number of things that isn’t smaller than |D|. But it does imply that <R, D> exists, for any R; hence, there is no way to ‘study’ R, or to ‘pick out’ R, or to have R ‘really exist’ that doesn’t equally well apply to any R.

Sure it does. You yourself used the argument that, if D exists, then so does the power set of D. The power set of any set is always larger than the original set.

Fine. But we’re taking the stuff of the world—the things out there—here to be the elements of D. That there’s a greater number of sets of those elements doesn’t imply that there’s a greater number of such elements. We can still say, ‘there are |D| things in the world’.

This is what I’ve been saying from the start. And, well, I certainly disagree that it doesn’t apply to the present context.

I do think there’s still a problem, though. You mention the issue of marking one set of relations as special. Let’s rewind a bit–I mentioned before the LLM that picked out the rules of Othello. The rules of Othello are indeed one particular set of relationships on an inferred board-state, and clearly account for the training games and will correctly predict moves on novel games. But there is another game, Othello-prime, that also works on the games fed to it, but for some games it gives different results. The rules booklet, at it were, contains the normal set of rules, plus an addendum with some special games with moves that would otherwise be illegal.

And of course there’s a very large number of these alternative games, since the training set was only an infinitesimal subset of the possible number of games, and we can have rulesets that give variations on all of these.

So how did the LLM pick out the actual game of Othello among all of these? If it’s not special in some way, there should have been an infinitesimal chance of happening on the right one. There was nothing else in the programming to favor it.

The answer, I’d suggest, relates to complexity. Kolmogorov complexity perhaps, or something else related to entropy. Some of the alternative rulesets could not even be stored in the LLM, for information reasons–it just doesn’t have enough bits to store them. But even among the rest, there seems to be a favoring of simpler rulesets, perhaps because it allows the NN weights to be more “relaxed” in some fashion, without having to spend information on long strings that can be “explained” with a simpler rule.

Whether that’s actually the case could probably be determined experimentally. Perhaps starting with the observation that their LLM wasn’t perfect; it had something like an 0.1% error rate. So it actually implemented a very slight variation on the rules.

Probably true. I’m content to end my participation with this summary response to your summarization …

This is precisely the Chinese Room argument all over again. Searle tried to show that the man inside the room was just mindlessly manipulating symbols and clearly had no understanding of Chinese. The AI community largely took the behaviourist position that the overall system did understand Chinese even if some individual processing component (i.e.- the man inside) did not. It’s not a philosophical argument in which either side is ever going to score a clear victory, but it’s obvious which side I’m on. Searle’s argument is particularly ironic in light of Fodor’s contention that even human thought is a kind of symbol manipulation. I’ve never understood why philosophers seem so disproportionately inclined to be AI skeptics.

You seem hung up on this point. My answer is that you can assess understanding by observing the right kinds of behaviour.

Incidentally, I fail to see how one can accept the Turing test as a valid assessment of intelligence but reject my behaviour-based assessment of understanding. One can easily imagine an advanced AI that lies and mimics a poor AI, thus resulting in “differences in understanding without differences in behavior”.

Sometimes such an understanding is useful, but it can be deceptive (see: Chinese Room argument). I don’t know enough about the internals of ChatGPT to know whether this knowledge is sufficient to expose fundamental limitations. Maybe it is. What I do know is that “knowing how it works” has been simplistically and deceptively used by AI skeptics since forever to attempt to undermine AI, going all the way back to “computers can only do what they’re programmed to do”.

That’s merely a semantic argument that depends on picking the right definition of “random”. If one defines it as being a sequence that satisfies statistical tests for a random distribution, then it really is random. But the other definition is based on an assessment of the method by which the sequences are produced – if we agree that the process is truly random, then we must accept whatever it produces as being random by that definition, even if it’s an unlikely series of a hundred 1’s all in a row. Understanding the method is crucially important in applications like generating winning lottery numbers. In that case the argument about needing a priori knowledge of the internal mechanism is just a truism.

Computer-generated pseudorandom numbers do not satisfy all statistical tests for randomness. The good ones satisfy all of the common or easily-used tests. But there are guaranteed to be statistical tests which will correctly reveal them as nonrandom. Those tests might be devilishly complex, or impractically slow and inefficient, but they exist.

Well, I’m not sure what to tell you other than ‘no it doesn’t’, since you neglect to give an argument for your disagreement, but I’ll try anyway. The question at hand is whether, using the structure over the terms of a language implicit in the way these terms are used in large volumes of text, an AI can infer what those terms refer to—whether it can come to understand them. (It doesn’t play any role what sorts of things these terms refer to, whether those ultimately bottom out in fundamental entities or clusters of relation or Human bundles of properties or concepts or processes or… I.e. the metaphysics we adopt doesn’t change the question.)

The counterpart to OSR here then would be to deny that there are terms that refer in any way. But there are: ‘dog’ is a term that refers to a dog. So the OSR move is just unintelligible here.

Besides, even the most brazen OSRist will acknowledge that there’s such a thing as a chair, and that it’s permissible to talk about this chair as if it were an object, because otherwise, all of everyday speech would just be unintelligible.

For Fodor, the symbols being manipulated already carried semantic content. In particular, he held that what a given symbol means was related to the causal antecedents of that symbol being ‘cued up’ in an agent. Getting the meaning of symbols from the manipulation of meaningless ones, as is claimed to be possible here, would be anathema to his position. That’s why Fodor championed the ‘robot reply’ to the Chinese Room: only if the system is causally connected to the environment in the right way—by, say, being instantiated in a robot body—do the symbols being manipulated actually acquire the semantics needed to act as representations.

But that’s explicitly question-begging. The Turing test works because intelligence plausibly is the ability to act intelligently, while whether understanding just boils down to the ability to ‘act understandingly’ is exactly the question we’re investigating. So proposing a method that from the outset presumes that the answer to this question is affirmative simply gets us nowhere.

No. There’s a mathematically exact notion of randomness, one of whose formulations is that none of the initial strings of a random number can be compressed significantly.

For any arbitrary length, I can give you a computer program that will output a random number of that length that passes all tests for randomness. All we know is that eventually, for any finite program, we must exhaust its ability to generate further randomness.

I feel like there is some kind of… bait-and-switch going on, or something. But I’ll be more generous and say that there’s probably some subtlety in the terms you’ve been using that I am missing. In any case, if you are willing to treat a cluster of relationships as equivalent as a given “relata”, then I am willing to go along with that.

Yes, of course. A chair is that thing with a particular set of relationships to other things like floors and butts. Which themselves are a set of relationships to other things in a highly-connected graph.

The set of relationships that you label a chair is likely similar to the ones I label a chair. Not identical–for all I know, you have some connections I don’t, like a sense of nostalgia about your grandmother when you think about a rocking chair–but it’s close enough that we can label images as a chair or not, and describe a realistic scene with some humans and chairs and other things, and perhaps even make extrapolations like what a sci-fi antigravity chair might be. But I can never have access to your inner experience of a chair and match it against my own. And my suspicion is that such a comparison is nonsensical anyway.

The same is true of an AI like ChatGPT. The only thing I can do is talk to it about chairs and work out if the set of relationships that it calls a chair are similar to the ones that I call a chair. At least to the same degree of approximation and leeway that humans get.

Do actual, physical chairs exist? Only in a manner of speaking. I think that stable arrangements of matter exist. But a chair depends on some properties that are only relevant to humans–and, importantly, any intelligence able to understand humans. Which it does so by incorporating the set of relationships that make up a chair into its own network of relationships. So while chairs do not exist on their own in isolation, neither do they require direct experience. Just a consistent network of relationships that can be connected to ones existing network.

Perhaps this is impossible for extremely alien intelligences, like ones not based on matter or which experience the universe with a different dimensionality. But an LLM trained on a large corpus of human data does not fit in that category. It clearly understands human needs very well.

It seems to be getting better at math, IF you constrain it to specific tasks. There is a 5th grade math problem getting a lot of press right now for being difficult to solve:

“Klein read 30 pages of a book on Monday and one eighth of the book on Tuesday,” the question read.

“He completed the remaining quarter of the book on Wednesday. How many pages are there in the book?”

IMO what the question is really looking for is the student’s ability to turn the worded question into a solvable linear equation, so I did that part since it was really simple. i just wanted to see if chatGPT could work the equation out and solve it:

Me: how would you solve 30 + (1/8)x + (1/4)x = x?

To solve the equation 30 + (1/8)x + (1/4)x = x, we need to isolate the variable x on one side of the equation.

We can start by combining the two terms that contain x on the right side of the equation:

30 + (1/8)x + (1/4)x = x

Combining the two terms with x, we get:

30 + (3/8)x = x

Next, we can isolate x on one side by subtracting (3/8)x from both sides:

30 + (3/8)x - (3/8)x = x - (3/8)x

Simplifying the left side, we get:

30 = (5/8)x

To isolate x, we can multiply both sides by the reciprocal of (5/8), which is (8/5):

30 * (8/5) = (5/8)x * (8/5)

Simplifying, we get: 48 = x

Therefore, the solution to the equation 30 + (1/8)x + (1/4)x = x is x = 48.

Pretty good. So of course i was curious if it could derive the equation from the word problem itself:

Me: Klein read 30 pages of a book on Monday and one eighth of the book on Tuesday. He completed the remaining quarter of the book on Wednesday. How many pages are there in the book?

Let’s first calculate the fraction of the book that Klein read on Tuesday. We know that he read one-eighth of the book, so if we let the total number of pages in the book be “x”, then he read:

1/8 x pages on Tuesday

On Monday, he read 30 pages, so the total number of pages he had left to read after Monday and Tuesday is:

x - 30 - (1/8)x

On Wednesday, he read the remaining quarter of the book, which is:

(1/4)x

We know that he completed the book on Wednesday, so the total number of pages in the book is equal to the number of pages he had left to read after Monday and Tuesday plus the number of pages he read on Wednesday. Therefore, we can set up an equation:

x - 30 - (1/8)x + (1/4)x = x

Simplifying this equation, we get:

(5/8)x - 30 = x

Subtracting (5/8)x from both sides, we get:

-30 = -(3/8)x

Multiplying both sides by -8/3, we get:

x = 80

Therefore, there are 80 pages in the book.

Whoops! It got confused with the equation set up, which should have been x - 30 - 1/8x -1/4x = 0 if you approach it as counting down to zero pages remaining at the end.

Let’s back up here for a minute and take stock of where we are. We’ve been trying to find a handle on the issue of whether a piece of software understands language. That seems like a simple enough question. Isn’t it just a bit odd, then, that to answer it, you now find yourself having to reconsider the metaphysics of the entire universe? That whether a piece of software (or we, for that matter) actually understands text should determine that the world as such is, at the bottom, just relata-less relation, all the way down? Doesn’t that seem kind of unbalanced?

Personally, whenever I find myself wanting to overthrow centuries of metaphysical thought and question the foundations of the universe just to determine what some piece of code oodles of layers away from anything remotely fundamental does, I think it’s a good idea to take a long look in the mirror and ask myself if I haven’t just maybe taken a wrong turn somewhere. But anyhow.

That’s not what I meant. The issue is that no matter if the world is, at bottom, matter-points, events, relations or bundles of properties, at the relevant level, we clearly can think about it in terms of things standing in certain relations. And that’s what ChatGPT gets to work with: its input consists of sentences, and a sentence is a particular relation over its terms (recall sentence diagramming?). That’s where it gets its structure from: a set of terms with a particular relation to be abstracted away from the text inputs.

It doesn’t get its input in terms of pure relation. I mean, what would that even look like? What form does the relation embodied by a sentence take, if you take away the terms? So the OSR-manoeuvre just doesn’t apply: we don’t have to wonder whether there are, ultimately, deep down at some very, very deep bottom, just pure relations. The quotidian world is a world of things, and those things stand in relations, and the terms referring to those things stand in (roughly) conforming relations, and that’s that.

Consider ChatGPT getting its input instead in terms of colored pebbles variously arranged. The arrangement of the pebbles yields a structure, but it’s the pebbles that supply the structure by being arranged thus. You can’t claim that there’s only the relation the pebbles stand in, and the pebbles don’t exist, because without the pebbles, there wouldn’t be any relation. Whatever the pebbles may be, deep down, there is no arrangement of pebbles without them! It’s like you can say that Alice is taller than Bob by virtue of Alice being 1,78m and Bob being 1,74m, but saying that there’s some ‘tallerness’ without there being Alice and Bob is just meaningless.

Alice and Bob may, themselves, be just a particular relation over cells, which are just a particular relation over elementary particles, which are just a particular relation over spacetime events, and so on. And perhaps there’s just relations, that never bottom out into any relata (I don’t see how there could be, but I don’t have to understand everything). But that question is wholly immaterial to the fact that Alice and Bob are two concrete relata standing in some relation regarding their size.

‘There aren’t any relata’ only works (if it does) at some deep, deep bottom of the world, where we’re concerned with the fundamental substrate of reality. It doesn’t work at the everyday level of things being related every whichwise.

So. ChatGPT is given certain tokens, instantiating a certain relation. If this relation were enough to settle the things the tokens refer to, and the relation they stand in, then ChatGPT could learn to understand language from just that data. But the Newman argument says that there’s nothing there to fix any particular relation over the things in the world. Any given permutation of those things can have an appropriate structure ‘pushed through’.

There’s an explicit example in this book, which concerns the three terms ‘Ajax’, ‘Betty’, and ‘Chad’, as well as the one-place relation ‘is a cat’. The ‘text corpus’ of the example is given by the sentences ‘Ajax is a cat’, ‘Betty is a cat’, and ‘Chad is not a cat’.

So, in the above notation, we’d have our domain D = {Ajax, Betty, Chad}, and relation R = {<Ajax>, <Betty>}. (Of course, the ‘ordered tuple’ notation is redundant, but I’m using it for consistency.) If ChatGPT understands language, then the only model for that should be the one where ‘Ajax’ refers to Ajax, and ‘Betty’ refers to Betty, and Ajax and Betty are both cats, and ‘Chad’ refers to Chad, not a cat.

But using push-through, we can construct an equivalent model using a permutation h on D such that h(Ajax) = Betty, h(Betty) = Chad, and h(Chad) = Ajax. Then, we have the relation h(R) = {<Betty>, <Chad>}. This model makes the same sentences true as before. The sentence ‘Betty is a cat’, that in D refers to Betty, which is in the extension of the relation R, and is a cat, in h(D) refers to Chad, which is in the extension of the relation h(R). But Chad is not a cat. So while the relation R does pick out cats, the relation h(R) fails to. But we (and ChatGPT) have no grounds on which to say that R is the right relation, and h(R) isn’t.

As it’s put in the book:

The issue generalises rapidly. By Pushing-Through, we can see that any name could be taken to refer to anything, that any one-place predicate could be taken to pick out any collection of things (provided only that there are enough of them), and similarly for all the other expressions of our language. We will stare into the abyss of radical referential indeterminacy, where every word refers equally to everything, which is just to say that nothing refers at all.

This is the world of ChatGPT. It can produce the sentence ‘Betty is a cat’, but it simply has no way to anchor it to the right structure that makes it mean that Betty is, in fact, a cat (and not, for instance, that Chad is whatever sort of thing is picked out by h(R)).

These AIs are illuminating something important about intelligence. Metaphysical thinkers hundreds of years ago didn’t know the brain was a neural net, didn’t know what a neural net was, had no examples of machines beating humans in a number of cognitive tasks, etc. Why we should think of them as still relevant isn’t clear, especially since their metaphysics didn’t really go anywhere or illuminate anything anyway.

Everyone is talking about how the AIs are not like brains because they are ‘software’ or ‘mechanistic’ or whatever. They are just probability machines, so the explanation goes. No one seems to consider that maybe that’s all we are, too. Especially our ‘system 1’ processes.

Again, focusing on the statistical word choices in the transformer ignores the construction of the list of probabilities for the next word in the first place. That happens somewhere in the hundreds of billions of connections of a trained neural net, not algorithms written by humans. We understand very little of how it’s doing what it’s doing. We know that capabilities emerge, and are not designed. We have seen capabilities emerge that we had no idea the LLM would be able to do at all.

That kind of uncertainty is not compatible with definitive statements regarding what the LLMs lack that consciousness requires.

That was specifically in the context that @Dr.Strangelove proposed a speculative metaphysical picture known as ontic structural realism, in which the world fundamentally consists of relations that somehow exist without any things standing to each other in those relations, was necessary to elucidate how an AI like ChatGPT could come to actually understand words. That just strikes me as too far-reaching a conclusion to be warranted.

And yet, there are countless statements that can be made about the limits of LLMs despite this uncertainty. They can’t solve the halting problem. They can’t produce truly random sequences. They can’t decide whether a Diophantine equation has a solution over the integers. And while I acknowledge that it’s highly speculative and just about the opposite of well accepted, I have at least a theory of how consciousness works under which it involves a problem of just this kind. Hence, that there are certain surprising capacities that emerge in LLMs does not in principle prevent us from drawing conclusions about their being conscious.