ChatGTP et al, the creators don't know how it works?

Listening to the Lex Friedman podcast where he is interviewing Sam Altman, OpenAI CEO, and I have heard mention of this before regarding machine learning, but apparently they don’t really know how it works. I assume it is just a case of economics where it is not economical or beneficial to step through each step, or millions of steps, to see how each decision point is weighed because really only the outcome matters. But just in case I don’t understand, is it really that it would be impossible to know or just too expensive?

It is of course possible to know. It’s only regular cpu instructions at the bottom.

I think it’s just way too many steps to in any way be practical. And it’s not enough to just know the steps. You also need to put them together and make sense of them.

I guess what I’m asking is, is it really just semi-informed trial and error development at this point? If it is impossible or just not economically feasible to know the outcome of your design, how could it not be?

Depends what you mean by “know how it works”.
Of course the creators know how the code operates and how it was trained and how it operates.
What they don’t understand is how any specific input/response occurs. That lives in the emergent character of the trained system. You could take the same raw system and train it with different data. It would behave differently. Yet the underlying system is identical. Somewhere in the interplay of trained weighted connections behaviour emerges. Nobody has much idea of how these things happen at all. You could follow some specific input and watch how the response is triggered, but you would learn almost nothing about how the next response would occur.

To make an (bad) analogy. We have an extraordinary good understanding of the way chemistry operates at the QED level. We struggle to predict how molecules more than a few dozen atoms will behave. Let alone how human brains operate. But the laws of QED govern how it works.

Here is a very long, very complicated breakdown of things that are known (and not known) about how neural nets work. It is worth at least a partial read.

If they mean why does a particular input generate a particular output, then that’s correct. Non-trivial (i.e., large) neural networks are black boxes. You can unroll the neural network to see the math, but it is too complex due to the number of variables to really deduce what is happening especially since most of the variables have little to no effect. They certainly do not mean that they do not know how LLMs work in general.

And that’s really the conceptual leap here.

For the high level layman conceptual view, it isn’t a “machine” in the sense of gears and shafts and belts or even a traditional computer program where the relationship between input and output or stimulus and response is clear-cut and pre-determined by human decisions only to be repeated by the machine.

The output of a GPT is much more like the weather is an output of sun & water & sky. It “just happens” as a result of thousands or millions of things jostling against one another in the presence of certain fixed rules while the total system state evolves over time. It can be broadly predictable in the macro, but not at all in the micro.

And strict attribution AKA establishing why specifically this output occurred given that input is practically impossible and nearly conceptually impossible as well.

LLMs are trained which means that no one specifically tells ChatGPT what to do.

ChatGPT is a neural network. That means it measures a bunch of inputs (in ChatGPT’s specific case the inputs are words and sentence structure. For a robot learning to walk, the inputs would be sensors throughout its body. Etc) and then constructs a model by connecting different measurements together, and running calculations based on its results.

When you train a model, you run it with little variations, and let the system pick the ones that do best, then modify them again and run them again. The modifications are random (like mutations in evolution), but sometimes they increase the fitness of the model, so those changes are kept.

Repeat a few billion times, and you have a machine that can have a conversation with you. We know the principles it operates under (IE we understand neural networks), but no one knows exactly how a specific neural network is built.

See the obligatory CGP Grey video:

Yeah, it’s a system composed of a really, really huge number of small parts - the small parts can be perfectly understood and predicted; the way in which they are connected together can also be understood, but the process of configuring them to do a thing is lot like teaching - at the end, you have a system where all of the parts are doing exactly what they should - there are no mysteries there, but the overall activity is a complex combination of what all of the parts are doing together in combination. - and when you try to look at the individual parts, you just see parts, doing their thing - it’s an issue of scale - if you zoom out, you see the behaviour but the parts are too small and numerous to make any sense; if you zoom in close enough to see some of the parts, you are at best looking at a very, very tiny subset of them, and you can’t see the rest of the action.

In this sense, it’s very much like what would happen if you were able to examine individual neurons in a human brain - you would just observe them to be connected with each other and exchanging electrochemical signals in a way that could be modelled and understood and perhaps would look rather ordered and predictable - it would not look as if this mechanism could give rise to the weird and varied and complex set of behaviours of the brain as a whole, but when you zoom out and look at the brain as a whole, it just appears to be a big wobbly lump of fat, which also doesn’t tell you how it works.

(I think in a sense the human brain might be slightly more amenable to this sort of examination because of the way different parts perform different sets of functions, having developed and evolved as an incremental process)

Add a few zeros to that. From the article I posted above:

If one looks at the longest path through ChatGPT, there are about 400 (core) layers involved—in some ways not a huge number. But there are millions of neurons—with a total of 175 billion connections and therefore 175 billion weights. And one thing to realize is that every time ChatGPT generates a new token, it has to do a calculation involving every single one of these weights. Implementationally these calculations can be somewhat organized “by layer” into highly parallel array operations that can conveniently be done on GPUs. But for each token that’s produced, there still have to be 175 billion calculations done (and in the end a bit more)—so that, yes, it’s not surprising that it can take a while to generate a long piece of text with ChatGPT.

I don’t know how ChatGTP works, but I do know a bit about how neural networks work.

If you simulate a handful of neurons, the overall simulation is simple enough that you can fairly easily figure out what everything is doing. But the simulation also won’t do anything useful or interesting.

If you increase the number of neurons, the interconnections and interactions become so complex that the whole system is far beyond human understanding. You can easily focus on what an individual simulated neuron is doing, or what is happening along a particular simulated path, but the sheer number of interactions makes it impossible for a human to fully understand what the system is doing.

And it is beyond that point that interesting things start happening. The neural network starts to behave like a real brain. The simulated signals along simulated paths start to look a lot like actual biological brain waves. The neural network can also be “trained”, by feeding it particular stimuli, after which it will respond to how it has been trained.

But we don’t really know exactly what it’s doing, because the interactions are so complex.

A good example is when they tried to train a neural network to recognize enemy tanks in photographs. They kept training the network with pictures of tanks, and sure enough the network would give a response when they showed it new pictures of tanks. But after some experimentation, they figured out that the network hadn’t actually learned to recognize tanks. They had accidentally trained it with pictures that had more shadows when a tank was in the picture, so the neural network was accidentally trained to recognize shadows, not tanks. Oops.

But that is the inherent problem with neural networks. Even though the individual pieces are fairly simple, the way that they interconnect and react with each other is so complex that we don’t really know how they work. We never know what it’s really doing inside.

Wow. I hadn’t realized they’d gotten up to 175 gigaconnections quite yet. Color me impressed.

That also gets within a couple-few orders of magnitude of a human brain, depending on how you keep score.

That’s ChatGPT 3.5. The number in ChatGPT 4.0 hasn’t been disclosed, but is thought to be at least a trillion. (Wild rumors run up to 10 or even 100 trillion, but that’s probably unlikely.)

One method for understanding what’s going on in neural nets is called “Mechanistic Interpretability”, and it involves attempting to reverse engineer the parameters of a neural net to discover what circuits, algorithms or other structures have evolved.

it was this process that discovered that when a single-layer, 50,000 parameter neural net suddenly learned how to do modulo addition in a generalized way, it had evolved its own algorithms using trig identities and fast Fourier transforms to do generalized addition.

This article specifically discusses ‘grokking’, or emergent capability:

Even without considering that, the number of calculations is substantially higher than that.

We know from various sources that an A100 GPU costs about $3/hr and that ChatGPT costs OpenAI about $0.0003/word. That’s 0.0001 GPU-hours/word, or 0.36 GPU-seconds (they run on 8 GPUs at a time, so the actual output rate is faster).

We don’t know all the inner details about their system, but an A100 gets 312 TFLOPS on FP16 data. That would come to 112 trillion operations per output word.

That’s just a ballpark–the real number could be higher (if they use 8 or 4-bit math, for instance), or lower (if they aren’t using the GPUs efficiently or are bounded by memory or something else), but in any case the number of operations is far higher than the number of weights.

It’s an absolute, fundamental fact that no brain can fully understand a system more complicated than itself. In practice, no brain can understand a system that’s even anywhere near as complicated as itself. These AIs are at least approaching the same level of complexity as our own brains, and therefore, our brains can’t understand them.

Let’s face it—it is not that what ChatGPT does is so complex, more like much of what humans do turns out to be simple…

Are you basing this on something like the pigeonhole principle or something like that? (the number of states in a more complex system being greater than in a simpler system, therefore the simpler system cannot represent/match all of the states of the more complex one)