The Miselucidation of Whack-a-Mole

I totally concur with what Wolfpup said.

One of the pieces missing from LLMs is recurrency. They are a one-shot brain - ask a question, and it starts at the top and tuns through to the bottom, spits out the answer and it’s done. It doesn’t self-check, because it can’t. Ocassionally, because of randomness intentionally built into the output it chooses a wrong word which leads to another wrong word and eventually a full-blown hallucination. But it can’t go back and choose a different word, or even check.

In human terms, what the AI lacks is executive function. Instead of thinking of it as an alien, maybe think of it as a brilliant human with Tourette’s syndrom - it blurts out what comes to mind, with no self control.

There are techniques to improve this, though. Few-shot prompting, for example. Also, the new generation of AIs apparently use ‘Q’ learning, which allows for a limited amount of back-and-forth tree traversal to examine the effects of a parameter change in the past and find better local optima. This will supposedly reduce ‘hallucination’ significantly.

The other way we can get there is with agents. An LLM can have its output checked with another LLM before being given to humans. Since hallucination is random, it’s unlikely the two would hallucinate the same thing.

As for arithmetic, I believe ChatGPT is now flawless in addition of two and three digits, as its own addition engine emerged after a certain parameter/training amount was reached. But for other math, it still tries to treat numbers as regular old tokens, and that just doesn’t work well. I assume that as these things get better higher level math steuctures will emerge. Or, you can just plug in Wolfram Alpha to GPT-4, and its math will then be reliable. That’s available to end users now.

The “arithmetic capabilities” of sophisticated LLMs are an emergent property because as any computational linguist can tell you there is logic intrinsically embedded in language; consequently, if you feed enough arithmetic and algebra problems into an LLM it will ‘deduce’ the operational rules and be able to functionally emulate them, albeit with unexpected conceptual gaps. This does not, however, mean that there are cognitive processes that actually link mathematical operations to real world situations or concepts; it is just answering prompts about physics questions with textbook-like responses. That it is able to do this is unsurprising because textbooks are one of the most dense source of validated information in a natural language format, and there are tens of thousands of texts covering all manner of subjects freely available online to be easily fed into a training set.

That more advanced versions of GPT and other LLMs can ‘ace’ standardizes tests such as the Uniform Bar Exam, GRE, Medical Licensing Exam, college entrance exams, et cetera is equally unsurprising because there are copious examples of past tests and example problems available online free or at marginal cost, and one ‘intellectual’ task that LLMs would be expected to be really good at is coming up with the statistically most likely answer on multiple choice or short answer tests, because this is literally what it is doing any time it takes a prompt and provides a syntactically correct and stochastically most probable response. This does not mean that an LLM would be a good legal counsellor, or capable of correctly diagnosing medical conditions from ambiguous prompts or non-language examination, or would be a good business analyst, because it lacks any real world context, and its entire scope of knowledge is limited to media input rather than direct interaction with the world.

Even with a greater scope of knowledge LLMs will still make bizarre, nonsensical errors and have great lapses in critical judgement because they lack the fundamental comprehensive model-building cognitive processes that a human (or any complex animal) brain does which underlay direct analytical processes. It is fair to say that it has a degree of “intelligence” insofar as LLMs have an ability to respond to variable input with some degree of appropriate answer, but they aren’t what a computational neuroscientist would recognize as truly integrated, cognitive processes, nor (despite the fanciful claims of enthusiasts and even some researchers who should know better) an indication of some underlying consciousness. To the extent that we understand consciousness in human brains, there are not processes going on in an LLM that could even provide a basis for the ongoing layers of cognition, emotion, and perceptual inference that produce the emergent property of consciousness in humans (and likely many other animals along a spectrum of sapience and sentience).

Well, sorta. For the record, here was the response:

So, it did formulate the logic of the problem correctly in terms of velocity being acceleration times time in response to a very textbook question-like prompt. For the stated acceleration in two significant figures it should have been 21.92 mph/s, rounded to 22 mph/s but it is clear that the LLM used the more accurate 9.81 m/s2, giving what should be 21.94 mph/s, rounded to 21.9 mph/s. (The units are…odd, but numerically consistent.) It’s a quibble but when performing engineering and science calculations significant figures are important. It does get the algebraic transition in the second step correct, although this is a trivial rearrangement that is commonly done in textbooks and explanations. It the final step it correctly calculates the right value for time (again, with caveats about significant figures) but then peculiarly fails to stick the landing but incorrectly claiming to round the result to [mark]two decimal places[/mark], and then somehow rounding 22.79 to 22.6 seconds, which is “approximately correct” but actually an error. (The LLM does know how to use emoji like a true GenZ-er, though, so it at least has that down to a T.)

If this were a 5 point problem on a quiz I was grading, I’d knock off one point for the rounding error, and probably another for bad handling of sigfigs. It isn’t badly wrong but it is wrong enough to be clear that GPT-4 is doing some kind of rote stochastic ‘interpretation’ rather than actually comprehending rules, and is coming far enough off of the correct answer that it isn’t any kind of issue with interpretation; it is just wrong, albeit close enough that if you didn’t check the work it would pass the smell test, unlikely the original responses posted by @Whack-a-Mole which were not only wrong but actually self-contradictory. Now, it would be pretty ‘easy’ (in conceptual terms; I don’t know about executing it functionally within the framework of GPT) to make an LLM that recognizes that it has a mathematical problem and dumps it off to a purpose-trained math-parsing subsystem to actually break the problem down in a way that is checked by an explicit algorithm rather than relying on a purely stochastic engine to divine and apply the rules correctly. I suspect the next evolution in generative interactive models will include some kind of corrections like this because they are already used in robotics to prevent ‘impossible’ or undesirable kinematics. But LLMs on their own do not ‘understand’ a problem except in the context of providing a probabilistically appropriate response to a well-formulated prompt.

I’ll add an anecdote from recent experience to highlight how this implicit trust in LLMs and unvalidated generative AI is a genuine safety and reliability problem in the real world. I received an analysis of a heat transfer simulation which was intended to explain why the simulation varied significantly from the actual flight data it was supposed to model. This report had already presumably passed through several rounds of review and approval with the contractor and was provided to my group for ‘final’ review. I did what I thought would be a cursory review before assigning it to an actual subject matter expert (SME) just to make sure that the report was complete, and found that it was unexpectedly full of explanation delving deeply into statistical mechanics derivations and computational fluid dynamics even though the model used in analysis was a pretty straightforward two-mode networked node heat transfer model.

I’m not an expert in statistical mechanics, having had one course partially covering the essentials almost three decades ago, but the derivation and application didn’t make any sense to me, so I cracked a book and did some skullduggery because I like to understand things that I sign off on. After spending the better part of an hour, I came to the realization that not only where there multiple bad assumptions and errors, but that in fact the entire analysis was complete gibberish concealed by a bunch of sophisticated seeming nonsense probably cribbed from a Wikipedia page, which was good enough to cause everybody who had previously reviewed it to glaze their eyes and pass it up the chain without (I assume) asking questions or demanding clarity. When I finally got the chance to press the author for an answer, he first hemmed and then admitted to feeding a prompt with his desired conclusion into a chatbot and getting piecewise answers which he inartfully stitched together into a authoritative-seeming pile of complete end-to-end bullshit that literally had nothing to do with actually validating the analysis or explaining the discrepancy between simulation and test data.

This is the danger of getting complacent in trusting LLM-based chatbots to answer factual questions. Getting the wrong answer on a message board has no real consequences (not that it is an excuse for not checking the result or acknowledging that it is an obvious error that should have been caught) but it normalizes the use of these tools for purposes that they have not been validated and are completely unreliable at. When a monumental dunce like Michael Cohen uses Bing to generate citations that turn out to be fraudulent to support his early release its a funny story that sounds like a rejected plotline from Arrested Development that somehow prophetically came true. But when supposedly intelligent people start relying on chatbots for authoritative(ly wrong) answers or to give them a persuasive rationalization (or at least a good hornswoggle) to justify some arbitrary claim, they can quickly become actual physical hazards to public safety and well-being, notwithstanding how such tools can be used by malicious actors to do intentional damage.

These are not just cute toys, or passing novelties. They are tools with the capacity to actually do genuine harm, and the more sophisticated they get, and the more normalized their use becomes in ways that they are not suited for, the more likely these hypothetical hazards are to being realized.

Stranger

This is probably not correct. Mechanistic interpretability studies,of LLMs have found rich models and structures. I’ve mentioned the ‘Halle Berry Neurons’ in people that fire to associate one thing or concept to another: Think of a catwoman, and the Halle Berry nerve fires. Think about berries, and maybe it fires. These types of associative neurons help build world models. And LLMs have evolved them.

Here’s another paper looking at LLMs ability to model the world, and in particular they seem to understand time and space:

The abstract:

As another example, An experiment was done to teach an LLM othello without giving it any information about the game at all except for a collection of moves and their results. The LLM learned the rules, and learned how to play well. What they found through mechanistic interpretation was that it had built a representation of an Othello board as part of learning to solve the game. In other words, it built a world model for its tasks.

I don’t have the energy to respond point by point to your lengthy response, but a few salient comments …

You may affect it to be “unsurprising” to you, but in general these sorts of emergent properties were often surprising to the actual researchers, and are broadly considered to be reliable markers of intelligence.

Uncharacteristically for you, this is unmitigated bullshit, and there’s no need for scare quotes around the word “ace”. OpenAI made a point of the fact that GPT had never been exposed to any of the tests that were used in its performance scoring. That it may have seen similar tests is irrelevant, as it boggles the mind how a putative “token prediction” machine could use that information to improve its performance on completely different questions. Humans can benefit from seeing prior test materials because it familiarizes them with the methodology and allows them to practice and hone their skills with it. OpenAI was very explicit that GPT received no such specialized training.

No, I’ve already addressed that. The errors seem “bizarre” and “nonsensical” to us because the intelligence is not human. It’s alien intelligence with different properties, not necessarily inferior ones. In fact many of the most important intellectual capabilities are distinctly superior to most humans.

The point of this anecdote seems to be that LLM-based AIs (and all AIs, in general) can give incorrect answers or unwise responses and can potentially be abused. None of this is contentious, and none was disputed by me.

Wow! I was going to agree with you when I thought the statement was going to be “they are tools with the capacity to actually do a great many good things for humanity”, but then you went off the cliff on AI denialism. The standard argumentative approach of AI “skeptics” ever since the earliest formative days of the field has been to (a) try to gain a superficial understanding of some basic principle of how it works, (b) claim that it’s not “real” intelligence but just a computational “trick”, and (c) seek out an example of the AI’s failure at some task, and cite that as “proof” of point “b”.

This argumentative approach is becoming less and less tenable as the best LLM-based AIs become literally more intelligent than most people, as measured by all the tests we know how to devise, and yet we still see this fallacious argument all over the place, with all its fallacious assumptions.


Just for the record and full disclosure, this is the ChatGPT 3.5 session regarding the acceleration question. I’m not sure I see much to criticize, though it’s a very simple problem.

A vehicle is capable of an acceleration of 1g. If it can theoretically maintain this acceleration indefinitely, how long would it take it to reach 500 mph?

The acceleration due to gravity on Earth is approximately 9.8 meters per second squared, which is equivalent to 1g. When a vehicle accelerates at 1g, it experiences an acceleration of 9.8 m/s².

To calculate the time it takes for the vehicle to reach a certain speed, you can use the kinematic equation:

v = u+at

where:

v is the final velocity,
u is the initial velocity,
a is the acceleration,
t is the time.

Assuming the initial velocity (u) is 0 (starting from rest), and the acceleration (a) is 9.8 m/s², and the final velocity (v) is 500 mph, we need to convert 500 mph to meters per second:

[Bunch of equations omitted because I’m not about to convert them to MathJax]

Now solve for t:

t ≈ 22.8 s

Therefore, it would take approximately 22.8 seconds for the vehicle to reach 500 mph with a constant acceleration of 1g.

Pretty good for a “stochastic parrot”, no? I’ve seen parrots, and none of them could do that, let alone ace an SAT, AP, or GRE exam.

This has been an interesting discussion, but the bottom line for this message board is that LLM isn’t reliable: it’s reassuring that this is a point of broad consensus. Insofar as this message board is concerned, they should be judged on whether their output is helpful: I say the false authoritativeness is typically the opposite.

No one is saying you may never use a search engine, or a bot, or Wolfram Alpha, what your buddy told you, or anything else; the point is that one needs to be able to use one’s brain to figure out whether the premises, reasoning, and answer are all correct.

Someone is also vastly overstating the comfort level a 1g acceleration.

That’s always true, and always will be. That line of reasoning, however, assumes that the individual brains are actually capable of adequately evaluating said premises, reasoning, and conclusion. Given the results of the 2016 presidential election, the insurrection of 2020, and the likely outcome of 2024, I’d rather have the nation, if not the world, in control of well-vetted AIs with actual cognitive abilities than the low-information self-serving humans responsible for said fiascos. Wouldn’t you? Could it possibly be any worse?

Saying that “[m]echanistic interpretability studies of LLMs have found rich models and structures,” is misleading at best. Mechanistic interpretability is an inference approach to identifying useful and controlling parameters of a heuristic neural network with the intent of assessing alignment, but the strong statement that researches have “found rich models and structures” suggests that they have directly examined network coefficients and activation functions to map actual neural-like structures correlated to algorithms, which just isn’t true. That large neural networks with a vast dataset can make associative relationships between clusters of associated data items is hardly surprising, and in fact is completely necessary for the processing of non-trivial natural language. That this carries over to something approximating geographical associations is interesting but also not surprising, because when ever you mention Boston is also somewhat likely to occur in conjunction with Massachusetts, Cambridge, the River Charles, Fenway Park, MIT, Beacon Hill, et cetera.

There is a lot of nonsense thrown about with regard to so-called “Halle Berry” (also “Grandmother”, “Jennifer Aniston”, “Bill Clinton”, et cetera) neurons even though no one has every actually identified a single neuron association or how that would even work functionally; what they have found through direct monitoring of individual neurons is that such associations tend to stimulate the same neuron repeatedly. What the principle does suggest is that particular objects or conditions can be correlated with the activation of very sparse arrays of neurons that form dedicated ad hoc circuits. To the extent that digital neural networks function in a reasonable analog to neurons in a brain, this would be useful because it would suggest that there is a path to reduced complexity by being able to narrow the activity involved in specific functions to discrete networks, thus reducing unnecessary computational activity. But again, researchers don’t really have a way to interrogate large neural networks to identify distinct functional patterns, and all associations are being made by inference.

The paper you link to, by the way, is authored by Max Tegmark (a physicist who is an enthusiast for the “mathematical universe” hypothesis and AI research at a high level) and Wes Gurnee, a graduate researcher at MIT in Operations Research who from his Institute webpage is interested in:

studying large-scale optimization, broadly applied. I am especially interested in understanding and reducing the risks posed by advanced artificial intelligence systems. Currently, this involves deeply understanding the learned representations and circuits of LLMs in the hope of maintaining oversight and diagnosing failures of increasingly capable models. In the past, I have done research on dynamical systems modeling, fair algorithms for congressional redistricting, and natural language processing.

So neither of them are experts working in the details of implementation of large language models, and the paper is an pre-print that has not been subject to peer review by actual experts in the field. The actual details of the paper are just basic regression and sensitivity analysis and don’t discuss anything about the actual neural structure of the model except a brief three paragraphs in Section 5 and a brief discussion about “neural ablations” in Appendix B that really isn’t very enlightening. As an argument for “rich models and structures” it is pretty underwhelming to say the least.

Before throwing around accusations of “bullshit” you may want to check yourself for your own fecal aromas. I did not say that GPT-3.5 was “ex[posed to any of the tests that were used in its performance scoring”; what I specifically said was:

I don’t know if sample tests were specifically used for training the LLM because openAI has been coy about specifics but it seems quite reasonable that they would be because there is plenty of available material for cheap or free on the internet for prepping for bar exams, and it is an excellent source of ‘legalistic’ jargon that you’d like a general purpose LLM to be able to handle because the use of language in a legal context is different from colloquial usage.

Now you are just slurring me with the claim that I “went off a cliff with AI denialism” because I made some specific criticisms about the potential for misuse and provided a specific example of how the misuse could have real consequences, branding my points as a “fallacious argument…with all of its fallacious assumptions” instead substantively addressing any of the actual arguments. For what it is worth (although you don’t seem interested in addressing what I’ve actually said) I think various types of AI can be beneficial and even crucial to solving certain types of problems, particularly those that are too expansive for a single individual to get their intellectual arms around and that are not amenable to explicit algorithmic approaches to “Big Data” problem solving. LLMs offer the potential to automate a lot of drudgerous, non-creative work in generating quasi-boilerplate text such as informational blurbs, or drafting simple legal briefs, or other things that are easily checked and do not require deep intellectual inference. And as they become more sophisticated and the ability to validate their reliability becomes more assured, those use cases and others can be expanded.

But I also think that there is an enormous potential for people to just lazily assume that they can perform more complex tasks, put heedless trust in the authenticity of a seemingly authoritative response, or overestimate the basic capability and factualness of these systems because of their low use cost and ability to cut out problematic and expensive human labor. And, like the example I posted, they can easily, even unintentionally, used to assembly a line of nearly impenetrable but expert-sounding nonsense to rationalize pretty much any opinion. They can and will certainly be used for nefarious purposes by bad actors just as still image and video generative systems are being used now. And I think we ought to have a discussion about what kind of restrictions, informed consent, and functional guardrails such systems should have insofar as we can impose controls at all.

Sorry if that hurts your feelings about how we should all be ambitiously optimistic about the unending rainbow of bountiful benefits for humanity but it is difficult to point to any technological innovation (save, perhaps, for vaccination) that is an absolute benefit with essentially no downside, and AI in particular has numerous potential risks and adverse use cases that we ought to be concerned about now because if it goes off the rails after we implement such systems broadly throughout our society and infrastructure, it isn’t as if we can just turn it all off.

Stranger

If I may recast this objection in simple terms, you seem to be saying that GPT may have been aided in scoring well on the bar exams because it (gasp!) learned legal terminology! Gee, maybe GPT was also aided in part in passing the medical exams because it understood medical terminology!

All that aside, most of the other tests were generic tests of critical thinking, problem solving skills, and general knowledge. That it aced them all is not something that can be easily rationalized away as trivial.

I consider you one of our most well-informed and valuable posters and was not trying to “slur” you. We first of all need to distinguish between the argument that GPT isn’t “real” intelligence and doesn’t actually know anything, and the argument that it could be misused to the detriment of society. You appear to be making both arguments. I vehemently disagree with the first, and have some sympathy with the second.

On the first argument, it frankly bugs me when intelligent people like you, obviously very knowledgeable in engineering disciplines, effectively align themselves with dimwitted insubstantial prognosticators like the late Hubert Dreyfus or the still-living but brain-dead John Searle in dismissing the extraordinary successes of modern AI technology. To imply that the almost miraculous emergent intelligence we’re seeing birthing in these neural nets is somehow frivolous is so unjustly dismissive that the word “denialism” seemed to fit. Yes, I’m enthusiastic and optimistic about the technology, but no, it’s not even close to being “conscious” or “sentient”. It’s achievable, but still a long way away.

I fully agree with all of that. Except that AI is just a subset of the issue; the larger issue is automation, which started to take over in the 60s. I’m sure that many of us, like me, have had conversations with friendly customer service reps who wanted to give you that discount or that extra perk and then, after a bunch of tapping on a keyboard, had to say, “I’m sorry, but the computer won’t let me”. What I’m trying to say here is that that ship has sailed. It sailed off into the sunset half a century ago.

I believe that AI is going to introduce incremental quantitative changes in how we conduct our lives, but not – for now, at least – fundamentally qualitative ones. And there are potentially many benefits along with the potential downsides. But as with old-style automation, it endows tremendous power to our institutions, both public and private. From my perspective, this is not an argument for or against powerful AI; it’s an argument as old as civilization itself: it’s an argument for responsible governance of our society.

I believe that AI will become the convenient way to answer that kind of factual question, just as calculators and spreadsheet software are currently the convenient way to do long division. But it’s not there yet. It’s still highly unreliable. It’s not currently appropriate to use it to answer FQ questions. Not unless you are already knowledgeable enough to vet its answer.

And right now it’s very dangerous, because it gives really plausible false answers. Like the legal briefs full of references to made-up precedents.

One might say that AI is uppity.
Not that one should.

Absolutely. I saw on another thread that someone was apparently asking ChatGpt about diagnosing a faulty electrical outlet! Seriously… you’re going to take advice from a chatbot about something as potentially lethal as house wiring!!!

These things are fascinating research projects, but I think unleashing them on the general public at their current stage of development is highly irresponsible.

This all reminds me of a software product my company used waaaay back in the early 80s. It was called Intellect, and you could ask it English language questions, and it would provide answers. Sounds great, but our users did not have enough knowledge to ask the right questions to which to get the specific answers they needed. You had to know the data – inside and out – to know if you were getting anything close to a correct answer. Fun times.

42, I assume?

Sometimes. :wink:

Asking the right question is key, of course, but as I said earlier, one of the strengths of GPT is the ability to have an interactive conversation with it in which it remembers the context of prior questions and responses, so that the information it provided before and your subsequent questions can iterate towards a genuinely productive and informative discussion.

That’s just one small example of the fact that GPT is literally at least 12 orders of magnitude more complex and powerful than its simple question-answering predecessors. That arguably places it closer to the scale of human cognition than traditional computational methods, but in any case it’s such a vast quantitative increase in capability that it creates fundamental qualitative differences and fundamentally new behaviours.

No, that wasn’t what I said, and I think you are framing my argument in a way that makes it seem as if I am trying to play down the achievement of an AI tool in being able to interpret and ‘ace’ (perform at the top end of the range), which is not the case. What I did say is:

Note that I did not say or imply that this capacity is “trivial”; what I did say is that the ability to apply knowledge that it was trained upon does not demonstrate that it is capable of functioning reliably in a real world context with inputs that are more ambiguous than a test question and often do not have answers as definitive as a multiple choice result. That an LLM has the ability to take an arbitrary prompt and generally provide a syntactically correct and at least generally applicable response based upon a fairly general heuristic algorithm is quite impressive; it is not, however, an indication of deep knowledge that can be broadly applied or any cognitive processes underlying anything like consciousness. The basic logic that anyone versed in computational linguistics is sufficient for a sufficiently powerful and well-organized stochastic ‘machine’ to come to an approximately correct response. It does not show powerful “critical thinking” abilities, and in fact even the most advanced LLM-powered chatbots currently demonstrate a distinct lack of critical faculties, repeatedly making basic errors that a human—at least, one that is paying attention—would catch upon even a cursory read.

I didn’t make either of the arguments you ascribed to me, and in fact you are using rhetorical chicanery here to try to cast me as disingenuous, ignorant, or both, including declaring that I have aligned myself with “Hubert Dreyfus or the still-living but brain-dead John Searle” even though I haven’t referenced either of them or their theses, and continuing to brand me as a “denialist”, which is a pejorative term that brings up association with anti-vaxxers and flat-earthers. You are not even clear about what that means except that by disagreeing with your view of how accomplished current generative AI is I am somehow disputing established truths backed by broadly accepted evidence, which is not the case. This is The BBQ Pit so I won’t complain but I’m also not interested in an interaction where I have to spend more of my effort defending my character or basic right to participate in the discussion than addressing actual claims, notwithstanding of how you are erroneously stating my arguments and ascribing my philosophical alignments.

What we are facing now with a wide and essentially unregulated adoption of LLMs isn’t just ‘automation’ in the traditional sense; automated systems of the past, both mechanical and computational, have used systems which had behavior that was well-defined within specific parameters that could be readily measured and controlled. With LLMs and AI in general, there has been little discussion among people promoting it about any means to validate or place controls on these systems, and many arguing that to do so would so limit the ‘progress’ of development and deny us all the manifest benefits that it would be a virtual crime to do so. This is despite the fact that in any significant interaction with such tools the flaws in their ability to reason through even basic problems or regulate themselves for the factuality of the information they provide.

And this is unsurprising because the fundamental goals of the developers currently isn’t to make these systems ‘truthful’ above all other objectives, or even ensuring that they are particularly reliable; it is to make them seem authoritative, useful, and give users confidence in whatever results the machine produces. I’m sure as development of these systems progresses, they will become more generally more reliable because that is necessary for broad utilization, particularly in technical fields such as medicine or engineering but without establishing some kind of standardized way of validating them or a framework for responsibility when the systems make an error, there will remain the perverse incentive to get a service to market soonest rather than make sure it is truly reliable. “Go fast and break things,” is a cromulent philosophy when making non-safety critical smartphone apps or the latest electronic gizmo but with is a completely irresponsible way of deploying products or services that people may depend upon for crucial decisions involving safety or factuality, notwithstanding what deliberate bad actors can do with such a capability that has no safeguards in use.

Stranger

You really seem to like GPT. It is based on a transformer model, so it has multiple attention heads and other features that make it a lot more fun to play with than ELIZA in 1964, but is there really anything specific about the architecture you can point to that makes you so enthusiastic about it, and why?

Here’s a good article about examining hidden layers in the LLMs to try to understand how they do in-context learning. They shouldn’t be able to, because model weights are not updated by the context. But it can be shown that you can give new problems to ChatGPT in its context, and it will learn how to solve new similar problems.

It turns out one of the possibilities is that LLMs create their own mini-LLMs inside them, then train them on the context. Wild.

https://www.alignmentforum.org/posts/N6WM6hs7RQMKDhYjB/a-mechanistic-interpretability-analysis-of-grokking

This has all kinds of data drawn from having “directly examined network coefficients and activation functions to map actual neural-like structures correlated to algorithms”.

In particular, have a look at the section on Modulo arithmetic, where they reverse-engineered the circuits that execute Fourier Transforms and trig identities to solve modulo arithmetic - a capability that emerged. Before this, the model was trying to solve arithmetic like words - using tokens and a huge storage map of tokens and results. After a while, though, it not only evolved the algorithm for a general solution, it then pruned its own memory of all the old tokens it no longer needed.

https://openai.com/research/language-models-can-explain-neurons-in-language-models Using%20GPt-4%20to%20examine%20the%20network%20of%20smaller%20LLMs.

This research very specifically identifies individual neurons and their firing patterns.

Would you accept a quote by Ilya Sutskever, OpenAI’s chief scientist?

You can’t do next word prediction without having a model of the world, people, their hopes and dreams, relationships, etc. Sutskever’s opinion is that the text they are being trained on is actually a projection of the world, and by learning enough of it they build rich representations of the world - just like we do.