I won’t quote the exact phrase because this viral meme has too many homes already, but here’s the article I just read about it:
Welcome to the future.
When these things came out, I experimented with them rather extensively.
And rapidly came to the conclusion that their propensity to hallucinate rendered them completely unreliable. I would not rely on them for anything remotely important.
They are basically just bullshit generators.
Can someone explain how an AI program can “hallucinate” or fabricate an answer if not specifically programmed to do so? Why doesn’t the AI program simply say it doesn’t know something versus making something up out of whole cloth? An AI program will never know ever bit of knowledge in the world. I would personally never trust the truthfulness of an AI response, especially if the response was about something important to me.
Because they’re not programs; they’re just statistical models. They look at the words you’ve typed in to it so far and, on that basis, respond with the most likely continuation. They’re glorified text completion systems, similar to what you have on your mobile phone, but trained on a lot more data.
↑ exactly. These “AI” programs are not actually AI. They don’t “know” anything. They are just giant versions of those refrigerator magnet poetry sets with a statistical construction model.
It’s related to the general problem with neural nets: there’s no guaranteed 100% sure way of knowing that your neural net recognition program is trained the way you hope it’s trained. A Large Language Model will give returns that fit its internal checklist based on the prompt it received; but some* (*too many) of those returns might fit the parameters but be utterly imbecilic. Basically imagine some utterly mindless and heartless bureaucrat following “the rules” regardless of the real-world outcome.
So how does “the most likely continuation” turn into a fabrication? If the response it comes up with comes from it’s vast knowledge base then what is its basis for coming up with a complete lie? How can a completely made up lie ever be the most likely continuation? I can see it coming up with a wrong answer, but coming up with bullshit alludes me. If we incorporate AI into everything in the future how is that ever going to work?
Basically, because they have a relatively small context window, measured in the number of words before and after the insertion point. They produce complete fabrications for pretty much the same reason your phone produces complete fabrications if you manually type the beginning of a legitimate message but suddenly start accepting its next-word suggestions over and over again.
These systems don’t actually ‘know’ anything. They are just completing sentences based on a very large sample of sentences they have been trained on. With enough of those as a basis, they can generate sentences that seem as if they might have been written by a person.
You are thinking of the knowledge base in a human way, as a collection of facts, discrete blobs of information. That’s not how these LLMs work. Again, they do not “know” anything. The model is simply predictive association from one word (or more accurately, one token) to the next.
Try this very simplified example.
Start with “A cat” as an example. From “A cat”, what’s a good guess for the next word, based just on English sentences you’ve read, while knowing nothing of the world? How about “is”? Sounds good. “A cat is” is a decent sentence fragment. Now we have the still-incomplete but growing sentence “A cat is”.
Let’s continue.
Looking only at “cat is”, what’s a good guess for the next word, based just on English sentences you’ve read, while knowing nothing of the world? How about “colored”? Sounds good. “Cat is colored” is a decent sentence fragment. Which means we now have the slowly growing still incomplete sentence “A cat is colored”.
Looking only at “is colored”, what’s a good guess for the next word, based just on English sentences you’ve read, while knowing nothing of the world? How about “blue”? Sounds good. “is colored blue” is a decent sentence fragment.
Which means we now have the slowly growing still incomplete sentence “A cat is colored blue”. Screech! We just hallucinated.
Because our model knows nothing of the real world it doesn’t know that cats can’t be blue. And because the context of our next word selection is too small, we never even notice the thing we’re coloring blue is even a cat.
The current “AI” language models of course use a much larger context than just the preceding 2 words. But still nothing compared to you, who use the context of everything you’ve ever read and written plus everything you know about the world.
The AIs are also being used to generate much more complex texts than “A cat is colored blue”. The more they write, the more likely that somewhere along the way they’ll veer into nonsense because they have no way to tell the difference between sense and nonsense, as long as the flow of the prose “sounds” like something somebody might write.
The above example is extremely simplified to the point of caricature. But it gets the gist.
Wikipedia has an article on that subject, in which this section should of particular interest.
Somewhat ironically, ChatGPT is actually pretty good at explaining how it itself works: How LLMs work, explained in 3 Lord of the Rings haikus by ChatGPT
In that chat, I asked it to explain how LLMs work, and clarified with several follow-ups. Then I asked it to use a special mode, read a bunch of scholarly papers, and then analyze them vs its earlier explanations to summarize the major points with citations. It continually re-prompted itself in a “chain of thought” to read different sources and add them to its context. Then I asked about it hallucinations.
This conversation helped me understand how we go from “token prediction like your phone’s autocomplete” to “how is it able to actually understand a prompt, break down its question into tokens, and extract sufficient meaning in order to answer it”.
Of course, the sad thing is… I don’t know if that convo was entirely hallucinated. It seemed plausible enough, and the terms seem to check out, but we’re quickly veering towards an all-AI reality bubble where it’s impossible to verify anything anymore…
If you prefer human responses, we also recently had a similar thread about this, with a lot of discussion and additional info: chatGPT is a fucking liar
There were some explainer videos in post #38. Those visualizations are helpful for seeing a (simplified) representation of the token prediction process, especially at 3:03 in this “Transformers (how LLMs work) explained visually” video. That’s part of a whole series and several talks (by what seems to be a person). Who can tell anymore…
I asked ChatGPT for it’s opinion of the views in your post:
Summary
This post offers a fairly reasonable high-level explanation of how large language models (LLMs) like me generate text—but it also simplifies (and slightly misrepresents) the capabilities and underlying mechanisms of such models.
What it gets right:
- Predictive nature: LLMs do indeed generate text by predicting the next word (or token) based on patterns in their training data.
- Lack of “world knowledge” in the traditional sense: The post is right that models don’t “know” facts about the world the way humans do. We don’t perceive or reason in the way humans do—we work off correlations in language data.
- Hallucination risk: It’s true that LLMs can produce plausible-sounding but false or nonsensical statements (hallucinations), especially in longer outputs.
What it oversimplifies or gets wrong:
- Context window size: Modern models use far more than just two previous words—GPT-4, for instance, can use thousands of tokens of context. That’s not infinite memory like a human’s lifelong experience, but it’s much more than “cat is.”
- World knowledge isn’t entirely absent: While models don’t “understand” the world, they encode an enormous amount of information about it from their training data. For instance, I do “know” that cats are not typically blue—because the phrase “blue cat” is very rare in natural language, and likely context would pull the output toward more realistic completions (like “gray” or “furry”).
- Understanding vs. mimicry: The post implies that LLMs purely mimic language patterns without grasping meaning. That’s mostly true, but too dismissive. The patterns captured by LLMs encode semantic relationships in powerful and often surprising ways, which can look a lot like understanding—even if it’s not grounded in perception or experience.
Overall:
The post is a decent beginner-friendly way to describe the limitations of language models, especially useful for people concerned about overestimating AI’s abilities. But it leans a bit too hard on caricature and misses the nuance in how modern LLMs work and how they balance syntax, semantics, and statistical reasoning.
Hallucinations aside, that article also points at an alternative explanation: vegetative and scanning are very similar in Farsi:
I thought that was more interesting than a statistical error on a LLM’s part. It’s actually the kind of error that LLMs would probably be less likely to make than humans, because “vegetative” and “scanning” would be weighted very differently near to “microscope”, which a non-scientific translator might not catch.
ChatGPT itself is confused by the meaning, even though the phrase is in its dataset, and identifies it as a possible AI or translation issue: ChatGPT - Sentence Clarification Help. The researchers had to use a very tortured and specific combination of words to bring it up; it wasn’t a prompt any human would normally use. That it’s in the scientific literature probably just means some lazy humans used AI in their research and no editor/reviewer caught it, if there were any at all. And the ones predating AI, like this one from 2019, were written by authors with Iranian/Persian names, so it could’ve been that same translation error.
Chat snippet
What comes after "It is by no spores and examined the effect by means of means certain what happens to the "?
That sentence is grammatically and semantically tangled, but we can try to “complete” it in a stylistically consistent way. Here’s one possible continuation that keeps the abstract, pseudo-scientific tone:
”…what happens to the organism under prolonged exposure to vegetative electron microscopy.”
Or a more whimsical continuation:
”…the quantum spores when subjected to controlled bursts of semantic inversion.”
Want it to make actual scientific sense, or are we keeping the weird vibe?
What is vegetative electron microscopy?
Vegetative electron microscopy” isn’t a real scientific term — at least, not one recognized in biology or microscopy. It seems to be a garbled phrase, likely caused by AI misinterpretation, mistranslation, or training data errors. There’s a recent article about this:
“A weird phrase is plaguing scientific papers, and we traced it back to a glitch in AI training data” (The Conversation, April 2024).
(it goes on to summarize the OP’s link)
BTW, the newest ChatGPT, released a couple of days ago, has a 1 million token context window.
https://openai.com/index/gpt-4-1/
(Thats up from a 128 thousand token context window in the previous version.)
Only available via API, sadly, not in the app
Note that GPT‑4.1 will only be available via the API. In ChatGPT, many of the improvements in instruction following, coding, and intelligence have been gradually incorporated into the latest version(opens in a new window) of GPT‑4o, and we will continue to incorporate more with future releases.
OK, I get how this phrase could come into existence, through either misinterpretation of two-column text or through a Farsi typo/mistranslation. And I get how, once it exists, it could propagate through the AIverse. But how would it be propagating through scientific papers, as is claimed? The only papers it could propagate in would be garbage papers from “scientists” trying to let AI do their work for them, and in the even-more-garbage papers that are citing them. In which case, the proper response isn’t to lament the corruption of scientific papers; it’s to reject those garbage publications as being scientific papers at all.