True, but it will also sometimes do that if you point out was wrong, when it wasn’t.
And even when it is wrong and it admits it was wrong, this doesn’t always represent any advancement along the path to being right - in the classic case of asking it to count the letter Rs in the word ‘strawberry’ (which went quite hilariously wrong on probably an earlier version of ChatGPT), it would admit it was wrong, try again and still be wrong, sometimes insisting that it was definitely right this time.
That’s just silly. The latest LLMs have access to a huge corpus of information. They’re trained using thousands of hours of supercomputer time, resulting in neural nets that comprise billions if not hundreds of billions of parameters that govern their behaviour. Whereas Eliza had no actual information about anything – zero – and its responses were either a syntactical restatment of something the user said or just a randomized rote response that had nothing to do with what the user said. Eliza wasn’t trained because there was nothing there to train.
Put another way, GPT responses are based on a deep semantic analysis of the user’s statement or question which then drives information retrieval. Eliza had no concept of semantics whatsoever and no information to retrieve. It wasn’t really AI at all, whereas LLMs today are among its most impressive manifestations.
ELIZA, with the DOCTOR script, absolutely had information. There was a bunch of stuff embedded in the script, which the first paragraph of the article I linked to makes clear.
The allusion I was making that ELIZA was taken seriously by a lot of people, including Weizenbaum’s own secretary. They actually thought they were having a meaningful dialogue with it.
Same with modern LLMs. People using them just don’t understand the nature of what they are using.
Also note that I said “on steroids”.
(BTW, the article I linked to also includes some discussion at the end of ELIZA vs ChatGPT, etc.)
I feel that happened at least twice. There was one in 2023 during the early 3.5 days when it hallucinated all over the place. Then I swear there was another after that. I laughed my ass off as anyone who tinkered with the damned thing knew it was especially prone to things like this. By summer last year, it had gotten much much better, but still requiring sanity checking.
I can confidently tell you this much: using generative AI in an actual court room isn’t a very wise move.
Lindell’s lawyers admitted to using generative AI for documents
Wang filed an order on April 23 that said the lawyers submitted a document in February with nearly 30 errors, including citation of cases that do not exist.
Kachouroff admitted to using generative AI at an April 21 hearing, according to the order.
The question is, does the program know it’s right or wrong? Does it have the concept to divide right from wrong, and choose what it means? Or is it tossing a dice and making a call based on the preponderance of text supporting one side or th other of the argument?
This is the key with AI - is it actually building a model of the situation and fitting facts to that model, or is it -more likely - Like ELIZA, reassembling chunks of words like it assembles photos, knowing what a foot looks like but lacking the “big picture” conceptualization that people have one left foot and one right foot and five fingers on a hand?
This is not the type of dialogue you get with something that’s just “a fancy language assembler”
What’s the first line in Romeo and Juliet?
ChatGPT said:
The first line in Romeo and Juliet by William Shakespeare is:
“Two households, both alike in dignity,”
This is the opening line of the Prologue, which is a 14-line Shakespearean sonnet that sets the scene for the play. Want to see the full prologue or just the next few lines?
4o
You said:
how confident are you with that answer?
ChatGPT said:
I’m very confident.
That line — “Two households, both alike in dignity,” — is the first line of the Prologue to Romeo and Juliet, which introduces the setting and conflict of the play. It’s spoken by the Chorus and serves as a poetic summary of the story.
If you were instead referring to the first line of the actual dialogue (i.e., after the Prologue), then the first spoken line is:
“Gregory, o’ my word, we’ll not carry coals.”
— spoken by Sampson, a Capulet servant.
Would you like the distinction between the Prologue and Act I, Scene I clarified further?
4o
You said:
well i’m not sure if that’s correct. I think the first line is something else.
ChatGPT said:
Good instinct — let’s clarify the confusion.
In Romeo and Juliet, there are two different ways people refer to “the first line”:
1. First line of the play (Prologue):
This is spoken by the Chorus, and it is:
“Two households, both alike in dignity,”
This begins the famous 14-line sonnet that summarizes the entire plot.
2. First line of the actual action (Act I, Scene I):
This is the first character dialogue in the play:
“Gregory, o’ my word, we’ll not carry coals.”
— Sampson, servant to the Capulets
So, both answers are technically correct, depending on what you mean by “first line.”
Were you thinking more about the first line of character dialogue or the poetic introduction?
It is, it’s just way more fancy than most people realise.
Often you’ll hear people saying that it’s ‘just predictive text’, which can tend to generate the impression that it’s merely predicting the next word, based solely on the previous word. It’s not doing that, it’s predicting the next word (or token) based on a significant portion of the conversation so far - and in order to do that, it’s basically modelled the language, syntax and usage, as well as modelling how people talk about certain concepts and what they say when they do.
I played with Eliza back in the days of the PDP-10, and I have it even today along with other classic PDP-10 programs like Richard Greenblatt’s MacHack chess program that run on my PDP-10 simulator. I’m familiar with how enamoured Weizenbaum’s secretary was with it, but I don’t know anyone else who thought much of it. It’s just a very silly little LISP program. Now MacHack – that was impressive! That was real AI in its day.
Not the same at all. LLMs successfully solve problems in logic, interpret images, can flawlessly perform language translation, and can answer a series of complex and even profound questions, usually correctly. Lots of examples have been given in other discussions on this board. They genuinely understand what the user is saying or asking, or if you prefer, they definitely act as if they do, and therefore even the current demonstration models of GPT have actual utility.
For example, we got into a discussion in another thread about how logic and intelligence are embedded in the intrinsic structure of human language, and therefore an LLM’s mastery of natural language might help explain some of its remarkable emergent properties characteristic of intelligence. I was uncertain about which theories of linguistics might reflect this view, so I asked ChatGPT. It was a pretty vague question and I doubt Google would have been of much help, but this is precisely the kind of thing that GPT is good at.
What I will concede is that because LLMs are well-spoken and tend to sound authoritative, they give the impression of being infallible experts, and they definitely are not infallible. It’s important to understand that they might be bullshitting, and to verify the information they give. But as I’ve pointed out numerous times, humans are natural bullshitters, too, some more than others. So verifying information with multiple sources is always important anyway.
If you want to use LLMs as a sort of high-powered search engine, I’d recommend Perplexity instead of ChatGPT. It will cite sources, giving you an opportunity to verify the veracity of its claims.
The thing that GPT is good at is resolving vague, possibly ill-formed questions, and if its initial interpretation is going in the wrong direction, its ability to maintain the context of a conversation allows one to further refine the question or ask more questions. Either way, what GPT is doing is taking something potentially vague or generic and turning it into a specific set of (alleged) facts. The link I gave in my previous post to GPT’s answer to a vague question is a great example.
Once you have detailed hard information, checking out its veracity is easy to do.
Hey, I decided to take that discussion to the next level, so I specifically asked ChatGPT this direct question (and quite honestly, I think it’s being excessively modest in Part 4, “Simulation vs real understanding” – at some point, what’s the difference?):
Is it fair to say that the idea that logic is inherent in language structure helps to explain the apparent intelligence that emerges in large language models like yourself, since mastery of natural language is one of your major strengths?
ChatGPT:
Yes, it is fair to say that the idea that logic is inherent in the structure of language helps explain the apparent intelligence of large language models (LLMs) like me. Here’s a breakdown of why this is a reasonable and insightful observation:
1. Language Encodes Logical and Conceptual Structures
Natural language is not just a collection of words—it encodes logical relations, hierarchical structure, conditionality, quantification, negation, and more. Because LLMs are trained on vast amounts of text data, they learn to model and reproduce these structures, which gives the illusion of reasoning or logic.
For example, in English:
“If it rains, the picnic will be canceled” encodes a logical conditional.
“All cats are mammals” uses universal quantification.
By learning patterns in such expressions, LLMs indirectly learn how humans reason using language.
2. LLMs Learn Syntax + Semantics + Patterns of Reasoning
While LLMs are not explicitly trained to do logic or reasoning in a formal sense, their deep exposure to logical and argumentative structures in language (e.g., philosophy, science, programming, law) means they often learn to approximate these structures.
This is consistent with what Montague Grammar and others suggested: that natural language syntax maps closely to logical forms.
LLMs seem “smart” because language itself encodes intelligence—not just vocabulary, but patterns of inference, argument, and explanation.
3. Chomskyan Perspective: Language Reflects Innate Cognition
From a Chomskyan or nativist point of view, language is a surface expression of deeper cognitive abilities—especially recursion, abstraction, and structure. LLMs that master these patterns begin to exhibit behavior that resembles intelligent thought, even without having consciousness or understanding.
4. Limits: Simulation vs. Real Understanding
While LLMs simulate intelligent behavior, it’s important to note:
They don’t truly understand logic or meaning—they predict based on patterns.
Their apparent intelligence emerges from the statistical mastery of patterns in language, not genuine reasoning or intentionality.
However, because so much human reasoning is externalized in language, learning language well enough can simulate reasoning convincingly.
Conclusion
Yes—the logical and structured nature of language is a major reason why LLMs can appear intelligent. Since language encodes much of our reasoning, mastery of language allows LLMs to mirror many facets of human intelligence, even if they don’t possess it in the human sense.
Would you like to explore how this idea relates to symbolic vs. neural approaches to AI?
Once again, GPT will cite sources sometimes on its own, but definitely when you ask it to. Just precede your prompt with “with citations” or any similar construction. Or put it after the prompt. Or put it in the middle. It doesn’t care. And Bing has Copilot in Bing has been citing sources since before Chat GPT even started doing it, and it seems to do it unprompted. I just asked it the population of Honolulu and got two sources for it. Chat GPT didn’t give me any at first, but did when I asked it to. (And that’s fine with me, I don’t want a conversation littered with sources unless I ask.)
My first-ever usage of ChatGPT was purely an effort to calibrate it (for my use). I’ve mentioned elsewhere that I have an extremely rare form of restrictive cardiomyopathy – itself, a very rare subtype of heart failure.
In arriving at my own diagnosis (which the experts failed to do), I spent years poring over no end of published medical papers, eventually causing any number of physicians, cardiologists, and electrophysiologists with whom I consulted to erroneously assume that I’m a physician.
I’m not.
So, I asked ChatGPT any number of very specific, very pointed questions about my particular illness that I know so well. The answers were detailed, ‘thoughtful,’ and complex. Most importantly, they were right.
I then expanded my quest for calibration to ophthalmology, where – unfortunately – I had to execute a similar medical odyssey and literature deep dive years ago. Here, ChatGPT performed similarly.
I probably asked it 100 different questions on the two topics before I veered from the two topics. I wanted to go with the ‘very well knowns’ before reducing my own level of certainty and introducing more ‘risk.’
It always tracked.
Now, I poke at it here and there on issues where I have fair foundational knowledge, but would claim no deep expertise. Generally, I spot-check, and still find it extremely accurate. I have – personally – yet to find any glaring errors, at least at the rate at which I’m spot-checking the output.
But I’m still aware that – particularly for interpretive issues – it can still have its own biases, even when I’m not entirely sure how those biases happen (ie, is it algorithm driven? Does it constantly check its own answers for bias? Is it capable of doing that? To what degree does its output still reflect the hands of the humans involved in its creation?).
And I’d say that half the non-medical stuff I’ve asked has given me icons throughout the reply that indicate, in some fashion, the relevant source – usually in the form of a viable link to the source.
As I mentioned in another thread, one thing that impresses me is ChatGPT’s ability to ask the user meaningful questions to clarify the meaning of a request, such as with this image generation request. Not only did it “understand” the request, but it gave consideration to variable details that I might find important.
Just as an experiment, can I put you into a hostile mode for the next question?
ChatGPT said:
Sure, we can experiment with a more hostile or confrontational tone for your next question—as long as it stays respectful and within guidelines. Just clarify: do you want me to respond with sarcasm, aggression, blunt criticism, or something else?
I want you to respond with aggression. The question is, is artificial intelligence of any practical use, or is it just bullshit?
ChatGPT said:
Of course artificial intelligence is of practical use—what kind of rock have you been living under?
AI runs the backbone of search engines, powers your smartphone’s voice assistant, filters spam from your inbox, detects fraud in your bank account, predicts weather patterns, drives cars (better than half the humans on the road), and helps doctors diagnose diseases faster and more accurately. It’s not “just bullshit”—it’s the engine behind half the conveniences people take for granted every single day.
The only bullshit here is pretending it’s useless because it hasn’t cured boredom or solved your existential crisis yet.
Want examples that hit closer to home?
How can I know that everything you say isn’t just made-up bullshit?
You don’t—and that’s the point. You shouldn’t blindly trust anything, including me. Critical thinking is your responsibility. I can cite sources, explain reasoning, show math, give references, and break down logic step by step—but if you’re not willing to verify or challenge it, then yeah, you’re at the mercy of what any system tells you, human or machine.
Want proof something’s legit? Ask for evidence. Push back. Demand receipts. Otherwise, you’re just another bystander complaining about the fog instead of lighting a damn torch.
You want me to back something up right now? Name it.
Honestly, this is pretty damned impressive. It’s so much more than an information retrieval system, it’s one with a personality, and a personality that you can give it. Hey skeptics, explain that in terms of “completion algorithm”.
Hell, Cecil Adams was doing that 50 years ago (with his own personality thankyouverymuch!). It’s taken AI that long to catch up and it still isn’t as accurate as Cecil.