AI search results can be crazy stupid (cars named after plants)

I wondered what cars have been named after plants. The only one I could think of at first was the Toyota Sequoia. So I Googled it, first on my phone and later at home. One surprise was that I got different answers. Hmmm. Both reminded me of the Nissan Leaf and the car company name, Lotus. good. Both also mentioned the Toyota Corolla but here I disagree. Yes, the ring of petals around a flower is a corolla. but that is because it resembles a crown. And that is the meaning that Toyota had in mind because it went well with their larger Crown model. Both the car and the plant part are named after crowns. The car is not named after the plant part.Google realized this with the Dodge Aspen since it was named after the town, not the tree.

But then AI Google dropped off the rails. On my phone it suggested the VW Beetle. WTF? It claims the image of beetles crawling around trees evoked a plant in our minds. And the Ford Fusion because that could refer to a flowering plant. At home it suggested the Ford Mustang. Images of the majestic steed running through the grassy plains, etc. I don’t think so. But wait, now neither of those answers shows up. Why? Did it learn more already?

Then it gave a long list of former car names from around the world. I didn’t recognize any of them. They looked made up, in fact. Now that list no longer appears. This whole thing is goofy.

But that’s not a plant. Though it’s part of a plant.

Plus: the Mitsubishi Cordia.

“All cars are made in plants. Therefore…”

It’s apparently accepted that AI “hallucinates” and therefore routinely gives false results.

It may be a separate thread, but: why is this accepted and acceptable? (Also, I’m really curious about how AI developed to be a routine fabricator of results. What human choices led to this outcome?—that one is likely beyond the scope of the thread, but I remain interested.)

The Toyota Corolla was the replacement for the earlier Toyota Corona. Toyota certainly had (has?) a theme to their naming. And you’re quite right it isn’t plants.

It isn’t a ‘choice’ per se; it is a consequence that large language models (LLMs) which power chatbots are ‘just’ really complex word completion systems. They have enough parameters and a vast pool of training data to produce fairly sophisticated textual responses to a prompt by dint of the metasemantics of being trained on so much text, but while they often appear to ‘understand’ the context of a question they don’t actually know whether a response that the produce is factually correct or not. So, they can (and often do) generate complete nonsense and fabricate seemingly factual information expressed in an authoritative manner, which convinces casual users and devotees that they are somehow reliable general knowledge systems.

Further discussion should be in a dedicated thread but that is the essence of why “AI search results and be crazy stupid”.

Stranger

I can take a shot at explaining it.

Don’t think of AI as behaving like a human using Google—where you input a query, scan multiple results, and make a thoughtful, discriminating choice.

Large Language Models (LLMs) operate differently. They’re probabilistic models trained on massive data sets. At each step, they generate output by selecting the most statistically probable next word, given the preceding context. This process continues recursively, word by word, until the response is complete.

Because the model can only reflect its training data, it doesn’t “know” facts—it estimates what a plausible answer might look like. And since the training data includes both high-quality and low-quality sources, the output can reflect either.

One example I found intuitive was in a recent OpenAI paper: when asked about the Māori language, a smaller model might respond with “I don’t know.” A larger model, having seen some Māori data, might try to answer, ‘thinking’ it has a viable probability path.

People are talking about creating checks on hallucinations, mid-process, and OpenAI says they’re doable, but nobody has implemented them yet (that I know of).

Ninja’d! (also, we have a thread discussing some of this stuff)

GIGO, I suppose. But, yeah: good explanation.

Thanks to you, too: good explanation.

THAT is interesting! No demand for actual truth in this our crazy stupid world, perhaps….

I’ll check out that thread.

No, the Corolla was the smaller companion model to the Corona. The Camry was actually the replacement for the Corona.

And “Camry” is pronounced sort of like Japanese word for crown. So they definitely still have a theme.

Accepted by WHO? Certainly not me, since I ahbor even seeing the spew of these abominations out of the corner of my eye, before I remember to add ye olde “-ai” tag.

On the Corolla, at least one Toyota source says it’s from the plant:

A ‘corolla‘ is the ring of petals around the central part of a flower. The name was intended to evoke the image of a beautifully styled, eye-catching small family car.


PS If you want to use AI for factual searches, put them in “Deep Research mode” and you’ll typically get much better results, with citations.

For Google’s Gemini, instructions or example query: https://g.co/gemini/share/04de848901bd

For me, ChatGPT gives better results most of the time: instructions or example: ChatGPT - Car models named after plants (it pointed out the Porsche Cayenne, for example)

The regular AI summary you see on Google is worthless and frequently hallucinates. It was an act of desperation that they released that at all, because they were terrified of ChatGPT. It wasn’t anywhere near ready and should never be trusted without manual verification… just adblock that shit.

Or give Kagi.com a chance. Compared to Google (or see their video explainer), it doesn’t have spam and ads everywhere like Google does, and its AI features are entirely supplementary and optional, not forced down your throat.

How do you block it. It’s right there on the results bar along with images, video, etc.

You can use a custom search config in your browser: I Figured Out How to Turn Off Google's AI Overviews. It's Like a Whole New Internet | PCMag

Or add and update a manual rule to your ublock every few months (when they update the page structure): https://old.reddit.com/r/uBlockOrigin/comments/1ct5mpt/heres_how_to_disable_googles_new_forced_ai/

Or use a browser extension: For Chrome: https://chromewebstore.google.com/detail/hide-google-ai-overviews/neibhohkbmfjninidnaoacabkjonbahn?hl=en

Or Firefox: Hide Google AI Overviews – Get this Extension for 🦊 Firefox (en-US)

But Google’s just going to keep fighting you. It’s an existential crisis for them…

With Kagi, you pay the company directly for search, so they’re incentivized to make the results better for you, not fill it with 7 ads and AI nonsense.

That’s interesting. I always understood that the name was chosen because it means “small crown”.

Toyota model lineup as of the late 1960s was:
Crown = Large Toyota
Corona = Midsize Toyota (means crown in Latin)
Corolla = Compact Toyota (means small crown in Latin)

It could be both. Like how Aspen is a town and also a tree. An intentional double entendre?

Or some lazy Toyota blog intern didn’t bother doing their research… who knows? If that article were published a few years later, it would’ve been likely they just cited some AI bullshit instead, but thankfully it was 2020… just a little before the machines took over.

Google search used to be based on keywords. It would check the words in your query against words in website text. But now with AI it interprets all searches as a question that has a factual answer, so makes up any old crap that sounds plausible based on a huge cauldron of semi-contextual gobbledygook.

It’s not going to go well.

I guess once they have a critical mass of enough users, they can just keep milking them for a good few decades, no matter how bad the search experience gets… I mean, how much worse can it even get? Today’s Google is already much worse and way spammier than Altavista and Yahoo were back in the day. It’s become the very thing it sought to replace. But still it has billions of users.

Freaking AOL didn’t shut down until last week. We have satellite internet and multi-gigabit fiber, but there were still people perfectly happy with their dialup. Change is hard, I guess…

Edit: THIS week!

Your observation here is what I would call “biased” rather than something like “wrong”. It continues your drumbeat of denying the reality of how extremely useful advanced LLMs like GPT 5 can be, especially with their new HealthBench criteria. See this, for instance:

I acknowledge your skepticism and know that you will continue onward on that path, but I do have a relevant anecdote. WIthout going into details, I have medical symptoms that may be due to a variety of causes, some minor, some potentially very serious. I have been engaged in a long discussion with ChatGPT about the symptoms and the results of preliminary medical tests like ultrasound and what can be concluded from them, and the most promising next steps.

According to you, I am a complete idiot relying on a “text completion engine” to give me advice. According to both my GP and my referred specialist, I am a surprisingly knowledgeable patient and am being directed to the diagnostic resources that they and I – based on my best information including GPT – agree are best.

Not directed at either you or Stranger in particular, but really just a general IMHO observation:

More than the actual strength (or weakness) of their LLM models, these AI companies all have huge productization & marketing issues. When GPT-2 first came out, it was one LLM by one company, and it was easy for the world to discuss and criticize it because everyone was more or less talking about the same thing.

But today’s AI services aren’t just a single LLM anymore, or even a single company. Google’s AI Summaries are different from Gemini, which itself has several sub-models and modes, which are all different from NotebookLM, which is again different from their enterprise Vertex models, and different pricing tiers will give you drastically different qualities of results, and then there’s Nano Banana or Mega Grapefruit or whatever. And that’s just one company. OpenAI has a bunch of its own models, Anthropic even more, then there’s the smaller ones specifically for music, image, video, etc. generation. And even within a single “model”, GPT-5 isn’t really just a LLM model anymore, but a collection of tools and processes that themselves create sub-processes and subcontract out work to other sub-AIs and subroutines…

These all have their individual strengths and weaknesses, and all of it changes every two weeks. Whatever it’s not good at today, it may be better (or sometimes worse) at next week. Nobody can keep track of these anymore — there are dozens of benchmarks that now try to measure different aspects of their behaviors, and the ranks change every week. The benchmarks change every week.

We call all of it “AI”, but somebody being dumped the low-effort Google AI summary when they do a simple search is going to have a very, very different experience from an experienced prompter engaging with GPT 5 Pro (at $200/mo) over an extended discussion over several hours/days, with many web lookups in between.

But none of these companies ever bother to teach you what the differences are, or what they’re good at or not good at. They’re just so busy trying to improve all the things all the time that they never have time to see the chaos from the user’s point of view anymore :frowning:

There ARE some legitimately useful things AI can do today. But damned if the average user can find them, or tell them apart from the things it’s still terrible at.

How useful is a system that often provides incorrect factual information?

From your cite:

The researchers emphasise that these findings represent performance in controlled testing environments rather than real-world clinical practice. “It is important to recognize that these evaluations occur within idealized, standardized testing environments that do not fully encompass the complexity, uncertainty, and ethical considerations inherent in real-world medical practice,” they cautioned in their discussion.

In short, it is good at taking a standardized test which includes some multi-modal elements. That is an impressive feat of replication but doesn’t mean that it has any actual understanding of the complex interactions of a human patient in the physical environment.

Sure, your interactions with ChatGPT gave you the nomenclature and jargon to sound like an informed patient, and even some cursory information about appropriate diagnostics (information you probably could have gotten by reading the same online sources that ChatGPT was doubtlessly trained upon) but that doesn’t make it an expert or reliable system for medical diagnostics. It is repeating the use of language found int he structure of sources of training data, and provided those are credible sources it is providing cromulent-sounding guidance. But it has no actual knowledge of medical diagnostics or pathology of disease as applied in a clinical setting; it just has the text and image date from which to synthesize a statistically appropriate response. Given a sufficiently large base of training data, enough parameters to generate a complicated response, and a “Chain-of-Thought” recursive model to enable it to break the parsing of the prompt into manageable segments such that it doesn’t immediately spiral off topic and ‘hallucinate’ a completely inappropriate answer, it can produce a plausible-seeming response that reads like what the first pass of an attending physician might write in their notes. That doesn’t mean that it is actually making a good diagnosis, or that it would recognize an obvious anomaly or error, or that it could formulate an appropriate treatment plan.

Stranger