AI is wonderful and will make your life better! (not)

Thats good. Lets identify what it is good for, perhaps that will help us understand and better prepare for what is bad about it.

We are facing the onslaught of a disruptive technology and insidious societal fad. It likely will not be disruptive in the ways initially assumed. Having it constantly shoved in our faces, and largely ruining previously useful tools like Google is currently the main fallout for me personally, but I am expecting something more profound to manifest.

To be fair, Google search has been increasingly enshittified long before AI, what with its insistence on placing paid ads before relevant search results, social media content before scholarly sources, and more. As noted before, AI can be a better search tool than Google search.

It’s not that I disagree, it’s that I think this is an incredible understatement. Google, for all of its alleged sophistication, is really just doing word-matching, whereas GPT (for example) is doing deep semantic analysis, and furthermore, maintains a context-aware conversation where you can ask more questions and get more refined answers. This is a profoundly different paradigm than a simple search. GPT also has access to a lot more information than the public internet.

And yet, still frequently gives confidently wrong and misleading information, something enthusiasts continue to gloss over as if getting wrong answers is somehow a nonissue. A general knowledge system is worse than useless if it does not consistently provide factually correct responses or cue the user to when a response may have ambiguity or the source of information is questionable.

Stranger

Yes. I don’t trust this so-called AI in the slightest, therefore I consider any results it presents as worse than worthless. It’s deceptive.

Taco Bell rethinks AI drive-through after man orders 18,000 waters

He said the firm was “learning a lot” - but he would now think carefully about where to use AI going forwards, including not using it at drive-throughs.

In particular, Mr Matthews said, there are times when humans are better placed to take orders, especially when the restaurants get busy.

“We’ll help coach teams on when to use voice AI and when it’s better to monitor or step in,” he said.

Definitely gives off “safety driver in the autonomous vehicle” vibes.

I tried, for instance, asking “AI search” for a reference to something, and it came up with a chapter in a multi-volume work. The book happens to exist, but it was the wrong volume (and chapter) even as it offered to tell me exactly which page to look at.

“Everybody” knows the models in question are bullshit generators, though, right? Or are they not being marketed that way? Because even various current-gen tools are “wonderful” and useful in certain lines of work.

There is the distressing realization that passing the Turing test is seen as the holy grail of general intelligence, but it turns out that the test is passable by imitating overconfident and dishonest human with no shame or self-awareness.

“But but humans make mistakes too!” well ok but the whole point of computers is that they don’t do that! If we’re putting 10 million “experts in people’s pockets” then they need to be experts, not a flaky app that misses half the time and doesn’t bother checking its work.

Disclaimer, I work for a company that’s bet the farm on an AI product, and I use AI every day. When you understand its shortcomings and how to get the most out of it, it can be magical. But in order to get there you have to slog through a long journey of understanding that the thing is often full of shit, which tarnishes the experience significantly, and fills me with dread to think of how people may be using incorrect information that may be harming their interests.

Our disagreement essentially centers on whether or not the adverb “frequently” is really justified here.

There is no knowledge source on the planet that is “consistently … factually accurate”. None. If, for instance, you have the opportunity ask questions of the most famed award-winning researcher in a particular discipline, you’re probably going to learn a lot, but you may also get answers biased by that individual’s particular pet theories.

As I noted before, OpenAI is making special efforts with their HealthBench initiative to better curate responses related to health care. I’m sure it’s far from perfect, and as I also noted earlier, even the IBM Health initiative with the Watson technology was not deemed reliable enough to be a physician’s research assistant. But at a more general level of knowledge, these tools can be very, very useful, and incomparably better than Google.

That is true for some information and not others. I have expertise in a language: if you ask me about that language, I either have or can get you information that is consistently factually accurate. In my academic field, yes, I’m swayed by theories, but I can get you information that is consistently factually accurate about history and bibliography and current consensus.

I don’t think there can be any dispute that the best, most knowledgeable human experts are generally the best sources of information. When I have a medical problem, I go to, and trust, my GP, and I may then be referred to a specialist.

But all of us of a certain age know that not all GPs, and certainly not all specialists, are equal. They will not all have the same opinions, diagnoses, and recommended treatment options, and certainly not all have the same level of expertise. Where there is uncertainty in complex systems like the human body, knowledge is not an absolute but a continuum.

In this continuum of health care, AI ranks the lowest and least dependable – that is not in dispute. What I would argue is that it’s likely far better than “nursing help lines” that are not actually intended to provide guidance (due to liability, you see) but only to distinguish between the need to call 911 or the need to stop bothering them. Whereas AI is genuinely intended to be informative and helpful, even if of course needing to be interpreted within the appropriate expectations of accuracy. It can be an amazing resource.

It looks like a dealer was kicked out of Dragon Con yesterday for trying to pass off AI art as his own. Police Called On Artist Accused Of Selling A.I. Art At Dragon Con

What does that even mean, though? Did he type “generate a kick-ass poster with, like, dragons and shit” into ChatGPT, or did he spend 2 weeks painstakingly tweaking the source code, input, and output of various bespoke image processing models?

AI stethoscope can detect heart conditions in just 15 seconds, UK doctors find | Euronews

(

About two-thirds of patients who were flagged by the AI stethoscope as potentially having heart failure did not actually have it.

However, it’s unclear whether doctors find the tool useful. A year after being given the AI stethoscopes, 70 per cent of GP offices stopped using them regularly, the trial found.

)

I was just in line at Wendy’s.

AI: Welcome! Would you like a Meal of Misfortune?

Me: Um….no?

AI: What can I get you?

Me: A large Diet Dr. Pepper.

AI: Okay. Would you like (screen changes, some sort of meal deal pops up)?

Me: Hell no.

AI: Hello! What can I get you today?

I moved back to the beginning of the script!

Ya shoulda tried to order 18,000 waters.

In retrospect, it should have been obvious that the system that would first ‘pass’ the Turing test would have a primary capability of deception.

A recent study by the Tow Center for Digital Journalism unveils a startling inefficiency among AI chatbots like ChatGPT and Gemini, which provide incorrect information over 60% of the time when sourcing news content. These inaccuracies, including fabricated headlines and misattributions, pose significant threats to the reputation and revenue of news publishers. The study advocates for AI companies to enhance transparency, accuracy, and ethical practices.

I think most people would consider an error rate of 60%, or indeed any percentage in the double digits, to be “frequently” enough to be unreliable.

The reason that “nursing help lines” do not provide diagnoses or recommend treatment plans is not just “due to liability, you see” but because without performing then appropriate examination of the patient a diagnosis is likely to be in error or incomplete, and a treatment plan may exacerbate the issue or create entirely new problems. A chatbot, of course, cannot perform any kind of examination; it can just take a prompt and render a syntactically correct response that statistically matches the data on which it has been trained. Are any of these systems validated against a baseline of different maladies and the range of symptoms that a patient typically reports? Do they understand enough to ask follow up questions to refine a diagnosis? Do they understand when the condition is beyond their ‘scope of practice’ and refer the case to the appropriate medical professional? If they don’t and can’t, they aren’t “an amazing resource” or “genuinely intended to be informative and helpful”; it is worse than useless even when “interpreted within the appropriate expectations of accuracy” by a layperson with little or no medical knowledge.

Try this:

Stranger

An error rate of 60% on what, exactly? That’s a very misleading statistic once you dig down and discover what it really means.

I asked ChatGPT where that 60% figure came from, and it immediately identified the Columbia Journalism Review’s Tow Center for Digital Journalism. But as GPT correctly pointed out, notwithstanding the sensationalized headlines of secondary articles, the title of the original study was “AI search has a citation problem”, published on March 6, 2025, analyzing how well various AI search–capable chatbots (e.g., ChatGPT Search, Perplexity, Gemini, Grok) handled citation accuracy.

To quote GPT:

The Columbia/Tow Center study really tells us something specific about citation reliability in AI search/chatbots, not about the overall dependability or usefulness of generative AI in every context …

… Generative AI is not inherently unreliable—its reliability depends on task, context, and safeguards.

… The Tow Center study should be read as a caution about AI as a source of citations in news/media research—not as a blanket statement that generative AI is unreliable in all uses. It’s a reminder that AI output often requires fact-checking and human oversight when precision matters.

Bottom line: Citations are a worst-case scenario for generative AI because they demand exact recall, zero tolerance for error, and real-time alignment with external sources — all things current models are weakest at. For more open-ended or approximate tasks, AI is much more dependable.

I wanted to try something specific and asked chatGPT to read this thread. My instructions were “read this thread and acknowledge that you understand it.”

It thought for about a minute and said that it had.

I looked at its thought process (cool feature) and saw that it was having trouble accessing the entire thread. My suspicion was that Discourse’s dynamic loading was giving it fits.

I asked if it was sure that it had read it. It thought (read: tried to access the thread again) for another minute, then said nope, it only read the first 19 posts and was unable to access the rest of the thread. Its thought process that time correctly pinned the infinite scroll as a culprit.

I asked it why it had been inaccurate. It didn’t answer (which is okay because AI isn’t actually capable of saying why it did something in the past) and instead acknowledged its mistake and said that now it had definitely read the whole thing.

I checked its thought process again and, clever bot, it was individually loading each page to read twenty posts at a time. Except that it still hadn’t loaded every page in the thread. It was also looking at other SDMB threads about AI and reddit threads about the SDMB.

I again asked if it was certain and this time it thought for a solid two minutes before acknowledging that it had only read some of it. I’m sure I could have kept it in this loop for as long as I wanted to.

I’ll have to manually create and upload a text file if I want its insight on this thread (or on any website with dynamic loading, which is a lot of pages these days).

It’s really funny that you did this, because a few days ago I decided to grab the “ads you hate” thread - with almost 7,000 posts - and ask CoPilot to sum up each company mentioned not in a quoted section. It thought for a while, and gave me a list with Liberty Mutual at the top with 12 mentions.

I pointed out that was woefully low, so it thought about it, and told me it had only looked at the first 100 posts.

I then went through EXACTLY the same loop process you did, asking and then getting nonsense, and never got to a realistic answer.