ChatGPT Breaking up marriages

Uh… I’m not sure that I know what you’re saying, but I’m pretty sure I know what you’re saying. :grinning_face_with_smiling_eyes:

You won’t do, Mr. Fish

Do for Who?

Really? You’re going to compare ChatGPT’s specifically curated medical guidance under OpenAI’s HealthBench program with the con of a fake psychic? You’re going to compare its vast medical resources to a carnival fortune-teller? If that’s your take then there’s no getting through to you.

As I said earlier, either in this thread or some other one, I am at an age where I invariably have various medical symptoms. Discussing those issues with ChatGPT has been enormously helpful and has enabled me to have a constructive dialog on a professional level both with my PCP and a relevant specialist, both of whom consider me to be a very knowledgeable patient for whom they don’t have to “dumb down” information. To compare that kind of informed advice to a carnival fortune-teller is just nonsensical mendacity.

Here:

Stranger

I know we’re getting sorta repetitive here, but I just want to highlight what I regard as a profoundly important point. By persistently focusing on what you regard as the core operational feature of LLMs (as mere “completion engines” lacking any actual knowledge) you’re completely ignoring the emergent properties that manifest at the unimaginably large scale of many billions of parameters shaping the behaviour of their artificial neural nets. What, after all, exactly is “knowledge” other than the ability to correctly answer questions or, in medicine, develop accurate diagnoses based on a description of symptoms and medical test results? And how is an AI’s knowledge objectively inferior to that of a physician if it reaches the same conclusions based on the same evidence?

10/10, no notes

ChatGPT may not be hurting my marriage, but it’s certainly impacting my D&D game.

The guy currently serving as DM, who has a bit of an AI obsession, has quite obviously and openly designed the adventure using ChatGPT. And that’s fine, I guess. Not everyone can be creative. He’s a nice guy and a good storyteller, even though the stories he tells are hopelessly derivative. But just this Sunday, he had to roll something like 10d6 for damage against me, but instead of looking at the dice he rolled, he picked up his phone, pointed it at the dice tray and said, “Let’s have ChatGPT tell us how much that was”.

I couldn’t help myself. I stood up and shouted, “Just add up the damn dice! Jesus fucking Christ. ChatGPT, seriously!?”

I apologized, of course, but it did put a strain on our game.

And this wasn’t the first time that damn AI has intruded into our sessions. A couple months ago, I got into an argument with a player about a rule. He asked ChatGPT and showed me the answer; in response, I opened the paper book I had on the table and showed him the actual, correct rule. And he kept on arguing with me! Stupid artificial intelligence. Some of us here have natural intelligence, you know?

So back in the day, when we invented writing, there were undoubtedly a core of intellectual stalwarts pointing out this new-fangled technology would weaken our ability to do the absolutely fundamental work of memorising tens of thousands of lines of epic poetry. And if so, they were right! We can’t do that anymore. And ditto for spellcheckers, spreadsheets etc - all taking on some of the intellectual muscle work and consequently allowing us to let some stuff atrophy.

Even so, it is still surprising, and kind of appalling, to see the scale and speed at which people are outsourcing some quite basic elements of thinking. Like counting up dice, or understanding an email. How dumb do you need to make yourself? In real danger of ending up like these guys.

The September 29 New Yorker has an article on AI diagnostics. One study quoted in the article (which is by a doctor) found that ChatGPT answered open ended medical questions incorrectly two thirds of the time. The article discusses a diagnostic tool called CaBot which does much better, but it is not available to civilians. The hospital the author works at has banned ChatGPT, I assume to keep doctors from using it to save time.

Technically, a therapist might encourage a person into a divorce by agreeing with the person who has complaints, and/or sensing that they’re not letting them go.

That a robot did the same doesn’t, by pure virtue of being a machine, make that result an unreasonable outcome. Likewise, I’m sure, therapists have said just the right thing to spur someone to kill themselves, kill others, abandon their family, etc. Likewise, it’s not inherently worse for an AI to cause the same result.

The questions would be 1) whether the AI has a better or worse success rate at healing relationships and mental issues than a human therapist, 2) whether the AI is becoming better on that front over time, and 3) whether (in the specific case of healing a marriage) the people are actually happier afterwards (being forced to continue an unhappy marriage isn’t necessarily the best outcome).

Anecdote isn’t data. Choosing the one case where things went bad doesn’t castigate the whole methodology. It’s the question of how it compares to the control case and the standard interventionary techniques.

Yeah, reddit’s default answer to any relationship question is ‘DUMP!’ - although a lot of the stories on there do warrant it (and some or most of those probably aren’t real).

It depends what we understand is involved in “correctly answering questions”, right? I mean, if somebody has blindly copied their answers on a test from a more knowledgeable classmate, they may have answered the questions correctly but we wouldn’t say that that reflects any meaningful knowledge of the subject.

And that is, fundamentally, the issue with LLMs as a source of information. They are performing a very complicated version of copying answers from other sources without any awareness of what the answers mean. They don’t have an analytical understanding of even the most simple logical deductions such as determining that 2+2=4.

That doesn’t mean that they aren’t going to come up with answers in very many cases that agree with reliable sources. To the extent that there’s a significant consensus in online data about some basic information, an LLM will on average do quite a good job of accurately reflecting that consensus version of the information.

But accurately copying knowledge is not identical to actually possessing knowledge, or being able to generate knowledge.

That said, I propose a possible solution to the OP’s concerns about the involvement of chatbots in relationship conflicts: Let both parties team up with a chatbot, and let the chatbots argue with each other while the humans have to communicate in person. It’ll be interesting to see if the chatbots end up working out a compromise or getting stuck in an endless duel. :grin:

As much as I generally respect the New Yorker, they are a literary and general news publication and are not immune to sensationalism or sometimes getting science wrong. I hear statistics like this all the time and I always wonder why it doesn’t reflect my experience, although certainly there have been cases of GPT being wrong. But “wrong two thirds of the time” makes me wonder if there is some unexplained nuance about how the doctor judges “wrongness”, and also if this was a formal study it’s probably already outdated because AI tech is advancing so fast. It probably predates GPT-5 and the HealthBench initiative.

Obviously they do have the ability to reason and synthesize conclusions about novel situations, because they do this all the time, like solving logical puzzles that they have never seen before. And let’s not forget the impressive battery of professional competency tests that GPT has successfully passed.

Having “knowledge” about a subject is a broad continuum and the definition is intrinsically subjective. GPT does a pretty good job of acting like it possesses understanding and analytical capabilities. If someone asked me to explain how the sun creates energy, I could do a reasonable job of explaining it based on things I’ve read (or mentally “copied”, if you will) and not because I’ve personally been there. GPT can do exactly the same, and probably more thoroughly.

I think people would have a lot more respect for LLMs if they took a behavioural approach and stopped obsessing about how they think it works, or, put another way, if they had a better appreciation for emergent properties that occur at very large scales of computational complexity.

I think the fundamental difference between asking an LLM-based chatbot and asking another human is not necessarily the veracity of the answer, but the confidence.

When I ask another human a question they cannot answer properly because they don’t know, I generally get a feel that the answer isn’t going to be worth anything - I can perceive their inability to answer in the form of lack of confidence.

With an LLM, it’s always 'Sure! I can help you with that!.." followed by (sometimes, or often) absolute bullshit.

Humans are capable of that sort of confidently-incorrectness too, of course, but it’s the exception not the rule. LLMs do it pretty much as standard.

You should read the article. It is not about how AI is evil. The other tool I mentioned is AI also, and was able to match an expert diagnostician on some hard to diagnose benchmarks. And do it much faster. It is about how AI diagnosis is different from human diagnosis, and about not thinking the output of any AI is as good as any other AI.

Maybe someone knows the study that was referenced, but it was not just the opinion of the doctor who wrote the article.

Nope. (Unless there’s a specifically deterministic algorithm for parsing a particular kind of puzzle that is invoked by the LLM under certain circumstances, but I have not seen such hybrid methods discussed, although of course that doesn’t mean they can’t exist. If you’ve got a specific example you’re thinking of, do please link to it.)

The LLM is just comparing probability values for different possible choices of words to put in its response, based on information it’s stored about the frequency and proximity of words it’s encountered in its training data (and some random-number generator input to shake things up a bit), millions and millions of times. It is not reasoning from anything like general cognitive understanding of what the puzzle “means”. Not even at the level of understanding why 2+2 = 4. It’s crunching numbers, that’s all.

Yes it does, and the programmers who created the tokenization algorithms and backpropagation models that the LLM is using (as well as handling all its vast training data) have done an admirable job in producing that result. But that doesn’t translate into the LLM actually knowing what it’s talking about, as we generally understand the term “knowing”.

Does that make the LLM”s answers automatically wrong? No, of course not. But it means that the correctness and complexity of the LLM’s answers is not a reliable guide to whether or how it’s “knowing” them.

Just setting the scene, did you yell that in English or Hebrew? (I suppose Yiddish is too much to hope for.)

Socrates was no fan of writing.

Is there even a Yiddish version of “Jesus fucking Christ”? I suspect that that phrase in English may now have become one of the “universal loanwords” in multiple languages. Like “tofu” and “cafe”.