Request: don't put ChatGPT or other AI-generated content in non-AI threads

I think the Winograd schema challenge is a great test of the ability of a system to perform natural language processing. I am not at all convinced that it is a useful assessment of general non-linguistic knowledge. People are really impressed with the ability of an LLM to process natural language because it is the thing that distinguishes us from other animals (well, at least most animals…I’m not convinced that some cetaceans, and maybe some corvids, don’t have real language) but as linguists will tell you, there is a lot of logic built into the use of language. Now, systems that can actually process natural language are impressive because it is a capability that researchers have strived to develop for decades, but it doesn’t necessarily indicate anything resembling actual cognition; it is just complex symbolic manipulation using really sophisticated statistical models without any actual evidence of semantic comprehension.

I’ll admit a bias in knowing some neuroscientists and not anyone at the forefront of AI research (although I’ve personally done some work with heuristic neural networks), but I’ll say that while OpenAI was going through a meltdown that attracted attention generally reserved for Taylor Swift or Kanye West, and for weeks you couldn’t walk through a cable news studio swinging a cat without hitting half a dozen evangelists for or against AI research, you have to really search through your newsfeed or podcast list to find someone doing active research in neuroscience, and inevitably they are explaining about how little we actually understand about how the brain functions.

I’m sure that there is a “distinguished cognitive scientist with a strong and extensive background in AI (although I’ll note that “cognitive science” is often codespeak for “psychologist with a passing interest in brain function” whereas neuroscience is actually focused on finding correlates between some aspect of neural function to behavior or other measurable aspect of response) but I think there is a lot of special pleading regarding “computational intelligence” being some some real analogue of human cognition. This is not to say that AI systems are not very impressive in their capabilities, or that you can legitimately refer to it as “intelligence” in at least some narrowly defined way that represents the ability to respond correctly to stimuli; but they aren’t functioning in a way that is really similar to human (or complex animal) cognition.

Stranger

We’re unfortunately getting way off the intended topic, which is too bad because these are interesting digressions. I’ll just say here that while the natural language analytical capabilities of LLMs are amazing, that isn’t the point here. Proponents of the Winograd schemas – which superficially resemble natural language comprehension tasks – will argue that they are explicitly designed so that the correct inference is impossible solely from the semantic context. For example, given the statement “I spread the cloth on the table in order to protect it”, it’s not possible to determine what the pronoun “it” refers to without an understanding of how tablecloths are used and that we try to avoid marring table tops. The first such test, proposed by Terry Winograd himself, was “The city councilmen refused the demonstrators a permit because they feared violence”. The question of who the pronoun “they” refers to requires an understanding of the typical behaviours and motivations of city councilmen and demonstrators.

Instances where such inferences can fairly trivially be made from the semantics without deeper real-world understanding – like when the arguments that a predicate like “drank” can refer to are restricted (linguistic selection) – are disqualified from the schemas.

Exactly - that’s probably the reason they went to ChatGPT in the first place - because they didn’t know anything about the question, but for some reason, felt compelled to answer it.

It’s like a worse version of answering a complex question by just copy-pasting whatever Google spat out. Google doesn’t always present the best answer at the top of the results, but at least with googling, the answer represents more or less exactly something someone wrote, somewhere, that was out in the open and potentially subject to the scrutiny and criticism of people who do know.

There are times when people might want others to answer with results they found in a web search; questions of the kind ‘I am having trouble finding…’ or ‘I can’t understand…’ or ‘I am having trouble getting started with…’ - in those cases, someone posting ‘here’s a thing I found on Google that I think might help’ is actually… helpful.
But in a lot of other cases, like when someone wants a legal opinion, or wants advice from people with actual real-world experience of a topic, a hastily googled answer is far less likely to be useful.

And a GPT-generated answer is just that problem taken to a further extreme. Oh, you have a question? OK, here’s something that I don’t even know is true! But at least we got there fast, right?

It would be better just not to answer.

I do think there is some utility in putting an inquiry into ChatGPT and then using the output as a basis for further research and reading. It’s lazy, but it does give you a head start on some things to look into, the process of which tells you which of the machine’s assertions are legitimate and which are fabricated through confident-sounding but arbitrary predictive text. If you take the output and edit it judiciously and build your own reply from it thoughtfully and conscientiously, you can, potentially, produce something useful.

But to say “here’s what the AI engine says in reply to your question” and then simply copy and paste the output into the post, that might as well be taking a big steaming shit in the thread.

In theory, I think it sounds OK to use it as a starting point. In practice, are very many people actually doing that? I feel like the consumer base for GPT is predominantly the people who don’t want to do any legwork, either before or after.

Well, first of all, using a term like “consumer base” makes it seem like there’s an epidemic of posters using AI bots to make their posts. This is not the case here. The idea that posters “for some reason, [feel] compelled to answer” questions that they know nothing about, and are using GPT to do it without any further research, seems more like a hypothetical fiction than any sort of troublesome reality.

As I said much earlier, the OP does have some merit, and no one can argue with the fact that these systems do get things wrong sometimes, the extent varying with subject matter and among the different iterations of the technology. I just object to the idea that LLMs do nothing more than just “string words together” and, by implication, are producing garbage that some people are stupid enough to think has value. LLMs are becoming important and practical tools for information processing and retrieval but should be used judiciously, with all the appropriate caveats already mentioned with respect to proper vetting and attribution when making posts.

Bullshit machine” might be more apt, then.

Which is probably why they’re catching on in many workplaces. Bullshit is currency.

(I wrote the Generative AI usage policy at my company. The TLDR is “you can use it as a crutch but you remain accountable and you cannot blame the machine if/when it fucks up.”)

Me too. The argument that (for example) it’s ‘nothing more than an elaborate autocomplete’ is facile. in practice, LLMs can do surprisingly complex tasks and display features that resemble comprehension.

However there are many documented cases where the output of LLMs contains plausibly written misinformation, and the very choice of someone going to consult an LLM with the intention of relaying its output as an answer in a message board post, probably correlates with that person not themself having the capacity to check the veracity of what the LLM tells them; if they knew the answer, they would probably just post the answer.

There are other ways an LLM can be useful here. For example, if I find a very interesting but long scientific paper I want to share, would it be wrong to ask ChatGPT to write a summary of it so I can post it here after being suitably checked?

Or, if an LLM can do something easily that would be really tedious for me to do, such as creating a bullet list of concept discussed in a book or something, should it not be posted here? Or posted without attributing it to an LLM?

If there’s a difficult concept you understand but can’t get across to people, asking an LLM to do a “tell me like I’m five” description of it. Since you know the subject, you know if the description is correct or not, but it saves you the hassle of writing it yourself. Would it not be allowable to post that here if someone says they’re having trouble with a concept, so long as it was attributed? I’ll grant this is an edge case though, and there are arguments for an against.

I mean, I’d rewrite it using the GPT answer as a skeleton. I think it’s fantastic for helping me organize my thoughts/to give me a framework for an answer in my own words, or modifying its words as I deem appropriate. I think that’s perfectly fine and a great way to use an AI tool like this. I don’t think copy-and-pasting verbatim even with attribution is necessarily a good idea. I know I will place less stock in it than something mildly re-written by a human poster whom I know has vetted the knowledge in it. Hopefully that makes some sort of sense – I don’t have everything worked out in my head quite yet.

Those all sound like pretty legitimate use cases to me; the middle one might be problematic if it’s a book you are not familiar with, but otherwise, you’re describing using it as a labour saving device rather than a magic oracle.

I find even this use case suspect. Learning how to organize disparate thoughts and organize them into a coherent narrative framework is a critical thinking skill, and one that atrophies without use, and will not be developed for someone who is in the phase of intellectual development of writing skill. Using a chatbot to perform ‘rote’ skills such as generating boilerplate text appropriate to the application, but using it to actual do some significant element of mental labor is subordinating your own cognitive maintenance and contribution.

I’m sure many people will disagree with that statement, and that is just fine to let a bot do the grunt work of generating an outline and populating with text to allow the user to just focus on ‘high level’ thinking, but I think that this is an insidious path to go down, far more so than the mechanization of human labor; with the latter, we have atrophied our physical strength and skills but then, those were very limiting. Granting away primacy over intellectual labor is literally foregoing what actually makes humans distinct and has given us control of our environment.

Stranger

Yeah we definitely disagree on that.

You may (or may not) be right about that. It remains to be seen.

An analogy I’ve made before is about the impact of photography on the creative and intellectual arts of painting and drawing. Painters produced landscapes and other true-to-life depictions because they were attractive; portrait painters applied their skills to giving those wealthy enough to afford it a picture of their likeness. The invention of photography rendered all that theoretically obsolete, especially as the technology improved.

One might have expressed a similar fear then that photography, if allowed to flourish, would kill the visual arts, and that the associated skills would atrophy from disuse. What actually happened was that the visual arts were not lost, but evolved along a few different trajectories. Traditional painting styles continued because they had merits that photographs did not, and they also morphed into more abstract, impressionistic styles which might not have happened without the impact of photography on traditional representational styles, while for many, photography itself became a tool that spawned a new art form in itself.

There may be merit in applying this analogy to emerging AI systems, and seeing them as tools that will augment our creative and intellectual endeavors, not stifle them.

In short, I’m much more optimistic that we can use AI as a tool to help us find information, help us understand and organize it, and generally use it as an assistant to help us communicate information and ideas, without risk of losing our critical thinking skills.

But … Artists were always a small subset of humans doing a highly specialized thing. The artists who specialized in the more drudgerous works (yet another painting of a fat society matron or her fancy house. Sigh.) lost out, but the truly creative artists, not merely skilled paint-crafters, moved on to invent new art and new aesthetics beyond where early photography could go. And humanity as a whole gained even as those paint-crafters lost their careers.

OTOH …

If the impact of AI on the general public’s ability to think is anything like the impact of cheap 4-banger hand calculators was on the general public’s ability to perform basic arithmetic, we are sooo screwed.


My WAG du jour and worth every penny you’ve paid for it:
As you say, it remains to be seen, but I foresee AI as both enhancing the best and brightest minds amongst us, while turning the majority of minds into useless porridge. Or rather, porridge even more useless than the Great Unwashed already are.

Making the more-porridge-heads also economically useless at the same time suggests the imminent formation of a real dystopia along the lines of Soylent Green or any number of other Hell-on-Earth novels where ordinary non-elite 0.1% humans are either prevented from breeding, or most of their offspring are simply ground up for oil or food.

I think there are probably too many posters who think they are well-equipped enough on a subject to let an LLM generate a summary and screen it to make sure it’s accurate. These will happen especially when it’s a subject adjacent (or in many cases not that adjacent) to what the poster believes is their area of expertise. In this case the LLM will act as a terrific Dunning-Kruger magnifier.

I’m so not sure … very seriously.

I think upon my career in medicine.

By the time I came along most of us were much poorer at diagnosis of heart conditions by auscultation than the generation before us. They had to make major clinical decisions by what they heard; I just needed to decide if it was odd enough to warrant ordering the echo.

I had the same thoughts you had regarding the rise of what got labeled “evidence based medicine” but what in reality was “follow the guideline” medicine. To no small degree many felt why bother with critical analysis of the literature when you’re going to be judged more by guidelines compliance than anything else. Let the guideline makers do the critical thinking. My mantra was that evidence based medicine does not mean compliance with guidelines. Guidelines are a default and a start.

And to some degree it has played out like that reliance on guidelines rather than on literature evaluation. But not much. New people out still think critically and well.

Not sure how this tool will impact the ways in which the average person thinks. I am fairly sure it will, and I fear similarly to you, but most of the time we just use the same amount of brain effort, heck any effort, in other ways. It will be a process.

Any other SDMB old-timers pleased to see the SDMB has lasted long enough that this has become an issue? I’m looking forward to someday seeing a Great Debate thread on whether The Technological Singularity was a good thing or not.

What about using AI for a math calculation?

For example, maybe I want to figure out the time dilation that would occur on a 1g acceleration for a trip from the earth to Alpha Centauri.

I am not qualified to do that calculation. I could probably figure it out but it would require way more work than I am generally willing to put into an internet thread.

If I can ask an AI to do it is that ok? We’ve had Wolfram Alpha for a while which kinda did this for some time. The newer AI models just seem better at understanding the question.

(NOTE: I recently did this and I posted the wrong answer.)