Request: don't put ChatGPT or other AI-generated content in non-AI threads

Me, too, although that may change one day. At the point where an AI registers for an account and posts under its own username, I’ll be happy to read its posts and evaluate its posts the same as any other user.

Until then, it falls between two poles. At one end of the spectrum are humans–and if someone was like, “I’m cutting and pasting what my wife said about the topic of this thread,” I think they’d be modded for doing so, because their wife should join the board if she wants to participate. At the other end of the spectrum are search engines–and if someone was like, “Here’s what Google returned when I searched for the topic of this thread,” I suspect they’d also be modded. ChatGPT falls somewhere in between.

And that’s legit great. I know many many people get both utility and pleasure out of interacting with ChatGPT, DallE, and so on. That’s fantastic for them! But as you say, it’s not what we’re here to talk about.

Let ChatGPT speak for itself–either by registering for the boards, or by hanging out on its own servers and letting us come to it.

How often does this happen? I mean, if you asked me to guess how many times I’ve seen it, I’d say maybe two or three times on the board in the last year when GPT became the talk of the town. Am I just not in the right threads? But, yes, I kinda rolled my eyes when there wasn’t any further dissection of the answer, just blatant copypasta.

Exactly.

It is completely possible that a near future generative AI could be able to autonomously participate as a poster here. And its posts might be of higher quality than many others! Still would not be interested in that interaction.

I still want to play Go against people rather than the machine set at my competitive level. Same idea.

I have sometimes said “here’s what I found in a ten minute google search” and not been modded. I gave the specific cites that I thought relevant, though; and generally specific quotes from them to show why I thought them relevant.

My reasons for doing it that way were to state that I hadn’t done a thorough study of the subject, and people should take what I’d found for what it was worth; but I did think I’d found something useful.

If I do similar searches and don’t think I’ve found anything useful, then I don’t make the post. If I’d asked an AI to do the search for me, in the current state of AI, I wouldn’t know whether I’d found anything useful or not. I would still have to hunt up the cites. ETA: It would be more like saying “Here’s a link to the first page of a Google search, there are more pages, I haven’t read any of the results.” Which I would rather expect to be modded; at least, unless the question in the thread was ‘can you find anything about x subject on google?’.

I haven’t kept track or anything, best I can say is “often enough that I’ve developed an opinion.”

Sure, the two or three times was enough for me to say, come on, we can do better with that. And I love these LLMs.

I think the rule should be that if the AI content is properly labeled and the poster believes that the output is on topic and interesting, there’s nothing wrong with it. We should judge the content, not the source.

Let’s say we are discussing a subject, and I ask ChatGPT about it and it comes up with a well written paragraph that illuminates something important that hasn’t been discussed in the thread. Should we not see that because an AI came up with it? Is it not a valid topic for debate? Or should I rewrite it and not attribute where the idea came from?

As AIs get better, they will have increasingly interesting and thoughtful answers to issues we discuss here. It doesn’t seem like fighting ignorance to disallow posting of information that’s enlightening, just because it came from an AI.

Maybe instead of a hard rule we should just rely on the judgment of posters and mods to determine when the use of AI content goes over the line, iust as we do with human content.

I mean, I think there should be some human commentary on the GPT response, not just a cut & paste job anymore than a cut & pasted link with little commentary surrounding it would be acceptable here.

Agreed. That’s like copying and pasting from an article you find online and pretending those are your own words. It’s a bad thing to do.

If someone did that and someone else thought the writing was familiar, put the text in a search engine, and found the original article, I think they’d call them out and rightly so.

The problem is that if you post stuff from an AI, that’s likely something generated from your unique interaction with the AI, and it’s harder to find it out. So you could probably get away with it easier. It’s still not a good thing to do. I’d say it’s an awful thing to do, really it’s just being dishonest and it’s not good for the community.

Consider if I went to a forum where artists share sketches with one another and hop into thread after thread saying, “Ooh, good idea for a sketch! I asked DallE to make me several and here they are!”

I would be discouraged from doing that again, and rightly so, because the purpose of the forum is not “share neat images.” The purpose of the forum is “share neat sketches you have drawn.” The ethics of plagiarism and creativity and all that aside, I would be acting contrary to the specific purpose of that community.

Ultimately, the purpose of this forum is “bullshit with people and share your specific knowledge with them.” I don’t want to talk to ChatGPT via a human intermediary. I can do that any time I want without the middleman.

It’s no different than when people ask questions in FQ that they could have Googled. The point is the conversation, and it’s why “why the hell didn’t you just Google that?” isn’t an appropriate response.

So I guess I ultimately find it to be a little bit rude and a little bit against the implicit social contract of a forum.

I see nothing wrong with someone starting a thread that disallows any AI content. We do that already in the photography contest threads.

But a blanket ban on the board doesn’t sem right, or even a blanket ban in non-AI threads.

But I do agree it should be used sparingly, and only when it constitutes a legitimate contribution to the debate that hasn’t been brought up by others.

For threads that have run their topic course and devolve into yucks and jokes, I don’t care how much AI content there is at that point - so long as it’s funny.

The problem is that there is often little real content, and the verbiage that is posted is often misleading or flat-out wrong. I recall a thread about a question of the relationship between Einstein gravity and quantum mechanics where a responder prompted ChatGPT with a suitable interpretation of the o.p. and then chucked the output into his post. The post read like a pretty well written Wikipedia article that hadn’t been vetted because it was full of subtle but definite errors and misapprehensions. It was good enough to completely fool a layperson and provided negative value because of the mistakes.

I think something that is not widely appreciated is that these LLMs are “deception machines”; the focus in development isn’t to provide accurate information or provide source references but to give the user the impression they are interacting with a human-like agent. They’re built for rapport rather than factuality, and they are so good at giving the impression of conscious behavior that even some artificial intelligence researchers start developing a theory of mind about them despite a complete lack of evidence of anything a neuroscientist would recognize as cognition. The ‘reasoning capacity’ demonstrated in them is an artifact of the structural logic inherent in language, and they are easily fooled by anything requiring a context outside of the texts in their learning set (something that will doubtlessly improve with multi-modal models but they still won’t be constructing a wider context of the world from direct interaction).

The complaint that it is reductionist to describe LLMs as “stochastic parrots” may have merit insofar as that they are more useful than a psittacine but they have less actual volition or interior cognitive processes than any animal with a complex brain, and as a consequence all they bring to a discussion is the most puerile and often-wrong information. Posting the output of a chatbot as a response is the lazy person’s way of demonstrating minimal participation, essentially just being a proxy for the LLM. And unfortunately, this is the slope we are going down where people will stop putting effort into critical thinking or formulating a coherent response of their own ideas (if they have any) and instead just pasting some smart-sounding nonsense from a machine that has no philosophy itself and no accountability for errors.

Stranger

If anything, AI has taught me to hate the word “stochastic”.

Yep. So the onus is on the poster to fact-check the LLM output before posting it, just asthey are supposed to fact-check themselves before posting. I don’t see a problem there. Don’t post wrong stuff. Do your homework.

ChatGPT was supposed to be a research prototype. They thought the public release would be quiet and small and they could expand their Alpha testing to Beta and gather more human feedback. It exploded and really hit prime time before it was ready.

LLMS aer not designed to be ‘deception machines’, and they don’t behave that way. They CAN be deceptive, but when they are it’s an accident. The researchers care very much about accuracy. It’s just a hard problem with a one-pass ‘brain’ that lacks recurrency.

But accuracy is improving. Q-Learning builds result trees after parameter changes and tracks downstream results, allowing for better weighting of parameters and better performance ( a tree traversal to find a hallucination is a lot cheaper than running a complete inference).

Another technique being used is to have the LLM do an internet search and compare its output to a trusted source or a consensus. Yet another is to have one LLM check the output of another before it’s handed off to humans. Since hallucinations seem random, they shouldn’t hallucinate the same way at the same time, and the error rate might plummet at the cost of doubling the compute.

There are undoubtedly other methods being worked on. We’ll get there eventually. But as of now, yes, everything an LLM says should be assumed to be wrong until you check it.

The ‘stochatic parrot’ or ‘just next-word prediction’ dismissal of AI misses that there is no way to do general, accurate next word prediction without a rich model of the world, the relationships between things, human emotions, and everything else captured in the text that they ingest in training.

Ilya Sutskever, who was head scientist at OpenAI, keeps repeating this every time people claim that it’s ‘just’ next word prediction. You can’t do next word prediction without essentially reasoning about the world from a rich set of models in the LLM.

Mechanistic Interpretability investigations have been finding evidence of those models, such as the equivalent of the “Halle Berry Neuron” that fires when any numjber of associated concepts or objects are referenced, and dedicated ‘circtuits’ that evolve to efficiently solve common issues like addition.

LLMs are not living brains. They aren’t recurrent. They are a straight one-pass, fully connected neural network, so they can’t ponder, ruminate, explore on their own, think about what they just said (unless prompted to), etc. That maks them alien, but it doesn’t mean they are spouting meaningless text, or incapable of coming up with new things.

It’s not even clear that there isn’t some form of fleeting consciousness extant when an inference is running. Ilya Sutskever thinks there might be, or at least that the possibility can’t be discounted. There’s a hell of a lot going on in an LLM each time an inference is run. Most of it we still don’t undersand.

Also ‘next word prediction’ might describe how our own speech center works. Maybe we’re all ‘stochastic parrots’. If we discovered that the System-1 process of speech involved just that, it wouldn’t take away anything from the richness of our minds. It’s just that the output process turns out to be a certain rather simple mechanism.

To be clear, I’m not referring to something like “Here’s the best article I found in a Google search.” I’m talking about, “I searched for the term ‘baby platypuses’ and here’s the page of Google results.”
https://www.google.com/search?client=firefox-b-1-d&q=baby+platypuses

The first is totally fine. The second is what i was talking about when I said people would get modded for posted the results of a search engine query.

As for the discussion of what the current state of AI is: I think that’s awesome material for another thread, but not relevant here, except for people who disagree that ChatGPT results are notoriously unreliable. And I’m not calling for a ban on using them, just for people to exercise self-restraint, recognizing how annoying those sorts of posts are for so many readers.

But then again, I would prefer to see “human commentary” instead of only bare links in all those posts which are merely one or two word links to some YouTube video.

For example (made up):

Poster A: “We must remember that Hunter Biden is not Joe Biden and therefore needs to be judged on his own merits.”

Poster B: “It’s like this.”*

And the link is to some lame comic book movie clip or Goatse or whatever. How does that advance anything interesting? I’d rather read ChatGPT mishmash than chase bare links around. At least that has some actual content (even if made-up) instead of a link to a non sequitur.

*link is made up

The fundamental problem is that people are posting information that they aren’t able to fact check; not with any intent to deceive but just because they don’t have the knowledge to interpret the output as wrong. I suppose this falls into the same category as cut & paste from any online source but at least if you have the source reference you can make some assessment of the integrity of the source, whereas the output of a chatbot is literally unverifiable.

When I refer to them as “deception machines” I’m not referring to the factuality of the content; it may be correct or wrong, but the LLMs underlying chatbots are designed to give the impression of an authoritative, human-like agent which responds with great confidence even when the result is completely wrong or utter gibberish. People respond to ‘confidently wrong’ answers with far more credulity far more than they do with carefully qualified factual answers because they like the feeling that the answer comes from a place of assured knowledge even if it is actually total bullshit; hence why many people were doubtful about the CDC’s carefully worded information about vaccines but were all in on hydroxychloriquine as a treatment for COVID-19. These systems are designed to be confidence-inducing, capable of creating a digestible explanation of whatever the user is asking about even if it is complete gibberish to an actual expert.

Except an LLM does not have “a rich model of the world”; it has a highly detailed model of an idealized world based upon its training set which may (or may not) accurately reflect. If all you fed it was Jabberwocky it would have a detailed model of the Looking Glass world and a vocabulary full of nonsense words, and it wouldn’t have any clue that it was right or wrong because it literally has no other way of accessing the external world or vetting the accuracy of information provided to it. That LLMs can do a mostly cromulent job of making appropriate causal or associative relationships between objects or actions is because these are represented in the language in their training set; although a simple grammar and sufficiently large vocabulary can make a virtually infinite number of grammatically-correct sentences, only a tiny fraction of those are actually sensible statements and which (presumably) make up essentially all of the training data, so the model will provide approximately correct things when you ask it about the rules of baseball or how to perform arithmetic even though it knows fuck all about America’s Pastime or how many apples Johnny took away from Julie. It’s a really complicated stochastic parrot that can autocomplete sentences and whole paragraphs from its complex network of statistical relationships, but it isn’t ‘thinking’ in anything but a very narrow sense of how to provide the most appropriate answer to a prompt based upon how its neural network has been shaped by data.

I know that someone is going to come in and say, “That’s exactly how a child learns, too!” but it really isn’t. A neural network is a crude approximation of how we think people learn but even a cursory understanding of human neuroscience reveals that the brain has innate structures for comprehending and processing certain types of stimuli which starts functioning long before a child is ever exposed to the outside world, and produces an internal experience of sentience even before a child can voluntarily control their body, as well as affective responses that can be found nowhere in any AI model of any degree of complexity. Ilya Sutskever and other ‘experts’ can speculate all they want that LLMs are or may be conscious, but while we can’t define what consciousness fully is, it is pretty clear that AI models do not have the capacity for it in terms of an interior experience that is distinct from its generative processes. And this is exactly what I mean by “deception machine”; ChatGPT 3.5 is so sophisticated at producing cromulent and mostly appropriate output that it sure seems like their has to be a little virtual homunculus inside of there doing some actual thinking and dreaming. But there isn’t any structure within the model that actually does that; nor is consciousness required for doing really complex operations using a conditioned neural network, which is really the point and the promise of these systems to be able to work with massive datasets and deal with ambiguous input that is actually beyond the ken of fully conscious human brains.

Now, it may be, as you observe, that many of our below-the-level-of-volition processes are similar ‘stochastic parrot’ models, and in fact that would be a good explanation for reflexive groupthink and why people so readily fall into cults and tribal behavior even against their best interests; in fact, many cognitive neuroscientists think it possibly and even likely that most of our decision processes are below the level of consciousness, and most of what our internal monologue is doing is rationalizing why we just ate that fifth slice of pizza or drove through a flashing railroad crossing. But we still have cognitive processes that are beyond generating blather and reflexively procreating (well, some of us, at least), and there is zero evidence or reason to believe that large language models or generative AI has anything like this, or that it will spontaneously emerge from just larger datasets. The kind of complexity necessary for actual cognition would require a structural change in the underlying system, instead of just more data and connections.

Stranger

Oh, I agree that LLMs currently don’t have any sort of ‘internal monologue’ running outside of inference. But then. we don’t have one either when we’re unconscious. What happens during inference or training, however, is not clear. Not even to the creators.

I agree. I would suggest that we shouldn’t allow such postings, unless the post is being made to ask people to help explain the output. But posting something from an LLM as ‘fact’ that you haven’t fact-checked for whatever reason, should not be allowed.

I see what you mean, and I can agree with that. Usually when humans repeat nonsense they don’t do it with the confidence and detail that an LLM can cough up. That can lead people to accept things they wouldn’t accept from a person.

But they aren’t ‘designed’ to induce confidence or to impress people with bullshit. In a way, they weren’t ‘designed’ at all. When the transformer architecture was developed, no one knew what would happen. And when they trained it, new capabilities emerged along the way. Surprising capabilities no one designed into the thing or even expected it to be able to develop.

Okay, there are several different questions here:

  • Do LLMs contain a rich model of the world in their neural nets?
  • Does the limited or selective training data cause its models to be wrong or simplistic?

I DO believe they contain a rich model of the world. I think reading the corpus of human text and images and video and audio gives it all it needs to build a rich model of the world. I don’t think it could do anything remotely like it’s doing now without it. And we’ve already found physical evidence of such structures, such as associative ‘neurons’ that tie concepts together. If you show a picture of a cat and a woman to ChatGPT and say, “what woman am I thinking of?” It will tell you Halle Berry, becxause she was Catwoman. Or maybe another cat woman. Maybe it would say Nastassia Kinski. But somewhere it has made a connection between cat, a woman, and a woman who played catwoman. And when it thought ‘catwoman’, the neurons representing Batman, Gotham, or other associations will light up.

That’s exactly how human brains do it. In fact, the ‘Halle Berry Neuron’ is a concept from studying the human brain. We find them in LLMs.

This doesn’t mean they ‘think’ like we do, or that they are conscious. I’m sure the process is utterly alien to us. But I’m also sure we’re seeing a lot more than a neat trick or a stochastic parrot good at fooling us.

As for the training data giving them a ‘limited’ view of an idealized world… I don’t think there’s anything idealized in the training data, and the LLMs have read a lot more about the world than you and I ever have. If anything it’s us who are laboring under the weight of not really knowing a whole lot about the world. For example, I’ve never read a book in Swahili. I have read only a tiny fraction of scientific papers, and I haven’t visited most countries on Earth. I’m ignorant compared to GPT 4. And so are you, and everyone else.

Sean Carrol is on your side, though. But I think most AI researchers are on mine.

I don’t think that the intent was to make generative language-based models that lie, but the use case is clearly something that seems like a trustworthy, human-like agent, regardless of whether it is capable of vetting the factuality of information presented or not. In fact, there is almost no effort put into ensuring that these agents are capable of checking facts, and that is because there really isn’t any way to validate the integrity of information they present other than post hoc reinforcement. But the objective is to make something that interacts with a human user in a clear, confident manner, not to create an interface with high information integrity.

I think LLMs have a ‘rich model’ of the text provided to it, but it is just text. It seems like it understands the larger context of the world because of the extensiveness of the training set, but then it is easily confused or fooled by things that aren’t well represented in a textual form.

I don’t know what you mean about “physical evidence of such structures, such as associative ‘neurons’ that tie concepts together”, but I’ll note that within biophysical neuroscience, the modeling of functions of even the individual neuron are still at a pretty immature state, and it is becoming increasingly apparent that at least some ‘processing’ of sensory information actually occurs outside the neuron. Whatever ‘physical’ evidence might exist in an LLM running atop a silicon substrate, it certainly isn’t like a neuron, and certainly not like the matrix of a vertebrate brain with its myriad of connections and interactions between axon, dendrites, and glial cells.

The association of ‘Catwoman’ to ‘cat’ and ‘woman’ is so semantically and grammatically obviously that I don’t think that that is even a very remarkable example of deduction; it is an almost simplistic example of what you would think a grammatical logic system should do, and LLMs have actually made much more complicated associations, which again is expected given the vast size and scale of the training set for something like ChatGPT. The one thing large language models should be really good at is making associations based upon grammatical and vocabulary patterns but it isn’t indicative of any deeper cognition; it is purely an emergent result of the vast computation that is capable through a heuristic system with modern computing resources and trained to develop a vast number of associations. How neurons in the human brain ‘recognize’ individuals is still an active area of research that isn’t well understood but it isn’t through a network of linear associations.

I’m not at all convinced that what we’re seeing isn’t just a really neat trick, although I’m not sure that we will ever have the ability to delve into the black box of complex neural networks to really understand how they function in any more than a trifling depth. I don’t think they have comprehension in terms of a larger world beyond what is ‘generated’ to respond to the query, and is a product of the vastly more data available to it, although the number of ‘associations’ it can make is limited by the digital substrate versus the much more complex connectivity in the vertebrate brain. However, this is a comment about current LLMs and other generative AI; I find it not at all unlikely that further advancements, particularly in development of more complex computing ‘substrates’, will permit something akin to cognition as neuroscientists would recognize it.

And frankly, I have no small amount of fear about that, because these systems with real cognitive abilities and the instantaneous access to data and entrained responses far beyond what a human could absorb and integrate in a lifetime, will make them so far beyond human conception that they will be able to manipulate human users with an easy that even the most charismatic politician could not muster. What happens when you have an AI that doesn’t itself have access to critical infrastructure or weapons but is a perfect Music Man of influence with its own objectives that we can never discern? People are already irrational, easily influenced, and ready to adopt ideologies that are contrary to their interests; what happens when you have a system that can outwit a human thinker at every avenue, and anticipate every act of intellectual resistance? Fortunately, I don’t think these are right around the corner based upon evolutionary developments of GPT and similar language models, although I suspect when that does happen we will have essentially no time to respond to it, and we are making zero effort to anticipate or regulate such developments, or even think about safety aspects beyond the most puerile protections. If and when it happens, we will be at the mercy of such a system if it can gain control over critical infrastructure or influence decision-making, and we absolutely love handing over control to automated systems with minimal controls or validation.

I like Sean Carroll and respect his superior knowledge of physics and enthusiasm of other sciences, but I frankly don’t think he is all that far in front of me with regard to AI. However, I think most professional “AI researchers” vastly overestimate their understanding of neuroscience and try to make associations that are well beyond actual evidence or knowledge, while actual neuroscientists who follow AI research are highly skeptical that what these systems are doing actually reflects cognitive processes on more than a superficial level. Given that people promoting AI research tend to be the height of performative narcissism while neuroscientists as a group are largely humbled by their realization of how little we actually know about the workings of the brain, I’m inclined to believe that the latter actually have a better grasp of our knowledge, or at least our ignorance, regarding genuine advances in machine cognition.

Stranger

My understanding is that Bing Chat does provide source references, since it sources its information from the internet (I don’t use it myself). So while you have a valid point, it’s not necessarily true for all LLMs. Furthermore, in most cases the GPT information can be verified by reference with reputable sources. Your example where the subject matter is too esoteric or nuanced for anyone but an expert to detect factual errors seems like a real outlier, and in such cases the poster himself could misunderstand an authoritative source and post incorrect information in just the same way.

But the idealized model of the world formed by GPT-4 is accurate 94.4% of the time, if its performance in the Winograd Schemas challenge is accepted as a valid measure. This is impressive because the schemas are explicity designed to test for the kind of common-sense real-world understanding that until recently AI has been notoriously lacking.

Stranger already covered this but I’ll just stress that while I admire Sean Carroll and he’s something of a polymath, he’s not an AI researcher.

Seems like a highly biased assessment based on various preconceptions. What about the views of a distinguished cognitive scientist with a strong and extensive background in AI – someone whose background encompasses research in both human cognition and computational intelligence? Whose side would they take?

OK, it’s a trick question and only an anecdote, but anecdotally the perspective created by a background in both fields in this case lined them up with AI researchers and disagreement with some well-established neuroscientists. I’m just saying I don’t think your generalization is very accurate.