Future of General Questions in the AI age

PastTense · July 5, 2025, 2:00pm

Because the organizations are using it for simpler things. Suppose a company decides to use AI for answering customer service questions: it just needs to select the answer from a few hundred or few thousand pages of documentation. But suppose you or I ask AI a question: it gets the answer from billions of pages or more of training data, much of which is inaccurate.

Voyager · July 5, 2025, 6:16pm

New York or New Yorker? The New Yorker ran an essay on exactly that, which doesn’t mean New York didn’t either.
Sales of blue books for exams are increasing as professors are making their students to essays in class rather than at home. So, handwriting may be coming back.

A rosy article in the Times Magazine (I think) about how AI won’t destroy our jobs gave checking AI output as the job of the future. Those lawyers who gave the judge hallucinated cites could have used that.
But I wonder how many people are going to love a job where they got to college or get specialized educations in order to proof read what the AI produces.

Exapno_Mapcase · July 5, 2025, 6:29pm

New York.

Pleonast · July 5, 2025, 7:29pm

No worries, once the AI has access to the source code, the whip factories will be bug free.

In case that’s too subtle: this is a social media site. Posters are not here to simply get questions answered, but to engage with others about various topics. Until AI does unprompted socialization, it’s not a useful replacement for posters on this site.

LSLGuy · July 5, 2025, 7:37pm

I can think of several posters where I might prefer an AI make their post instead of them.

I personally haven’t messed w AI at all. At least not beyond reading Google & Bing “search results” that have become AI-written synopses instead. I mostly skip over those, but read them a bit. Punchline being I’m not competent to put my next paragraph into action. But I have thought about the idea. To wit …

I wonder if there is a format of interaction with current onine AIs akin to a chat or threaded dialog where you or I could feed it e.g. this thread of 45 posts and ask it to write a coherent next post, or a reply to some or all of an upthread post of its choice and just let it go and do its thing?

And get a useful result that the humans would be interested in continuing to interact with as just one more poster in the crowd.

In another AI thread recently I made joking mention of the “md-2000 Test” replacing the old Turing Test, based on a witty comment @md-2000 had made about AI capabilities.

I propose that my idea in this post be named the “Doper Test”. If an AI can pass that, it’s doing well for itself and for us.

Exapno_Mapcase · July 5, 2025, 7:57pm

The flip side of that is this is not a useful site for those who just want answers and not engagement. No younger users will look at our structure and decide that’s what they want. Take a look around. These are all the names you will ever see here.

wolfpup · July 5, 2025, 9:56pm

They’ve been doing that literally for decades. I got an automated email response from my useless cable company more than 25 years ago. What’s new with technologies like LLMs is a much deeper level of engagement, with much greater opportunities (from the C-suite point of view) for displacement of low-level clerical jobs.

Then you might get a useless, inaccurate response. Remember that systems like ChatGPT at this stage are experimental, essentially a proof-of-concept. The corpus on which they’ve been trained hasn’t been curated for accuracy, or indeed, AFAIK, curated at all.

Much as I think that an appropriately tweaked LLM could easily pass the Turing test, I suspect that the answer to this challenge is “yes, an LLM absolutely could”. However, I’m not gonna do it, because (a) the current model of ChatGPT isn’t really tweaked for that purpose, and (b) I’m already at risk of getting an unwanted reputation for quoting ChatGPT too often.

I don’t think that’s exactly true. The real problem is that the nature of the world is such that the majority of new joins don’t really fit in. But we do get them. Why they don’t fit in is a different discussion. In some cases it may just be that they’re not acclimatized to this environment, which is definitely not Xitter-ish or Facebook-ish.

Alessan · July 5, 2025, 10:15pm

So? All things must past. We’ve had a good run.

Stranger_On_A_Train · July 5, 2025, 10:22pm

To be clear, I used the term “gibberish” to describe the specific response in this post as it was, in fact, totally wrong. I upgraded my evaluation of @Sam_Stone’s response from GPT-4 to “reasonably talented magpie” because it came closer to the correct answer even though it has a couple of errors in describing the algebra and badly fumbled rounding to significant figures. Chatbots are certainly capable of providing readable and almost invariably grammatically and syntactically correct responses although the reliability of those responses is highly variable depending on the specificity of the prompt and just how much ‘inference’ the chatbot has to do to produce a response. They can more or less be treated as Wikipedia articles written by Cliff Clavin after spending an afternoon at the dentist’s office coincidentally reading an article on the topic at hand, delivering some mixture of half-remembered fact and whatever filler seems likely to roll well off the tongue.

We are in “an entirely new era now”, and specifically once in which the confluence computational power and storage, access to enormous masses of digitized data, and the ability to manage large-scope neural networks via a transformer architecture creates emergent capabilities not available to researchers in the 'Sixties, or for that matter the 'Nineties and 'Aughts. The coincidental development of highly optimized vector processors for interactive graphics displays and large scale cluster management systems for parallel processing like CUDA also made for an ideal computing architecture to handle vectorizable data such as natural language text processing. But if anything, current AI evangelists are even more “overly optimistic” in their projections of what LLM-based systems will do in terms of developing artificial general intelligence (AGI), motivated no only by academic enthusiasm but the purported fiscal potential (and for some, eliminating of squishy human work units from the work ecology to be replaced by easily control machines not prone to illness, inattention, or demanding labor rights). This has led to the same kind of hype that Silicon Valley technology companies apply to every area of innovation, taking an impressive but fundamentally based innovation and blowing it up into the next revolutionary, world-changing development which they do from biotech to space technologies to virtual reality, and a critical eye to this behavior observes that there is always a cycle of increasingly more hyperbolic evangelism with any grounded criticism brushed aside as obsolete thinking followed by a massive deflation or in many cases total collapse of the nascent industry.

There is enormous amounts of capital and resources (including hydrocarbon-generated power and precious potable water) being put into training of these models and running the data centers necessary to make them usable for pseudo-‘agentic AI’ (more on this later) despite the fact that not only are they not making anything like a profit but lack a substantial use case that would justify this investment. Advocates often claim that these systems are ‘getting better’ as transformer architectures evolve and systems for doing retrieval augmented generation and post hoc filtering to control output are developed, and to the extent that these systems give more coherent responses that don’t completely go off the rails or immediately start ‘hallucinating’ this is true, albeit at the expense of running into limits of training data and ‘compute’; these systems still make basic errors in comprehension, contextualization, and producing erroneous results with fabricated citations or facts, which occur because these systems don’t actually have the fundamental capacity to create accurate contextual models of the real world independent of text (or in the case of hybrid models, still images and video) in their training set because that is literally the only information they have available, so even if these systems were capable of human-level conceptualization (which they aren’t but we’ll set that discussion aside for the moment) they would be fundamentally limited in what information they have access to about the world. There is no clear path to making these systems reliably in distinguishing established fact from speculation or error, and not even a really good way to prevent them from producing fabricated citations or presenting ad hoc explanations that aren’t based upon verified information without explicit post hoc filtering rules.

Just to be clear, I’ve never stated a “position of generative AI being useless”; I have said that it is unreliable, and that there is no clear path to marking it reliable enough to undertake ‘mission-critical’ responsibilities or produce work without human oversight and review. The responses that the chatbot produced in response to @Francis_Vaughan’s criticism were clear and at least mostly correct in their details about how the system works internally (and doubtless reflected in some subset of the training set albeit restated in a succinct form) but actually missed on responding to the more fundamental issues that are touched on in the post, and specifically the issue of scope for inference. The post makes an example of asking the LLM to perform a mathematical operation, and then using it to describe how to perform the operation, which an LLM will treat as distinct and unrelated queries in how it tokenizes the prompt and responds. I’ve actually seen that independently where a chatbot will give a (mostly) correct answer but a completely wrong explanation of how to perform the calculation, or conversely give a wrong answer but present a reasonably clear instruction on how it should be performed. While the response by the chatbot regarding “implicit statistical inference” and “simulating inference” are true of themselves, it misses the point that a person would understand the inherent relationship between those two questions but an LLM does not because they are essentially serial queries.

I think it is important to understand the difference between how an LLM performs inference to how person does it. LLMs are, fundamentally, neural networks which have been trained on enormous amounts of textual data (far more than a human would ever be able to consume) and uses that to generate to apply parameter weights to an elaborate neural network consisting of many layers of nodes and backpropagation (BP) to refine those values to produce a statistically ‘likely’ string of text; the weights and their processing functions create implicit algorithms to produce the response. Once trained, a prompt is introduced to the system, tokenized into vectors, and using computational brute force of the transformer architecture churns out a ‘cromulent’ bit of text consistent with what is statistically represented in text and what it has produced before in the current context frame. A human brain, on the other hand, builds and constantly updates and refines conceptual models of the world which allow it to be trained upon a tiny fraction of total human knowledge and yet process nearly any text (in the language it is familiar) using its 25 watt brain power and with the capacity to reference previous discussions or seemingly unrelated concepts that are linked in a contextual frame. Humans don’t use backpropagation for learning and don’t really have anything like a transformer architecture; to the extent that we understand attentional systems in human cognition, their flow is quite non-linear and non-sequential.

It seems amazing to most people, and even people working the in the field of generate AI, that a system designed to process and generate natural language text can ‘reason’, but in fact there is an implicit logic in the structure of how language is used, and if you are mapping a sufficiently large corpus of examples of textual language into a BP-modified weighted artificial neural network (ANN) you are going to end up with something that can use language to ‘deduce’ correct answers to at least common questions and ‘simulate inference’. (That these systems can also perform mathematical calculations is also unsurprising because algebra is essentially a grammar for doing various mathematical operations and it is actually pretty straightforward, so deriving them implicitly by examples an expected ‘emergent property’ of an artificial neural network.) This is an impressive but not a surprising result, and I find it a little shocking that people working on LLMs are not more conversant with neurolinguistics to understand this. However, if they were fed a corpus of grammatically consistent nonsense pseudo-text a la Lewis Carrol’s poem Jabberwocky, they would also produce a set of rules for manipulation and ‘inference’, even though there is no real world meaning or context for the things and actions describe within. It would confidently provide a definition for “galumphing” or “borogoves” consistent with use in the training text without any comprehension that it is total nonsense.

I’ve been using ANN-based tools for machine learning for about the last fifteen years, and the kind of patterns they can tease out of large datasets or the emergent capabilities they can create in terms of filtering algorithms are very impressive. To be clear, I think LLMs are amazing that they can accurately manipulate a wide variety of natural language text and produce a pertinent response, something which symbolic attempts at machine cognition consistently failed at above some modest ceiling of complexity for decades. As a uniquely human capability (as far as we know, although I reserve judgement about cetaceans), we tend to view language as fundamentally reflective of intelligence rather than a tool of it, and so when something talks it gives the impression of being smart, self-aware, and capable of independent thought even when that is shown to not be the case. I think people—including many researchers—are impressed with these systems because they are a mirror reflecting what they project upon them and the tone of language in their training corpus rather than critically evaluating if and how they could actually be ‘thinking’ or developing volition.

This does present a pretty revolutionary jump in the capability of this kind of ‘AI’, and there are real use cases for it, albeit probably not the kind of trillion-dollar industries that would justify all of the investment into AI research and training, and certainly not matching the breathless exuberance that evangelists have been proclaiming that it will be capable of “next year” for about the last five years. I don’t believe for a minute that these systems have a ‘spark of consciousness’ within them, am highly doubtful to a point of near-certainty that this isn’t a path to AGI or even true machine cognition without some kind of radical change in transformer architecture that actually allows the LLM to be a knowingly quasi-self-aware, introspective, constantly self-updating system. I don’t believe they are going to start intentionally distracting us and build giant robot factories that will displace human workers en masse (although I’m sure many employers and CEOs are hoping and betting that will come to pass so they can eliminate the human element of ‘Human Resources’), and the real danger isn’t that they’ll intentionally take over but that we will assign responsibility to systems that aren’t actually capable of any kind of reliable control or decision making.

Stranger

Stranger_On_A_Train · July 5, 2025, 10:34pm

Given how much time I already spend having to read through and correct chatbot-generated drivel, not very well. But even worse is that most people without experience are just going to assume the AI is correct. We already have the challenge engineers often become so reliant upon their tools that they don’t actually grasp fundamentals in a real-world context, and I’ve lost count of the number of instances this year where someone has called for creating a finite element model and performing detail analysis to answer a question that could be solved with a closed form hand calculation or making a simple numerical routine to solve an iterative stress calculation. If making an FEM just required giving some vague statements to an LLM and getting a stress image back a few minutes later, almost nobody would even think about pulling out a Roark’s or Bruhn and paging through to find a pertinent formulation, and they probably aren’t going to realize when they get a completely wrong answer, either.

‘Efficiency’ is the watchword according to AI evangelists who don’t realize that the inefficiency of having to figure things out and learn is what makes for technical expertise and the ability to cope with novel problems by applying fundamental knowledge.

Stranger

griffin1977 · July 6, 2025, 12:01am

I’d say great debates is in more danger. In my experience AI is not significantly better than regular Google at finding factual answers, and frequently just hallucinates. It’s summarizing and writing convincing sounding verbiage is where it’s pretty impressive.

wolfpup · July 6, 2025, 9:04am

I’ve never disputed the fact that LLM responses can be partly or even entirely wrong So can human responses. And so can responses from IBM’s DeepQA engine, Watson, despite its confidence-rating heuristics. Yet Watson is already finding commercial applications. And so are fallible humans. As for the rounding problem, LLMs are still pretty bad at arithmetic in many cases, though generally pretty good with using algebra to solve problems.

I pointed you to that particular GPT response because it typifies the kinds of responses that I most often see from it. In this case, it exhibited an apparent understanding of a complex technical topic and responded to it with an apparent deep knowledge of how LLMs work. I see nothing in it that was significantly inaccurate. To describe this response as “Cliff Claven … reading an article on the topic at hand, delivering some mixture of half-remembered fact and whatever filler seems likely to roll well off the tongue” is just preposterous and continues your overly dismissive stance.

I somewhat agree with this, but your previous comments go far, far beyond just “unreliable” and run to sarcastic dismissiveness. Nothing in this world is 100% reliable – not LLMs, not DeepQA, and not humans – and that shouldn’t be the primary criteria by which intelligence is judged, whether human or artificial. The key question should be, not “will it ever make a mistake?”, but “is it useful?”

This continues the dismissiveness by oversimplifying the way LLMs work, such as undervaluing their ability to learn attention patterns and compositional semantics (inferring the meaning of complex phrases and sentences from their component parts) and the emergent properties of absolutely massive neural nets that are capable of generalized reasoning, not just memorizing patterns. You are absolutely undervaluing the emergent properties that manifest at massive scale.

No one claims that LLMs exactly mimic human cognition, but it’s noteworthy that GPT certainly maintains context across earlier tokens in the same session, and there are signs of the ability to do a similar kind of contextual linking of seemingly unrelated concepts.

This is a profoundly important point.

wolfpup · July 6, 2025, 10:35am

It depends on the particular content you’re looking for, the specificity of your search terms or query, and a whole variety of just random factors. GPT can be significantly better than Google if your search criteria are vague, as it can help you refine your request and iterate to a more accurate question. It’s also helpful in providing a concise but generally detailed response directly tailored to your query, whereas the sites that Google points you to may ramble on about stuff you’re not interested in.

The one big advantage of using Google is that you can usually tell if a site is trustworthy (e.g.- a university or reputable government agency) whereas you don’t necessarily know where GPT is getting its information or whether it’s processed it correctly (but you can ask it for citations). Basically ChatGPT is an experimental proof of concept, as I said before, and isn’t meant to be relied on for solid information, although it can do a lot more than just retrieve information – it can summarize long articles, for instance, usually quite well. Google is simply a search engine, and its information is no more or less reliable than the internet as a whole if the user doesn’t discriminate between trustworthy sources and bullshit.

Mijin · July 6, 2025, 10:59am

Yep, browsing through my questions to chatGPT they have been 80% accurate, 10% GPT saying “I don’t know” but in a flowery way, and 10% where there was a significant inaccuracy. Which is pretty amazing, as I only ask it open ended questions, and often after trying and failing to find relevant answers on google.

(On that point: google has gone down a lot IME. It’s not just the pile of sponsored links at the top, but even the non-sponsored links seem heavily weighted towards an interpretation of the query that will drive commerce. I also have a thread on google maps starting to have cracks)

I don’t say this to get into a back and forth of anecdotal data, just to say “YMMV” to the caricturization some are making that LLMs are largely inaccurate.

engineer_comp_geek · July 6, 2025, 11:09am

And even then you don’t know, because quite often the AI response doesn’t match what is in the cites that it gives you.

And basically, let’s all keep in mind that the OP is recommending getting factual information from systems that until recently couldn’t even correctly count the number of Rs in the word “strawberry”.

But more importantly, the SDMB is a discussion board. It’s entire purpose is for real live human beings (and the occasional dog) to discuss things. The purpose of GQ/FQ isn’t to provide google-able answers. It’s to provide factual knowledge and discussion from something a lot smarter than AI or Google and to have interesting discussions about such things.

The SDMB is not an AI repository and we don’t want AI answers here. AI isn’t a discussion among people. If all you have to offer is an AI generated post, don’t bother. You’re not part of the discussion. You’re just parroting a machine’s output. We want actual discussion here. That’s kind of the whole point behind message boards.

If all you want is an AI response, then go ask an AI and leave the SDMB out of it.

That said, I think AI has its uses, and I have no problem with people posting AI responses here as long as their post isn’t 100 percent AI generated, and the AI is used as a data point for further investigation/discussion and is not given as a factual response.

puzzlegal · July 6, 2025, 11:37am

I think it’s worse than what regular Google used to be. I’m extremely frustrated with the drop in quality of Google search results since they’ve been replaced with AI -generated answers.

This is the problem I’ve had with search engine AI summaries. They include links. But when i look at the links, they don’t say what the AI answer says. It’s extremely frustrating and i feel I’m wasting my time even googling stuff these days.

Have i mentioned how frustrated i am at losing the fabulous tool that Google used to provide for free? I guess, easy come, easy go.

I actually think AI will be extremely useful in a bunch of well-defined tasks. I advise young professionals to learn to use it. But it’s not ready for prime time in lots of ways it’s currently bring used.

Mijin · July 6, 2025, 11:42am

To be clear, I was not suggesting anyone post AI answers here*.
I am just speculating on a near-future where, I dunno, a question about what kind of laptop can do X gets responses of “Why didn’t you 'PT-it?”

* Indeed I share the annoyance many have with this practice. Mainly because AI answers are too verbose, so any thread with an AI quote quickly gets really bloated.
(And yes: I am aware I posted an excerpt of an AI conversation upthread, but obviously this thread is a special case )

LSLGuy · July 6, 2025, 11:47am

Since this is a question about the future I propose we consult another widely recognized and long-standing oracle of inscrutable knowledge.

Francis_Vaughan · July 6, 2025, 1:46pm

Something that became apparent in some of the other threads on AI, and along with the amusing critique of my summary opinion:
As noted above as well. There remains a total disconnect between the LLMs providing an answer, and telling you you how to get the answer. In one example, when asked why it gave the wrong answer, the LLM balthered a totally ridiculous story that made no sense and had nothing to do with how an LLM actually works.

The problem we have is that the LLMs have been trained on a huge corpus of text. Much vaster than just accessible on the Internet. It has been clear from the outset that all of the AI companies only have a passing acquaintance with intellectual property laws, and are quite happy to train on copyright works. Consider the huge holdings Google has of digitised texts. So assume the LLMs have been trained on as many textbooks, example exam papers and solutions, essays, research papers and the like, that the trainers can lay their hands on. Every time you ask it a question, ask yourself, is this an original question, or this a likely a question that someone has asked sometime before in history? If the latter, there is every chance the LLM has ingested it, and is going to regurgitate an answer or solution. Worse, your question may be close enough to an existing worked solution, and it gives you that solution, and without working it through yourself, you may never realise it isn’t quite the correct solution.

This leads me to why I am really worried about the impact on education. Students are already using LLMs to write assignments. There is a fine line between getting the LLM to tidy up the prose, and getting it to construct the entire assignment. Sadly, for the vast majority of subjects taught, the amount of material out there that the LLMs have already been trained on makes it very difficult to set topics that are novel enough that the LLM wont just reconstruct an assignment answer. Written assignments are almost impossibly compromised already. Coding assignments in computer science are not far behind. It was bad enough in the past where there was a thriving trade in assignment cheating.

The answer is inevitably going to be a reversion to long invigilated exams and supervised practical sessions. This adds significantly to the workload, and not all students do well under such pressure.

One of the pernicious problems has always been students that run the averages in assignments and exams. Rote learning a huge bunch of stuff, and just keyword matching on the question, then regurgitating everything they can think of. The logic being that, sure, the answer is wrong - but there is enough that is correct in there that they won’t get zero marks. And they will argue this with you. It might get them over the line. AI is more of the same, but worse.

I had a student that had actually written part of a coding assignment himself, but it didn’t work properly. So he asked Chat GPT why. He really didn’t want to understand that the answer he was given was wrong, and had nothing to do with actually why his program didn’t work. The LLM was likely just regurgitating text that had appeared on a forum where someone was having an issue with a similar assignment topic.

Which gets us to SDMB and GQ. The LLMs might provide plausible and authoritative answers. But there is nothing to say they are correct. They have a habit of delivering lots of text with that same scattergun playing the odds approach. They might be wrong, but there is likely enough correct in the answer that they are not totally wrong. Just enough to get a passing mark and scrape through. If a conceded pass grade is all you ever want in life, well maybe an AI answerbot is all you ever need.

wolfpup · July 6, 2025, 2:14pm

Just as an aside, I should mention that ChatGPT’s critique was in two parts, the first part listing all the points which it deemed accurate, and the second part listing the points it deemed “partially misleading or oversimplified”. So it fully agreed with many of your points and overall it was quite complimentary, but since I was asking for a critical analysis, I only posted the second part. I mean, a whole bunch of bullet points saying “I fully agree with this” is no fun!

Topic		Replies	Views
Artificial Intelligence - yea or nay? Great Debates	29	1350	July 6, 2000
Ask the AI In My Humble Opinion	52	3639	December 12, 2002
Moderator's Notes: On General Questions Factual Questions	81	14032	March 13, 2002
Is GQ still needed? About This Message Board	49	2593	January 22, 2007
I am SICK of people answering my questions with "Don't you know to try Google first?" The BBQ Pit	42	2128	October 3, 2001

Future of General Questions in the AI age

Related topics