Overnight, Google Gemini becomes a know-it-all a-hole

This is verifiably not true:

Chinchilla [1] is a paper that came out in 2022 from DeepMind. The authors found out that current large language models are under-trained due to the focus on scaling model size while keeping training data constant!! The authors in fact trained over 400 language models ranging from 70 million to over 16 billion parameters on 5 to 500 billion tokens, and concluded that for compute-optimal training, both model size and the number of training tokens should be scaled equally.

The emergence of contextual ‘knowledge’ from LLMs arrises from the logic built into the usage structure and metasemantics of language, and LLMs do not function or develop in the manner that cognition the human brain works but we’ve discussed this elsewhere so there is no need to rehash it here.

Chatbots are intrinsically optimize for user engagement because keeping the user coming back to the chatbot is an inherent goal in arguing to investors about how profitable it is going to be if they just plow another XXX billions of dollars into more and more ‘compute’. Post-training optimizations such as fine-tuning parameters, reasoning models, the use of retrieval-augmented generation frameworks, scaling efficiencies, explicit rules to enforce alignment to a set of safety or other guidelines, et cetera are focused on consistency, which insofar as the desire is that the LLM to provide correct summary and factual information is correlated with accuracy, but nobody is interested in a chatbot that spits out correct information in a hostile or off-putting manner, and proving that LLMs are worth developing means showing that the ‘beta’ chatbots are a thing that the general public will embrace enthusiastically.

I agree that the real strength of an LLM is “its ability to transform the user’s context [natural language] into another form,” and it is a really impressive, if computationally (and in many other ways) costly, capability. The deep learning approach to building models capable of high fidelity natural language processing is orders of magnitude beyond the previous symbolic approaches and nobody can dispute that current chatbots are essentially capable of having human-like conversational capability at least at a superficial level. However, these systems don’t understand ‘truth’ ,and accuracy is a question of how well that correlates to the consistency of the responses of these models. Of course, a malicious developer could make a model that is not oriented toward accuracy and intentionally produces convincing falsehoods for a variety of purposes, and chatbots have already demonstrated the ability to mislead people even without any intentional malice. The faith that people will put into the output of an LLM (either due to being uninformed about how they function and the lack of reliability, or because they are too lazy to fact-check even basic things like citations or easily calculated objective results) is pretty shocking so far and will only get worse as these tools are promoted and pushed upon the public as essential tools that they must use in the name of efficiency.

For sure, and a lot of effort has been put into ‘aligning’ LLMs to not spew that prejudice, conspiranoia, and bias (even though these are definitely things that encourage engagement, so developers are having to manage objectives that are at cross purposes), but in the case of Grok, Elon Musk announced the specific intent to remove “woke ideology” and “cancel culture” from Grok responses. The result?:

This week, Grok’s responses praised Hitler and espoused antisemetic views, prompting xAI to temporarily take it offline. Two months ago, Grok offered unprompted mentions of “white genocide” in South Africa and Holocaust denialism. In February, X users discovered that Grok’s responses about purveyors of misinformation had been manipulated so the chatbot wouldn’t name Musk.

Why does this keep happening? It has to do with Grok’s training material and instructions.

For weeks, Musk has promised to overhaul Grok which he accused of “parroting legacy media.” The most recent incident of hate speech followed Musk’s July 4 announcement that xAI had “improved @Grok significantly” and that users would notice a difference in Grok’s instantaneous answers.

Over that holiday weekend, xAI updated Grok’s publicly available instructions — the system prompts that tell the chatbot how to respond — telling Grok to “assume subjective viewpoints sourced from the media are biased” and “not shy away from making claims which are politically incorrect,” The Verge reported. Grok’s antisemitic comments and invocation of Hitler followed.

The manipulation of parameters in Grok is an example of what intentional and focused malfeasance can produce in chatbot behavior. If was fortunately egregiously offensive enough to be immediately obvious but a more subtle manipulation of parameters to produce plausibly ‘truth-like’ falsehoods could do enormous damage to public trust and a shared, fact-based reality. The problem with trying to discern truth akin to what ‘Ground News’ does with media is that there is no truth in intentional deceptions that are intended to manipulate and deceive and the constantly changing baseline of a system which is designed to dynamically produce individually-targeted fabrications pretty much defies any broad attempt to filter out falsehoods.

Stranger