TLDR It doesn’t know what is or isn’t available or what is or isn’t true. It isn’t a search engine or logical evaluator. It is an autocomplete engine. It can very convincingly mimic human language, but it cannot necessarily “reason” through that language and evaluate it for truthfulness, only produce similar outputs for similar inputs based on its training. If that training happened to contain largely truthful information, it will usually autocomplete truthful information — but only by coincidence/consensus. If that training is full of lies, it will lie. If it isn’t in its training, it will keep trying to find the closest matches, but it will make less and less sense the further out the connections are… that’s what we usually call hallucinations. It doesn’t know when it’s hallucinating. It doesn’t self-analyze, it doesn’t have a sense of self or possess an internal state of mind, it’s just constantly searching for the next token based on the probabilities in its training.
Much longer version…
It is not agreeing with you or disagreeing with you, it’s just generating text based on your input, sending it through a giant network of interconnected flowcharts. LLMs are not “truth machines”. They do not have the capacity to evaluate whether a statement is true or false, even (or especially) about itself. It doesn’t really know when it’s faking something or not, and it’s not doing it malevolently.
They take your input prompt and run it through its model and produce a probabilistic output of text it thinks you will like (or rather, that its training taught it to prefer). In many cases, because its training data was full of humans trying to be helpful and true, its learned weights will likewise tend towards what is true, but only by coincidence and popularity. If you ask it about a topic and it’s seen 10 million posts about it and 9.9 million of them have the same answer, that answer is going to be very strongly weighted and preferred in its output.
But when you ask it about something not in its training, it’s either going to call an external tool (run some code, do a web search, etc.) and then synthesize/summarize the results of that tool, or it’s going to just hallucinate some text that it thinks you will like. It has no capability to evaluate the truthfulness of its output, only the degree of correlation between its output and your input, based on its model weights.
If its training overwhelmingly has “answer B” to “question A”, it will almost always give answer B. If “answer B” and “answer C” are equally weighted in its training, it might give you either one. If it has no actual “answer B” OR “answer C” in its training, which is almost certainly the case when you ask it a question like that, then it’s going to instead try to find “what are the closest similar questions… if there are none… what are the closet similar text responses that humans have preferred in their past online conversations that are even remotely similar?”. Some text in your original prompt might’ve led it down a wild goose chase and it’ll just keep following that thread and spitting out words one at a time.
In a sense, LLMs are constantly hallucinating, no matter your prompt. But its hallucinations are heavily weighted by its training, and often that will tend towards reality just because the “real” answers were more popular in its training data. You can of course skew this by producing a lot of fake answers and causing the LLM training to be biased towards those fake answers instead, which is what is already happening these days as people try to do the equivalent of SEO for LLMs, and that’s why the pre-LLM internet is so valuable in training.
It is essentially a probabilistic autocomplete, and it cannot by itself gauge the passage of time or evaluate the truthfulness of anything, only match your input to likely outputs in its training. Given enough training, this will produce something that reads like convincing speech, but without the critical thinking and logical evaluation that (sometimes) accompanies human speech.