I appreciate the comprehensive and erudite analysis of the issues as you see them. I agree with much of it, but I have to take issue with the quoted part and some of the conclusions that follow.
I agree that absorbing the structure and logic of natural language is an important contributor to an LLM’s capabilities, but it surely cannot be the whole story. For example, as @Babale said, it cannot explain multi-modal LLMs based on essentially the same technology, like those capable of powerful image processing. Even with more conventional text-based LLMs, it’s hard to see how language competency translates into mathematical competency, especially at higher levels. The two skills are so different that many humans, including quite intelligent ones, might possess strong language competency but be very bad at math, or the other way around.
The article below is part of a series about LLMs and is a fair and well-balanced treatment of the subject of emergent properties. It cites a famous paper on the subject from 2022, then cites an attempted refutation. The refutation is, at best, controversial, and centers on the supposition that when emergent properties appear suddently and unexpectedly at particular levels of scale, that this is merely an artifact of the evaluation methodologies. The authors claim that the artifact, or “mirage”, occurs because “almost-right” LLM responses are dismissed as wrong and therefore the existence of the latent capabilities isn’t observed or acknowledged until larger ANN scales and parameter numbers are achieved. I think in most cases this is simply not true (the capabilities just are not there in any meaningful way) but there it is – a competing hypothesis for your interest.
In short, whether true emergence exists in LLMs is the subject of debate, but I’m firmly in the “yes” camp. As the article says, quoting UC Berkeley professor Jacob Steinhardt, “emergence” can be defined as “qualitative changes that arise from quantitative increases in scale.”