Does written language retard vocabulary change?

Although methods for dating prehistoric linguistic changes are generally regarded as unreliable, there are several recent papers that attempt to do just this. I’m less interested in general comment on these methods than in one specific question.

To me, it seems common-sense that a body of written work would retard vocabulary change. The Indo-Aryan languages derived from Sanskrit have evolved, of course, but it would seem logical that the existence of old Sanskrit texts would decrease the chance that an ancient word would be replaced (e.g. with a borrowing from another language) and then forgotten. Similarly, one might guess that the permanence of written Latin has increased the stability of the vocabulary of Romance languages.

Is this correct? I’ve not seen this point made, but have seen prehistoric linguistic date estimates which ignore this effect, if it exists.

In my utterly uninformed opinion, it does seem to be common sense that written language would slow down language change. My question is whether or not it would be significant.

For most of human history since the development of writing, most human beings haven’t had access to said writings. On the other hand, they tend to be able to speak. The academic types are the ones who are usually picky about changes to what they learned, so I would assume that most of language change occurs at the level of the uninformed people who have less or no access to the written word.

Just looking at English over the past 100 years, there are plenty of words that are “out of style” now. They’re still in our dictionaries and we can still understand them, but they aren’t necessarily in use. In a society without dictionaries and widespread formal education, they might not be understood at all. I expect that universal access to books, mass media, and the Internet will slow down the rate at which things change even more. But I would think that it’s more about the universal access than it is about the media itself.

Something to consider, at any rate.

Agreed.

I’m wondering two things here: on one hand, how much difference is there in the effect produced by written works directly vs indirectly (most Europeans could not read the Vulgata, in the Middle Ages - but they had fragments of it read to them as frequently as they wanted/could afford to); on the other, how much of an effect does a body of oral tradition have, how much would a storyteller/history teacher who knows the stories she’s telling are centuries old do her best to stick to telling it as she herself was told it, thus preserving forms and words which might otherwise have been forgotten. It’s quite common for old legends and stories to include definitions of terms which may be unusual, at least in Spain: “… and in the palace there was a snowhouse, a building where snow brought from the mountains was stored and used to cool drinks and fruits…” “turns out that the woman was not a Muslim as he had thought, but a Mozárabe, a Christian who lived in Muslim lands…”

There are several countervailing trends at work.

Vocabulary change can be interpreted two ways. The meaning of existing words and the coinage of new words. Why these changes? That needs to be answered before looking at writing. Language changes most rapidly when the speakers change or the environment changes. Isolated tribes in New Guinea, where the jungle stays more or less the same over thousands of years, have little need for new vocabulary. The tribes - the others - they encounter also remain the same. Both of these are entirely untrue for Europe. (Same for Asia, but I just don’t know enough to talk about it.)

Europeans moved around constantly, first spreading from various areas in the Middle East and the Caucasus and then invading or traveling to other areas on the subcontinent time and time again. Written language was invented at some point during these movements, but while we know of changes obviously only from writing it seems clear that the changes themselves stemmed from the intermixture of peoples and languages. When two different language speaking peoples come together both languages change, although one may change more than the other. Writing itself will change to reflect this. Does it slow down the changes? Two effects again. One is that writing is inherently conservative, in that it is always behind the changes in oral language. The other is that there are many more oral speakers than writers and many more opportunities for change. That seems to take precedent.

Changes in the world - the rise of technologies, new foods and animals encountered, the needs of armies vs. navies,f the spread of religions - also required vocabulary change, and this is usually reflected in coinages. Some may be adapted from other languages, some may be brand new. But these are oral changes that wouldn’t be impeded by writing. Writers would also have to strive to keep up with the speakers. If these new words lasted in the language they would eventually make their way into print, but we can’t know today when they first appeared in the oral language.

One great example of this is spelling. Spelling would seem to be something that would standardize quickly after the introduction of writing. But we know that’s not true, at least in English and other European languages. Shakespeare is famous for never having written his name the same way twice. Spelling did not standardize until after the introduction of spellers (books of difficult vocabulary words) and dictionaries.

It’s not until literacy becomes fairly close to universal and a set of tools to impart literacy - primers and spellers and dictionaries and teachers and books - become close to universal that some aspects of language change slow. Formal grammar has changed little since print becomes widespread. But technology also works to increase the rate of change of encountering different speakers and new environments. So meaning changes and coinages have increased steadily since the Industrial Revolution. We’re probably in the era of greatest rate of change in history.

This is all a heavy-handed way of saying that writing doesn’t have a simple effect on language, and that the effect changes over time and place and that some aspects of language are easier to pin down than others. There’s no universal answer.

Here’s why I pose the question:

I’ve no background in anthropology or linguistics, but in retirement I am broadening my interests and one topic that fascinates me is human prehistory or, more specifically, the development of neolithic Europe. The expansion of the Indo-European language family is one of the pieces of this puzzle.

Scholars like Gimbutas and Mallory place the I-E Homeland in the steppes of Eastern Europe, with the main branches diverging 4500 - 2500 BC. This model maps very nicely to archaeological facts and many scientists (as well as this layman) believe it is almost certainly correct.

However, there is a strong minority opinion which places the I-E breakout about two millenia earlier; this is no minor difference but has a huge effect on any reconstruction. Some recent journal articles have supported this minority opinion by constructing a chronology of the Indo-European language family using computer analyses of vocabulary change.

I think these analyses must be wrong, but wonder what the flaw(s) is/are. The studies do extrapolate change rates from historic fact and assume constancy so their dates will end up too early if change was indeed faster before the invention of writing.

I think we’d need to know the assumptions and parameters built into those computer analyses before we could form an opinion on whether or not they’re adequately accounting for the effects of literacy, if any, on the speed of vocabulary change.

Do you have a cite with any details about how these analyses are constructed?

Great question in any case, and I happen to agree with you about the seemingly most plausible interpretation of the currently available data on I-E spread.

My personal opinion from long reading on the subject is that Gimbutas is dead wrong about her general body of work and has been definitively refuted since her death. As for Mallory,

I’m on Renfrew’s side on this. I am no expert, though, just an interested reader.

However, all of their primary research was conducted decades ago and that is a long time in modern debates. The trend of late is to push all seminal events back farther and farther. Whether the exact details are correct the trend appears to be well-established.

None of which has anything to do with written languages that I can see.

I think it’s more likely to produce diglossia.

Here is a famous paper published in Nature several years ago. Here is a recent thesis paper which I’ve not finished reading. AFAICT neither incorporates literacy-dependent time rates.

I’m curious about your Gimbutas comment.

As for “all seminal events [pushed] back farther and farther,” this certainly does not apply to the R1 Y-chromosome haplogroup, which, according to Wikipedia, was conjectured to be associated with the Cro-Magnon invasion a decade ago, then associated with Europe’s repopulation ca 10,000 BC, but now is believed to have not arrived in Europe until about the Chalcolithic Age. (I mention this incidentally, claiming no necessary relation between that gene and any language.)

As to the PIE Homeland question, I’ve also done much reading and have formed a strong opinion that the Gimbutas-Mallory interpretation is correct. This might lead to a fun discussion, but is not the question I ask in this thread.

:confused: IMO, the papers above give incorrect dating. I wonder why. If languages change faster when there are no written works, that might be the explanation.

How would we know? I mean, how could you do a long-term study on how a non-written language changed?

Besides that, there are more variables in play than just whether or not the language was written. It seems that language colonizers tend to be more conservative than their homeland-staying relatives (witness Icelandic).

Languages also change a lot more when the populations are in intense contact with other language groups. And not just languages that are very different-- it is theorized that the reason English lost lots of its inflection was because of intermingling of groups who spoke different, but very similar languages (ie, Anglo-Saxon and Old Norse).

Interesting, septimus! I looked at the cited Nature paper, and the first thing that struck me was that according to that reconstruction, the approximate date of the divergence of “the Italic, Celtic, Balto-Slavic and perhaps Indo-Iranian” language groups is pretty nearly the same as the date estimated by the more mainstream model.

That is, everybody seems to pretty much agree that those language families emerged and began diversifying within about the last 5000 years: i.e., later than about 3000 BCE (they propose Greco-Armenian languages becoming distinct perhaps two millennia previously).

In other words, this model’s dating of the language evolution process involved seems less radically divergent from the more mainstream view than you suggest.

AFAICT, where the one major difference of opinion comes in (and it’s admittedly a huge difference) is in identifying the approximate date where the breakup of Proto-IE into separate language families starts. The “mainstream” view holds that PIE began branching only around 5000 BCE (plus or minus some centuries), while the Nature authors’ model pushes it back to around 7000 BCE, with Hittite, Tocharian and Greco-Armenian diverging early.

If some “literacy effect” had caused the authors’ fundamental assumptions about the intrinsic speed of vocabulary change to be incorrect by some constant “stretch factor”, I would have expected their conclusions to differ from the mainstream model consistently across the board. Wouldn’t you?

That is, wouldn’t we expect a “vocabulary change clock running too slow” to give inflated estimates for all the time periods of linguistic change that it covers? Shouldn’t we see, say, the emergence of Indo-Iranian, and then its split into Iranian and Indo-Aryan, etc., pushed back in time as well, when compared to the mainstream model?

But that’s not what the paper appears to show. So I think we can infer that whatever the merits or faults of this linguistic evolution reconstruction model, the problem, if there is one, is not as simple as just using a too-slow base rate for vocabulary change.

And, supposedly, Americans have preserved words or forms that are considered archaic in the UK, such as “fall” (the season) and “gotten” as the past participle of get.

There would seem to be a few good controlled experiments, involving languages that became stabilised because they were not only written but continued in use as languages of various religions – such as Latin, Greek, Arabic and Sanskrit. Latin and Sanskrit are very good examples, because the religious languages remained stable, while the spoken languages diverged and became multiple languages today. The religious use may make a language particularly stable, because of the perceived need to preserve sacred texts and liturgy, but we can certainly see how the spoken language change relative to the religious language in these cases.

I’d also speculate that the wide use of sound recordings of language over the last century or so will tend to slow down the rate of change in language.

I don’t think the liturgical nature of Latin and Sanskrit make them good analogs to vernaculars. An interesting data point, but I can’t see that it would tell us much about how a language like English changes today. Latin is, like, TOTALLY not open to change, yo!

You’d think so, but regional distinctions in language, even in the US, continue to develop.

As with almost anything else that ibvolves testing the past, you look for divergence in current conditions.

One way would be to look at the degree of divergence between languages in New Zealand or North America. Places where they didn’t have writing, and compare it divergence in Europe and Asia.

For example, Polynesia was settled was settled between 1, 500 and 500 years ago and had no written language. So we should be able to compare the languages of Polynesia to get a baseline for how much a non-written language diverges in 1, 500 year. Then we can compare how much various French dialects have changed in that same period.

Alternatively, both Indo-European and Americanlanguages have a recent common root. So we could look at how much diversity there is in American languages, which were largely unwritten, and compare it to the degree of diversity in Indo-European languages, which were mostly written.

And so on and so forth. Of course there will be other confounding factors in every case, but we should also be able to find yet more cases that control for those factors. With a detailed enough study over enough instances, we should be able to falsify the hypothesis. If it’s true, then we would expect European and Asian languages to have diverged less in the last 4, 000 years than languages in the Americas, or Oceania.

Both papers need to “calibrate” the dating scale using information from written languages, so the effect I’m asking about, if it exists, will increase with earliness. To clarify, here are dates with the 1st column roughly that of Gray-Atkinson and Fig. 5.1 in the 2nd paper, the 2nd column “mainstream” guesstimates.

Sanskrit 3000 BP 3000 BP
Ind-Iran 4500 BP 4000 BP
II-Greek 7200 BP 5800 BP

See how the dates’ “wrongness” increases with earliness? The most recent 3000 years are dated correctly because the calibration (via Sanskrit) forces this.

(BTW, I think there may be other factors adversely affecting the early date estimates in these models.)

Why? Sanskrit didn’t become a written language until about 2000 years BP.

I’m not asserting you’re wrong, I just don’t really understand your argument.

And I don’t understand your misunderstanding. :cool: Is it just the difference between 2000 BP and 3000 BP estimates for written Sanskrit? If so, three comments:

  1. In OP I should have mentioned that memorized poetry can play the role of “written language” for this discussion.
  2. 2000 BP for written Sanskrit seems very late, unless we’re conflating BP with BC. I was expecting any objection to be in the opposite direction.
  3. Clay inscriptions dated 3380 BP from the Mitanni Kingdom include fragments of proto-Sanskrit.

It ain’t, though. As far as Indo-Aryan languages are concerned, writing seems to have been first used to represent a form of early Middle Indic in the Asokan period, which is indeed about 2000 years BP. Literacy in Sanskrit did indeed develop quite late compared to some other ancient languages.

Yes, there are some loanwords from an early Indo-Aryan or Indo-Iranian language in cuneiform Mitanni texts, but that’s not the same as a tradition of literacy in Sanskrit itself.

And since Sanskrit was not a written language (nor were any of its daughter tongues) till about 2000 BP, I’m still having difficulty seeing how “calibration using information from written languages”, specifically “via Sanskrit”, would “force” correct dating from 3000 BP onward.

Wikipedia shows

But I don’t pursue this because it’s inessential to my argument. I hope it’s clear that a religious text, rather than more general literacy, may be enough to retard vocabulary change.

Even taking your date instead of Wikipedia’s and even ignoring my comment about memorized poetry, your focus on the difference between 2000 BP and 3000 BP seems to miss the point that my argument would apply (though with different numbers) with a different “calibration” for Sanskrit.

(BTW, my mention of Mitanni inscriptions in proto-Sanskrit from ca 3380 BC ignored the Kikkuli Hittite horse-training texts whose source is estimated to be somewhat earlier.)