Indo-European Language Substratums

Is it known if (or thought that) any of the Indo-European languages had a non-Indo-European substratum? I’ve read a bit about questionable theories about this involving Germanic languages, but if anyone can expand on this subject, it would be appreciated.

All Northern Europe, including the entire Scandinavian Peninsula as well as areas south of the Baltic, were Finnnic-speaking before the Indo-Europeans came. So runs the Finnic-Substrate-to-Germanic theory. In this theory, the Germanic languages arose as a creolized pidgin Sprache of Finnic peoples learning to speak Indo-European. I don’t have a link to the theory at present, it’s something I came across at work (I’m writing from home now) while googling some odd archaic Finnish words while teaching myself Finnish by reading the Kalevala and listeing to Värttinä. Hyvää iltaa.

Spain and southeastern France have Basque substratum words

And I believe Old Indo-Aryan (i.e., Vedic Sanskrit) has some substratum words (or maybe just plain loan words) from non-IE South Asian language families such as Dravidian and/or Munda. It’s also got some general non-IE linguistic features (e.g., retroflex consonants) derived from those other language families.

Kimstu is certainly right about the Dravidian substratum to Indo-Aryan. There have been oodles of Dravidian loanwords identified in Sanskrit, even in the Vedic vocabulary. How the retroflex consonants got into Indo-Aryan (and its beighbor Pashto as well) is a curious topic. It seems that Dravidian did not originate them; they were picked up from the indigenous Munda languages, which are the real autochthonous languages of India. How and why they spread across language families as such a strong areal feature is something I’d like to understand better. Once the slots for the retroflex phonemes were set in Sanskrit phonology, it began generating its own due to the presence or assimilation of /r/.

Gymnopithys is right too about the Basque/Aquitanian substrate in Iberia and southern France. But can anyone explain what it is about the Romance languages in those areas that differ from other Romance languages, that came from Basque influence?

The non-IE substrate underlying Greek was called Pelasgian by the ancient Greeks. It may account for many non-IE words in Greek vocabulary, but whatever language or group of languages it was is unknown and lost. The ending of names like Korinthos or Hyakinthos has been attributed to Pelasgian.

The non-IE substrate underlying Persian in Iran was Elamite, the language of a kingdom in southwestern Iran that was in contact with Mesopotamia. The Persians were relative latecomers to the Middle East. There are plenty of records of the Elamite language recorded on cuneiform tablets, but it still isn’t very well understood. I suspect that many Persian words of non-IE origin are the only survivals of Elamite words, but there’s no way to establish that. Elamite seems to be part of the same language family as Dravidian. It’s been named Elamo-Dravidian. Since the pre-Aryan extent of the Dravidian languages covered North India, Pakistan, part of southern Afghanistan, and southeast Iran, I hypothesize the Elamo-Dravidian family once having reached from Khuzestan to Cape Comorin.

I forgot to mention above: what is now Russia was all Finno-Ugric speaking before the Slavs arrived there. While the presence of Finnic speakers in, say, Germany is conjectural, it is known that Finns were all over Russia (several Finnic languages are still spoken there, like Mari, Mordvin, Udmurt, Komi…). Also, internal classification within IE postulates that Germanic and Balto-Slavic form a node within IE taxonomy, i.e. Germanic and Balto-Slavic are more closely related to each other than to the other IE branches. If this is true, I think it would be because they share the Finnic substrate. Whether or not Finns lived in Germany, at any rate we know they were all over Scandinavia, and the earliest origins of Germanic as an identifiable branch of IE point to Scandinavia. Sweden, I think, is where Gothic came from.

The Greek word thalassa ‘sea’ is thought to be a loan from Pelasgian.

In Anatolia, the non-IE substrate to IE Hittite was a language called Hatti. The Hittite Empire preserved Hattic as a religious language, and there are cuneiform tablets of it from the Hittite imperial library at Bogazköy. Some people have speculated that Hattic or something related to it was the language of the earliest Anatolian towns like Çatalhöyük. There has also been speculation that Hattic may have been related to Caucasian languages like Chechen.

The substrate of Kurdish was an ancient language called Hurrian, of unknown affiliation. Hurrian (also preserved for religious use by the Hittites) was the language of an area called Urartu, which is pretty much the same area as modern Kurdistan, in southeastern Anatolia and northwestern Iran.

Regardless of whether we’ve determined or even arbitrarily named certain languages as being the substrate of present-day Indo-European languages, it seems pretty certain that they all had some non-Indo-European substrate. None of the present-day speakers of Indo-European languages are simply descendents of the speakers of proto-Indo-Europeans. Determining what the substrates are is probably impossible in some cases because the speakers of the substate languages have no related languages spoken today and left no ancient manuscripts nor record in history and were so thoroughly assimilated that it’s difficult to tell which of the differences of their present-day languages from other Indo-European languages were caused by the substate languages.

Can someone define substratum here? I mean in a precise linguistic sense.

Here’s the definition I was using, from Wikipedia:

In that case, Wendell’s got it right. Modern humans colonized Europe 30-40,000 years ago. I-E is about 6-8,000 years old, probably originating somewhere around the Black Sea. But it doesn’t really matter exactly where, it almost certainly was a small geographic area. Hence, almost all the modern I-E languages would fall in the category of the OP.

I don’t know of any other features, but it’s supposed that the loss of initial /f/ in Castilian comes from Basque influence. Thus Latin ferrum -> Spanish hierro, Spanish horno vs. Italian forno, Spanish hijo vs. Portuguese filho. Only thing that springs to mind.

The Celtic substrate in much of Iberia gave Spanish some words not found in other Romance languages, such as “vega”, which means “riverbank” or “floodplain” (as in “Las Vegas”), a word descending from the Proto-Celtic word which gave rise to modern Irish “abhainn” = “river” (this appears on the map of the Burren region of Ireland on my wall), and to the British river toponym “Avon”.

I’ve been told, by a reasonably reliable source, that there’s some suspicion that there was a Semitic substrate to proto-Germanic. Apparently if you look at the words in proto-Germanic that don’t trace back to proto-Indo-European there are some resemblances to Semitic languages. Perhaps before 1000 B.C. there was a group of Semitic sea-traders who traveled around the coast of Europe and settled in Scandinavia before the proto-Germanic people reached there and their language influenced proto-Germanic.

JKellyMap writes:

> The Celtic substrate in much of Iberia gave Spanish some words not found in
> other Romance languages, such as “vega”, which means “riverbank”
> or “floodplain” (as in “Las Vegas”), a word descending from the Proto-Celtic
> word which gave rise to modern Irish “abhainn” = “river” (this appears on the
> map of the Burren region of Ireland on my wall), and to the British river
> toponym “Avon”.

Well, yes . . . and no. Celtic isn’t a non-Indo-European substrate for Spanish in any case. This is a situation where one branch of Indo-European serves as the substrate for another branch. Furthermore, proto-Celtic was spread out over much of Europe. It was spoken in regions where French, English, German, and Spanish are now spoken. There are more words in English derived from Celtic sources than in Spanish, and there are even more Celtic-derived words in French than in English.

That’s a fascinating idea - but obviously deeply speculative. Is it even widely agreed that Proto-Germanic had a substrate? Forgive my ignorance if this is something commonly believed now, but last I heard it was more of an interesting hypothesis than a substantiated theory.

It’s certainly an interesting notion. Did the Germanic people settle somewhere with a Semitic population at some point?

Interesting hypothesis is a good description for this theory. What’s known is that the proto-Germanic peoples were living in the area of either present-day southern Sweden or Denmark around 1000 B.C. If you look at the words in proto-Germanic that don’t derive from proto-Indo-European (and that’s about a third of them), there are some resemblances with Semitic languages. For that reason, some speculate that a Semitic group of sea-traders just before 1000 B.C. traveled around the coast of Europe and some of them settled in Scandinavia. These Semitic sea-traders were thus there before the proto-Germanic peoples arrived and were conquered by the proto-Germanic peoples. Hence, there was a group of Semitic peoples learning proto-Germanic and thus serving as the substrate for proto-Germanic.

I learned this recently while listening to the one of the courses on tape offered by The Teaching Company, The History of Human Languages, which is taught by John McWhorter. McWhorter is a linguist, so this isn’t just a random nut theory. It is though rather speculative.

I should have made clear that I was responding to Johanna’s question:

and not to the OP. Sorry.

It is clear that all Indo-European languages have at least some non-IE substratum, several in fact, but it’s much harder to tell where these originally come from. By using solid historical evidence, we can conclude what’s already been presented in this thread: Finno-Ugric people originally inhabited most of what is now Russia (European part), Basque lands were much larger in ancient times, and there were no Indo-Europeans in the whole Indian Subcontinent before the Aryan invasion. In each of these cases, an IE language(s) has formed or expanded in originally non-IE area. But we also have evidence that while these expansions may have been destructive to the original cultures, the original people weren’t destroyed in major scale. Especially when looking at Aryans in India, we see that the IE newcomers are a relatively small population compared to the already existing folks there. Now, when we fast forward a few generations, the language situation has become favourable for IE people, but most of these new speakers are people who have changed their language. They have been assimilated to an IE culture, but features of their old language can still be heard. This has probably been a common cause for substrata all over the world.

Same thing happened when Greek supplanted Pelasgian. Greeks were a militant, barbarian (can one use the word barbaros when referring to Greeks?) Balkanese tribe while Pelasgians apparently had a sophisticated Mediterranean civilization. Yet they were defeated, but not without a trace. Pelasgian substratum seems to include vast amount of words related to Mediterranean nature and also bronze age technic and culture, not to mention major place names and mythical characters. This substrate has since spread to other languages, and somewhat ironically these words of non-IE origin are often those that we think as most defining Greek loan words. The story of Latin and Etruscan is similar, but Etruscan substrates in Latin appear to be fewer, at least in vocabulary.

Determining the whole history of humankind in Europe is obviously bigger problem. Proto-Indo-European had divided into different language groups long before written history, and it’s not at all clear, where these groups had gotten their differences. Unfortunately these origins are also always subject of national pride, racism and even massive racial and mystical theories, which so much have troubled this science in last centuries. For a long time linguistics was also practically the only way to “see” past historical records, but recently genetic studies have come to aid.

I know nothing about linguistics, but it’s my understanding, that during the past century opinions there have ranged from the quite extreme race-nation-and-language-are-closely-linked to much milder and neutral views. According to the extreme viewpoint, when a new nation arrives, it physically displaces older inhabitants, and no language change happens, keeping any substratum minimal. The more modern theories claim that language can also be acquired through interaction, just like in the historical examples above. This brings us to an interesting mix, Paleolithic continuity theory, which partly has its origins on theories of Uralic continuity. They challenge the traditional concept of Indo-European “conquerors” storming Europe in Bronze Age, and go for much earlier and continuous populating histories.

The Uralic continuity theory is several decades old now, and has generally replaced the old wisdom, which basically insisted that Finns came from Volga area as a single migration flow little more than 2000 years ago - a viewpoint undoubtly influenced by real historical mass migrations of Huns, Germanic tribes and Hungarians, among others. Here is a nice compact FAQ of Finno-Ugric languages, and what majority of Finnish scholars now think. It’s notable that while all of these languages are relatively close to each other linguistically, their speakers aren’t necessarily ethnically close. Genetical studies have proven that Finns and Hungarians are less related to each other, than to Germanic and Slavic people as a whole, respectively. The reason is that these two nations come from the opposite ends of Finno-Ugric spectrum, and each has been subject of intermarriages with their neighbouring nations, but some see more to it.

The theory that Johanna referred to has been very popular here for some years now. If you check the above link, it says that one of the best-known supporters is Kalevi Wiik (site in Finnish, but click the links around there to see some interesting maps). He published a few years ago a book called Eurooppalaisten juuret (Roots of Europeans), which was a hit, and both it and his newer book have won prizes. But for many old-fashioned linguists and other scholars (“Indo-Europeanists”) this was far too revolutionary, and they accused the suddenly popularized minority of nationalism, racism and what else. However, accusers seem to have calmed down after they actually read the book. Personally I like the theory (though again, I know nothing about linguistics, nor genetics for that matter), let’s see why.

The model arises from Uralic continuity, but it’s more comprehensive. Basically, during the late (and coldest) times of the latest Ice Age, the remaining population of Europe had backed to three different refugia around Southern parts of the continent. These were roughly in Ukraine, Balkan and Iberia (Spain and Portugal). Each of these areas had distinct culture, ethnicity, language and, due to climatical differences, food gathering methods. Especially the Iberian refugium was well isolated from others, since the proximity of both Nordic and Alpine ice sheets had turned Central Europe into polar wasteland. Ukrainians lived on eastern mammoth steppes and were big game hunters, Balkanese had some wee forests and went after small game, while Iberians were something in between. Then the Ice Age officially ended and the massive ice sheets started to retreat northwards. Climate changes then moved other environmental zones accordingly. Very quickly this gave advantage for those big-time hunters, since the game now roamed freely in the whole northern half of Europe. Those from Ukrainian refugium expanded to a massive area. People from Iberian refugium took also major gains in the west, but Balkanese in their woodlands were more modest and only expanded gradually. Nothing lasts forever, though, and the great hunters see mammoths become extinct and woodlands spreading north more and more rapidly. And then some Middle Eastern folks give Balkanese agriculture. This enables the smallest refugium to rise and become the greatest western culture and take over lands west and north via - you guessed it - language replacements.

It’s clear that Ukrainian refugium had a Finno-Ugric protolanguage. Balkanese might either have been the proto-IEs, or IE people came to Europe from Anatolia along with agriculture and thoroughly replaced the original Balkanese. Iberian refugium was inhabited by those called Franco-Cantabrians, or Iberians, Aquitanians and Basque. It appears that they expanded to the whole western Europe and even all the way to Norway, before being almost completely converted to another languages during different historical phases. It’s possible that there was another smaller refugium in Southern Italy and maybe Caucasus but these aren’t crucial. Anyway the theory then goes on to explain the different Indo-European language groups. Apparently both Italian and Celtic groups are descendants of Basque folks with switched languages, while Germans, Balts and Slavs all were originally Finns but, again, switched languages due to certain circumstances. It gets even more complicated, since there are other substrata, like certain Basque influence in Old Germanic, but it actually makes sense. Even the possibility of existence of certain unknown and unrelated languages is acknowledged.

The above claims are generally backed up by both linguistical and genetical evidence whenever possible. Especially the work of Cavalli-Sforza and Piazza is often mentioned. There’s a nice presentation (Warning: PDF) in Wiik’s site with some of the most interesting maps. They’ve found correlations between having light-coloured eyes and descending from Ukrainian refugium, for example. I have no idea whether any even relatively similar theories have emerged in other countries, or if this truly is an overt example of Finnish nationalism and delusions of grandeur. But just by looking at post- Ice Age Europe, for example this site about ancient Ireland (be sure to check the external references), one can certainly see, why the idea of refugiums and expansion from those, leading to continentwide language changes, looks so appealing.

Hmm. Since my awfully long post wholly seems to be defense of a unique theory which is, at best, disputed, I can probably go on a bit.

The Finnish minority theory also agrees that there is a Semitic influence in Germanic, quite strong in fact. However what you just described is a bit off. That would require the whole southern Scandinavia to have been essentially an empty land to around 1500 BC. Just look at any good atlas of old history and you’ll see how wrong that premise must be. In fact, the pre-neolithic Maglemose culture existed in Denmark and surrounding areas before the year 6000 BC, and the region was relatively densely populated even back then. People had come there from east as quick as the area was fertile enough for their hunting. And why wouldn’t they? Scandianvia had everything: steppes, water, game, fish, nearby ice shelf, beautiful women, weird politics… Ehm, and it wasn’t so far away that you couldn’t get there in less than few hundred years even across the whole continent. No, around 1000 BC those Semitic sea-traders would’ve been nothing more than a drop of water in a lake.

On the other hand, theory I’ve cited above links a Semitic-Hamitic language to the Megalith culture. Looking around in Western Europe, you’ll see huge stone monuments everywhere. It must have taken huge amounts of labour back then when those were built. Easiest way to do that was religion. And turns out that it was, indeed; those are churches, and they’ve been built by people who believed in a religion that came from Middle East, and along with it spread a lot of Latin and some Greek influence in local languages. You’ll see other, older, huge stone monuments, too. Those are megaliths. Maybe the whole megalith culture was a religion, which utilized these structures to build cultural fellowship among its adherents. Now the majority of currently known megalithic structures is in British Isles, Western France, coasts of Iberian Peninsula, Denmark and Southern Sweden, but there’s also some around Northern Africa. What if the religion started in Africa and then entered Europe, spreading as far as Scandinavia just before 3000 BC. And there were important centers like Carnac and maybe Stonehenge, and what if the priests or whatevers of the religion used some Semitic or Hamitic dialect as their language. Megalithic period lasted for more than several hundred years, enough time for any local language to get more than enough influence (mostly new vocabulary) from this “higher status” language. This kind of influence is probably superstratum for its correct name, and it resembles the Greek influence to Latin, or French influence to English.

Again, sorry for a long post with low information value. I’ve probably made some obvious errors there, not to mention my awful English grammar :o

There are several vocabulary words in Spanish that come from Basque, but the only one I can remember off the top of my head is <i>izquierda</i>.

It’s still unclear whether Pictish was Indo-European or not, but it certainly left behind a bunch of place-names in Gaelic.

It’s speculated that the Dravidian languages gave retroflex consonants to the various Indian languages. And I may start a separate thread on this, but I have noticed that in most Indo-Iranian langauges, unlike almost every other Indo-European language, the word for “one” is “yak/yek/ek/ak/ik” or some variant ending or containing a “k”. The Burushaski word for “one” is “hik” or “hek”. I have always wondered if there is a link there.

From what I’ve read, the Dravidian languages in turn inherited them from the aboriginal Munda languages.

Actually, “riverbank” is usually “ribera”. Vega refers to “fertile plain” or “meadow”. Las Vegas was named such probably because of the oasis near a spring that was encountered there.

According to Ralph Penny, “vega” comes from a Basque word, not Celtic.