What is the largest language?

First of all, I apologize in advance if this has been brought up before. I was unable to find a topic devoted to this using the search engine, and in my humble opinion, this question is worthy of its own topic. I have frequently heard it said that English is the largest language in terms of vocabulary. I have also come across several sources that suggest the number of words is somewhere around one million (a bit of a side question: Does this figure count all the individual tenses of words individually? and How is the figure come up with?). Other people have told me that Russian has the largest vocabulary…and I have caught wind of rumors that Chinese holds the title of largest language. So, I would be obliged if my ignorance to this question’s answer could be provided by my you, the members of the Straight Dope Message Board, and also the following information provided: the largest language in the world, in terms of vocabulary, and the top few languages in this regard along with their respective number of words. I realize that this question may carry inner complexities for one attempting to answer it, but I can think of no other way to state it and would indeed like to know its answer, if one can be determined.

I am quite certain it’s English, although it depends on how you count it. It’s probably not Russian either way. For more info go to the Oxford English Dictionary Website (www.oed.com).

The largest language hmm… that would be Danish, my illustrious mother tongue. Since the vocabulary quite literally is infinitive, in Danish you’re able to indiscriminately create new words by combining old ones. E.g. car door -> cardoor; car door handle -> cardoorhandle etc. (there was a famous (in Denmark) business man who insisted it was his inalienable right always to have a couple morgenbolledamer (litteraly: morning fucking women) at his disposal). I must regrettably admit that many other (albeit inferior) languages also have that ability, as has English but only sometimes; paper clip -> paperclip – as in: how the goddamn hell do I get rid of this Microsoft Office paper-bleeding-clip?

Ps. I wonder, can we use profanity on these message boards?

From my encyclopedia:

“English has a larger vocabulary than any other language, and because it is the world language, words newly coined or in vogue in one language are very often added to English as well.”

So there ya go for that one.

Finding out how many words a language has is accomplished by the ultra-accurate method of `making shit up.’ There never has been, is not now, and never will be an accurate accounting of how many words exist in the English language, or any other living language.

First of all, what is English? The BBC’s official language? The Northern American Midwestern dialect used on CNN? Not the same thing, even though all dialect, jargon, and slang have been excised.

Does English officially include common nonstandard' terms like ain’t’ (now commonly accepted, BTW) and the alternate senses of `cool’?

Does the Oxford English Dictionary, the single largest multivolume dictionary of the language, have a monopoly on The English Language?

How much jargon is part of English? Does the word CD-ROM' exist? Does megabyte’ or gigabyte'? How about LAN,’ ethernet,' token,’ grep,' and foo’?

OK, enough. It’s established that English is a messy, undefined thing. But it’s true that English is the only language that needs thesaruses, and English speakers have historically been willing to accept vast quantities of foreign words as native language. Buffet,' smorgasboard,’ tattoo,' tattoo,’ (Yes, there are two words spelled and pronounced the same, but with distinct meanings.) schmaltz,' cavalier,’ and `blitz’ are all foreign words that have been completely accepted as Standard English. Part of this is the worldliness of the British Navy, part of this is the intrepidness of the American Colonials, and part of this is the fact that the British Isles have been conquered by pretty much all of Western Europe over the millennia.

[hijack]
English is an oddball language, as far as European tongues go: It’s a Germanic language with no genders and little inflection, and it has as much to do with Old Norse and Norman French as it does with Ancient Germanic.

The Germanic comes from two broad groupings of Germanic tribes that colonized the British Isles in prehistory: The Angles and the Saxons, which intermarried to form the Anglo-Saxons. The Anglo-Saxons were a diverse bunch, speaking a variety of basically similar Germanic languages. Because they had to trade and interact in general, the common language gradually simplified to the point where it had no grammatical gender and much less inflection than usual. All of our basic words, like father,' son,’ `water,’ and so on, come from Germanic, as does English’s basic grammar.

The Romans came in later, but their Latin left remarkably little impact on the English language. The biggest thing they did was Hadrian’s Wall, which isolated Scotland to the point where Celtic tongues could survive there long after they had been killed off in Continental Europe. Gaelic owes its current existence to that, as did Manx, before it died off in the 1800s.

Around 1000 AD, Vikings, speaking Old Norse, swept down on the mud huts and Catholic churches of the Green and Pleasant Land. The Vikings, being Vikings, didn’t colonize the Islands, but they left their mark just the same. Old Norse gave us a lot of words, but didn’t completely displace Germanic or remove the Germanic grammar.

Old Norse plus Germanic gives you Old English, the language Beowulf was written in. Old Enlgish is not comprehensible to the average English speaker, but it can be learned as a foreign language.

In 1066, William the Conqueror, descended from a group of French-speaking Vikings called Normans, swept to victory at Hastings and established a French-speaking monarchy in England. This gave English 99% of all of its Latin-derived words, such as cavalry' and beef’ and veal' and regal’ and other words that haven’t quite shaken off their `royal’ feel. Interestingly, these French words didn’t replace their Germanic- and Norse-descended counterparts. Instead, they became syonyms for them, bulking up our vocabulary to astounding degrees. (Try to find synonyms for the words I listed. You’ll proabably do pretty well.)

With the kings (aka royals :)) speaking Norman French, the peasants could mutilate English all they chose for all the literate elite cared (another few Frenchy words there :)). This further simplified English.

Old English with a good dose of Norman French gives you Middle English, the language of Chaucer’s Canterbury Tales. Middle English is more or less comprehensible to modern readers, if there is a running translation within easy reach. It’s to the point where footnotes alone really don’t cut it anymore, IMHO.

After a few centuries of Middle English, something really odd happened: The Great Vowel Shift. Around the mid-1600s, in just a few decades, English vowels changed from being French-like to being Modern English-like. One reason our spelling is so screwed up is that this is when William Caxton was establishing his influential printing presses in London, so he preserved Middle English spellings of words that gained Modern English pronunciations. Knight' and aisle’ were once pronounced as spelled, but are now curious fossils of an earlier age.

The Great Vowel Shift created Modern English, the langauge of William Shakespeare. Modern English, even Early Modern, is more or less instantly comprehensible now, especially with footnotes.
[/hijack]

I’d say, given the language’s unusually cosmopolitan history, English is certainly in contention for the world’s largest language.

Based on my limited knowledge of English history I will dare to call you a damn liar, or rather to mildly suggest you didn’t get it quite right there.
First came the prehistoric people, who might have talked Chinese for all we know (might also have been Picts), then came the Celts, then came the Romans, then came the Angels and Saxons, then came the Vikings, who being Viking did indeed colonize (after a fair bit of raping and pillaging naturally – hah! We got you there damn Brits!) – especially in what is now Scotland, Ireland, Isle of Man as well as north-east of England-proper (Danelaw), then came the William. The rest is pretty much history.

I would guess there isn’t much difference in size among languages. What you’re doing is naming things, or expressing concepts. Anything can be named or expressed in any language, once it’s known.

I don’t think combining old words to make new ones results in a larger language, since you’re just recycling already existant words.

Plus, while some languages are more economical than others, the number of words that get economized isn’t great. For instance, Russian generally doesn’t use definite or indefinite articles such as “the” and “a.” So, the total number of words in a book might be lower than in the same book in English, but the the total number of words in the russian language is only reduced by two.

What would tip the scale would be who has the most stuff to name? That would probably be modernized developed cultures, but not any one language.

It’s good you didn’t insult me, because I apparently know more than you.

The prehistoric people may have been six-assed monkeys, too, but logic and history suggest that they spoke a group of languages known as Germanic. I don’t know about the Picts, or what they have to do with British linguistic history.

No, the Romans came after the Anglo-Saxons. Give me a cite if you expect me to think otherwise.

I don’t think the Vikings stayed very long. They did burn and rape and pillage, but they did that throughout Europe without necessarily taking the place over. AFAIK, the Vikings were more concerned with getting the stuff back home than with establishing themselves as local lords. Hence Danegeld, an early protection money, and Kipling’s famous `Once you pay the Danegeld, you never get rid of the Dane.’

I might be wrong about early British history, but I’ve done more than a little research into this at the layman’s level in pursuit of my linguistic knowledge.

Umm, Derleth, what is your problem? Your history was completely out of order, a perfectly correct ordering was posted and now you demand cites. Yours is the one completely out of whack, it is you that needs to provide cites.

Afterall, you’re the one that apparently never heard of the Danelaw!

Derleth, I enjoyed your post and obviously you have done a lot of study on the subject. However, I wondered about one thing on my first reading and then it came up again.

If the Vikings did not stay around very long, how come:

Did the Vikings not stay in England, but did stay in France? If so was it because of the food? :wink:

How could this be possible? Are you saying that English is the only language with synonyms?

Very well said post, Derleth, I’m impressed. But WinstonSmith does have a point; there were Vikings prior to the Norman conquest who established colonies, even kingdoms, in England. Sweyn Forkbeard, son of Harold Bluetooth (I swear I’m not making these up) conquered England in 1013. His son Canute was king of England, Denmark, and Norway, and was eventually accepted by the English nobility as equal to a Christian king, the first Viking Chieftain to do so. The fact that the Vikings eventually “blended in” muddies the picture a bit, at least until the Normans started throwing their weight around.

For sheer amount of characters is probably unsimplified chinese(or whatever comes after simplified chinese) which uses about 80 000 characters, 35, 000 needed for a good understanding of the language apparently.
I don’t know about volcabulary.

I once read English has ~400,000 words, and the other major languages (French/Spanish) have ~100-150k words. On top of that the average vocabulary was ~10,000 words, and the average person only uses a couple thousand a day. No cite though, sorry.

As some have it all languages, or at least those we’ve taken good look at, are infinite. If you were to collect a corpus of every distinct word used in the New York Times over the course of, say, 5 years, you’d probably have a million words on your hands. Have a look through the paper the day after you stop and you’ll find about half a dozen new words. Point being, languages create new words every day and the lexicon continues to grow. Finding an upper limit is going to be tough.

Which language has the largest lexicon at present? Dunno, but my WAG is that it would correlate with the age of the language and number of speakers. The longer it’s been around, the longer it’s had to grow. Likewise, the more people there are speaking the language, the greater probability of developing specialised words.

I was apparently wrong. Thanks to pravnik, who explained it to me in a reasonable tone, and everyone else should probably calm down.

I was wrong, as I’ve said, so the issue some people had with my apparent contradiction should be resolved as `Derleth was wrong on the first count, so alter it so it makes sense with the second. pravnik will show you how.’ :slight_smile:

I have heard of the Danelaw, but it never connected with anything for me until now. Oh well, that’s why I come here. :slight_smile:

But wait…

…I still want to know about this comment. Is this really true, or were you exaggerating for effect?

Well, if it makes you feel any better, Derleth, I still believe your basic premise is correct: English can be considered as having the largest vocabulary due to intermixing and borrowed words that affected the language far greater than any of the other Germanic languages. “Detente” can be considered an English word, even though of French origin, for example. That may seem like cheating, but that’s exactly how many word we think of as standard English entered the language, like “senior” and “editor” (it helps if you say them with the Monty Python French accent). Norman words like veal, beef, mutton and pork took their place alongside the corresponding Anglo-Saxon calf, ox sheep and swine in one language. From there it just snowballed, and common English words are borrowed from all over the place, like “raccoon” from Algonquin.

Answer Kyomara already! I want to know too. :slight_smile:

Correct order (cites on request, though any google search on British history should back me up - I hope :wink: ):

1.) Pre-historic - Unknown linguistic group. The folks we eventually came to know as the Picts ( in Scotland ) and Cruithni ( in Ireland ) may have been the first settlers or may not. However by historical times they had been fairly thoroughly “Celticized” ( the Picts that were merged with the Scots in a united kingdom under Kenneth MacAlpin spoke a mostly Brythonic language, while the Dal Riata dynasty that established the “Scot” foothold of “Dalriada” to begin with, may have been a Gaelicized Pictish dynasty ).

2.) The Celts - At least two distinct waves ( quite probably more and continuous, like the later Anglo-Saxon migrations ), from which we get the Goidelic ( Q-Celtic - Gaelic, Manx ) and Brythonic ( P-Celtic - Welsh, Cornish, Breton ) branches of the Celtic linguistic group in the British Isles. These two separate branches divided on the continent by the way ( so the Celtiberians of Spain spoke a Q-Celtic language, the Gauls of the Swiss region a P-Celtic tongue ). This group pushed the “Picts/Cruithni” north and eventually absorbed/merged with them. There may have been some minor Germanic intrusion at this time during one of the later migrations - The Belgae who had a presence on both sides of the channel and are usually referred to as Celts, may have been a mixed confederacy of Germanic and Celtic tribes.

3.) The Romans - First Latin influence.

4.) The Anglo-Saxons - Which included not only the traditional Angles, Saxons, and Jutes ( who settled in Kent ), but Germanic peoples from all over northen Germania/Scandinavia. The famous king buried at Sutton Hoo appears to have been one Redwall, who was apparently a scion of a Swedish dynasty.

5.) The Vikings - Norwegians in the north, mostly, Danes more in the south. It is worth noting that the differences between the Norse invaders and the earlier Anglo-Saxons, both culturally and linguistically, was probably not very profound ( though by then the Saxon English were pehaps a bit more settled ). Canute’s Danish was apparently at least roughly mutually intelligible with Anglo-Saxon English, though of course as has been pointed out, there were plenty of Danes already settled in England before Canute even took the throne, so the transition was not so difficult. In fact Canute by most accounts appeatrs to have been a reasonably popular monarch in England.

6.) The Normans - Thoroughly Latinized ( Francophone-style ) Norse. The last major pre-modern piece in the puzzle.

  • Tamerlane