What written language conveys the most information with the least characters?

I have noticed that with any text in both English and Español, the one in English takes up less room on the page. Deutsch seems bulkier than English, and Тексты на русском are fairly bloated as well. I would assume that any ideographic text would be much more compressed.

Has there ever been a study of which written language conveys the most information per page?

Chinese. Unfortunately, it comes at the cost of ease of learning: memorizing several thousand ideographs and the sets of ideas they can represent, is a multi-year process at best.

Latin’s syntax and grammar combine to make it theoretically and potentially one of the best languages for a combination of precision and conciseness – unfortunately, the majority of Latin writers didn’t avail themselves of that capability! :wink:

Hey cool! I have a Google ad in Chinese on this thread! I’ve never seen that before… :slight_smile:

Classical Chinese wins, I would think. To give an extreme, but concrete example, here is the introduction to Sun Tzu’s Art of War as translated in English by Lionel Giles:

Here’s the original text:

Giles’ translation is actually on the terse side and mirrors quite precisely the Chinese text. Nevertheless, it’s more than twice as long.

[nitpick]Chinese characters are often called “ideographs,” but they do not in fact represent ideas, but rather, words and other morphemes. Some discussion can be found at the Wikipedia article on ideographs.[/nitpick]


Given all the commas and semicolons in the Chinese text, I’m personally guessing that the writing is skipping a lot of assumed words.

Walk,dog,tuesday;god,punish,not. -> Make sure to walk the dog on Tuesday, or else god will punish you.

If you’re going to define a very complex Chinese character (isn’t the preferred term now logograph?) as just one character, I’d say you’re kind of gaming the question. Wouldn’t it be better to ask which written language conveys the most information in the least strokes? As for translations: one can translate with simple-to-understand language or high-falutin’ language and get wildly different lengths for the translation.

And the ads? I get (from left to right): one in Korean and three in English.

Aren’t translations usually longer than the original text, regardless of what language it’s translated to and from? Or is a Chinese translation of an English novel actually thinner than the original?

Yeah, something like that.

孙子曰 (Sun Zu said):兵者 (war),国之 (nation’s) 大事 (important task).
-> Sun Tzu said: The art of war is of vital importance to the State.

死 (death) 生 (life) 之 ('s) 地 (grounds?)
-> It is a matter of life and death

存 (safety) 亡 (ruin) 之 ('s) 道 (road)
-> a road either to safety or to ruin

etc., etc. The translation is a rough one - it’s been a while since my high school classical Chinese classes.

Purely anecdotal, but when I worked as a translator (English-Korean), the Korean translations were usually shorter than the English originals, while the English translations were almost always longer than the Korean originals. English just takes up so much space.

The arrangement of hangul characters into syllabic blocks rather than stringing letters out in a line has to compact things. At that, you will notice that the OP observed that English tends to take up less space than some other European languages like Spanish or German. I would guess that part of the effect there is that English has a very large vocabulary. Other languages might have to use modifiers to cover something English has (or more likely, stole) a succinct word for. I remember observing “There’s a word for that …” concerning some topic or another when speaking to a friend whose native tongue was Farsi. His response was “English has a word for everything. There’s probably a word for a guy who masturbates on Tuesdays.”.

For German, it can look even worse than it is for text printed in small areas because of all the long words (really, hyphenated words without the hyphens). It folds badly. And capitalizing all the nouns leads to more use of wide letters in variable width fonts.

The semi-colons are there to make the text easier to parse. It’s not skipping assumed words, classical Chinese is extremely compact. Not just the writing, but the grammar also.

And while I’m willing to accept that translations might generally tend to make texts longer, this is certainly not always the case. As a matter of fact, texts translated from English to Chinese would still end up shorter in Chinese.

Take the same excerpt from the Universal Declaration of Human Rights:

In French:

In English:

In Japanese:

In Chinese:

Again, Chinese is more compact than the other languages, at comparable font sizes, due to its compact script, grammar and vocabulary. I singled out classical Chinese because it’s even more compact than modern Chinese (as in the IDHR).

What about ‘Ogham’ then?


I’m at work so haven’t read the entire Wiki entry, but I remember thinking ages ago that given that the writing is largely made up of vertical lines crossing the main horizontal one, you’d be able to write a great deal in very little space. The picture of that olde parchment looks pretty compact…

Nope, my experience translating from Spanish to English and viceversa shows that, in general, the English version is shorter. I’ve been known to say “ohmyGod, I got a paragraph that’s shorter in Spanish!” and do a little victory dance.

A lot of it is the verbs; many actions which in English can be expressed with a single (often short) word require a phrasal verb in Spanish. Another factor is the completely different phraseology. If I translated any text “brick by brick,” Babelfish-style, the length wouldn’t vary as much, but my selling point as a translator is that I can often come up with 2nd-language versions that leave people wondering which one is the original. That requires rephrasing, changing similes, etc.

I’m not sure I agree that your post has comparable font sizes, in relation to the OP. I don’t think it’s a valid comparison putting one very complex Chinese character (that needs to be distinguished from thousands of others) in the same width as one very simple roman letter. While I don’t read either the Chinese or Japanese scripts, it seems that one would have to look close and squint a little bit to read the ones you posted, while the roman-letter scripts are very clear and readable (on my fairly high res LCD screen at least).

For information density comparisons, we should compare texts with either similar levels of readibility, or more objectively, compare minimum pixel counts. Roman letters are legible at 8x8 pixels. How many pixels does it take to make distinguishable classic Chinese characters?

(I’m not saying classic Chinese wouldn’t still win, but we should make reasonable comparisons).

The hardback copies of Hobbit + LOTR that I got in China, in Chinese, are MUCH thinner than any copies I’ve ever had in English, and the font is large and readable. I can’t vouch for the quality of the translation, since my Chinese still sucks. :slight_smile:

People I know in China can read characters in amazingly small fonts. They say you don’t have to be able to see every stroke to know what the character is.

As a slight tangent in the conversation, my Chinese teacher told me that, in her experience, not only will the Chinese version be more compact, but people with equal facility in reading will be able the read the Chinese in a shorter time. She claimed this to be true in both the Chinese -> English and English -> Chinese translation scenarios.

Oh, and note that the original Art of War has no puncuation at all, IIRC.

I agree that it’s sort of silly to compare English, which uses a 26-character alphabet, with Chinese, which uses a 1000+ set of symbols, unless the question the OP wanted answered was “which language has the most distinct characters.” Whichever language has more characters is likely to be able to convey more information with fewer of them, simply because there are many more words that need no more than one character. It’s interesting that, even with 50 times as many symbols, and other space-saving techniques like not using spaces, Chinese is only about one tenth the size of English in terms of characters, and only one third in terms of space taken up. This suggests that English is actually more effiicient, given its limited alphabet.

A more interesting way to approach this question, in my opinion, is to ask which language is the most concise, given its set of characters. To answer this, you’d want to get a statistical distribution of the most commonly used 10000 or so words, and see how well-correlated usage is with size.

Actually, I think that is precisely what the OP is asking: which language contains the most amount of information in the least amount of space (or characters, in his words). To that end, the logographic languages win.

It’s certainly valid to point out that the higher density comes at the price of complexity, and if you counted the actual number of strokes it takes to create the text, I don’t think Chinese would come out ahead. Nonetheless, it’s fair to answer the OP’s question by saying “Chinese.”

Good point. Imagine what this text would look like if it were written out entirely in binary. The strings of 0s and 1s would take up much more space. This space could be compacted down by notating it in hexadecimal, which would reduce each 8-digit string to two characters, but adds the complexity of having 16 symbols (0-F) instead of 0 and 1. Each hexadecimal pair can be represented by a single character, thereby reducing the space consumed but also increasing the complexity of the text (256 characters, theoretically, though most likely only the ASCII characters from 32-127 would mostly appear).

No, because one could easily invent a script where all glyphs were single strokes of varying lengths and/or orientations.