Being a linguist, I must report that, sadly, we don’t have that kind of power over any language’s orthography. What I think you’re referring to, though, is transliteration. That’s where a word in one language is written in another so as to give a sense of how it’s pronounced in the first language. As languages have different sound inventories, this is of necessity an approximation at best. There is also the added problem that some languages use the same symbols to represent different sounds, and so sometimes a word is just “lifted” entirely as written in the first language.
Well, the lexicon of any language is independent of its orthography, so this doesn’t have that much to do with spelling. Korean, as an example, has a fair number of words borrowed from Chinese, Japanese, English, and French. As the Korean alphabet is not based on Latin’s and Korean has a different phonetic inventory than those languages, the transliterations are done in various ways; one could even say irregularly.
Actually, Polish (and Gaelic, for that matter) are very good at being easy to pronounce from the written word. The word in your OP, for example, should actually be spelled (I’m assuming, from the way you described its pronunciation) “Ko???taj”. (If you can’t see the characters, the “l” has a little crossbar in the middle, and the “a” has a little hook at the bottom). In Polish, “?” is pronounced like an English “w” or like the “l” in “hull”, depending on dialect, and “?” is a nasalized “a”, like in French “blanc”. Gaelic spelling, which I’m not going to go into here, is equally logical – there is a reason why “sh” and “th” are pronounced as “h”, because of the way consonants in the language change their sound. My point is that with either language, if you know how the spelling works, you can look at a word and virtually every time come up with how it is pronounced.
Here’s the thing – most of these nations didn’t have their own system of writing before the Latin alphabet dominated, and those that did usually had alphabets that could handle the non-Latin sounds of their language. There’s nothing sacred about the ways that we in English use to represent these sounds; heck, if you go back to Old English you’ll notice that we represented them much differently than we do today (“cg” for “dj” or “dg”), “sc” for “sh”, “þ” and “ð” for “th”). Each region that received the Latin alphabet had to work out for itself how to represent non-Latin sounds. For what we spell as “sh”, the Polish chose “sz”, the Hungarians chose “s”, the French developed “ch”, the Italians “sc”, the Spanish “x” (and then lost the sound and turned it into “j”), the Croatians and Czechs and Slovaks “š” (“s” with a caron), and so forth. There’s no consistency, because there is no “right” way to do it, but rather each set of scribes had to work out something for themselves.
People have given examples that show how English doesn’t keep strict Latin pronunciation. So why do you seem to think that English’s pronunciation (with regards to the Latin alphabet) is the only “true”, correct one?
Going by the spellings in the OED etymology of folk, it seems the original pronunciation was /f/. That’s the case with both Old English and Old High German. In Middle High German, the spelling is with a V, so perhaps the pronunciation went from /f/ to /v/ and back to /f/ in German. However, I’m not an expert on German, so this may be wrong. Also, I’m not sure when this pronunciation shift took place, just some time after spellings were standardized which was several hundred years ago.
I think the OP is wondering why they are used differently, or how did this happen in the first place? What’s the point of having a fairly common set of symbols if they are used in a different manner everywhere?
Can you imagine if some culture decided that “4” represents six things?
This has been alluded to in many of the messages above, but basically it’s because each language has a different set of phonemes (i.e., sounds that make a difference as to what a word means). Sometimes it’s because a language has a sound that doesn’t appear in other languages – e.g., English has two “th” sounds, and these are not found in many other languages; Gaelic and German have a “ch” sound which only appears in English in words derived from languages like Gaelic and German, e.g., “loch”. And sometimes it’s because languages distinguish two sounds which correspond to just one sound in another, e.g., English distinguishes “l” and “r”, but these are effectively one phoneme in Chinese and Japanese, so that to them “long” and “wrong” sound the same.
Letters correspond with sounds, and digits correspond with numbers. Numbers are the same across all cultures, so each culture has a “4” concept which is the same (though speakers of Arabic, Chinese and Hindi will write it differently – the Chinese form being unrelated in origin to our “4”, and the Arabic and Hindi forms being derived from the same origin.)
But using that argument, any language that uses the latin alphabet should use the same letter for the same phoneme, provided its common amongst those languages. So are you saying all languages that have what English calls a “w” phoneme, using the letter ‘w’?
If a 4 is a 4 is a 4, then a “wuh” is a “wuh” is a “wuh” and all should use a ‘w’.
Since that’s not so, the comparison to numbers is not valid.
I think it’s accidental, rather than planned by linguists. It all depends on when spelling reforms are institued (see below).
Perhaps because the invaders stayed in the case of England, but “marched right throuth” as you said about Germany. But Germany instituted a series of spelling reforms, whereas English hasn’t (or at least hasn’t for a long time). It’s really just that simple. In fact, there was a spelling reform for German as recently as 1996. When was the last time the English speaking countries got together to rationalize the spelling system for that language?
The concepts of particular numbers are fixed, and don’t slide into each other the way sounds do. The number 4 doesn’t slide into the number 5 the way that “s” can slide into “z”, or the different vowels can slide into each other.
The other problem is that the Latin alphabet had only 23 letters. English has 26, because it separated I/J and U/V, and added W, but that’s not enough for all the phonemes of English. So for a sound like the “sh” in English, you can:
[ul]
[li]Create a two or three letter combination (like “sh” in English, or “sch” in German)[/li][li]Use a diacritic, like “ş” or “š”[/li][li]Create a special character, like the IPA long s[/li][/ul]
Since each language has its own history, its own set of phonemes, and its traditions of spelling, each language has its own solutions to the problem. And they can change, too: even English, which has lost a few letters over the centuries, including “æ”, “þ” and “ð”, and lost phonemes, such as the sound represented by “gh”.
As has been stated already, Polish orthography is more phonetic than English orthography. Few languages are as un-phonetic as English.
Actually, you might ask, why does English use the letter ‘J’ for the wrong sound?
No, they didn’t.
That’s because you’ve been taught Anglicised Latin. You’re not pronouncing things the way that the actual ancient Romans would have.
Yes, Yoolioos Keyezahr.
Cicero was Keekayrow
The Romans didn’t invent their alphabet. They took it from the Greeks and changed a few things around – modified letter shapes, added some, dropped some, and used others for sounds that they weren’t used for in Greek. The Greeks had done the same thing with the Phonecian alphabet.
So, really, adopting an alphabet and changing it around for a particular language’s purposes is a long tradition.
Living in a college town, I encounter a lot of people for whom English is a second language. The one thing I hear repeatedly is how English is a very easy language to learn to speak but a royal bitch to learn to write. Why? Because of our wacky pronunciations their inconsistencies. A Greek from Cyprus teacher I once had used to laugh at the way we pronounced it. We say “Sigh-pruss” and natives to Greece say “Kee-proos.” It’s all relative. English isn’t the foundation from which we should measure all other languages.
Minor nitpick: Are you sure about that? Mandarin seems to have a clear distinction between ‘R’ and ‘L’ to me. They even have it in pinyin, so I figured it’s definitely two sounds to them. My girlfriend who’s a Chinese mainlander always says, ‘tomollow’ instead of ‘tomorrow,’ but I always figured she was being cute. My Japanese friends do all have the R/L problem when it comes to english, though.
Cantonese speakers do seem to have a bigger problem with the R/L sound. I can’t think of any Cantonese word that starts with a definite ‘R’ sound, so I guess that’s why.
Just an anecdote. I speak native Cantonese and have been trying to pick up Mandarin. One of the biggest problems for me is that so many of the initial sounds are so difficult for me to distinguish. I can barely tell “X” from “SH”, and “CH”, “ZH”, and even “Z” and “J” can all blend together to my ears. One day in class someone asked my instructor what a southern Chinese accent is. She replied, “They make everything sound the same.”
For all this is repeated in discussions like this, it isn’t really true. No native English speaker who had never heard of this supposed linguistic titbit would ever pronounce “ghoti” as “fish”. In English spelling, the “gh” is only pronounced as an “f” at the end of a morpheme such as “tough” or “laugh”, never at the beginning of a word. Likewise, “ti” is only pronounced as “sh” when in a certain combination with other letters, such as “<vowel> + tion” or “<vowel> + tia” (like in “negotiation”). I confess I do not know the exact reason why “o” is pronounced the way it is in “women”; if I had to hazard a WAG I’d say it’s some sort of umlaut going on, or else a remnant of the original Anglo-Saxon pronunciation “wifman”, but it is clearly a special case, and not applicable to English orthography in general.
Oh, and even Church Latin doesn’t really count if we’re pronouncing Ivlivs Cæsar in Classical Latin. Church Latin is pronunced like the “Vulgar Latin” dialect in Italy as it started evolving into Italian.
It’s a limited set of characters, and we apply to languages whose phonetics won’t stand still, even though the written word does. In Eastern Europe, for Cyrillic script the missionaries started with the basic Greek character set and did some serious modifications and additions to optimize it for Slavic languages – yet you still get characters that do not really map from Bulgarian to Russian to Ukrainian to Serbian, after lo these many centuries.
Now, sure, the European missionaries could have done like was done in the case of Cherokee, where Sequoyah made up a character set – some of which look like Roman fonts – and applied them to Cherokee syllables, without worrying what their lookalike character would be pronounced as in European languages. The result is quite phonetic… if you’re pronouncing it in Cherokee! But the thing is, in Europe the literate classes’ primary scholarly and official communications were in Latin and Greek anyway for centuries, so they used what they knew.
Gaelic spelling is highly regular and rational, it’s just a complex system that’s tough for non-speakers to puzzle out. Part of the reason is that it has two parallel sets of consonants, and words starting with one can change to the corresponding member of the other set according to regular grammatical rules. Quite reasonably, it was decided to make that grammatical process regular in spelling, too - by adding an “h” to signal that a letter had undergone the process. That resulted in a lot of strange spellings. “mh” certainly doesn’t occur in most writing systems - but it makes perfect sense in the context of Gaelic.
That’s an example of how a writing system might try to do more than simply reflect the precise pronunciation of a word. By the same token, the English words “photograph” and “photography” have highly different pronunciations. In my dialect, using the SAMPA standard for the international phonetic alphabet, they would be /fou 4@ gr{f/ and /f@ ta gr@ fi/ - there’s actually more sounds that change than stay the same. But since the processes that reduce unstressed vowels to schwa sounds or cause the “t” to turn into an alveolar flap are entirely predictable - unconscious, even - to a native English speaker. So it makes more sense to use similar spellings to clue a reader in to the fact that they come from the same root.
Very few languages have writing systems that even approach being phonetic, due to matters of how the language works, its history, and so forth. It’s more surprising when a language does have an entirely rationalized writing system - but even then, it wouldn’t be expected for it to be clear and understandable to those who don’t speak it.