Evolution of east asian spoken languages

Does anybody have any information about the evolution of spoken Japanese, Chinese and Korean languages, where they came from, when they split? Or anything like that.

I’m looking for information on spoken languages, not written languages.

The best online source (not perfect, but the best one) for the relationships between languages is Ethnologue. It’s a website with information about all of the world’s languages. Start on the following webpage:


For Japanese, click on the link that says “Japonic”. For Korean, click on the link that says “Language isolate” and then on the link that days “Korean”. For Chinese, click on the link that says “Sino-Tibetan”.

As you can see, Japanese is in a small language family. Korean is a language isolate with no known related languages. Chinese is in a large language family. As you can also see, Japanese, Korean, and Chinese are unrelated. Note that “unrelated” doesn’t truly mean that linguists think that these languages (or any others) originated separately tens or hundreds of thousands of years ago. It’s thought that language originated only once 100,000 (or so) years ago. It’s not possible to discover any relationships before about 10,000 years ago though, so languages with no relationships within the past 10,000 years are called “unrelated.”

Thank you for the informative citation.

I believe this overstates the case by several 1000 years.

The first known written languages, Sumerain and Egyptian, both first attested
ca. 5500 years ago, belong to different language families. There is no consensus
on the relationship between any language families beyond the bare fact that they
must have split from each other at some time very much earlier than the beginning
of recorded history:

Language Families

(from link):

Umm… Wendell is summarizing the finds of glottochronology and comparative linguistics, which seek to establish the relationships, and hence the probable origins, of languages, based on structural and core-vocabulary similarities. And there does exist consensus in broad terms as to, e.g., the Indo-European and Malayo-Polynesian ‘phylums’ of languages. I.e., virtually every expert in the field, without significant exception, concurs that Tadzshik, Rajasthani, Ukrainian, Greek, Latvian, Swedish, Portuguese, Armenian, and Afrikaans are related, with a probable time of divergence somewhere between 4000 and 2000 BC. Other, more speculative relationships between the ‘consensus’ groups and between one of them and small families or language isolates have also been postulated by some scholars in the field, but have not attracted a consensus.

The interrelationships of the ‘Han’ (Chinese) languages with each other, and more distantly to Tibetan, Burman, and some related languages, in a Sino-Tibetan phylum is among those for which consensus exists.

Another consensus group is the Altaic phylum, which includes the Turkic languages, spoken across Central Asia from Turkey to Sinkiang; the Mongol languages, spoken in Inner and Outer Mongolia and in Siberia north of Mongolia; and the near-extinct Manchu-Tungus group, spoken in Manchuria and northeastern Siberia. Plausible hypotheses that have not gained consensus (or in one case had consensus in the past and then lost itA) link the Altaic group to one, some, or all of the following:[ul]
[li]the Finno-Ugric group, which includes the Magyar of Hungary; Khanty and Mansi, spoken in northwest Siberia; the Finnic tongues including Suomi (Finnish), Saami (Lappish), Eesti (Estonian), and a group of tongues spoken by minorities within Russia; and the Samoyed languages of the Russian Arctic.[/li][li]the Japonic languages, including Japanese and the Ryukyuan dialects[/li][li]Korean[/li][li]the language of the Ainu people of Hokkaido[/li][/ul]
These hypotheses are individually regarded with varying degrees of acceptance and skepticism by scholars in the field.

colonial, your post is really confusing. All I said was that there are no generally recognized relationships if the resulting languages split up more than about 10,000 years ago. There are some speculative classifications where the proto-language was more than 10,000 years ago, but those aren’t generally accepted. Do you agree or disagree with that statement? Do you have a figure of more or less than about 10,000 years as being the boundary? Please explain what you’re trying to say and don’t just post some citations.

There are some people who believe that Japanese is related to Korean. There are some who believe that Korean is related to Altaic.

For non-linguists, the idea that Japanese and Korean are related is quite compelling because they just sound so similar. But linguists have identified certain apparent cognates (word forms in common) between the two languages.

The theory goes that Japanese is related to a different ancient Korean dialect from the one that gave rise to modern Korean, at a time when the Korean peninsula was much more linguistically diverse than it is now. In the intervening time, the dialect most closely related to Japanese was either displaced or lost its distinct identity through convergent evolution, with the result that the relationship became less obvious.

One problem with identifying cognates is the large amount of shared (borrowed) vocabulary (much of it of Chinese origin) in Korean and Japanese.

(post #4)

OK; there may be no expert dissent at all.

Latvian and Armenian may have originated 2000-4000BC, and Tadzik I do not know about.


• Rajasthani would have diverged from Sanskrit after the Aryan invasion of India,
which took place after 2000BC.

• Portuguese came into existence with the other Romance languages after the fall
of the Roman Empire ca.450AD &ff,

• Ukrainian came into existence after the origin of the Slavic languages about the
same time,

• Swedish after the breakup of Norse after 1000,

• and Afrikaans from Dutch since 1700.

Also, the literate and urban Hittite civilization was flourishing before 1500BC.
Since it was an Indo-European subfamily member (Anatolian) its subfamily must
certainly must have split from all other IE subfamilies much earlier than 2000BC,
otherwise there would not have been enough time for Hittite to first subdivide
and then reach a separate advanced stage of development. Consequently there
are unlikely to be any expert who supports divergence from Proto-Indo-European
anywhere near 2000BC, although some may support a 4000BC divergence.

Although it may be what you meant to say, it is not what you did say in post #2:
"It’s not possible to discover any relationships before about 10,000 years ago”.
That means beginning 10000 years ago relationships are discoverable. In the
present state of scientific linguistics they are not.

I did explain what I was trying to say immediately before the citation:

Writing is now the only way to confidently establish a chronological sequence
in linguistic relationships, and the sequence boundary now begins with the first
known written languages: Sumerian and Egyptian ca. 5500 years ago.

If Sumerian and Egyptian had belonged to the same language family, then they
would have been the earliest languages for which a relationship could be established.

However, since they belong to different language families (Sumerian and Afro-asiatic),
then relationship between them cannot be established because no relationship between
any family is yet established.

That pushes establishing any relationship between any languages to dates more
recent that 5500 years ago, and further away from 10000 years ago.

I was going to ask about this. My understanding was that there was strong archaelogical and genetic evidence that the modern day Japanese were largely descendents of a relatively recent (after 500 B.C or so) migration from the mainland, either China or Korea. I’d think that would leave some linguistic evidence as well, assuming the invaders/migrants/whatever brought their language with them.

I remember having read something about that theory more than a decade ago. It correlates well with the belief that the ancestors of the modern Japanese came from Korea and migrated northwards, displacing the Ainu. The article specifically mentions how political considerations make that theory significantly less popular in Japan and Korea than it is anywhere else.

Most likely this article by Jared Diamond, or something similar. It’s the most comprehensive writeup I’ve seen, and discusses some of the political and ideological blinders the Japanese have about their history. For example, virtually every exhibit I’ve seen in Japan related to Jomon people depicts them as looking like modern Japanese even though going by archeological evidence Jomon people probably looked a lot like modern Ainu; their culture was rather obviously an offshoot of the same culture as the branch that produced the Ainu.

The very good Ethnologue citation Wendell Wagner posted earlier is misleading in the “Japonic” category since they list everything in that category without drawing relationships or timelines. The Ryukyu Islands were a separate kingdom from Japan until the late 1800s, and were under loose Chinese control for roughly 500 years before that. The Ryukyu language has more or less been replaced by Japanese, so people there speak a dialect of Japanese with heavy influence from their original language, but the Ryukyu language is completely unrelated to Japanese. Similarly, all of the other languages or dialects (the relationship isn’t clearly drawn on the site) are languages that have been subsumed by Japanese in very recent times by linguistic standards.

This is not accurate. The Ryukan dialects form the larger of two insular subfamilies of Japonic.
They split some time over 1000 years ago. Modern subsumation by Japanese is irrelevant.

colonial, it’s not true that we can only estimate the time of divergence of two (or more) languages when there are written records. There are glottochronological methods by which we can make a reasonable estimate of the time of divergence for spoken languages. I also disagree that we can only say that two (or more) languages are related if they diverged less than 5,500 years ago. There are language families where the proto-language diverged into daughter languages more than 5,500 years ago. 10,000 years isn’t a precise bound to how far back we can reconstruct proto-languages, but it’s closer than 5,500 years.

Once again, a thread with a question that can be answered with a reasonably nontechnical answer has broken down into an argument between experts where the posts are irrelevant to the question in the OP. The answer to the question that BowlOfDucks asked has the following simple answers:

  1. No, Japanese, Korean, and Chinese are not related (using “related” in the usual way that it’s used in linguistics discussions).
  2. In Ethnologue you can find a tree diagram of the relationships between all the large number of languages that are related to Chinese. (Incidentally, Chinese is not a single language but a group of closely related languages.)
  3. You can also see that Korean is unrelated to any other language.
  4. You can see that Japanese is considered by some linguists to be related to several other languages, although, as Sleel points out, this is disputed. Other linguists think that those other languages have merely borrowed a lot of Japanese words or that it replaced the original languages with a lot of the older languages words being borrowed.

We have probably driven BowlOfDucks away now because he (or she) is convinced we can’t give a simple answer.

I doubt authoritative consensus claims to be able to confidently
date any pre-3500BC language.

For example, the earliest known written Indo-European languages
are Hittite, and Greek (as Linear B) and Sanskrit, all flourishing by

The Anatolian/Proto-Hittite family is thought to have split from
Proto-IE before any other, but estimates of the timing range from
7000BC to 4000BC.

Proto-Greek is nebulously thought to have existed as of ca. 3000BC
and nebulously to have entered what is now Greece 1700-2100BC.

The date of divergence of Sanskrit from the rest of its subfamily
(Indo-Iranian) is not known other than that it preceded IE entry into
India, ca. 1500BC.

With so much uncetainly more recently than 3500BC how can there
be any confidence for what was taking place earleir than 3500BC,
much less as far back as 10,000 years ago.

I think it is possible that all languages attested by, say, 1AD diverged
before 3500BC/5500 years ago. What I am trying to say is that little
to nothing is known of the timing before they enter the written record.

I did not say otherwise. Semitic Egyptian for example diverged
from Afro-Asiatic well before written Egyptian appeared ca. 3500BC.
Speaking of which would any expert hazard more than a guess as
to when this divergence took place? How about the timing of divergence
by Babylonian, Assyrian, Arabic and Hebrew? Any idea?

For which language families? Not Proto-Indo-European, that’s for sure!:
Per Wiki expert opinion on the date of Proto-Indo-European origin is so
divided that estimates range from ca. 4000BC to before 10,000BC. I doubt
there is a much narrower range for any other language family prototype.

BowlofDucks has his answer, although he could have obtained it himself
per Google.

More specifically, that Korean is related to Tungus-Manchu. This hypothesis leaves aside the open question of the latter’s relation to Altaic.

colonial, you are an expert within the context of answering BowlOfDucks’s question, or else you have no business answering it at all. I’m not setting a particularly high bar for being an expert. I have a master’s degree in linguistics, but I did syntax/semantics and not historical linguistics when I was in the field, and it’s been thirty-four years since I’ve been in the field. “Expert” in this context just means that you know enough to answer the question in the OP.

The fact that BowlOfDucks could have looked up the answer with Google in irrelevant (and probably not true). If the question was so utterly trivial that anybody could have looked it up, then just say that in reply to the OP. In answering the question at all, we’re saying that the question in the OP is interesting and significant. What bothers me about threads like this is that a question that could be answered in a way that genuinely informs the OP and leads them on to learning more on their own often descends into a battle between experts on nitpicking points that don’t help the OP at all and possibly drives them off the SDMB.

Thanks for finding that, Sleel. The point I was trying to make is made far more articulately in Diamond’s article:

It’s true that the Ryukuan languages are near extinction, but it’s not true that they are unrelated to Japanese. They are very closely related (but distinct and not mutually intelligible) languages. Having said that, it’s probably a common misunderstanding in Japan that “Okinawan language” refers to the hard-to-understand variety of Japanese spoken in Okinawa, so I can understand the desire to correct this misapprehension by emphasising that Okinawan languages really are different.

I didn’t know that was an open question. But I have no special expertise in linguistics, only a layman’s interest.

In his most recent reply below Wendell Wagner evades answering any
of my points. I take that to mean that MS or no MS he sees that he has
backed a losing proposition and has decided that the best way to save face
is to pull rank.

Nonsense. Internet chat rooms are the realm of the informed amateur,
many of whom know quite enough to keep the experts on their toes.

In that case I am an expert because I could have informed OP that Chinese,
Japanese and Korean belong to three different language families, that no one
has any idea when the three families divided from a common ancestor, that
Chinese is thought to have originated within the confines of the modern nation,
and that the geographical origin of Japonic is unknown (I am not sure about
where Korean is thought to have originated if not the Korean peninsula itself).

Alas for you and the other real experts, Wiki definitely provides the same
information you did, and in its article on “Japonic languages” provides
a link to the same Ethnologue site you provided.

Also, the following excellent article on Japonic comes up on only the second page
of Google hits:

The Origin of the Japanese Language

The Internet is the realm of those who enjoy displaying knowledge.

OP was answered in full, the only inaccuracy being your wildly mistaken comments
on dating.

Error of several 1000 years is not a nitpick. If OP is so easily upset by debate
then Internet chartrooms are the wrong place for him to be hanging out!

colonial, do you get some thrill in misinterpreting what other people say? You have consistently failed to understand every single post of mine. It appears that you go through posts sentence by sentence (and sometimes word by word) trying to find the misunderstanding of them which would make the least sense and then base your comments on that misinterpretation. I see no point in even trying to reply to your posts. I could spend the rest of eternity replying point by point to your posts, and each time you would just create further misinterpretations for those new posts and then post those new misunderstandings.