Let’s say the police find someone walking around with amnesia.
And they were speaking a foreign language. Could a trained linguist figure out what language he was speaking, even if he didn’t speak it.
Let’s say it was a small language, spoken by like 50,000 or so people. I thought about this question while I was reading how Romansh is spoken by about 50,000 in Switzerland.
How would the linguist, in general, go about figuring it out?
I think if the person was literate and could write down their language it would be an immense help. I’m only a linguistics student, but I think I could identify written samples to broad language families or areas with a non-random degree of success.
A linguist could probably nail it down pretty close if he/she were able to question the person, and ask* him or her how to say certain words. The linguist would probably pick words from the Swadesh list.
According to Ethnologue, there are 6,912 languages spoken in the world today. (Yes, I’m aware of the problems with Ethnologue.) I did a really quick survey of a random set of the languages listed there, and I would guess that no more than 25% of them have more than 50,000 speakers. So let’s say that about 1500 languages have more than 50,000 speakers. If an individual linguist was asked which language this speaker used (and he didn’t have any help from other linguists), he couldn’t identify the language.
Of course, Romansh is not a random selection. It’s a Romance language, and there’s a good chance that anybody who calls themself a linguist will know a Romance language. A linguist could not identify without help all of the 1500 languages with more than 50,000 speakers. (Also, these are really not small languages. A small language would be less than about a thousand speakers. These are medium-sized languages.) A linguist with a few connections, though, would know who to ask among other linguists. They could quickly eliminate many of the other medium to large languages. In many cases they would be told things like “Well, I don’t know what language it is, but I think it comes from family X or area Y.”
This isn’t really a realistic scenario. It would be possible to look at the person and narrow down the possible languages in most cases. In any case, the answer is that a linguist couldn’t do this without a lot of help over several weeks, assuming that the language is really randomly selected from all the languages with more than 50,000 speakers.
It would also depend on whether or not the linguist could see the person. For example, some of the most obscure languages are found in Papua New Guinea, which has something like 800 different languages. But people from Papua New Guinea are fairly ethnically distinct, so the linguist would probably know right away to narrow down his search considerably.
There are a few easily recognizable features shared by some languages that could help to narrow it down. You can listen to a language and, without picking apart any words, figure out of it is a tone language or a stress language, and what kinds of tones it has, and even some things about the stress patterns.
A linguist with a very good ear could also pick out the phones being used; this would of course be affected by the speaker himself (lisps and other speech impediments, as well as acceptable individual variation), but if an inventory of phones was built up, this could go a very long way towards identifying the language.
As has been said, too, there are some families of languages that are just easy to pick out because they’re spoken by large numbers of people and are therefore familiar to many people. Romance languages were given as an example. I figured out that the people behind me on a bus, yesterday, were speaking Portuguese, even though I haven’t studied it–but I do know Italian and French.
I can do it from sound alone as long as I know some information about the language already.
Just now I was in the public library and overheard some Asian people speaking to one another. (And I completely suck at identifying the nationality of an Asian face. I just can’t do it.) I ran a quick process of elimination… 1. Asia. 2. Non-tonal, therefore ruling out Sino-Tibetan, Thai, Vietnamese, Hmong, etc. 3. Polysyllabic, pointing to an Altaic language which was not Japanese, I have studied Japanese and this wasn’t it. Now, it might have been Mongolian or Tungusic, but I doubt it as I have never heard of any Mongols or Tungus immigrants in my area.
Therefore I identified the language as Korean with near-100% certainty, even though I don’t know Korean.
As with any scientific field, linguists tend to specialize rather than generalize. While I’m sure there are many linguists out there who have superb language recognition skills, there are also plenty of linguists who have spent the last 30 years studying (for example) the syntax of English relative clauses. The latter type wouldn’t be much help for the kind of task you present, but s/he would certainly have the resources to point you to someone who would.
Wow, you had to go thru that much analysis to identify Korean?
I know some Japanese, and whenever I hear something that sounds a bit like Japanese, but I don’t recognize any words, it’s almost always Korean. I don’t speak any Korean, but I have heard it a lot.
Anyway, I think I share your enjoyment of playing “guess the language”. It’s like solving a puzzle. It can be kind of intimidating at first, but if you keep a level head (as per your analysis, above) it really isn’t that hard as long as we’re not talking about those obscure 50,000 speaker languages…
Lots of sh-sh-sh sounds that are not typical of South American Spanish. It could have been Catalan, but there is a native Catalan speaker in my department and she really doesn’t sound like these people did (so I wasn’t completely certain, but felt pretty sure). Furthermore, I was pretty sure it was European Portuguese rather than Brazilian, the sound of which I am also familiar with.
I asked the people. I was right.
So aside from picking up some Romance-sounding vocabulary, my guess was based on being able to rule things out based on personal experience. None of my training in linguistics really helped me, except so far as it has exposed me to hearing these languages being spoken.
If I was faced with some Indian-looking people, for instance, I would be–by myself and without access to references–completely helpless in figuring out which of the hundreds of languages of India they were speaking.
I can’t remember the name of the book now; however, I saw it on the Linguistics reference section in my university library (Shields Libarary, UC-Davis) when I was a student there. The book gave guidance on how to determine what a particular language was from certain identifying characteristics. If any SDMBers happen to be in that area, feel free to check those shelves and let me know what the book’s title is.
How could you tell the language was polysyllabic based on just a few minutes of exposure to its phonetics? People don’t speak with pauses between words.
But you didn’t ask them? I understand if you didn’t, but if you don’t ask you really can’t be sure (hence the near-100% certainty, I suppose).
I think this is the sort of problem that pops up a lot when trying to identify a language you don’t understand: you can’t really be sure you’ve got it right unless you confirm it by asking or getting someone who does speak the language to confirm it.
As sundog66 said, most linguists don’t really study that. In fact, I don’t really think there is a specialization in Linguistics that focuses on identifying unknown languages. Someone who specialized in language classification and who had good exposure to a number of languages might be pretty good at this. I think that many linguists who had a good broad education in phonetics, phonology and language classification would do better than your average Joe at this, but if it were obscure or the language happened to be in a family they knew little about they might have some trouble.
I overheard a triplet of Quebecois speaking their dialect of French and thought they were speaking Swedish. What do I know?
But seriously, it had some strange “Northern Clipping”, really reminded me of a Germanic language. I have a bit of an intuition with language and can usually tell instinctively where the person is from, not necessarily the specific language.
Um, most linguists I know don’t have the Swadesh words memorized in every language. Or even in a substantial number of languages. A lot of linguistics is mostly done in English - a syntactician hardly needs familiarity with any other languages (though I think this is a mistake.) Other linguists are likely to specialize in a particular language family - I doubt my old linguistics 101 professor would be much help outside of his field of expertise - the Ethiopic languages. So I don’t see how this would help. Particularly since it’s very hard to make much sense at all out of speech in a foreign language you don’t speak - when I hear a language that I don’t speak at all, I can’t even puzzle out individual words because doing that requires some implicit knowledge of a language’s phonology.
The more limited scenarios offered are easier - Johanna’s example with Korean is one I could probably do, but Korean is a major language and I’d still be playing the odds in assuming that it’s not some minor language that I might never have heard before. I’ve never heard Naxi or She or Kusunda or any of the many other minor languages of East Asia, so process of elimination like this will work most of the time but if you run into a speaker of some language you’ve never heard of, it won’t work.
For languages as closely related as they are, Spanish and Portuguese sound remarkably different. But I say that as a relatively capable Spanish speaker.
I think it’d be pretty hard to mistake Catalan for Portuguese. They don’t really sound much alike at all to me. And I still have yet to run into anyone speaking Catalan on the street, though I’d love an opportunity to show off my l33t Catalan knowledge, which is enough to, say, order off a menu in a restaurant.
Um, I’m not. Aside from the assorted errors that happen in any large project like this, I’m not aware of any relevant criticisms of their linguistic work.
This depends on many, many variables. Does the linguist have access to the person so that he can be questioned? If not, how much recorded material is available? If it’s just a few words, then it might be impossible to identify with certainty. (There are many languages which have just a handful of remaining speakers—sometimes only one!—and little or no scholarly documentation.) Also, is the linguist allowed to refer to books and other resources, or is he supposed to come up with a determination off the top of his head? If the latter, it’s doubtful that any linguist could possibly identify a language with certainty given that there are several thousand of them. Also, what kind of linguist are we talking about here? Linguists don’t necessarily know how to speak, or even very much about, foreign languages. Some practicing linguists, such as computational linguists, might not have any undergraduate linguistics training at all. Such linguists could probably write you a very elegant part-of-speech tagger for their native language but utterly fail to tell the difference between spoken French and Russian.
Generally speaking, someone who studied general theoretical linguistics in an English-language undergraduate curriculum would probably at least learn the names and geography of the major language families, with somewhat deeper study of the Indo-European family. Therefore such a person, provided they also had some modest real-world experience, could reasonably be expected to place a large sample of an unknown spoken language from the IE family in one of its subgroups (Germanic, Slavic, Celtic, etc.) on the fly. For a more detailed classification within an IE subgroup, or to identify a language outside IE, he would probably need to consult references or colleagues, and access to the speaker would be a great help.