Are some languages better for speech recognition than others?

drew870mitchell · November 23, 2012, 11:45pm

Title sums up my question.

Tim_T-Bonham.net · November 24, 2012, 2:10am

Don’t know about the recognition part, but for producing written words from the speech. any language with a regularized spelling would be better than languages like English, where spelling is very irregular, with many competing rules and lots of exceptions.

clairobscur · November 24, 2012, 4:40am

Spanish is quite straightforward. I’m sure there are some exceptions, but a word is normally pronounced the way it’s written (as opposed to English, French, Russian…)

Mississippienne · November 24, 2012, 4:44am

Turkish in both grammar and spelling is very regular.

Ximenean · November 24, 2012, 11:21am

I’m not sure that spelling irregularities are a major obstacle to speech recognition, since spelling is an aspect of written language, not spoken. Besides, even in English, computers are actually pretty good at detecting and correcting spelling mistakes.
I would think that languages with more phonemes would be harder for computers to decipher, because the sounds that needed to be distinguished would tend to be closer to each other.

Johanna · November 24, 2012, 7:08pm

Good call. Tibetan orthography, to go to the other extreme, diverges so far from pronunciation that one wonders how Tibetan speech recognition will ever be achieved without driving the programmers crazy. Burmese is pretty far out there too. One wonders about Scottish Gaelic too…

Freakenstein · November 26, 2012, 6:49pm

Finnish. Every letter has only one way how it is pronounced. The accent is always on the first syllable and the language has only 21 letters.
( A, D, E, G, H, I, J, K, L, M, N, O, P, R, S, T, U, V, Y, Ä, Ö - the only ones that could be confused are I and J )

Even the uneducated people tend to do only two mistakes: if ‘N’ is followed by ‘P’, some people write it ‘M’, and if the word has ‘IJ’, they drop ‘J’. ( But even those are very rare now 'cause teachers have focused on them ).

Johanna · November 27, 2012, 9:43am

I agree. Finnish is among the best candidates in the world for this. Korean and Spanish are too.

Floater · November 27, 2012, 11:10am

No, not really. They can be pronounced short or long, which is indicated in the spelling: a single letter=short, double letters=long, although a Finnish friend of mine once remarked that perkele, should have three r-s.

Nava · November 27, 2012, 11:47am

But that’s not a different sound, it’s the same sound several times. Like the fs in different… they don’t sound different from the f in fake, but there’s two of them and both get pronounced (well, they both get pronounced in some dialects, I’ve known people who only pronounced one f in different).

Floater · November 27, 2012, 11:54am

The distinction is very important in Finnish, that’s why I pointed it out. For example, a person called Vesa does not want his name pronounced as Vessa, which means loo.

Nava · November 27, 2012, 11:57am

Yeah, but it’s still the same sound twice. A Spanish rr is a different phoneme than a Spanish r; a Finnish rr is the same phoneme as a Finnish r, twice.

Freakenstein · November 27, 2012, 1:39pm

I not sure what You’re saying here. Short and long are not in any way ambiguous in Finnish. If You pronounce it short You write one letter, if long then You write two letters there, no exceptions.
Tuli ( fire ) is clearly a different word than Tuuli ( wind ) or Tulli ( customs ) - the language is full of these and no Finn would confuse them.

Hermitian · November 27, 2012, 10:06pm

Why is everyone talking about spelling?

Maybe I’m the only one here, but I assumed speech recognition had nothing to do with spelling. I don’t think the program is trying to hear each letter, just certain combinations of adjacent phonemes that will translate to certain words or phrases.

If it hears an “f” sound, followed by an “ox” sound, it will use it’s algorithms to determine that that is equivalent to “fox” in english. It is not worried about the “f” sound being a “ph” because “phox” isn’t a word.

stui_magpie · November 27, 2012, 10:14pm

With you there. Speech recognition is surely as much about the accent as the language and would depend who designed the software.

Speech recognition software designed in the USA can really struggle with an Australian or New Zealand accent, even though (theoretically) we’re all speaking the same language.

I’m not sure if there’s a language that is so simple to speak that it removes accents from the equation altogether.

Bosstone · November 27, 2012, 10:54pm

Binary?

Quartz · November 27, 2012, 10:58pm

How about an artificial regular language like Esperanto?

Huvudtvatt · November 27, 2012, 11:46pm

I agree that there is too much focus on the spelling here. Finnish spelling is very close to phonetic, which means that it is easy to build a speech synthesizer for Finnish, but it does not matter much for speech recognition.

One of the problems with speech recognition in Finnish is that it is a compund language where it is possible to build thousands of words from one root. Speech recognition software typically matches each word against a vocabulary. For a language like English very few words will fall outside of the vocabulary if it is big enough, but Finnish speech will have many “out of vocabulary” words, which means that you may have to match againts morphemes and word segments instead.

Ximenean · November 28, 2012, 12:31am

Yes, it seems people are including computer transcription under “speech recognition”, in which case I guess regular spelling would help. But if we’re just talking about the Star Trek, “Computer: identify enemy vessel” kind of speech recognition, spelling is irrelevant. The computer must analyse the sound phonologically and try to identify the words, and then the meaning, which is probably the really hard part.

Ximenean · November 28, 2012, 1:01am

Esperanto is maybe a bit complex, phonetically. It has nasty phonotactics (rules about which sequences of sounds are allowed), or to be more accurate it doesn’t have any particular rules because the guy who invented the language never thought of it. All sorts of horrid consonant clusters crop up. It doesn’t have the smallest phoneme inventory, either. Japanese, for example, has many fewer consonant sounds than Esperanto.

If we’re talking about artificial languages, how about Toki Pona? Nine consonants, five vowels, no consonant clusters.

Topic		Replies	Views
With all the data & CPU power available why is speech to text accuracy still so crappy in 2016? In My Humble Opinion	34	2033	January 25, 2016
Which languages come closest to having an unambiguous written form? Factual Questions	56	2605	May 24, 2007
Most Divergent Orthography? Factual Questions	28	4093	January 11, 2009
"Enuf is enuf. Enough is too much." Protestors at the Washington DC Spelling Bee Miscellaneous and Personal Stuff I Must Share	62	9514	October 18, 2010
Dyslexics untie! Spelling reform movment Factual Questions	59	11126	November 21, 2015

Are some languages better for speech recognition than others?

Related topics