Are some languages better for speech recognition than others?

Title sums up my question.

Don’t know about the recognition part, but for producing written words from the speech. any language with a regularized spelling would be better than languages like English, where spelling is very irregular, with many competing rules and lots of exceptions.

Spanish is quite straightforward. I’m sure there are some exceptions, but a word is normally pronounced the way it’s written (as opposed to English, French, Russian…)

Turkish in both grammar and spelling is very regular.

I’m not sure that spelling irregularities are a major obstacle to speech recognition, since spelling is an aspect of written language, not spoken. Besides, even in English, computers are actually pretty good at detecting and correcting spelling mistakes.
I would think that languages with more phonemes would be harder for computers to decipher, because the sounds that needed to be distinguished would tend to be closer to each other.

Good call. Tibetan orthography, to go to the other extreme, diverges so far from pronunciation that one wonders how Tibetan speech recognition will ever be achieved without driving the programmers crazy. Burmese is pretty far out there too. One wonders about Scottish Gaelic too…

Finnish. Every letter has only one way how it is pronounced. The accent is always on the first syllable and the language has only 21 letters.
( A, D, E, G, H, I, J, K, L, M, N, O, P, R, S, T, U, V, Y, Ä, Ö - the only ones that could be confused are I and J )

Even the uneducated people tend to do only two mistakes: if ‘N’ is followed by ‘P’, some people write it ‘M’, and if the word has ‘IJ’, they drop ‘J’. ( But even those are very rare now 'cause teachers have focused on them ).

I agree. Finnish is among the best candidates in the world for this. Korean and Spanish are too.

No, not really. They can be pronounced short or long, which is indicated in the spelling: a single letter=short, double letters=long, although a Finnish friend of mine once remarked that perkele, should have three r-s.

But that’s not a different sound, it’s the same sound several times. Like the fs in different… they don’t sound different from the f in fake, but there’s two of them and both get pronounced (well, they both get pronounced in some dialects, I’ve known people who only pronounced one f in different).

The distinction is very important in Finnish, that’s why I pointed it out. For example, a person called Vesa does not want his name pronounced as Vessa, which means loo.

Yeah, but it’s still the same sound twice. A Spanish rr is a different phoneme than a Spanish r; a Finnish rr is the same phoneme as a Finnish r, twice.

I not sure what You’re saying here. Short and long are not in any way ambiguous in Finnish. If You pronounce it short You write one letter, if long then You write two letters there, no exceptions.
Tuli ( fire ) is clearly a different word than Tuuli ( wind ) or Tulli ( customs ) - the language is full of these and no Finn would confuse them.

Why is everyone talking about spelling?

Maybe I’m the only one here, but I assumed speech recognition had nothing to do with spelling. I don’t think the program is trying to hear each letter, just certain combinations of adjacent phonemes that will translate to certain words or phrases.

If it hears an “f” sound, followed by an “ox” sound, it will use it’s algorithms to determine that that is equivalent to “fox” in english. It is not worried about the “f” sound being a “ph” because “phox” isn’t a word.

With you there. Speech recognition is surely as much about the accent as the language and would depend who designed the software.

Speech recognition software designed in the USA can really struggle with an Australian or New Zealand accent, even though (theoretically) we’re all speaking the same language.

I’m not sure if there’s a language that is so simple to speak that it removes accents from the equation altogether.

Binary?

How about an artificial regular language like Esperanto?

I agree that there is too much focus on the spelling here. Finnish spelling is very close to phonetic, which means that it is easy to build a speech synthesizer for Finnish, but it does not matter much for speech recognition.

One of the problems with speech recognition in Finnish is that it is a compund language where it is possible to build thousands of words from one root. Speech recognition software typically matches each word against a vocabulary. For a language like English very few words will fall outside of the vocabulary if it is big enough, but Finnish speech will have many “out of vocabulary” words, which means that you may have to match againts morphemes and word segments instead.

Yes, it seems people are including computer transcription under “speech recognition”, in which case I guess regular spelling would help. But if we’re just talking about the Star Trek, “Computer: identify enemy vessel” kind of speech recognition, spelling is irrelevant. The computer must analyse the sound phonologically and try to identify the words, and then the meaning, which is probably the really hard part.

Esperanto is maybe a bit complex, phonetically. It has nasty phonotactics (rules about which sequences of sounds are allowed), or to be more accurate it doesn’t have any particular rules because the guy who invented the language never thought of it. All sorts of horrid consonant clusters crop up. It doesn’t have the smallest phoneme inventory, either. Japanese, for example, has many fewer consonant sounds than Esperanto.

If we’re talking about artificial languages, how about Toki Pona? Nine consonants, five vowels, no consonant clusters. :slight_smile: