About 15 years ago I had a brief stint in a job where, among other things, I tried to sell Dragon NaturallySpeaking software to law firms. I never sold a single unit, and in fact I was not even impressed with the product – it was always substituting homophones or near-homophones, especially for proper names, and the results came out like Anguish Languish.
Now I work for a lawyer who uses Dragon all the time to dictate memos and emails to me, and – after 15 years – the results are no different.
Is improvement here simply impossible without the invention of strong AI?
I think the biggest constraint voice recognition software is up against is the dialectical variations of speech in one language. Sure, written language has variation. The Brits like to stick superfluous Us in their words (honour, colour), for instance, but even misspellings have a pattern. Trying to shoehorn the speech patterns of someone from the deep hollers of Appalachia into the same category as a Cockney is just asking for heartache and failure. Heck, even fellow speakers of the “same” language can be completely confounded.
Not only would you need an incredibly powerful AI, you’d need more speech pattern records than the oral history archives of the Library of Congress.