Voice Recognition Software

Two or three years ago I purchased L&H Voice Xpress Plus 1.1 and, after a brief test, it did not seem to work too well so I uninstalled it. I thought maybe the computer was too slow. Now that I have a faster computer I have installed it again. What a PITA. In order to learn your voice it makes you read from the screen for about an hour. So, I finished the tedious process and proceed to test the voice recognition. It is pure, worthless crap. It cannot get the simplest text right and correcting it takes much longer than just typing it in the first place. The product as it is performing now is worthless and I do not understand how anyone would find it acceptable.

Can anyone tell me if there is any way to improve the performance? I really do not need voice recognition so it’s not like I’m going to spend more moeny buying another package but I just can’t believe this is so utterly useless.

Any advice or useful information about voice recognition is welcome.

First, L&H bought Dragon and then promptly went out of business. I’ve had some experience with Dragon software for various languages. For Italian, for example, the software works almost perfectly – it’s very impressive.

Voice recognition in English, however, is always tough because English has so many homonyms. I know Dragon, at least, tries to intelligently determine which words to use based on the “context,” e.g. to/two/too coming before an adjective is probably “too,” unless followed by a noun, etc.

Oh, yes, you asked an actual question, didn’t you? One of the big sources of bad voice recognition is bad input quality. This can be caused by several things, 1) cheap/bad microphone, 2) bad microphone placement (too far or too close to the mouth, too far to the side, too far in front, etc. Sometimes even a change of a few milimeters makes a quite noticable difference.), 3) Cheap/bad soundcard, 4) too much background noise.

Another thing that seems to make a difference is the amount of memory you have. These things are extremely memory intensive. A faster processor might not improve performance that much but doubling you memory makes a dramatic difference.

At least I have the satisfaction of knowing the founders and owners of Lernout & Hauspie went to jail for fraud. http://www.crmdaily.com/perl/story/10138.html
http://www.itworld.com/AppDev/985/IDG010427lh/
But I guess I am out of luck if I expected to get my software to work reasonably well.

What is the state of the art? Are there any products which work reasonably well? How much do they cost?

Truth Seeker, I have 512MB RAM so I do not think memory would be a problem. I’ll follow your advice and try other microphones etc. But I hate the thought of having to do the whole training thing again. That is just too boring.

All the ones I’ve tried had voice training, Viavoice and Dragon. It was fun to mess with a bit, but I found speech recognition to be too much trouble. I suspect it’ll be atleast five more years before it starts to be ubiquitous.

Teebone

I’ve tried L&H Voice Xpress myself and just about wanted to throw it out the window after that learning session. I’ve heard there is a product (or company) called Nuance that is supposed to be a lot better than the others, but I can’t confirm it actually works. After dealing with L&H I swore I’d never touch the voice recognition software again.

Does the suite that Microsoft provides with OfficeXP work any better? I had some mild success with it, mostly in controlling the menu, not in text. But then, hitting CTRL+S is almost as effortless as saying “File Save”…

I’ve found it to be pretty useless in general - there is still so much you have to do with the keyboard, not to mention time spent proofreading and correcting, that it almost always makes the work harder, and MUCH more noisy, which is bad in the environments computers are typically used in (offices, living areas, rooms with music playing, libraries)… it just isn’t a useful technology right now. I’m sure there are some situations - dictating a letter and the like, but you can still probably do it faster typing.

The only area where I find it useful is for accessibility for disabled persons, and it still has leaps and bounds to go for that. I’ve seen people type with a pen in their mouths several times faster.

That’s exactly what I thought five years ago! :smiley: OS/2 Warp came with an IBM speech recognition program, I believe. I was impressed, but it wasn’t quite good enough to be useful.

I think voice recognition can be useful in specialized electronics. GPS systems for cars, perhaps. Or wearable computers for technitians so they can bring up schematics without using their hands, that sort of thing.

Specialized voice recognition tools are already on market. It isn’t hard to develop a device to respond to a limited number of keywords - many work on the levels of hundreds, from robotic toys to navigation systems in luxary cars. These have been featured in advertisements recently.

The difficulty with the technology is in general use speech recognition, where instead of hundreds of commands the software is “listening” for, there are hundreds of thousands of words that it has to parse and interpret, then print or act on.

This is why saying “File Save” works so well to make it save a file (It only has so many options, and “file” is pretty hard to mishear as “Edit” or the like), but making it TYPE the line “This is why saying file save works so well” can result in “They say skiing fills Av. slow well.” Hence, speaking slowly and pronouncing each word, syllable, and space exactly.

The god forbid that you have a comical stereotypical accent.

I read recently that speech recognition for English is proving problematic due to the smiliarity of certain sounds. For example, the software has a hard time telling the difference between “dark” and “park” since the “p” and “d” sounds are so alike.

Some suggested creating a specialised patois for voice recognition software that used more words that were easily recognisable by the computer. I have my suspicions that this will not catch on.

Must dash. My locument is ready at the crinter.

I have used Dragon 5.something with relativly good results. Read, read, read to it. It works well if you put in the effort to teach it, the point of the software is to be used by people for whom the effort people put in is worth it for the result, so people with injuries or other problems that reduce their ability to use a keyboard. If it is down to “just for fun”, most people don’t have the time, patience or system resouces to use it to its full ability. I can run mine fine, but I would rather use that percentage of my system resources for somthing else.

As a non-american english speaker I have found that systems that do not learn are more or less useless for me. An example is Nitrous Voice Flux. I can use it well enough if I put on a ridiculously stereotypical american accent, my Yosemite Sam impression runs it quite well, but with my own pronunciation, no result.

ScanSoft purchased L&H’s voice technologies, including Dragon, after the scandal, but I haven’t had any experience with it. However, I have a quadraplegic friend who swears by Dragon 5. He says that it took a while, but after about 3 weeks, he almost never got misinterpretations unless he forgot and tried to do something like issue a command with his mouth full of food or something like that :).

According to Scansoft’s website, version 6 of Dragon is out, with many different flavors to choose from.

BTW, the software that came with OS/2 was ViaVoice (IBM makes ViaVoice, go figure :slight_smile: ).

critter42

I think in the end its gonna be a videophoneesq technology, one that we COULD do, and everyone assumes that it would be super cool, but no one really likes as much as they think they would.

According to this speech recognition software is included in Windows XP, however, I could not access it using these instructions.