Strictly speaking, it’s speech recognition. However, I see the two terms used interchangeably in some places. For example, speech system designers are called VUI designers, not SUI designers. Also, the “box” that contains the speech recognition engine and other associated hardware and software is often called a VRU (voice recognition unit) and not an SRU. To confuse matters, the “press 1 to continue”-type keypad systems are generally called IVRs, “Interactive Voice Response” because what responds is a voice. The XML-based language for controlling VRUs is called VoiceXML.
InvisibleWombat has echoed my complaint, in that IW says that it’s the menus not the recog.
Ugh! I can’t stand the speech recognition menus. I just don’t like talking to something that is not a real person. I don’t mind a touchtone menu that sends you to the correct department but that’s about as far it goes. For me voice recognition menus are a deal breaker for any relationship I may have with a company (Telstra, I’m looking at you!)
What’s the alternative, a live voice for everything you want to do?
I guess I feel the opposite. If I can get it done without human intervention, then I’d rather do that. I find that it’s quicker and sometimes more accurate. This of course depends on the thing I want to do.
The touchtone menus are a hassle. What happens if your choice isn’t there? I know I can construct a speech recognition menu that is more flexible.
Sure, we’d all like to talk with a live person. Sometimes I do, too. As I said previously, a well-designed, customer-friendly system helps the company focus its people on the questions that really need thought and response from a real person.
You want a live person every time you call? Be ready to pay for it.
I wish my voice mail on my cell could use voice recognition. I just got one of those robo cop blue tooth headsets and it would be great to check my voice mail without opening up the phone. But no I have to open the phone and type my pin and negotiate the menus.
1920s Style “Death Ray” people used to say the exact same thing about answering machines. I don’t want to talk to no machine. I don’t hear that complaint as much anymore. I imagine in about ten years most people will get over talking to machines or they will die off.
Some aspects of speech recognition do help when you’ve got your hands full, or in other situations where a keyboard either doesn’t work or is overkill.
I’ve often wondered if we can improve mobile phone technology to the point where your phone could contain the VRU. That way, you could train your phone to do very well with general parameters of your own voice, and then control the ASR with downloaded grammars. The best of both worlds; the recognition is more finely tuned, and at the same time it’s controlled by the expected responses.
I was actually pleasantly surprised when dealing with a voice/speech recognition system recently. I had to call my cell phone provider to see if my monthly payment was received - I had sent directly through online banking, and then for some reason I couldn’t log on to their webpage to see if the payment went through.
So I tried the 1-800 number. Speech recognition. I had no problem whatsoever with the actual speech recognition part - it always took me to the next sub-menu I had asked for from their list of options. But for the life of me I couldn’t find any option to check if payments were received. And their menu they gave didn’t have an equivalent of “press 0” to talk to an operator.
I was getting irritated and sarcastically said “I want to talk to a real person.” And voila!
I figured out that their system has that as an option from pretty much anywhere in the menu tree. But they don’t actually tell you it’s there. I went all the way through some branching sub-menus to the bottom of the available options, and it never had a “speak to a customer service rep” option. But if you asked for it anyways, it worked.
This is a good example of why anecdotes don’t equal data. I’ve never heard anyone say a positive thing about these systems, but it seems that there are a lot of people out there who have had good experiences. Good to know.
And thanks also for the tips on how better to deal with speech recognition systems and how to circumvent them if need be.
[QUOTE=Chronos]
I’m puzzled. The OP wants to speak to a real live human instead of to a computer…/QUOTE]
Where in my OP did I say that? I didn’t. You seem to be conflating my complaints about speech recognition systems with other people’s complaints about menus.
I like the automated systems. I’d actually prefer not to talk to a real live person if I can help it. And if talking to a real live person is unavoidable, I’d prefer to speak to a person in the right department for whatever it is that I want.
I usually prefer using a string of obscenities that there’s no way the IVR is going to pick those up (and usually serves to blow off some steam quite nicely to boot :D)
Voice recognition is still highly in the academic research realm (hell, we haven’t moved much beyond Dragon-like voice training, which is clearly not feasible for a Fortune 500 company), so I’d say it’s mostly suits “jumping on the latest customer service paradigm” with no clue of its actual efficency.
A touchtone menu that directs me to the correct department with a live person, or I’m quite happy doing things over the internet or via email.
I am and do. I would rather pay a little more and get better service. When I had problems with my old internet service and would ring the ISP, I would actually get the same tech each time (hi James :).)
There is only really one company here that I have to deal with that has voice menus, Telstra. They are also the same company that keep cold-calling me to try and get me to go back to them. If they can afford to have a real live person ring me up every other day, then they can afford to have a real live person answer when I have a problem with their service. Also, their voice menu only directs you to a live person anyway, it does nothing that a touchtone menu won’t do except it is in a format that I will not tolerate. That, and the cold-calling, is why I am not a customer and never will be regardless of their pricing structure.
Yes, except when you talk to an answer machine you are actually talking to the person you’re leaving the message for. When you talk to a voice recognition menu, you really are talking to a machine.
Although when I’ve rung Telstra’s directory service and had problems with it recognising the name of the place I wanted, it would send me to a real person who would ask me “You want the St Agnes Police Station?” Which means they’ve been listening to my feeble attempts at making a machine understand me, or they’ve listened to a recording of it. So now I get the eerie sense of talking to a machine but having someone else sitting there listening in.
I guess, to me, it is a technology that either needs to be very very good, or not used at all, at the moment it’s just not good enough to do anything other than put me in a bad mood for when I finally get to talk to someone.
That is not really true. I don’t train the voice recognition on the voice dialing of my cell phone. I just say the names as they appear in the list of remembered phone numbers. It does a really god job. I just wish the voice mail on the providers end would allow me to move through the menus with voice commands.
1920s Style “Death Ray” you are sounding more and more like My grandmother when answering machines came out I don’t want to talk to no damn machine.
If there was a speech-recognition-based phone menu system that (a) actually had choices representing what I wanted to do and (b) actually understood what I said, then I’d be happy to use it. If I’m checking the balance on an account, and all I have to do is recite my account number, such a system would be fine.
What torques me is when the system doesn’t have a selection for “I got billed twice for this event,” and it takes you ten minutes to find your way to a real person, or when it misinterprets what you say and there’s no easy way to back up and fix it.
People get all excited about 90% or 95% accuracy. That’s not even in the neighborhood of usable. I used a ViaVoice system that had 95% accuracy to dictate a magazine article. You know what 95% accurate means? It means one word in 20 is wrong. It took me longer to edit the article than it would have taken to just type it in the first place.
Dictation software is a completely different task than voice menu software. With dictation software, you could potentially be saying anything, and the program needs to recognize it. This is still not a mature technology (though it’s getting closer), and it’s not good enough for most folks.
Voice menu software, however, has a much easier task before it. It just has to be able to tell whether what you said is closer to “Account status” or “make a deposit” or “transfer funds”. Or, if what you said isn’t all that close to any of those, it transfers you to a human. Here, 95% accuracy generally means that 5% of the time, it can’t tell what you want, and transfers you to a human. This is a huge advantage, since most people will only have to deal with the machine, and even those who can’t, will still deal with a human. Meanwhile, the company can cut their customer service staff by a factor of 20.
The big thing that bugs me is, if I can’t do it online, then I sure as hell can’t do it with a voice recognition system. But I guess I’m just a freak that tries to do things online first…