Command and Control Engines need exact commands

Before an application starts a command and control recognizer listening it must first give the recognizer a "list" of commands to listen for. The list might include commands like "minimize window," "make the font bold," "call extension <digit> <digit> <digit>," and "send mail to <name>."

If the user speaks the command as it is written they are going to get very good accuracy. However, if they word the command differently (and the application hasn't provided the alternate wording) then recognition will either not recognize anything or, even worse, it will recognize something completely different. So, if a user speaks, "bold that" instead of "make the font bold" there's a pretty good change the computer will hear "minimize window".

Applications can work around this problem by:

Make sure the command names are intuitive to users. For many operations like minimizing a window, nine out of ten users will say "minimize" or "minimize" window without prompting.

Show the command on the screen. Sometimes an application will be able to display a list of commands on the screen. Users will naturally speak the same text they see. Microsoft Voice uses the application names shown on the task-bar for the "Switch to <application>" command.

Use word spotting. Many speech recognizers can be told to just listen for one keyword, like "mail". This way the user can speak, "Send mail", or "Mail a letter," and the recognizer will get it. Of course, the user might say, "I don't want to send any mail" and the computer will still end up sending mail.

Have the computer verify every command with the user. The Microsoft Voice application will display the command that it heard and then in small text display, "Say 'Do it' to accept the command." The command is not actually acted upon unless the user says "Do it" within a few seconds of the command being spoken.

Over time speech recognizers will start applying natural language processing and this problem will go away.