Tweaking the prompts

Current speech recognition technology has the flaw that users must say one of the phrases that the computer expects or accuracy will fall. If the user speaks something that the speech recognizer isn't expecting then either the speech recognizer will return an "unrecognized" response, or worse, it will think it heard another command and do something completely different than what the user wanted.

An application designer should pay close attention to the wording of questions since the phrasing and vocabulary will significantly effect whether or not the user is likely to give one of the expected responses. For example, if the movie application wants to know what time the user wishes to see a movie, it could ask, "What time do you want to see the movie?" However, this can produce responses ranging from "This evening" to "7:00" to "Sometime tomorrow." If the question is reworded to, "Do you want to see an afternoon showing, evening showing, or late night showing?" then the user's response will be more limited.

An application should anticipate synonymous responses. Users will tend to use the same phrasing that they hear from the prompts. If the prompts don't hint at any vocabulary or phrasing then the responses will be varied. In the case of the movie time, the application should expect responses like "In the afternoon," "afternoon", and "afternoon showing". Protypes will show what kind of responses are likely.

Word spotting might work well in some prompts if the recognizer is only looking for a key-word like "afternoon", "evening", or "late night". If more than a few keywords are possible then accuracy decreases.