Grammar-Specific Notifications

When the user speaks a command or phrase, the engine notifies the application by calling the ISRNotifySink::UtteranceBegin member function. When the user stops speaking, the engine calls the UtteranceEnd member function. The application should not use these notifications to influence recognition.

Unless the sound turns out to be just noise, some time after the engine calls UtteranceBegin, the PhraseStart member functions of all ISRGramNotifySink interfaces on active grammar objects are called, indicating that the engine has begun recognition processing on the audio. The application is notified of the time (in bytes) when the audio began. An application can convert this into hours, minutes, and seconds by calling the ISRCentral::ToFileTime member function.

If an engine supports hypotheses, it periodically calls the ISRGramNotifySink::PhraseHypothesis member function after the initial call to the PhraseStart member function. Many applications ignore calls to PhraseHypothesis because they have no use for the information, but some applications display the hypothesis to the user or perform preprocessing on the hypothesis.

When an engine has concluded its processing of the audio, it calls the PhraseFinish member function. The PhraseHypothesis and PhraseFinish member functions pass the following information to the application:

· The start and end time of the phrase. The application can use this information to isolate the audio or synchronize it with other events in the system.

· Two flags. One flag indicates whether the engine is confident enough about the recognition that the application should act on it, or the recognition results are unclear and the application should ask for clarification (much as a person would say "Could you repeat that, please"). The other flag indicates whether the best recognition result was obtained from this grammar or from another active grammar. An application should act upon a phrase recognition only if it was obtained from its own grammar.

· Address of a SRPHRASE structure. If the engine can recognize words from the speech, it passes the address of an SRPHRASE structure that contains a sequential list of SRWORD structures, each indicating the text and word identifier of the recognized word. An application can display these words, parse them, or do whatever it needs to do with the information. (If the grammar is context-free, the words must be in the grammar's set of words.)

· Address of a results object. Some engines support a speech-recognition results object that allows an application to get more information about what was recognized, such as accurate timing and alternative recognition possibilities. If a results object exists, the engine passes a non-NULL address of an IUnknown interface for the object. For more information about the speech-recognition results object, see "Speech-Recognition Results Objects" later in this section.

If the application needs to use the speech-recognition results object, it calls the IUnknown::QueryInterface member function to get an interface on the object (for example, ISRResBasic). In this case, it is the application's responsibility to release the results object. If the application doesn't use the results object, the engine calls the IUnknown::Release member function on the results object after the application returns from the PhraseFinish notification. The engine calls the Release member function so that an application that ignores the PhraseFinish notification does not inadvertently leave results objects in memory.

While training or double-checking its recognitions, an engine may change the result of a recognition. If so, the engine calls the ISRGramNotifySink::ReEvaluate member function to notify the application that the results object has changed and should be updated. Applications that use speech recognition for command and control typically ignore this notification, but those using it for data or text entry may use the information to improve performance.

The notification object is used by the grammar object until the latter frees itself, at which point it calls the ISRGramNotifySink::Release member function to release the notification sink. This may not occur immediately after the grammar object is released by the application because of asynchronous nature of speech recognition — that is, the engine may need to finish processing a certain amount of speech after the grammar object is released.