Site Attributes

Every voice-command site has a set of attributes that affect the interaction between the speech-recognition engine and the audio input device. An application can query and set a site's attributes by using the IVCmdAttributes interface provided by the voice-command object. To get a pointer to IVCmdAttributes, call the IVoiceCmd::QueryInterface member function with the IID_IVCmdAttributes interface identifier. All speech interfaces include the IUnknown functions QueryInterface, AddRef, and Release, so if you have a pointer to any interface for an object, it is not necessary to call IUnknown to get a pointer to another of the object's interfaces.

Because multiple applications can share the same voice-command site, changing a site attribute affects all applications registered to use the site, not just the application making the change. To receive notification of attribute changes, an application must specify the VCMDRF_ALLMESSAGES value when calling the IVoiceCmd::Register member function. Note that attribute settings are saved between uses of the site, even if the computer is shut off in the meantime.

Audio Input Device

An application can use the IVCmdAttributes::DeviceSet member function to change the audio input device for a voice-command site. To change the device, the application must specify the device identifier of the new device. An application can use the waveInGetNumDevs and waveInGetDevCaps multimedia functions to obtain the identifiers of the available audio devices in the system. The IVCmdAttributes::DeviceGet member function retrieves the device identifier of the wave-in audio device for the site.

Speech-Recognition Mode

A speech-recognition engine typically provides an assortment of modes that it can use to recognize speech in different languages and dialects. Each voice-command site uses a single speech-recognition mode. An application can change the mode for a voice-command site by specifying the GUID of the new mode in a call to the IVCmdAttributes::SRModeSet member function. To retrieve the GUID of the mode that the site is currently using, an application can call the IVCmdAttributes::SRModeGet member function.

Speaker Name

One way an application can improve the recognition accuracy of an engine is to train the engine to take into account the unique qualities of a user's voice. By using the IVCmdDialogs interface of the voice-command object, an application can direct the engine to display its training dialog box.

Typically, an engine's training dialog box displays a sequence of words and phrases that the user must speak into the audio input device. The engine processes the user's spoken input and saves information that helps the engine improve its recognition accuracy for the user. The engine saves the user's name (that is, the speaker name) along with the user's training information and uses the name to load the information whenever the user becomes the speaker for a site.

An application changes the speaker name for a voice-command site by using the IVCmdAttributes::SpeakerSet member function. Changing the speaker name unloads all training for the previous speaker and loads the training for the new speaker. If no training exists for the new speaker, the application starts with the engine's default training. The IVCmdAttributes::SpeakerGet member function retrieves the name of the current speaker for a site.

Microphone Name

Some speech-recognition engines can improve recognition accuracy by optimizing themselves for use with particular types of microphones. Setting a microphone name for a site allows the engine to identify the type of the microphone, so it can optimize itself accordingly. An application uses the IVCmdAttributes::MicrophoneSet and IVCmdAttributes::MicrophoneGet member functions to set and retrieve the microphone name.

Awake State

A voice-command site can be either awake or asleep. When a site is awake, the speech-recognition engine "listens" for commands for all active voice menus associated with the site. When a site is asleep, the engine listens for commands only from sleep menus. Commands from sleep menus become active only when the site is asleep, and they become inactive when the site is awake. An application uses the IVCmdAttributes::AwakeStateSet and IVCmdAttributes::AwakeStateGet member functions to set and retrieve the awake state.

An application can use AwakeStateSet to enable the user to briefly suspend voice commands for a site. For example, the user might want to suspend recognition from the computer microphone during a telephone conversation and resume recognition when the conversation is finished. A sleep menu typically contains a "Wake up!" command that resumes speech recognition, but it may contain other commands as well.

Enabled State

The IVCmdAttributes::EnabledSet member function enables or disables speech recognition for a voice-command site. Disabling a site completely turns off speech recognition for the site so that the engine recognizes no audio input, not even commands on sleep menus. For example, a user might want to disable speech recognition from the computer microphone during a meeting so that speech recognition will stay off, even if somebody inadvertently speaks a command on a sleep menu. The IVCmdAttributes::EnabledGet member function finds out whether a site is enabled or disabled.

The user can use the Speech control panel application to turn the system's speech capabilities on and off. When the system's speech capabilities are turned on or off, the WM_SPEECHSTARTED or WM_SPEECHENDED message is sent to all top-level windows in the system. An application can use these messages to determine when to enable or disable its voice-command capabilities.

Automatic Gain

Many speech-recognition engines can automatically adjust the gain of the incoming audio signal for a voice-command site (if the audio device supports it). Gain refers to the increase in signaling power, measured in decibels (dB), that occurs as a signal is boosted by an electronic device.

An application can use the IVCmdAttributes::AutoGainEnableSet member function to set the speed with which the engine adjusts the signaling power. When calling AutoGainEnableSet, the application specifies a value from 0 to 100. A value of 0 disables automatic gain, and a value of 100 causes the voice-command object to set the gain to the value for the previous utterance so that if the next utterance is spoken at the same level, the gain is set perfectly. A value between 0 and 100 moderates the automatic gain adjustments on a linear scale. For example, a value of 50 adjusts the gain to 50% of the level for the previous utterance.

An application retrieves the current automatic gain value for a voice-command site by using the IVCmdAttributes::AutoGainEnableGet member function.

Threshold Level

An application sets and retrieves the threshold level of a speech-recognition engine used by a voice-command site by using the IVCmdAttributes interface with the ThresholdSet and ThresholdGet member functions. The threshold level is a value from 0 to 100 that indicates the point below which the engine rejects an utterance as unrecognized. A value of 0 indicates that the engine should match any utterance to the closest phrase match. A value of 100 indicates that the engine should be absolutely certain that an utterance is the recognized phrase. For example, suppose the engine is expecting "What is the time?" If the threshold is 100 and the user mumbles "What'z tha time" or has a cold, the command may not be recognized. However, if the threshold is too low and the user says a similar-sounding phrase that is not being listened for, such as "What is mine?" the engine may recognize it as "What is the time?"

If the command spoken by the user is not close enough to what the speech-recognition engine expects, the voice-command object notifies the application that the command was not recognized by calling the application's IVCmdNotifySink::CommandOther notification interface with a NULL string. For more information about the voice-command notification interface, see "Notification Interface" later in this section.