Voice-Command Site

A speech-recognition engine receives spoken audio input from the user through a voice-command site on the user's computer. A voice-command site consists of an audio input device, such as a microphone or telephone, and a speech-recognition mode. An engine typically provides several different modes for recognizing speech, each representing a different language or dialect.

An application must create a separate voice-command object for each voice-command site it needs to use. To use a site other than the default (the computer microphone and the default mode of the engine), the application must obtain the identifier of the desired audio input device and the globally unique identifier (GUID) of the desired mode and then pass that information to the voice-command object when registering.

An application can use the waveInGetNumDevs and waveInGetDevCaps multimedia functions to obtain the device identifier of an audio input device. To determine the available speech-recognition modes, an application creates a speech-recognition enumerator. The enumerator returns information about all modes provided by all engines in the user's system, including the GUID of each mode.

For more information about multimedia and multimedia functions, see the Microsoft Win32 Software Development Kit (SDK). For information about enumerating the modes of a speech-recognition engine, see the "Low-Level Speech Recognition" section.