Initializing an Application for Speech Recognition

To implement speech-recognition features, you must initialize the OLE libraries and then create instances of at least three speech-recognition objects: the audio-source object, the speech-recognition engine object, and the grammar object. The audio-source object provides speech input to the engine object that processes the speech. The grammar object provides the engine with the words and rules that determine what the engine can recognize. In addition, you can use the speech-recognition sharing object and the speech-recognition enumerator to find an engine that meets your application's requirements.

The following example demonstrates how to initialize an application for speech recognition. It shows how to create the necessary objects for implementing speech recognition. The example accomplishes the following tasks:

· Uses the CoInitialize function to initialize the OLE libraries.

· Creates an instance of the multimedia audio-source object, which allows the engine to receive input from the multimedia wave-in driver.

· Creates a speech-recognition sharing object and a speech-recognition enumerator. The application-defined FindSuitableEngine function (described in the following section) uses the sharing object and enumerator to find an engine and audio-source that meets the application's requirements.

· Creates an instance of an application-defined class that implements the ISRGramNotifySink interface (described in "Processing Recognition Notifications" later in this section). The engine calls this interface to notify the application of recognition events.

· Calls the application-defined InitializeGrammar function (described in "Loading a Grammar" later in this section) that loads a context-free grammar from a file and creates a grammar object.

· Calls the ISRGramCommon::Activate member function to activate the newly created grammar.

// BeginOLE - Initializes OLE, creates the speech-recognition objects,

// and activates a grammar.

// Returns TRUE if successful, or FALSE otherwise.

// Global variables:

// g_pIAMMD - address of the IAudioMultimediaDevice interface

// g_pIEnumSRShare - address of the IEnumSRShare interface

// g_pGramNotifySink - address of the ISRGramNotifySink interface

// g_szGramFile - name of the grammar file

// g_hwndMain - handle of the application's main window

BOOL BeginOLE()

{

HRESULT hRes;

PISRFIND pISRFind = NULL;

// Initialize OLE.

if (FAILED(CoInitialize(NULL)))

return FALSE;

// Create the audio-source object and retrieve the address of the

// object's IAudioMultimediaDevice interface. By default, the object

// uses the WAVE_MAPPER device.

if (CoCreateInstance(CLSID_MMAudioSource, NULL, CLSCTX_ALL,

IID_IAudioMultiMediaDevice, (LPVOID *) &g_pIAMMD) != S_OK)

return FALSE;

// Create a speech-recognition sharing object and retrieve the

// address of the object's IEnumSRShare interface.

CoCreateInstance(CLSID_SRShare, NULL, CLSCTX_ALL,IID_IEnumSRShare,

(LPVOID *) &g_pIEnumSRShare);

// Create a speech-recognition enumerator and retrieve the address

// of the enumerator's ISRFind interface.

if (CoCreateInstance(CLSID_SREnumerator, NULL, CLSCTX_ALL,

IID_ISRFind, (LPVOID *) &g_pSRFind) != S_OK)

return FALSE;

// Call an application-defined function that uses the speech-

// recognition sharing object and enumerator to find and select a

// suitable speech-recognition engine.

hRes = FindSuitableEngine(pISRFind);

pISRFind->Release();

if (hRes != NOERROR) return FALSE;

// Create the grammar notification interface based on an

// application-defined CISRGramNotifySink class.

if ((g_pGramNotifySink = new CISRGramNotifySink) == NULL)

return FALSE;

// Call an application-defined function that reads grammar data from

// a file and then loads the grammar.

if (!InitializeGrammar(g_pISRCentral, g_szGramFile))

return FALSE;

// Activate the grammar.

if ((hRes = g_pISRGramCommon->Activate(g_hwndMain, FALSE, NULL))

!= NOERROR)

return FALSE;

return TRUE;

}