Engine Selection

After an application has identified an acceptable text-to-speech mode, it calls the Select member function to select the mode and create an engine object to represent the mode in the application. Both the ITTSFind and ITTSEnum interfaces include the Select member function, and the syntax is the same for both.

When calling Select, the application specifies a globally unique identifier (GUID) that uniquely identifies the mode to use, the address of a variable that receives the address of the ITTSCentral interface on the engine object, and the address of the IUnknown interface on the audio-destination object created earlier (for more information, see "Audio-Destination Object" earlier in this section).

When Select is called, it ultimately creates an engine object. However, as the engine object initializes, it first attempts to communicate with the audio-destination object to see if they share an acceptable format for digital audio. If they do, the engine finishes initializing and Select returns the address of the ITTSCentral interface on the engine object. If they do not, the engine object frees itself and Select returns an error.

If Select is successful, it returns the address of the ITTSCentral interface on the newly created engine object. After obtaining the address of the ITTSCentral interface on the engine object, the text-to-speech enumerator is no longer needed and the application should release it by using the ITTSFind::Release or ITTSEnum::Release member function.

The ITTSCentral interface provides centralized control of a text-to-speech engine, such as sending text to the engine, injecting speech-inflection tags (control tags) into text as it is played, controlling the playing of the audio output, registering or releasing a notification interface, and so on. The engine uses the notification interface to notify the application of speech-related events. For more information about the notification interface, see "Engine Notifications" later in this section.

The ITTSCentral::ModeGet member function fills a TTSMODEINFO structure with information about the current text-to-speech mode.

The ITTSCentral::Phoneme member function converts text to its phonemic representation, which is the intermediate stage between Unicode text and digital-audio data. An application can use Phoneme to adjust the phonetic representation of text, before it is spoken by the engine, to correct mispronounced words.

An application can use the ITTSCentral::PosnGet member function to retrieve the exact byte in the audio stream that is currently being played, and then use this information to synchronize actions with events. For example, if an action should occur a certain interval after a particular byte is played, the application can call PosnGet to get the time that byte was played, add the interval to the time, and synchronize the action with the result.

The ITTSCentral::AudioPause member function pauses the engine so it does not play any more audio until the application calls the ITTSCentral interface's AudioResume or AudioReset member function, which stops all speech, releases all pending text buffers, and empties the text-to-speech engine's speaking queue.

The ITTSCentral interface includes several other member functions, which are described in subsequent sections of this section.