Engine Notifications

Many engines provide notifications that are independent of the text buffer that is being spoken, although the notifications are sent only when at least one buffer is in the queue. These notifications include such things as the time when the audio starts or finishes playing, and visual cues that correspond to the phonemes that are spoken.

To receive engine notifications, the application must create a COM object that supports the ITTSNotifySink interface. Before the engine object can use the interface, the application must register it with the engine by calling the ITTSCentral::Register member function. When calling Register, the application specifies the address of its ITTSNotifySink interface and the IID_ITTSNotifySink interface identifier. If Register succeeds, the engine returns a notification identifier.

When the application no longer needs to receive engine notifications, it can call the ITTSCentral::UnRegister member function, specifying the notification identifier returned by Register. Calling UnRegister is optional, because the system automatically releases all of an application's notification interfaces when the application releases the engine object. However, the system may not release the notification interfaces immediately after the engine is released.

While the application has an ITTSNotifySink interface registered with the engine, it can receive several types of notifications. The engine calls the ITTSNotifySink::AudioStart member function as soon as it starts playing the audio data, and the ITTSNotifySink::AudioStop member function as soon as it stops.

The engine calls the ITTSNotifySink::Visual member function to indicate the phoneme that is currently being spoken and (if the engine supports it) mouth-position cues. An application can use this information to display animated faces. The engine calls Visual at the exact time that the mouth position is reached. If the application cannot render the animation frame quickly enough (within 1/30th of a second), it should implement a custom audio destination that forwards the notification to the application ahead of time. Not all engines call Visual.