Distributing Processing Over a Network

An engine might want to do some of its processing on a different machine than the one the application is running on. While the Microsoft Speech API provides no official way of implementing the distributed processing, it's fairly easy for an engine to distribute processing.

To distribute processing, split the engine into two parts.

The first part is basically a stub that takes all of the calls to the speech API and marshals the data over the network, to the second part. The server portion will perform the synthesis and send digital audio data back to the stub. The engine stub supports all of the interfaces that the full engine supports, although it does almost none of the processing. In order to reduce network usage, the stub might accept a "compressed" audio format from the server and decompress on the host.

The second part of the engine lies on a server. It accepts the marshaled data from the stub and then processes it, sending back notifications to the stub as digital audio is synthesized.

The engine might also support a custom interface, IMyDistributed. The IMyDistributed interface allows an application that knows about remote processing to control which machine the processing is on, and to finely control what kind of data is transmitted over the LAN (or other communication method) so that unneeded data (such as mouth synchronization information) isn't transmitted.