Speech Application Programming Interface

The Microsoft Speech Application Programming Interface (API) uses the OLE Component Object Model (COM) architecture under Win32® (Windows® 95 and Windows NT® 3.51). The component objects can be accessed through C/C++ or Visual Basic's ® OLE Automation. The speech architecture is divided into two levels, a high level that is designed for ease and speed of implementation, and a low level that allows applications complete control of the technology.

This article briefly describes how to add simple speech recognition and text-to-speech to an application using C++ and the architecture involved. We begin with the high-level interfaces, known as Voice Commands and Voice Text.