Hardware and Software Requirements

Telephony applications use the same speech recognition engines used for Command and Control speech recognition, and the same text-to-speech engines used on the PC.

These hardware and software requirements should be considered when designing a speech application:

Processor speed. The speech-recognition and text-to-speech engines currently on the market typically require a 486/66 or faster processor.

Memory. On the average, the combination of speech recognition and text-to-speech will use 2 megabytes (MB) of random-access memory (RAM) in addition to that required by the running application.

Telephony card. A number of telephony cards are on the market today. On the low end are cards that use FAX/MODEM chips which have been augmented to handle speech. These are included in almost every new home PC. Higher end cards include DSPs or support for multiple phone lines.

Operating system. The Microsoft Speech application programming interface (API) requires either Windows 95 or Windows NT version 3.5.

Speech-recognition and text-to-speech engine. Speech-recognition and text-to-speech software must be installed on the user's system. Many new audio-enabled computers and sound cards are bundled with speech-recognition and text-to-speech engines. As an alternative, many engine vendors offer retail packages for speech recognition or text-to-speech, and some license copies of their engines.

For a list of engine vendors that support the Speech API, see the ENGINE.DOC file included with the Speech Software Development Kit.