Developing Speech Apps on Asterisk

Asterisk has native speech recognition support through the module res_speech.so. This module provides a number of dialplan applications that can be used for speech recognition. However, res_speech.so does not handle the communication between Asterisk and an automatic speech recognizer (ASR).

To get that communication working, a second Asterisk module must be loaded. The recommended method is to use the open-source UniMRCP-Asterisk module, which uses the Media Resource Control Protocol (MRCP) to send requests for speech recognition from Asterisk to the Capacity Private Cloud Media Server. A module called res_speech_unimrcp.so provides a link between the native Asterisk speech API and the platform in this case, meaning that dialplan calls will be translated into MRCP requests.

One shortcoming of this approach is that Asterisk has no native text-to-speech (TTS) support. To help solve this, the UniMRCP project includes another Asterisk module called res_unimrcp.so that offers dialplan applications providing TTS functionality. The res_unimrcp.so module also includes ASR functions that can replace or complement the native Asterisk speech API.

In summary, there are two broad approaches to using speech software on Asterisk with UniMRCP:

Use res_speech_unimrcp.so with the native Asterisk Generic Speech API (res_speech.so). This will not give you access to TTS.
Use res_unimrcp.so for TTS and/or ASR functionality.

Developers may also choose to combine these approaches. One important benefit of using the Asterisk Generic Speech API is that recognition results are generally returned as simple strings. The res_unimrcp.so interface returns results in the Natural Language Semantic Markup Language (NLSML) format, which represents a complex object as XML. This means application developers will need to parse an NLSML-formatted object to make use of the results of an ASR interaction. See the guide to NLSML support for more information on building NLSML parsers.

The following information on building speech applications for Asterisk is available:

res_unimrcp.so
- MRCPRecog() — Speech recognition over MRCP.
- MRCPSynth() — TTS over MRCP.
- SynthAndRecog() — Combined TTS output while listening for speech (ASR) input.

Was this article helpful?