Capturing DTMF with Asterisk and UniMRCP

A common question when getting started is how to capture DTMF key presses when using speech recognition with Asterisk. There are two ways to do this when working with the res_unimrcp.so module and the MRCPRecog() and SynthAndRecog() applications.

Using DTMF Grammars

The recommended approach is to let the platform parse DTMF key presses against DTMF grammars. This allows the dialplan code to handle speech and DTMF in the same way. A common requirement is to support menus like "Say yes or press 1, otherwise say no or press 2." In this case, the application should treat the spoken word "yes" as equivalent to the key press 1. This is accomplished by writing separate speech and DTMF grammars that return the same output, loading them both, and then performing a recognition with i=none. Setting i=none tells UniMRCP and Asterisk to send the DTMF digits to the platform for processing.

Here are the two grammars required for this application:

YesNoSpeech.gram

#ABNF 1.0 UTF-8;
language en-US;
mode voice;
tag-format <semantics/1.0>;

root $rootrule;
$yes = yes {out = "yes"};
$no = no {out = "no"};
$rootrule = $yes | $no;

YesNoDTMF.gram

#ABNF 1.0 UTF-8;
language en-US;
mode dtmf;
tag-format <semantics/1.0>;

root $rootrule;
$yes = 1 {out = "yes"};
$no = 2 {out = "no"};
$rootrule = $yes | $no;

Regardless of whether a user says "Yes" or presses 1, the output from each grammar will be the word "yes." So the dialplan can look like the following:

DTMF and Speech Dialplan

exten => s,n,MRCPRecog(speechYesNo.gram,YesNoDTMF.gram,p=default&i=none)
exten => s,n,GotoIf($[ "${RECOG_INSTANCE(0/0)}" = "yes"]?Yes:No)
exten => s,n(Yes),Verbose(1,The user said yes.)
exten => s,n,Hangup
exten => s,n(No),Verbose(1,The user said no.)
exten => s,n,Hangup

Now if the user says "Yes" or presses 1, they will hit the line labeled Yes. This same basic pattern can be used for any sort of mixed DTMF/speech dialogs.

Using Built-in Dialplan DTMF Handling

Both MRCPRecog() and SynthAndRecog() support a parameter called i which specifies which, if any, DTMF digits the UniMRCP module will process. If i is set to either a string of digits or the word any, those digits are processed by the dialplan as normal, and the speech engine is not involved. Consider the following code:

exten => s,n,MRCPRecog(builtin:grammar/digits,p=default&i=any)
exten => s,n,Verbose(1,The user said ${RECOG_RESULT})
exten => 1,1,Verbose(1,The user pressed 1)

This allows a user to speak digits using the built-in digits grammar, but if the user presses 1, normal Asterisk dialplan functionality takes over, advancing the user to extension 1. However, there is a limitation in the current UniMRCP/Asterisk implementation that restricts it to collecting only one digit this way. For more than one digit, use the DTMF grammar approach described above.

Was this article helpful?