Vendor Specific Parameters

As defined in the MRCP specifications, there are a set of headers allowing the client to adjust vendor-specific parameters. These headers may be sent in the SET-PARAMS/GET-PARAMS methods.

The following parameters are platform-specific extensions to the MRCP specification. They can be controlled via the media_server.conf configuration file, located by default at mrcp-api/docker/lumenvox/media_server.conf.

They may also be set with the appropriate header as part of a RECOGNIZE or SET-PARAMS method; see Specifying Vendor-Specific Properties via MRCP Headers below.

See Configuration Parameters for more information about changing various MRCP parameters.

wind-back-time

The length of audio wound back at the beginning of voice.

This helps in situations of weak speech onset. The resolution of this parameter is 40 ms and it is rounded to the closest multiple of 40 ms, which means setting this value to 139 ms is the same as setting it to 120 ms, and setting this value to 141 ms is the same as setting it to 160 ms. It is specified in milliseconds.

Range: >0

Default: 480

snr-sensitivity-lvl

This setting controls the minimum SNR of streamed audio data for it to be processed to identify whether it is speech. Data below this threshold is automatically assumed to be silence/noise. The noise estimate for the calculation is obtained from the initial silence specified by STREAM_PARM_VAD_STREAM_INIT_DELAY. The higher the value, the harder it is to barge in. The default value of 50 equals 5 dB SNR. The parameter range is mapped between 3.5 dB to 20 dB. If the application is expected to be in a very noisy environment and speech is not expected to be much louder than the background, this setting may need to be lowered. If speech is expected to be much louder than the surrounding noise, raising this value allows the VAD to ignore lower-volume background speech or babble noise that may otherwise cause barge-in.

Note that this parameter can be set in the range 0-100, with higher values (closer to 100) being more sensitive to barge-in in noisy situations with low SNR (where speech and background noise are similar).

Range: 0-100

Note that this vendor-specific setting (where 100 is most sensitive) should not be confused with the similar MRCP Sensitivity-Level header setting, which affects the STREAM_PARM_VAD_VOLUME_SENSITIVITY setting in the API.

Default: 50

vad-stream-init-delay

The length of audio (in milliseconds) that the VAD module uses to estimate the acoustic environment.

Accurate VAD depends on good estimation of the acoustic environment. The VAD module uses the first couple of frames of audio to estimate the acoustic environment, such as noise level. The length of this period is defined by this parameter.

Range: >0

Default: 100

vad-bargein-threshold

VAD speech sensitivity setting.

A higher value makes the VAD more sensitive to speech, meaning that the VAD requires more certainty that the data is speech before barge-in. Raising the value will reject more false positives and noise. However, it may mean that some speech on the borderline is rejected. This value should not be changed from the default without significant tuning and verification.

Range: 0-100 (MRCPv1 and MRCPv2)

Default: 50

compatibility_mode

Enables compatibility encoding of results.

This option may need to be enabled to match the output of decodes with those of other vendors.

Contact Technical Support for more specific details.

Default: 0

end-of-speech-timeout

Controls the end-of-speech timeout setting.

This value affects the underlying STREAM_STATUS_END_SPEECH_TIMEOUT of the speech port, which is used in an MRCP ASR recognition session.

After barge-in, the streaming interface will flag STREAM_STATUS_END_SPEECH_TIMEOUT if it did not detect end-of-speech in the time specified by this property. This is different from the end-of-speech delay; STREAM_PARM_END_OF_SPEECH_TIMEOUT represents the total amount of time a caller has to speak after barge-in is detected.

Default: -1 (infinite)

secure_context

Enables suppression of potentially sensitive ASR data.

When enabled, this option prevents logging of any potentially sensitive data to either log files or call data files, which includes any associated audio segments. Where potentially sensitive data would have appeared, the word _SUPPRESSED will replace it to indicate that suppression occurred.

Possible Values:

  • 0 - Disabled. Normal logging will be performed.
  • 1 - Secure Context mode enabled. Sensitive data will be suppressed.

Default: 0

tts.secure_context

Enables suppression of potentially sensitive TTS data.

When enabled, this option prevents logging of any potentially sensitive data to either log files or call data files, which includes any associated audio segments. Where potentially sensitive data would have appeared, the word _SUPPRESSED will replace it to indicate that suppression occurred.

Possible Values:

  • 0 - Disabled. Normal logging will be performed.
  • 1 - Secure Context mode enabled. Sensitive data will be suppressed.

Default: 0

enable-sre-logging

This parameter has been deprecated. Please see data archiving documentation for current guidance.

callsre-prefix

Allows the addition of a custom string prefix to the beginning of the Response File filename for the current session.

When specified, this option adds the specified prefix to Response Files generated for the current session. This may be useful when identifying certain specific calls, such as those belonging to a particular application or customer-controlled category.

Note that the callsre-prefix and callsre-suffix options are both independent, so they can be used individually, together, or not at all, as needed.

Possible Values:

  • A string containing valid filename characters (avoid reserved characters)

Default: (none)

callsre-suffix

Allows the addition of a custom string suffix to the end of the Response File filename for the current session.

Similar to callsre-prefix, when specified, this option adds the specified suffix to Response Files generated for the current session. This may be useful when identifying certain specific calls, such as those belonging to a particular application or customer-controlled category.

Note that the callsre-prefix and callsre-suffix options are both independent, so they can be used individually, together, or not at all, as needed.

Possible Values:

  • A string containing valid filename characters (avoid reserved characters)

Default: (none)

logging-verbosity

Controls the logging verbosity within the current session.

When used, this option allows users to override the default LOGGING_VERBOSITY setting in client_property.conf, which controls the verbosity of log messages. This allows users to independently control the amount of logging generated for individual sessions. It can even be used to control interactions and requests within a single session so that, for example, only certain recognition requests are logged with a specified verbosity.

Possible Values (same as LOGGING_VERBOSITY setting):

  • 1 - Minimal Logging - only errors and critical issues
  • 2 - Medium Logging - all non-debug information as events occur
  • 3 - Debug Logging - all types of events, including information and debugging activity
  • 4 and higher - Additional diagnostic detail, typically used for troubleshooting with Technical Support.

Default: (as specified by LOGGING_VERBOSITY setting)

sticky-save-waveform

Allows overriding the platform's default SAVE-WAVEFORM setting.

Possible Values:

  • True - Regardless of the save-waveform header value, the save-waveform option will be set to true for the remainder of the MRCP session.
  • False - Regardless of the save-waveform header value, the save-waveform option will be set to false for the remainder of the MRCP session.

Default: (none)


Specifying Vendor-Specific Properties via MRCP Headers

As mentioned previously, you may specify the above parameters in an MRCP header. You must use the following format. Note that a semicolon (";") is used as the delimiter:

Vendor-Specific: com.lumenvox.wind-back-time=300;com.lumenvox.vad-stream-init-delay=200

This header field may be specified in RECOGNIZE, recognizer SET-PARAMS, or synthesizer SET-PARAMS methods during an MRCP session. The following header field names may be used:

com.lumenvox.wind-back-time
com.lumenvox.snr-sensitivity-lvl
com.lumenvox.vad-stream-init-delay
com.lumenvox.vad-bargein-threshold
com.lumenvox.compatibility-mode
com.lumenvox.end-of-speech-timeout
com.lumenvox.secure_context
com.lumenvox.tts.secure_context
com.lumenvox.enable-sre-logging
com.lumenvox.callsre-prefix
com.lumenvox.callsre-suffix
com.lumenvox.logging-verbosity
com.lumenvox.sticky-save-waveform

Was this article helpful?