CPA & AMD Integration - gRPC

This article covers all gRPC-based integration scenarios for Capacity Private Cloud CPA and AMD. gRPC does not use grammars — settings are configured directly in protobuf messages. Message playback is handled separately by the application's telephony platform or via a separate TTS request to the gRPC API. For core concepts, see CPA & AMD Overview. For MRCP integration, see CPA & AMD: MRCP Integration.

gRPC session architecture: Audio is streamed into a session representing the phone call. CPA and AMD run as two parallel interactions on the same session, both processing the same audio stream. AudioPush begins before InteractionCreate requests are sent.

gRPC vs MRCP — Key Differences

While MRCP and gRPC achieve the same detection outcomes, they differ significantly in session management, settings configuration, and how results are returned.

AspectMRCPgRPC
Grammar requiredYes — grammar <meta> tags configure CPA and AMDNo — 2 parallel interactions configured in protobuf fields
Message playbackMRCP SPEAK command on the same sessionSeparate — via telephony platform or a TTS request to the gRPC API
Session managementSIP session lifecyclegRPC call session lifecycle (SessionCreate / SessionClose)
Parallel CPA + AMDBoth grammars in a single RECOGNIZE request2 separate InteractionCreate requests run in parallel on the same session
Real-time resultsRECOGNITION-COMPLETE eventInteractionResult on gRPC stream

Message Delivery

Purpose: Same as MRCP message delivery, but using the gRPC Interaction API. No grammars needed — settings configured directly in protobuf messages.

Requires: CPA + AMD
Protocol: gRPC (Capacity Private Cloud Interaction API)

Process Flow


gRPC Message Sequence

The following gRPC message sequence illustrates the interaction between the client application and the LumenVox gRPC service during a message delivery workflow. Unlike MRCP, audio streaming begins before interactions are created, and CPA and AMD run as separate parallel interactions rather than grammars within a single request.



gRPC Agent Connection

Purpose: Same as MRCP agent connection, but using the gRPC Interaction API. Connect a live agent to a human caller; hang up on machines.

Requires: CPA + AMD
Protocol: gRPC

Process Flow

gRPC Message Sequence — Agent Connection

The following gRPC message sequence illustrates the interaction between the client application and the LumenVox gRPC service during a message delivery workflow. Unlike MRCP, audio streaming begins before interactions are created, and CPA and AMD run as separate parallel interactions rather than grammars within a single request.


Apple Call Screening

Minimum version: Capacity Private Cloud 7.0

The logical flow for Apple Call Screening is identical to the MRCP version. The implementation uses the gRPC Interaction API with sessions and interactions instead of MRCP grammars.

Purpose: Detect Apple Call Screening on outbound calls using the gRPC API, deliver a screening payload, and handle the three possible outcomes: call dropped, voicemail, or human answered.

Requires: CPA + AMD (sequential interactions on a gRPC session)
Protocol: gRPC (Capacity Private Cloud Interaction API)

Audio is streamed into a session via AudioPush before interactions are created. Settings are configured directly in protobuf fields rather than grammar meta tags. Message playback is handled separately by the application's telephony platform or via a separate TTS request to the gRPC API.

gRPC vs MRCP — Apple Call Screening Differences

While MRCP and gRPC achieve the same detection outcomes, they differ significantly in how sessions are managed, how settings are configured, and how results are returned. The table below highlights the key architectural differences between the two protocols for standard CPA and AMD workflows.

AspectMRCPgRPC
Screening announcement detectionCPA grammar returns UNKNOWN SPEECHInteractionCreateRequest with detection_mode=CPA returns UNKNOWN SPEECH
Prompt End detectioncpa_prompt_end.grxml grammarInteractionCreateRequest with cpa_prompt_end_enable=true. If enabled all other CPA settings will be ignored.
Screening tone detectionamd_screening_tone.grxml grammarInteractionCreateRequest with screening_tone_enable=true
Wait for answercpa_unknown_silence_timeout.grxml grammarInteractionCreateRequest with cpa_unknown_silence_timeout=30000
Message playbackMRCP SPEAK command on same sessionSeparate — via telephony platform or TTS request to gRPC API
Session managementSIP session lifecyclegRPC call session lifecycle (SessionCreate / SessionClose)

Process Flow


gRPC Message Sequence — Apple Call Screening

The following gRPC message sequence illustrates the interaction between the client application and the LumenVox gRPC service during a message delivery workflow. Unlike MRCP, audio streaming begins before interactions are created, and CPA and AMD run as separate parallel interactions rather than grammars within a single request.

Decision Points

DecisionConditionAction
UNKNOWN SPEECH detectedCPA InteractionResult returns UNKNOWN SPEECHApple screening announcement detected. Create new CPA interaction with prompt end detection enabled.
PROMPT END detectedCPA InteractionResult returns PROMPT ENDAnnouncement complete. Deliver call screening payload via telephony platform or TTS gRPC request.
SCREENING tone detectedAMD InteractionResult returns SCREENINGApple warble tone confirmed. Create CPA interaction with 30000ms silence timeout and AMD tone detection interaction in parallel.
Call Dropped (Result A)VAD_EVENT_TYPE_BARGE_IN_TIMEOUT returnedApple user declined. Close interactions and session. End call.
Voicemail (Result B)CPA returns UNKNOWN SPEECH, then AMD returns BEEPCall went to voicemail. Deliver message via telephony platform. Close interactions and session.
Human Answered (Result C)CPA returns HUMAN RESIDENCE or HUMAN BUSINESSApple user accepted. Deliver message or bridge to agent via telephony platform. Close interactions and session.
ErrorCPA returns UNKNOWN SILENCE or other unexpected resultClose interactions and session. Log error. End call.

Timing Considerations

  • Initial CPA/AMD detection: The first interaction will return UNKNOWN SPEECH typically within 3–5 seconds as the Apple screening announcement plays.
  • Prompt End detection: Returned promptly once the Apple announcement speech concludes.
  • Screening tone detection: The warble tone is typically detected within a few seconds of the payload completing.
  • Wait for answer: A cpa_unknown_silence_timeout of 30000ms (30 seconds) is recommended to allow sufficient time for the Apple user to decide. Adjust based on observed user response times for your campaign.



End-to-end campaign flow

This diagram combines all use cases into a single comprehensive flow showing the complete lifecycle of an outbound call — from pre-answer tone detection through post-answer classification to final call disposition.


Timing and Configuration Quick Reference

The table below provides a quick reference for how each detection feature behaves at runtime, including what triggers it to stop listening and when results are returned. This is useful for estimating detection latency and configuring appropriate timeouts.

FeatureTrigger to StopReturns When
CPASpeech ends + VAD_EOS_DELAY, or silence timeoutSpeech classified OR timeout
AMD (Beep)BEEP tone detectedBeep detected OR BARGE_IN_TIMEOUT
AMD (Fax)FAX tone detectedImmediately on detection
AMD (SIT)SIT tones detectedImmediately on detection (1–2 seconds)
AMD (Busy)BUSY tone detectedImmediately on detection

Was this article helpful?