CPA & AMD Integration Overview
This article introduces the core concepts behind Capacity Private Cloud Call Progress Analysis (CPA) and Answering Machine Detection (AMD) — two complementary technologies that work together to classify outbound call outcomes and optimize campaign performance. For implementation details, see the companion articles on MRCP Integration and gRPC Integration.
CPA vs AMD — What Each Does
CPA and AMD serve different but complementary roles in outbound call processing. CPA uses Voice Activity Detection (VAD) to measure speech duration and classify who answered the call. AMD uses Digital Signal Processing (DSP) to detect specific tones such as beeps, fax signals, and network errors.
| Feature | CPA (Call Progress Analysis) | AMD (Answering Machine Detection) |
| Detection Mode | DETECTION_MODE=CPA | DETECTION_MODE=Tone |
| Technology | VAD — measures speech duration | DSP — detects specific tones |
| What it detects | Human Residence, Human Business, Unknown Speech, Unknown Silence | Beep, Fax, SIT, Busy |
| When it's useful | Classifying who or what answered the call | Detecting tones before answer (SIT/Busy) and after answer (Beep/Fax) |
| Grammar example (MRCP) | CallProgressAnalysis.grxml | ToneDetection.grxml |
CPA Classifications
CPA classifies call outcomes based on how long the answering party speaks before pausing. The thresholds below determine which classification is returned. These values are configurable via grammar meta tags (MRCP) or protobuf fields (gRPC).
| Classification | Speech Duration | Meaning |
| HUMAN RESIDENCE | < 1800ms | Short greeting — "Hello?" |
| HUMAN BUSINESS | 1800ms – 3000ms | Longer greeting — "Thanks for calling XYZ, how may I help you?" |
| UNKNOWN SPEECH | > 3000ms | Likely answering machine or voicemail greeting |
| UNKNOWN SILENCE | No speech within 5000ms | No one speaking — possibly ringing with no answer |
AMD Tone Types
AMD detects specific audio tones using digital signal processing. Each tone type indicates a different call outcome and determines the appropriate next action.
| Tone | Meaning | Timing |
| BEEP | Answering machine ready for message | Post-answer, after voicemail greeting ends |
| FAX | Fax machine tones | Post-answer — usually immediate after answer |
| SIT | Special Information Tones (7 subtypes) | Pre-answer — network error or disconnected number |
| BUSY | Busy signal (400–680 Hz) | Pre-answer — configurable timeout (BARGE_IN_TIMEOUT) |
Key Timing Parameters
The timing parameters below control how CPA and AMD behave during detection. These values can be tuned per campaign to balance detection speed against accuracy. All parameters are configurable via grammar meta tags (MRCP) or protobuf fields (gRPC).
| Parameter | Default | Purpose |
VAD_EOS_DELAY | 1200ms | Silence after speech before end-of-speech is triggered |
CPA_HUMAN_RESIDENCE_TIME | 1800ms | Speech duration threshold for Human Residence classification |
CPA_HUMAN_BUSINESS_TIME | 3000ms | Speech duration above which the call is classified as machine |
CPA_UNKNOWN_SILENCE_TIMEOUT | 5000ms | No-speech timeout before returning Unknown Silence |
BARGE_IN_TIMEOUT | Auto (CPA) | Maximum listening duration before forced timeout |
VAD_STREAM_INIT_DELAY | 0ms | Background noise calibration period |
Pre-Answer vs Post-Answer Detection
Understanding the difference between pre-answer and post-answer detection is fundamental to designing effective outbound call flows. Different features apply at each stage of the call lifecycle.
Parallel CPA + AMD Architecture
CPA and AMD are designed to run in parallel on the same audio stream. Running them sequentially creates detection gaps where tones or speech could be missed. Both MRCP and gRPC support parallel processing.
Best practice: Always run CPA and AMD in parallel to ensure no detection gaps between tone analysis and speech classification.
MRCP: Single Session with Both Grammars
In MRCP, both CPA and AMD grammars are included in a single RECOGNIZE request. The media server analyses the audio with both grammars simultaneously. The first match from either grammar returns the result as a RECOGNITION-COMPLETE event.
gRPC: Single Session with Parallel Interactions
In gRPC, the architecture uses a session-based model. Audio is streamed into a session representing the phone call. Two parallel interactions — one for CPA and one for AMD — are created on the same session, both processing the same audio stream simultaneously. Audio streaming (AudioPush) begins before the interaction requests are sent.
When to Use CPA Only, AMD Only, or Both
Not every outbound call scenario requires both CPA and AMD. The table below provides guidance on which detection mode to use based on common campaign objectives.
| Scenario | Recommended Mode | Reason |
| Pre-answer tone filtering | AMD only | Only tone detection works pre-answer. CPA requires speech. |
| Human vs machine classification | CPA only | CPA handles speech duration analysis. AMD detects tones, not speech patterns. |
| Message delivery to humans and machines | Both | CPA classifies who answered. AMD detects beep for voicemail message timing. |
| Agent connection (humans only) | Both | CPA identifies humans. AMD catches fax/SIT to avoid wasting agent time. |
| Beep detection only (call already classified as machine) | AMD only | If another system already classified the call as a machine, just wait for the beep. |
| Full outbound dialling pipeline | Both | Pre-answer: AMD for SIT/Busy. Post-answer: CPA for classification + AMD for beep. |
Detection Trigger Reference
Each detection feature has a specific trigger that causes it to return a result. The table below summarizes what causes each feature to stop listening and report its finding.
| Feature | Trigger to Stop | Returns When |
| CPA | Speech ends + VAD_EOS_DELAY, or silence timeout | Speech classified or timeout |
| AMD (Beep) | BEEP tone detected | Beep detected or BARGE_IN_TIMEOUT |
| AMD (Fax) | FAX tone detected | Immediately on detection |
| AMD (SIT) | SIT tones detected | Immediately on detection (1–2 seconds) |
| AMD (Busy) | BUSY tone detected | Immediately on detection |
