Prometheus metrics available
Observability is a cornerstone of any production-grade speech platform. Capacity Private Cloud exposes a comprehensive set of Prometheus metrics across every microservice, giving operations teams deep visibility into system health, resource utilization, and request-level performance. Whether you are building Grafana dashboards, configuring alerting rules, or capacity planning for scale, these metrics provide the telemetry foundation you need.
The metrics below follow Prometheus conventions: counters track cumulative totals that only increase, gauges represent point-in-time values that rise and fall, histograms capture request duration and size distributions across configurable buckets, and gauge vectors expose labeled gauge values partitioned by response type or status.
A downloadable PDF reference is also available: Prometheus Metrics Reference (PDF)
Deployment
| Measurement | Description | Type |
|---|---|---|
| deployment_active_count | Number of active deployments | gauge |
| deployment_total_responses_returned | Number of responses returned by the container | gauge vector |
| deployment_active_requests | Total number of active requests | gauge |
| deployment_average_request_process_time | Distribution of average request processing time | histogram |
| deployment_max_requests | Maximum number of simultaneous deployment requests | gauge |
| deployment_total_requests | Total number of deployment requests | counter |
VAD (Voice Activity Detection)
| Measurement | Description | Type |
|---|---|---|
| vad_audio_streams_current | Active current VAD streams | gauge |
| vad_audio_streams_max | Maximum concurrent number of VAD streams | gauge |
| vad_audio_streams_total | Total number of completed VAD streams | counter |
| vad_audio_timeout_total | Total number of VAD stream timeouts | counter |
| vad_active_requests | Total number of active requests | gauge |
| vad_total_cpa_requests | Total number of CPA requests received | counter |
| vad_active_cpa_requests | Total number of active CPA requests | gauge |
| vad_average_cpa_request_process_time_dist | Distribution of average CPA request processing time | histogram |
| vad_total_cpa_responses_returned | Number of CPA responses returned | gauge vector |
| vad_total_amd_requests | Total number of AMD requests received | counter |
| vad_active_amd_requests | Total number of AMD active requests | gauge |
| vad_average_amd_request_process_time_dist | Distribution of average AMD request processing time | histogram |
| vad_total_amd_responses_returned | Number of responses returned | gauge vector |
| vad_total_asr_requests | Total number of ASR requests received | counter |
| vad_active_asr_requests | Total number of active ASR requests | gauge |
| vad_average_asr_request_process_time_dist | Distribution of average ASR request processing time | histogram |
| vad_total_responses_returned | Number of ASR responses returned | gauge vector |
| vad_total_transcription_requests | Total number of transcription requests received | counter |
| vad_active_transcription_requests | Total number of active transcription requests | gauge |
| vad_average_transcription_request_process_time_dist | Distribution of average transcription request processing time | histogram |
| vad_total_transcription_responses_returned | Number of transcription responses returned | gauge vector |
| vad_stream_subscribe_duration | Histogram of latencies for Redis stream subscription | histogram |
| vad_transcoding_duration | Histogram of latencies for transcoding audio chunks | histogram |
| vad_processing_duration | Histogram of latencies for engine processing of audio chunks | histogram |
Session
| Measurement | Description | Type |
|---|---|---|
| session_total_requests | Total number of requests received | counter |
| session_active_requests | Total number of active requests | gauge |
| session_average_request_process_time_dist | Distribution of average request processing time | histogram |
| session_total_responses_returned | Number of responses returned by the container | gauge vector |
ASR (Automatic Speech Recognition)
| Measurement | Description | Type |
|---|---|---|
| asr_total_asr_requests | Total number of requests received | counter |
| asr_active_asr_requests | Total number of active requests | gauge |
| asr_active_europa_requests | Total number of active backend engine requests | gauge |
| asr_average_asr_request_process_time_dist | Distribution of average request processing time | histogram |
| asr_max_asr_requests | Maximum number of simultaneous ASR requests | gauge |
| asr_total_asr_responses_returned | Number of responses returned by the container | gauge vector |
| asr_total_grammar_requests | Total number of grammar load requests received | counter |
| asr_active_grammar_load_requests | Total number of active grammar load requests | gauge |
| asr_average_asr_stream_request_process_time_dist | Distribution of average stream request processing time | histogram |
| asr_max_concurrent_grammar_load_requests | Maximum number of simultaneously active grammar load requests | gauge |
| asr_average_grammar_load_request_process_time_dist | Distribution of average grammar load request processing time | histogram |
| asr_total_grammar_load_responses_returned | Number of grammar load responses returned by the container | gauge vector |
| asr_total_transcription_requests | Total number of transcription requests received | counter |
| asr_active_transcription_requests | Total number of active transcription requests | gauge |
| asr_total_transcription_responses_returned | Number of transcription responses returned | gauge vector |
| asr_average_asr_batch_request_process_time_dist | Distribution of average batch request processing time | histogram |
| asr_average_transcription_batch_request_process_time_dist | Distribution of average transcription batch request processing time | histogram |
| asr_average_transcription_stream_request_process_time_dist | Distribution of average transcription stream request processing time | histogram |
| asr_max_transcription_requests | Maximum number of simultaneous transcription requests | gauge |
| asr_max_active_grammars | Maximum number of simultaneously active grammars | gauge |
| asr_max_active_parses | Maximum number of simultaneous active SISR parses | gauge |
| asr_active_decodes | Active number of decodes being processed | gauge |
| asr_active_grammars | Active number of grammars being processed | gauge |
| asr_active_parses | Active number of SISR parses being processed | gauge |
| asr_average_sisr_parse_text_request_process_time_dist | Distribution of average SISR parse request processing time | histogram |
| asr_sisr_parse_text_requests_total | Total number of SISR parse requests received | counter |
| asr_total_sisr_parse_text_responses_returned | Number of SISR parse responses returned | gauge vector |
| asr_total_ms_audio_pushed | Total milliseconds of audio pushed into ASR | counter |
| asr_cache_entries | Number of entries currently present in the ASR grammar cache | counter |
| asr_cache_size_bytes | Current size in bytes of the grammar cache | counter |
| asr_active_ms_audio_processing | Total milliseconds of audio currently being processed by ASR | gauge |
| asr_fine_tuned_results | ASR fine-tuned model results (if enabled) | gauge vector |
TTS (Text-to-Speech)
| Measurement | Description | Type |
|---|---|---|
| tts_total_requests | Total number of requests received | counter |
| tts_active_requests | Total number of active requests | gauge |
| tts_average_request_process_time_dist | Distribution of average request processing time | histogram |
| tts_total_responses_returned | Number of responses returned by the container | gauge vector |
| tts_average_pending_queue_time | Average time of requests queued for processing | histogram |
| tts_max_queue_size_synthesis_requests_tts1 | Maximum number of simultaneous TTS1 synthesis requests | gauge |
| tts_active_queue_size_synthesis_requests_tts1 | Current number of simultaneous TTS1 synthesis requests | gauge |
| tts_max_pending_requests_tts1 | Maximum number of pending TTS1 synthesis requests | counter |
| tts_preprocess_load_cache_results | Used internally for testing | gauge |
| tts_postprocess_load_cache_results | Used internally for testing | gauge |
| tts_max_queue_size_synthesis_requests | Maximum TTS requests per container | gauge |
| tts_first_result_time_max | Maximum time between client making synthesis request and receiving first audio packet | histogram |
| tts_first_result_time_min | Minimum time between client making synthesis request and receiving first audio packet | histogram |
Resource Manager
| Measurement | Description | Type |
|---|---|---|
| resource_active_asr_installs | Actively installing ASR packages | gauge |
| resource_asr_download_attempts_counter_total | Total number of ASR download attempts | counter |
| resource_asr_download_failure_counter_total | Total number of failed ASR downloads | counter |
| resource_asr_download_success_counter_total | Total number of successful ASR downloads | counter |
| resource_asr_language_packages_configured | Number of ASR packages configured for the system | gauge |
| resource_tts_active_installs | Actively installing TTS packages | gauge |
| resource_tts_download_attempts_counter_total | Total number of TTS download attempts | counter |
| resource_tts_download_failure_counter_total | Total number of failed TTS downloads | counter |
| resource_tts_download_success_counter_total | Total number of successful TTS downloads | counter |
| resource_tts_voice_packages_configured | Number of TTS packages configured for the system | gauge |
| resource_active_vb_active_installs | Actively installing VB-Active packages | gauge |
| resource_vb_active_download_attempts_counter_total | Total number of VB-Active download attempts | counter |
| resource_vb_active_download_failure_counter_total | Total number of failed VB-Active downloads | counter |
| resource_vb_active_download_success_counter_total | Total number of successful VB-Active downloads | counter |
| resource_vb_active_language_packages_configured | Number of VB-Active packages configured for the system | gauge |
| resource_dnn_active_installs | Actively installing DNN packages | gauge |
| resource_dnn_download_attempts_counter_total | Total number of DNN download attempts | counter |
| resource_dnn_download_failure_counter_total | Total number of failed DNN downloads | counter |
| resource_dnn_download_success_counter_total | Total number of successful DNN downloads | counter |
| resource_dnn_voice_packages_configured | Number of DNN packages configured for the system | gauge |
| resource_itn_active_installs | Total number of ITN resource installs | counter |
| resource_itn_download_attempts_counter_total | Total number of ITN download attempts | counter |
Licensing
| Measurement | Description | Type |
|---|---|---|
| license_invalid_check_ops_total | Total number of unsuccessful license check events | counter |
| license_sync_fail_ops_total | Total number of unsuccessful license sync events | counter |
| license_sync_ops_total | Total number of attempted license sync events | counter |
| license_sync_success_ops_total | Total number of successful license sync events | counter |
| license_valid_check_ops_total | Total number of successful license check events | counter |
| license_valid_licences | Number of valid license deployments | gauge |
| license_invalid_licences | Number of invalid license deployments | gauge |
Configuration
| Measurement | Description | Type |
|---|---|---|
| configuration_total_requests | Total number of requests received | counter |
| configuration_active_requests | Total number of active requests | gauge |
| configuration_max_requests | Maximum number of configuration requests | gauge |
| configuration_average_request_process_time_dist | Distribution of average request processing time | histogram |
| configuration_total_responses_returned | Number of responses returned | gauge vector |
Binary Storage
| Measurement | Description | Type |
|---|---|---|
| binary_storage_total_requests | Total number of requests received | counter |
| binary_storage_active_requests | Total number of active requests | gauge |
| binary_storage_max_requests | Maximum number of binary storage requests | gauge |
| binary_storage_average_request_process_time_dist | Distribution of average request processing time | histogram |
Admin Portal
| Measurement | Description | Type |
|---|---|---|
| admin_portal_total_requests | Total number of admin portal requests | counter |
| admin_portal_average_request_process_time_dist | Distribution of average request processing time | histogram |
Archive
| Measurement | Description | Type |
|---|---|---|
| archive_total_requests | Total number of requests received | counter |
| archive_active_requests | Total number of active requests | gauge |
| archive_average_request_process_time_dist | Distribution of average request processing time | histogram |
| archive_total_responses_returned | Number of responses returned | gauge vector |
| archive_active_execution | Total number of active archive requests currently being executed | gauge |
| archive_requests_max | Maximum number of archive requests received | gauge |
| archive_total_execute | Total number of archive requests executed | counter |
Deployment Portal
| Measurement | Description | Type |
|---|---|---|
| deployment_portal_total_requests | Total number of deployment portal requests | counter |
| deployment_portal_active_requests | Total number of active deployment portal requests | gauge |
| deployment_portal_requests_max | Maximum number of deployment portal requests | gauge |
| deployment_portal_average_request_process_time_dist | Distribution of average request processing time | histogram |
| deployment_portal_total_responses_returned | Number of responses returned | gauge vector |
Reporting
| Measurement | Description | Type |
|---|---|---|
| reporting_total_requests | Total number of requests received | counter |
| reporting_active_requests | Total number of active requests | gauge |
| reporting_average_request_process_time_dist | Distribution of average request processing time | histogram |
| reporting_requests_max | Maximum number of reporting requests | gauge |
LumenVox API
| Measurement | Description | Type |
|---|---|---|
| lumenvox_api_total_requests | Total number of initial requests received | counter |
| lumenvox_api_total_requests_within_sessions | Total number of API requests within sessions | counter |
| lumenvox_api_active_requests | Total number of active requests | gauge |
| lumenvox_api_total_responses_returned | Number of responses returned by the container | gauge vector |
| lumenvox_api_rmq_messages_received | Number of LumenVox API RabbitMQ messages received | counter |
| lumenvox_api_rmq_messages_sent | Number of LumenVox API RabbitMQ messages sent | counter |
ITN (Inverse Text Normalization)
| Measurement | Description | Type |
|---|---|---|
| itn_request_times | ITN request times | histogram |
| itn_requests_current | Active ITN requests | gauge |
| itn_requests_max | Maximum number of simultaneous ITN requests | gauge |
| itn_requests | Total ITN requests | counter |
NLU (Natural Language Understanding)
| Measurement | Description | Type |
|---|---|---|
| nlu_average_request_process_time_dist | Distribution of average request processing time | histogram |
| nlu_active_requests | Total number of active requests | gauge |
| nlu_total_requests | Total number of initial requests received | counter |
| nlu_total_responses_returned | Number of responses returned by the container | gauge vector |
MRCP
| Measurement | Description | Type |
|---|---|---|
| mrcp_total_requests | Total number of calls (sessions) received | counter |
| mrcp_active_requests | Total number of active requests | gauge |
| mrcp_average_request_process_time_dist | Distribution of average request processing time | histogram |
| mrcp_total_responses_returned | Number of responses returned | gauge vector |
| mrcp_max_calls | Maximum simultaneous number of calls processed | gauge |
| mrcp_sip_calls | Total number of SIP calls processed | counter |
| mrcp_sip_tcp_connections | Total number of SIP TCP calls processed | counter |
| mrcp_rtsp_calls | Total number of RTSP calls processed | counter |
| mrcp_garbage_collection_calls | Total ended calls that are in the process of garbage collection | gauge |
Diarization
| Measurement | Description | Type |
|---|---|---|
| diarization_average_request_process_time_dist | Distribution of average request processing time | histogram |
| diarization_active_requests | Total number of active requests | gauge |
| diarization_total_requests | Total number of initial requests received | counter |
Language ID
| Measurement | Description | Type |
|---|---|---|
| lid_average_request_process_time_dist | Distribution of average request processing time | histogram |
| lid_active_requests | Total number of active requests | gauge |
| lid_total_requests | Total number of initial requests received | counter |
Neuron
| Measurement | Description | Type |
|---|---|---|
| neuron_total_requests | Total number of Neuron requests received | counter |
| neuron_active_requests | Total number of active Neuron requests | gauge |
| neuron_active_requests_process_time_dist | Distribution of average Neuron request processing time | histogram |
| neuron_first_result_time_dist | Distribution of time to first audio byte in milliseconds | histogram |
| neuron_first_result_time_max | Maximum time to first audio byte in milliseconds | gauge |
| neuron_first_result_time_min | Minimum time to first audio byte in milliseconds | gauge |
| neuron_max_queue_size_requests | Maximum simultaneous Neuron requests at any one time since startup | gauge |
| neuron_total_responses_returned | Number of responses returned including type | gauge vector |
Filestore
| Measurement | Description | Type |
|---|---|---|
| filestore_active_deployments | Current number of active deployments managed by the file-store service | gauge |
| filestore_cache_memory_bytes | Current memory consumption of the file-store cache in bytes | gauge |
| filestore_cache_size | Current number of items in the file-store cache | gauge |
| filestore_cache_reconciliation | Total number of cache reconciliation events | counter |
| filestore_deployment_cache_size | Current number of cached deployment artifacts/entries | gauge |
| filestore_http_request_duration_seconds | Distribution of HTTP request latency in seconds | histogram |
| filestore_http_requests_total | Total number of HTTP requests received | counter |
| filestore_http_response_size_bytes | Distribution of HTTP response body sizes in bytes | histogram |
