LLM observability tools are blind to the voice layer. Here is what I checked 6 of them for.
Observability tools for LLMs primarily focus on tracing model calls, including prompts, completions, and latency, which is insufficient for voice agents. Failures in voice agents often occur in the audio layer, such as end-of-turn detection, ASR latency, and barge-in detection, which current LLM tracers do not capture. Tools built on OpenTelemetry offer a flexible canvas for instrumenting these audio-layer spans alongside LLM metrics, but require custom implementation, while other tools are more LLM-call-centric and require additional telemetry for audio insights. AI
IMPACT Highlights a gap in current LLM observability tools, pushing for better audio-layer tracing to improve voice agent performance and user experience.