τ-voice Examples

τ-Voice extends τ-bench to live, full-duplex voice interactions — where both sides speak and listen at once, people interrupt, and calls happen in noisy environments. Rather than clean audio in a quiet room, τ-Voice simulates realistic conditions: accents, street noise, burst sounds, connection drops, and natural turn-taking dynamics.

A simulated τ-Voice call in the retail domain. The main timeline shows six minutes of overlapping speech, interruptions, and noise. Inset A decomposes the audio the agent receives; Inset B highlights turn-taking dynamics.

The examples below demonstrate how these conditions affect agent performance. The same task can succeed under clean audio and fail under realistic conditions — same task, same agent, different outcome. A full blog post with detailed results is coming soon.

Explore full voice trajectories Browse and replay complete voice simulations — with transcripts, tool calls, and turn-by-turn details — in the interactive trajectory visualizer.

Open Visualizer →

🔊 Sample Conversations · Clean vs. Realistic

Same task, different conditions

Task 14 succeeds under clean audio but fails when realistic effects are applied — same task, same agent, different outcome.

Clean

Gemini Success

Realistic

Gemini Logical

Transcription failures

Both conversations fail due to transcription errors. In clean audio, verbally encoded characters trip up the agent; in realistic audio, accent and noise compound the problem.

Clean

xAI Transcription

Realistic

xAI Transcription

Logical failures

Both conversations fail due to reasoning errors — wrong policy application or missed constraints — independent of audio quality.

Clean

OpenAI Logical

Realistic

Gemini Logical

Annotated Speech Activity Timeline

The interactive visualization below annotates the realistic Task 14 audio with speech-activity markers — user & agent speech, interruptions, noise effects, backchannels, and more. Press play to step through the conversation with a synchronized playhead.

📊 Speech Activity Timeline — Retail, Gemini

0:00 / 0:00

User
(Busy Street)

Agent

Time (seconds)