τ-Voice extends τ-bench to live, full-duplex voice interactions — where both sides speak and listen at once, people interrupt, and calls happen in noisy environments. Rather than clean audio in a quiet room, τ-Voice simulates realistic conditions: accents, street noise, burst sounds, connection drops, and natural turn-taking dynamics.
The examples below demonstrate how these conditions affect agent performance. The same task can succeed under clean audio and fail under realistic conditions — same task, same agent, different outcome. A full blog post with detailed results is coming soon.
Annotated Speech Activity Timeline
The interactive visualization below annotates the realistic Task 14 audio with speech-activity markers — user & agent speech, interruptions, noise effects, backchannels, and more. Press play to step through the conversation with a synchronized playhead.
(Busy Street)