PulseAugur
EN
LIVE 09:16:57
tool · [1 source] ·

Thinking Machines unveils real-time interaction models with 200ms processing

Thinking Machines has unveiled a new class of "interaction models" designed for real-time conversational AI. These models process audio, video, and text in rapid 200-millisecond intervals, eliminating the need for separate turn-detection components. This architecture allows for continuous, interleaved input and output streams, enabling capabilities like speaking while listening and reacting to visual cues without explicit prompts. The system utilizes two co-trained models: a lightweight interaction model for live conversation and a background model for complex tasks like planning and tool use, ensuring low latency for users. AI

Summary written by gemini-2.5-flash-lite from 1 sources. How we write summaries →

IMPACT Enables more natural, responsive conversational AI by integrating interactivity directly into model architecture.

RANK_REASON Research preview announcement of a new class of models with novel architectural approach. [lever_c_demoted from research: ic=1 ai=1.0]

Read on dev.to — LLM tag →

COVERAGE [1]

  1. dev.to — LLM tag TIER_1 · Thousand Miles AI ·

    How Thinking Machines built interactivity into the model

    <p>A new release from Thinking Machines, dated May 11, 2026, lands at 0.40 seconds end-to-end on the FD-bench V1 turn-taking benchmark — about three times faster than GPT-realtime-2.0 (xhigh) and roughly half the latency of Gemini-3.1-flash-live (high). The latency number is the …