Thinking Machines has unveiled a new class of "interaction models" designed for real-time conversational AI. These models process audio, video, and text in rapid 200-millisecond intervals, eliminating the need for separate turn-detection components. This architecture allows for continuous, interleaved input and output streams, enabling capabilities like speaking while listening and reacting to visual cues without explicit prompts. The system utilizes two co-trained models: a lightweight interaction model for live conversation and a background model for complex tasks like planning and tool use, ensuring low latency for users. AI
Summary written by gemini-2.5-flash-lite from 1 sources. How we write summaries →
IMPACT Enables more natural, responsive conversational AI by integrating interactivity directly into model architecture.
RANK_REASON Research preview announcement of a new class of models with novel architectural approach. [lever_c_demoted from research: ic=1 ai=1.0]