Liberating LLM Capabilities in Full-Duplex Speech Models
Researchers have introduced a new paradigm called Listen-Write-Speak (LWS) for large language models interacting through speech. This approach treats text as a primary output channel, enabling LLMs to generate visible free-form text, code, and perform complex reasoning in real-time alongside spoken responses. LWS requires no architectural changes to existing LLMs and is trained using a novel data pipeline. The system demonstrates strong performance in full-duplex interaction and maintains high consistency between its written and spoken outputs. AI
IMPACT Enables LLMs to offer richer, more interactive outputs beyond simple spoken replies, potentially improving user experience and task completion.