A new open-source voice model called Audio Interaction has been released, capable of processing audio in real-time without waiting for input to finish. This model can translate, transcribe, and converse continuously, even recognizing ambient sounds like coughs. Its code and weights are available on GitHub under an open-source license, with training data to be released later. AI
IMPACT Enables continuous, real-time voice interaction and ambient sound recognition in open-source applications.
RANK_REASON Release of an open-source model with novel real-time processing capabilities. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →