Google DeepMind has released upgraded Gemini 2.5 audio models, enhancing capabilities for both live voice agents and text-to-speech generation. The Gemini 2.5 Flash Native Audio model now offers improved function calling, instruction following, and conversational context awareness, achieving a 71.5% score on the ComplexFuncBench Audio benchmark. Additionally, new live speech translation features are rolling out in the Google Translate app, enabling real-time speech-to-speech translation that preserves speaker intonation and pitch. AI
Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →
RANK_REASON Frontier-lab model release with system card.