PulseAugur
LIVE 08:56:09
frontier release · [2 sources] ·
0
frontier release

Google DeepMind enhances Gemini audio models for natural voice interactions and translation

Google DeepMind has released upgraded Gemini 2.5 audio models, enhancing capabilities for both live voice agents and text-to-speech generation. The Gemini 2.5 Flash Native Audio model now offers improved function calling, instruction following, and conversational context awareness, achieving a 71.5% score on the ComplexFuncBench Audio benchmark. Additionally, new live speech translation features are rolling out in the Google Translate app, enabling real-time speech-to-speech translation that preserves speaker intonation and pitch. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

RANK_REASON Frontier-lab model release with system card.

Read on Google DeepMind →

Google DeepMind enhances Gemini audio models for natural voice interactions and translation

COVERAGE [2]

  1. Google DeepMind TIER_1 ·

    Improved Gemini audio models for powerful voice experiences

  2. Google DeepMind TIER_1 ·

    Advanced audio dialog and generation with Gemini 2.5

    Gemini 2.5 has new capabilities in AI-powered audio dialog and generation.