Google DeepMind has released upgraded Gemini 2.5 audio models, enhancing capabilities for both live voice agents and text-to-speech generation. The Gemini 2.5 Flash Native Audio model now offers improved function calling, instruction following, and conversational context awareness, achieving a 71.5% score on the ComplexFuncBench Audio benchmark. Additionally, new live speech translation features are rolling out in the Google Translate app, enabling real-time speech-to-speech translation that preserves speaker intonation and pitch. AI
排序理由 Frontier-lab model release with system card.
- ComplexFuncBench Audio
- Gemini 2.5 Flash
- Gemini 2.5 Pro
- Gemini Live
- Google AI Studio
- Google DeepMind
- Google Translate
- Newo.ai
- NotebookLM
- Search Live
- Shopify
- United Wholesale Mortgage
- Vertex AI
- Project Astra
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →