PulseAugur
EN
LIVE 21:33:21

Google AI launches Gemini 3.1 Flash TTS with new audio tag controls

Google AI has released Gemini 3.1 TTS and Gemini 3.1 Flash TTS, their newest text-to-speech models. These models offer enhanced expressiveness and control, introducing audio tags to guide vocal style, pace, and delivery through natural language commands. The audio tags are designed to be an intuitive way for users to shape the output of the text-to-speech models. AI

RANK_REASON Release of a new model version by a major AI lab, but not a frontier model release.

Read on X — Google AI →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

Google AI launches Gemini 3.1 Flash TTS with new audio tag controls

COVERAGE [3]

  1. X — Google AI TIER_1 English(EN) · GoogleAI ·

    Last week, we launched Gemini 3.1 TTS, our latest and best text-to-speech model. This new model introduces [awe] audio tags, an intuitive way to guide vocal sty

    Last week, we launched Gemini 3.1 TTS, our latest and best text-to-speech model. This new model introduces [awe] audio tags, an intuitive way to guide vocal style, pace, and delivery. Here are some tips on the best ways to use audio tags in your prompts: 1. All inline tags must…

  2. X — Google AI TIER_1 English(EN) · GoogleAI ·

    Gemini 3.1 Flash TTS is rolling out in Google Vids and is available today in preview via the Gemini API and in @GoogleAIStudio.

    Gemini 3.1 Flash TTS is rolling out in Google Vids and is available today in preview via the Gemini API and in @GoogleAIStudio. Whether you’re creating a pitch deck or recording a passion project, transform your scripts into studio-quality narration: https://t.co/MG2YIQwKb6

  3. X — Google AI TIER_1 English(EN) · GoogleAI ·

    Today we launched Gemini 3.1 Flash TTS, our most expressive and controllable text-to-speech model yet.

    Today we launched Gemini 3.1 Flash TTS, our most expressive and controllable text-to-speech model yet. This launch [excitement] includes audio tags! 🗣🏷 Audio tags [explanatory] are a seamless way to guide vocal style, pace, and delivery using natural language commands embedded h…