PulseAugur
LIVE 13:12:49
research · [1 source] ·
0
research

OpenAI enhances its voice AI with promptable prosody and SOTA ASR

OpenAI has significantly updated its voice AI capabilities, introducing "Promptable Prosody" which allows for more nuanced control over speech generation. This update also includes state-of-the-art automatic speech recognition (ASR) and semantic voice activity detection (VAD). These advancements aim to make AI-generated speech more natural and expressive. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

RANK_REASON OpenAI released updates to its voice AI capabilities, including new features and performance improvements.

Read on Smol AINews →

COVERAGE [1]

  1. Smol AINews TIER_1 ·

    Promptable Prosody, SOTA ASR, and Semantic VAD: OpenAI revamps Voice AI

    **OpenAI** has launched three new state-of-the-art audio models in their API, including **gpt-4o-transcribe**, a speech-to-text model outperforming Whisper, and **gpt-4o-mini-tts**, a text-to-speech model with promptable prosody allowing control over timing and emotion. The **Age…