PulseAugur
EN
LIVE 00:22:20

OpenAI enhances its voice AI with promptable prosody and SOTA ASR

OpenAI has significantly updated its voice AI capabilities, introducing "Promptable Prosody" which allows for more nuanced control over speech generation. This update also includes state-of-the-art automatic speech recognition (ASR) and semantic voice activity detection (VAD). These advancements aim to make AI-generated speech more natural and expressive. AI

RANK_REASON OpenAI released updates to its voice AI capabilities, including new features and performance improvements.

Read on Smol AINews →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. Smol AINews TIER_1 English(EN) ·

    Promptable Prosody, SOTA ASR, and Semantic VAD: OpenAI revamps Voice AI

    **OpenAI** has launched three new state-of-the-art audio models in their API, including **gpt-4o-transcribe**, a speech-to-text model outperforming Whisper, and **gpt-4o-mini-tts**, a text-to-speech model with promptable prosody allowing control over timing and emotion. The **Age…