English(EN) Promptable Prosody, SOTA ASR, and Semantic VAD: OpenAI revamps Voice AI

OpenAI 通过 promptable prosody 和 SOTA ASR 增强其语音 AI

作者 PulseAugur 编辑部 · [1 个来源] · 2025-03-20 22:51

OpenAI 已显著更新了其语音 AI 功能，引入了“Promptable Prosody”，可对语音生成进行更细致的控制。此次更新还包括了最先进的自动语音识别 (ASR) 和语义语音活动检测 (VAD)。这些进步旨在使 AI 生成的语音更加自然和富有表现力。 AI

排序理由 OpenAI 发布了其语音 AI 功能的更新，包括新功能和性能改进。

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

Smol AINews TIER_1 English(EN) · 2025-03-20 22:51

Promptable Prosody、SOTA ASR 和 Semantic VAD：OpenAI 改进语音 AI

**OpenAI** has launched three new state-of-the-art audio models in their API, including **gpt-4o-transcribe**, a speech-to-text model outperforming Whisper, and **gpt-4o-mini-tts**, a text-to-speech model with promptable prosody allowing control over timing and emotion. The **Age…