PulseAugur
实时 01:27:18
English(EN) How OpenAI delivers low-latency voice AI at scale

OpenAI 发布 GPT-5 级语音模型,支持实时推理、翻译和转录

OpenAI 发布了三款新的实时语音模型:GPT-Realtime-2GPT-Realtime-TranslateGPT-Realtime-Whisper。这些模型提供了增强的推理能力、对 70 多种语言的实时语音翻译以及低延迟转录。特别是 GPT-Realtime-2,被描述为具有“GPT-5 级推理”能力,并拥有显著扩展的 128K 令牌上下文窗口,同时改进了对中断和工具使用的处理。 AI

影响 通过改进的推理、翻译和转录功能,增强了实时语音代理的能力,可能加速语音优先界面的普及。

排序理由 OpenAI 发布了具有 GPT-5 级推理能力的新实时语音模型。

在 OpenAI News 阅读 →

AI 生成摘要 · Google Gemini · 来自 45 个来源。 我们如何撰写摘要 →

OpenAI 发布 GPT-5 级语音模型,支持实时推理、翻译和转录

报道来源 [45]

  1. OpenAI News TIER_1 English(EN) ·

    通过 API 中的新模型推进语音智能

    Explore new realtime voice models in the OpenAI API that can reason, translate, and transcribe speech, enabling more natural and intelligent voice experiences.

  2. OpenAI News TIER_1 English(EN) ·

    OpenAI 如何大规模提供低延迟语音 AI

    How OpenAI rebuilt its WebRTC stack to power real-time Voice AI with low latency, global scale, and seamless conversational turn-taking.

  3. Latent Space (swyx) TIER_1 English(EN) ·

    [AINews] Thinking Machines 的原生交互模型 - TML-Interaction-Small 276B-A12B - 推动了 SOTA 实时语音,并淘汰了标准 VAD

    well done, Team Thinky.

  4. Latent Space (swyx) TIER_1 English(EN) ·

    [AINews] GPT-Realtime-2、-Translate 和 -Whisper:新的SOTA实时语音API

    OpenAI continues deploying GPT-5 everywhere

  5. Smol AINews TIER_1 English(EN) ·

    GPT-Realtime-2、-Translate 和 -Whisper:新的 SOTA 实时语音 API

    **OpenAI** released **GPT-Realtime-2**, a voice model with **GPT-5-class reasoning**, tool use, interruption handling, and extended context windows up to **128K tokens**, achieving top scores on **Big Bench Audio** and **Conversational Dynamics** benchmarks. They also launched a …

  6. The Decoder TIER_1 English(EN) · Matthias Bastian ·

    OpenAI的新语音模型将GPT-5级别的推理能力带入实时对话

    <p><img alt="" class="attachment-full size-full wp-post-image" height="1152" src="https://the-decoder.com/wp-content/uploads/2026/05/openai_audio-1.png" style="height: auto; margin-bottom: 10px;" width="2048" /></p> <p> OpenAI is shipping three new voice models—GPT-Realtime-2, GP…

  7. Hacker News — AI stories ≥50 points TIER_1 English(EN) · Sean-Der ·

    OpenAI 如何大规模提供低延迟语音 AI

  8. MarkTechPost TIER_1 English(EN) · Asif Razzaq ·

    OpenAI 发布三个实时音频模型:GPT-Realtime-2、GPT-Realtime-Translate 和 GPT-Realtime-Whisper 加入实时 API

    <p>Three purpose-built audio models expand what developers can build with live voice: reasoning agents, speech translation across 70+ languages, and streaming transcription.</p> <p>The post <a href="https://www.marktechpost.com/2026/05/08/openai-releases-three-realtime-audio-mode…

  9. MarkTechPost TIER_1 English(EN) · Asif Razzaq ·

    Inworld AI 推出 Realtime TTS-2:一个能适应你实际说话方式的闭环语音模型

    <p>The Inworld AI's new model conditions on full audio context, not just transcripts — a meaningful architectural shift for voice-first AI agents</p> <p>The post <a href="https://www.marktechpost.com/2026/05/05/inworld-ai-launches-realtime-tts-2-a-closed-loop-voice-model-that-ada…

  10. Email — Mindstream TIER_1 English(EN) · bounces+35008234-749c-ns3evnpcff6928077d7u=kill-the-newsletter.com@em5320.mindstream.news (bounces+35008234-749c-ns3evnpcff6928077d7u=kill-the-newsletter.com@em5320.mindstream.news) ·

    ChatGPT语音功能现已支持更多操作

    <!--[if !mso]><!--><!--<![endif]-->ChatGPT voice can now do much more<!--[if mso]><xml><o:OfficeDocumentSettings><o:AllowPNG></o:AllowPNG><o:PixelsPerInch>96</o:PixelsPerInch></o:OfficeDocumentSettings></xml><![endif]--><!--[if mso]><style type="text/css"> h1, h2, h3, h4, h5, h6 …

  11. Mastodon — sigmoid.social TIER_1 English(EN) · [email protected] ·

    OpenAI 的新语音模型令人印象深刻。演示:https:// x.com/OpenAI/status/2052438194 625593804 # AI

    New voice models from # OpenAI are quite impressive. Demo: https:// x.com/OpenAI/status/2052438194 625593804 # AI

  12. Mastodon — sigmoid.social TIER_1 English(EN) · [email protected] ·

    OpenAI 新推出的实时 API 三件套:GPT-Realtime-2 凭借 128K 上下文、并行工具调用和可配置的推理绑定,将 GPT-5 级别的推理能力带入实时语音

    OpenAI's new Realtime API trio: GPT-Realtime-2 brings GPT-5-class reasoning to live voice with 128K context, parallel tool calls, and configurable reasoning tiers. Zillow achieved 95% call success (up from 69%) on adversarial benchmarks. Plus real-time translation in 70+ language…

  13. Mastodon — sigmoid.social TIER_1 Suomi(FI) · [email protected] ·

    OpenAI 发布三款新的实时语音模型,用于实时语音翻译、转录和对话的语言模型现已推出

    OpenAI julkaisi kolme uutta, reaaliaikaista äänimallia Reaaliaikaiseen puheen kääntämiseen, litterointiin ja keskustelemiseen tarkoitetut kielimallit ovat saatavilla sovelluskehittäjille välittömästi. https:// dawn.fi/uutiset/2026/05/08/ope nai-reaaliaikaiset-aanimallit # OpenAI …

  14. Email — The Rundown AI TIER_1 English(EN) · bounces+31366032-637c-8d9utci1mq15fs7p9a4h=kill-the-newsletter.com@em8370.daily.therundown.ai (bounces+31366032-637c-8d9utci1mq15fs7p9a4h=kill-the-newsletter.com@em8370.daily.therundown.ai) ·

    🗣️ OpenAI 缩小语音助手推理差距

    <!--[if !mso]><!--><!--<![endif]-->🗣️ OpenAI closes reasoning gap in voice agents<!--[if mso]><xml><o:OfficeDocumentSettings><o:AllowPNG></o:AllowPNG><o:PixelsPerInch>96</o:PixelsPerInch></o:OfficeDocumentSettings></xml><![endif]--><!--[if mso]><style type="text/css"> h1, h2, h3,…

  15. Mastodon — sigmoid.social TIER_1 Türkçe(TR) · [email protected] ·

    OpenAI 升级了其实时语音和翻译功能,推出了新的 GPT‑Realtime‑2、GPT‑Realtime‑Translate 和 GPT‑Realtime‑Whisper。支持多语言和高精度

    OpenAI yeni GPT‑Realtime‑2, GPT‑Realtime‑Translate ve GPT‑Realtime‑Whisper ile gerçek‑süre sesli ve çeviri yeteneklerini yükseltti. Çok dilli destek ve yüksek doğrulukta ses tanıma sayesinde tüm dili tek bir platformda topluyor. API’lerde hali hazırda erişilebilir. 🚩 # AI # OpenA…

  16. TechCrunch AI TIER_1 English(EN) · Lucas Ropek ·

    OpenAI在其API中推出新的语音智能功能

    The new features could be handy for customer service systems, but OpenAI says they have applications that work across a variety of other fields, including education and creator platforms.

  17. Mastodon — sigmoid.social TIER_1 English(EN) · [email protected] ·

    OpenAI发布了三款新语音模型,包括具备GPT-5级别推理能力的GPT-Realtime-2,以及支持70多种语言的GPT-Realtime-Translate。

    OpenAI has unveiled three new voice models, including GPT-Realtime-2 with GPT-5-class reasoning and GPT-Realtime-Translate supporting over 70 languages. The company says it is responding to viral videos highlighting its voice technologys shortcomings. https:// gizmodo.com/openai-…

  18. Mastodon — sigmoid.social TIER_1 English(EN) · [email protected] ·

    Inworld AI 推出了 Realtime TTS-2,这是一款闭环语音模型,可以听到对话轮次的全部音频,以适应用户实际的语气和

    Inworld AI has launched Realtime TTS-2, a closed-loop voice model that hears the full audio of conversation turns to adapt its delivery to users actual tone and emotional state. The model uses plain-language prompts like "[speak sadly, as if something bad just happened]" to steer…

  19. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    OpenAI 将 GPT-5 级别的推理能力引入实时语音 - winbuzzer

    https:// winbuzzer.com/2026/05/10/opena i-brings-gpt-5-class-reasoning-to-real-time-v-xcxwbn/ OpenAI has launched a three-model real-time voice lineup that separates reasoning, translation, and transcription instead of treating voice as one bundled chat feature. # AI # OpenAI # G…

  20. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    OpenAI 发布全新语音模型,可实时推理、翻译和转录

    OpenAI has new voice models that reason, translate, and transcribe as you speak OpenAI has just released three new realtime voice models that it says will “unlock a new class of voice apps for developers.” Each new voice intelligence model has a unique speciality for different pu…

  21. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    OpenAI 正在深入语音领域。该公司刚刚在其 API 中推出了三个新的实时音频模型。GPT-Realtime-2 用于对话推理,GPT-Real

    OpenAI is pushing deeper into voice. The company just launched three new realtime audio models in its API. GPT-Realtime-2 for conversational reasoning, GPT-Realtime-Translate for live multilingual translation, and GPT-Realtime-Whisper for streaming speech transcription. https:// …

  22. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    OpenAI在其API中发布了新的语音智能功能,面向客户服务、教育和创作者平台。这些工具支持实时语音输入

    OpenAI has released new voice intelligence features in its API, aimed at customer service, education and creator platforms. The tools enable real-time spoken interaction beyond basic transcription. https:// techcrunch.com/2026/05/07/open ai-launches-new-voice-intelligence-feature…

  23. Mastodon — mastodon.social TIER_1 English(EN) · rhodzy ·

    新博文:当AI终于开始(正确地)回应 OpenAI 的低延迟语音AI举措不仅仅是技术升级,而是一个根本性的转变,使得

    New blog post: When AI Finally Starts Talking Back (Properly) OpenAI's move to low-latency voice AI isn't just a tech upgrade; it's a fundamental shift that makes truly conversational AI a reality, with massive implications for everything from health tech to gaming. https:// rhod…

  24. Mastodon — mastodon.social TIER_1 English(EN) · [email protected] ·

    使用 OpenAI 新模型推进语音智能 openai.com/index/advancin… #AI #voice #translation #OpenAI

    Advancing voice intelligence with new models in the API openai.com/index/advancin… #AI #voice #translation #OpenAI

  25. Mastodon — mastodon.social TIER_1 English(EN) · [email protected] ·

    追踪研究图书馆的AI发展:ARL 2026 AI快速调查结果 — 美国研究图书馆协会 www.arl.org/blog/tracking-… #AI #lib

    Tracking the AI Evolution in Research Libraries: Findings from ARL’s 2026 AI Quick Poll — Association of Research Libraries www.arl.org/blog/tracking-… #AI #libraries

  26. Mastodon — mastodon.social TIER_1 English(EN) · aihaberleri ·

    📰 2026年实时音频模型:OpenAI发布GPT-Realtime-2,Translate和Whisper OpenAI推出了三款新的实时音频模型——GPT-Realtime-2、GPT-Realt

    📰 Realtime Audio Models 2026: OpenAI Unveils GPT-Realtime-2, Translate & Whisper OpenAI has introduced three new realtime audio models—GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper—designed to transform live voice applications with reasoning, translation, and l…

  27. Mastodon — mastodon.social TIER_1 Türkçe(TR) · aihaberleri ·

    📰 OpenAI 2026年推出实时语音模型,重新定义对话式AI:GPT-4o、Whisper... OpenAI革新实时语音能力

    📰 OpenAI 2026'da Gerçek Zamanlı Ses Modelleriyle Konuşan AI'yi Yeniden Tanımlıyor: GPT-4o, Whisper ... OpenAI, gerçek zamanlı konuşma yeteneklerinde devrim yaratacak üç yeni ses modelini duyurdu: GPT-Realtime-2, GPT-Realtime-Translate ve GPT-Realtime-Whisper. Bu modeller, yapay z…

  28. Mastodon — mastodon.social TIER_1 English(EN) · aihaberleri ·

    📰 GPT-Realtime-2将于2026年推出:语音AI速度提升15.2%,支持实时翻译和转录 OpenAI发布了GPT-Realtime-2、-Translate和-Whisper

    📰 GPT-Realtime-2 Launches in 2026: 15.2% Faster Voice AI with Realtime Translation & Transcription OpenAI has unveiled GPT-Realtime-2, -Translate, and -Whisper — a suite of next-generation realtime voice APIs that set new state-of-the-art benchmarks in speech understanding and tr…

  29. Mastodon — mastodon.social TIER_1 Türkçe(TR) · aihaberleri ·

    📰 GPT-Realtime-2 在 2026 年为实时语音 API 树立了新标准 OpenAI,凭借 GPT-Realtime-2、-Translate 和 -Whisper,实现实时语音处理

    📰 GPT-Realtime-2 ile 2026'da Gerçek Zamanlı Ses API'leri Yeni Standartı Yarattı OpenAI, GPT-Realtime-2, -Translate ve -Whisper ile gerçek zamanlı ses işlemede devrim yarattı. Bu yeni API'ler, sesli diyalogların doğallığını ve hızını tamamen yeniden tanımlıyor.... # YapayZekaAraçl…

  30. Mastodon — mastodon.social TIER_1 日本語(JA) · [email protected] ·

    OpenAI 新语音模型“GPT-Realtime-2”支持即时翻译和低延迟转录

    OpenAI、新音声モデル「GPT-Realtime-2」 即時翻訳や低遅延文字起こしも https://www. watch.impress.co.jp/docs/news/ 2107115.html # watch_impress # ChatGPT # テック # AI

  31. Mastodon — mastodon.social TIER_1 English(EN) · aihaberleri ·

    📰 OpenAI 2026年语音模型:GPT-Realtime-2、Whisper及翻译将彻底改变实时转录 OpenAI已推出突破性语音模型,包括

    📰 OpenAI Voice Models 2026: GPT-Realtime-2, Whisper & Translate Revolutionize Real-Time Transcription OpenAI has introduced groundbreaking voice models including GPT-Realtime-2, Translate, and Whisper, revolutionizing real-time speech processing. These models enhance transcriptio…

  32. Mastodon — mastodon.social TIER_1 Türkçe(TR) · aihaberleri ·

    📰 2026年语音通信革命:GPT-Realtime-2、Whisper和Translate的实时转录……OpenAI在语音AI领域创造里程碑

    📰 2026'da Sesli İletişim Devrimi: GPT-Realtime-2, Whisper ve Translate ile Gerçek Zamanlı Transkrip... OpenAI, sesli yapay zeka alanında bir dönüm noktası yarattı: GPT-Realtime-2, Translate ve Whisper ile gerçek zamanlı konuşma, çeviri ve transkripsiyon artık insan diline çok dah…

  33. Mastodon — mastodon.social TIER_1 English(EN) · aihaberleri ·

    📰 AlphaEvolve:Google DeepMind 的 Gemini AI 如何在 2026 年改变算法设计 AlphaEvolve 是由 Google DeepMind 开发的、由 Gemini 提供支持的编码代理,它正在

    📰 AlphaEvolve: How Google DeepMind’s Gemini AI Transforms Algorithm Design in 2026 AlphaEvolve, a Gemini-powered coding agent developed by Google DeepMind, is reshaping how advanced algorithms are designed across scientific and engineering domains. By autonomously generating and …

  34. Mastodon — mastodon.social TIER_1 Türkçe(TR) · aihaberleri ·

    📰 AlphaEvolve 使用 Gemini AI 生成代码:实现科学发现自动化 2026 Google DeepMind,使用 Gemini AI 自动设计算法

    📰 Gemini AI ile Kod Üreten AlphaEvolve: Bilimsel Keşifleri Otomatikleştiriyor 2026 Google DeepMind, Gemini AI’sini kullanarak algoritmaları otomatik tasarlayan AlphaEvolve’u tanıttı. Bu sistem, sadece kod üretmekle kalmıyor, bilimsel problemleri çözme kapasitesiyle disiplinler ar…

  35. Mastodon — mastodon.social TIER_1 English(EN) · aihaberleri ·

    📰 2026年最佳实时语音处理API:OpenAI Whisper、GPT-4o及更多 OpenAI发布了新一代实时语音API套件

    📰 2026’s Best Real-Time Speech Processing APIs: OpenAI Whisper, GPT-4o & More OpenAI has unveiled a next-generation voice API suite capable of real-time speech processing, integrating advanced inference, translation, and transcription. This innovation aims to redefine human-AI in…

  36. Mastodon — mastodon.social TIER_1 Türkçe(TR) · aihaberleri ·

    📰 OpenAI语音API 2026:AI实时语音处理革命 OpenAI宣布三项新API,重新定义语音交互未来:GPT-R

    📰 OpenAI Ses API'leri 2026: Gerçek Zamanlı Ses İşleme ile AI Devrimi OpenAI, sesli interaksiyonların geleceğini yeniden tanımlayan üç yeni API'yi duyurdu: GPT-Realtime-2, GPT-Realtime-Translate ve GPT-Realtime-Whisper. Bu teknolojiler, yapay zekânın sesle iletişim kurma kapasites…

  37. Mastodon — mastodon.social TIER_1 English(EN) · aihaberleri ·

    📰 API中的音频模型2026:使用OpenAI的新工具将语音应用的构建速度提高70% OpenAI已在其API中推出了三个新的音频模型,使开发人员能够

    📰 Audio Models in API 2026: Build Voice Apps 70% Faster with OpenAI’s New Tools OpenAI has introduced three new audio models in its API, empowering developers to build advanced voice applications. This move aligns with broader industry efforts to standardize AI transparency and b…

  38. Mastodon — mastodon.social TIER_1 Türkçe(TR) · aihaberleri ·

    📰 2026年新款语音模型:为开发者带来的语音API和AI语音应用的革命 三款模型将改变开发者语音应用未来

    📰 Yeni Ses Modelleri 2026: Geliştiriciler İçin Ses API ve AI Ses Uygulamaları Devrimi Geliştiriciler için ses tabanlı uygulamaların geleceğini değiştirecek üç yeni ses modeli duyuruldu. Bu adım, yapay zekanın sesle etkileşime geçme kapasitesini derinlemesine yeniden tanımlıyor...…

  39. Mastodon — mastodon.social TIER_1 English(EN) · aihaberleri ·

    📰 OpenAI Codex 如何让我被 Reddit 封禁 (2026) 在一名开发者因披露 AI 工具而被 Reddit 封禁后,Codex 在开发中的使用引发争议

    📰 How OpenAI Codex Got Me Banned from Reddit (2026) Codex usage in development is sparking debate after a builder was banned from Reddit for disclosing AI tool use. The incident highlights growing tensions between AI efficiency and community transparency norms.... # AINews # AI #…

  40. Mastodon — mastodon.social TIER_1 Türkçe(TR) · aihaberleri ·

    📰 2026年Reddit是否禁止使用OpenAI Codex?开发者的关键警告 OpenAI的Codex工具在开发者中非常受欢迎

    📰 OpenAI Codex Kullanımı 2026'da Reddit'te Yasaklandı mı? Geliştiriciler İçin Kritik Uyarı OpenAI'nin Codex araçları geliştiriciler arasında büyük popülerlik kazanırken, Reddit'te kullanımı nedeniyle hesap yasaklamaları yaşanmaya başlandı. Bu durum sadece bir teknik sorun değil, …

  41. Mastodon — mastodon.social TIER_1 Svenska(SV) · redaktionen ·

    OpenAI 的新语音智能:客户服务和教育的革命 https://redaktionen.net/artikel/985 # ai # svtech

    OpenAI:s Nya Röstintelligens: En Revolution för Kundtjänst och Utbildning https:// redaktionen.net/artikel/985 # ai # svtech

  42. Mastodon — mastodon.social TIER_1 English(EN) · sagalinked ·

    📰 OpenAI 向其 API 推出了新的语音智能功能,这可能对客户服务系统有利,并在各种领域有应用

    📰 OpenAI has introduced new voice intelligence features to its API, which could be beneficial for customer service systems and have applications across various fields such as education and creator platforms. 🔗 https:// techcrunch.com/2026/05/07/open ai-launches-new-voice-intellig…

  43. Mastodon — mastodon.social TIER_1 English(EN) · [email protected] ·

    Inworld AI 推出了 Realtime TTS-2,一个能适应用户实际说话方式的闭环语音模型。与传统的文本转语音系统不同,TTS-2 可以听到

    Inworld AI has launched Realtime TTS-2, a closed-loop voice model that adapts to how users actually talk. Unlike traditional text-to-speech systems, TTS-2 hears the full audio context of each conversation - not just transcripts - allowing it to detect tone, pacing and emotion. ht…

  44. Mastodon — mastodon.social TIER_1 English(EN) · [email protected] ·

    OpenAI 以语音AI服务9亿周活跃用户,但传统管道导致延迟不可接受。本次深度解析揭示了其巧妙解决方案:一种“音频

    OpenAI serves 900M weekly users with voice AI, but traditional pipelines caused unacceptable latency. This deep dive reveals their ingenious solution: an "audio-native" architecture built on a re-engineered WebRTC stack. They tackled "one-port-per-session" and stateful protocol i…

  45. Mastodon — mastodon.social TIER_1 Svenska(SV) · redaktionen ·

    OpenAI 新语音AI:低延迟,速度前所未有

    OpenAI:s Nya Röst-AI: Snabbare Än Någonsin med Låg Latens https:// redaktionen.net/artikel/881 # ai # svtech