OpenAI ships GPT-5-class voice models for real-time reasoning, translation, and transcription
ByPulseAugur Editorial·
Summary by gemini-2.5-flash-lite
from 45 sources
OpenAI has released three new real-time voice models: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. These models offer enhanced reasoning capabilities, live speech translation for over 70 languages, and low-latency transcription. GPT-Realtime-2, in particular, is described as having "GPT-5-class reasoning" and features a significantly expanded context window of 128K tokens, alongside improved handling of interruptions and tool usage.
AI
Explore new realtime voice models in the OpenAI API that can reason, translate, and transcribe speech, enabling more natural and intelligent voice experiences.
**OpenAI** released **GPT-Realtime-2**, a voice model with **GPT-5-class reasoning**, tool use, interruption handling, and extended context windows up to **128K tokens**, achieving top scores on **Big Bench Audio** and **Conversational Dynamics** benchmarks. They also launched a …
<p>Three purpose-built audio models expand what developers can build with live voice: reasoning agents, speech translation across 70+ languages, and streaming transcription.</p> <p>The post <a href="https://www.marktechpost.com/2026/05/08/openai-releases-three-realtime-audio-mode…
<p>The Inworld AI's new model conditions on full audio context, not just transcripts — a meaningful architectural shift for voice-first AI agents</p> <p>The post <a href="https://www.marktechpost.com/2026/05/05/inworld-ai-launches-realtime-tts-2-a-closed-loop-voice-model-that-ada…
<!--[if !mso]><!--><!--<![endif]-->ChatGPT voice can now do much more<!--[if mso]><xml><o:OfficeDocumentSettings><o:AllowPNG></o:AllowPNG><o:PixelsPerInch>96</o:PixelsPerInch></o:OfficeDocumentSettings></xml><![endif]--><!--[if mso]><style type="text/css"> h1, h2, h3, h4, h5, h6 …
OpenAI's new Realtime API trio: GPT-Realtime-2 brings GPT-5-class reasoning to live voice with 128K context, parallel tool calls, and configurable reasoning tiers. Zillow achieved 95% call success (up from 69%) on adversarial benchmarks. Plus real-time translation in 70+ language…
OpenAI julkaisi kolme uutta, reaaliaikaista äänimallia Reaaliaikaiseen puheen kääntämiseen, litterointiin ja keskustelemiseen tarkoitetut kielimallit ovat saatavilla sovelluskehittäjille välittömästi. https:// dawn.fi/uutiset/2026/05/08/ope nai-reaaliaikaiset-aanimallit # OpenAI …
Email — The Rundown AI
TIER_1·bounces+31366032-637c-8d9utci1mq15fs7p9a4h=kill-the-newsletter.com@em8370.daily.therundown.ai (bounces+31366032-637c-8d9utci1mq15fs7p9a4h=kill-the-newsletter.com@em8370.daily.therundown.ai)·
OpenAI yeni GPT‑Realtime‑2, GPT‑Realtime‑Translate ve GPT‑Realtime‑Whisper ile gerçek‑süre sesli ve çeviri yeteneklerini yükseltti. Çok dilli destek ve yüksek doğrulukta ses tanıma sayesinde tüm dili tek bir platformda topluyor. API’lerde hali hazırda erişilebilir. 🚩 # AI # OpenA…
The new features could be handy for customer service systems, but OpenAI says they have applications that work across a variety of other fields, including education and creator platforms.
OpenAI has unveiled three new voice models, including GPT-Realtime-2 with GPT-5-class reasoning and GPT-Realtime-Translate supporting over 70 languages. The company says it is responding to viral videos highlighting its voice technologys shortcomings. https:// gizmodo.com/openai-…
Inworld AI has launched Realtime TTS-2, a closed-loop voice model that hears the full audio of conversation turns to adapt its delivery to users actual tone and emotional state. The model uses plain-language prompts like "[speak sadly, as if something bad just happened]" to steer…
https:// winbuzzer.com/2026/05/10/opena i-brings-gpt-5-class-reasoning-to-real-time-v-xcxwbn/ OpenAI has launched a three-model real-time voice lineup that separates reasoning, translation, and transcription instead of treating voice as one bundled chat feature. # AI # OpenAI # G…
OpenAI has new voice models that reason, translate, and transcribe as you speak OpenAI has just released three new realtime voice models that it says will “unlock a new class of voice apps for developers.” Each new voice intelligence model has a unique speciality for different pu…
OpenAI is pushing deeper into voice. The company just launched three new realtime audio models in its API. GPT-Realtime-2 for conversational reasoning, GPT-Realtime-Translate for live multilingual translation, and GPT-Realtime-Whisper for streaming speech transcription. https:// …
OpenAI has released new voice intelligence features in its API, aimed at customer service, education and creator platforms. The tools enable real-time spoken interaction beyond basic transcription. https:// techcrunch.com/2026/05/07/open ai-launches-new-voice-intelligence-feature…
New blog post: When AI Finally Starts Talking Back (Properly) OpenAI's move to low-latency voice AI isn't just a tech upgrade; it's a fundamental shift that makes truly conversational AI a reality, with massive implications for everything from health tech to gaming. https:// rhod…
Tracking the AI Evolution in Research Libraries: Findings from ARL’s 2026 AI Quick Poll — Association of Research Libraries www.arl.org/blog/tracking-… #AI #libraries
📰 Realtime Audio Models 2026: OpenAI Unveils GPT-Realtime-2, Translate & Whisper OpenAI has introduced three new realtime audio models—GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper—designed to transform live voice applications with reasoning, translation, and l…
📰 OpenAI 2026'da Gerçek Zamanlı Ses Modelleriyle Konuşan AI'yi Yeniden Tanımlıyor: GPT-4o, Whisper ... OpenAI, gerçek zamanlı konuşma yeteneklerinde devrim yaratacak üç yeni ses modelini duyurdu: GPT-Realtime-2, GPT-Realtime-Translate ve GPT-Realtime-Whisper. Bu modeller, yapay z…
📰 GPT-Realtime-2 Launches in 2026: 15.2% Faster Voice AI with Realtime Translation & Transcription OpenAI has unveiled GPT-Realtime-2, -Translate, and -Whisper — a suite of next-generation realtime voice APIs that set new state-of-the-art benchmarks in speech understanding and tr…
📰 GPT-Realtime-2 ile 2026'da Gerçek Zamanlı Ses API'leri Yeni Standartı Yarattı OpenAI, GPT-Realtime-2, -Translate ve -Whisper ile gerçek zamanlı ses işlemede devrim yarattı. Bu yeni API'ler, sesli diyalogların doğallığını ve hızını tamamen yeniden tanımlıyor.... # YapayZekaAraçl…
📰 2026'da Sesli İletişim Devrimi: GPT-Realtime-2, Whisper ve Translate ile Gerçek Zamanlı Transkrip... OpenAI, sesli yapay zeka alanında bir dönüm noktası yarattı: GPT-Realtime-2, Translate ve Whisper ile gerçek zamanlı konuşma, çeviri ve transkripsiyon artık insan diline çok dah…
📰 AlphaEvolve: How Google DeepMind’s Gemini AI Transforms Algorithm Design in 2026 AlphaEvolve, a Gemini-powered coding agent developed by Google DeepMind, is reshaping how advanced algorithms are designed across scientific and engineering domains. By autonomously generating and …
📰 Gemini AI ile Kod Üreten AlphaEvolve: Bilimsel Keşifleri Otomatikleştiriyor 2026 Google DeepMind, Gemini AI’sini kullanarak algoritmaları otomatik tasarlayan AlphaEvolve’u tanıttı. Bu sistem, sadece kod üretmekle kalmıyor, bilimsel problemleri çözme kapasitesiyle disiplinler ar…
📰 2026’s Best Real-Time Speech Processing APIs: OpenAI Whisper, GPT-4o & More OpenAI has unveiled a next-generation voice API suite capable of real-time speech processing, integrating advanced inference, translation, and transcription. This innovation aims to redefine human-AI in…
📰 OpenAI Ses API'leri 2026: Gerçek Zamanlı Ses İşleme ile AI Devrimi OpenAI, sesli interaksiyonların geleceğini yeniden tanımlayan üç yeni API'yi duyurdu: GPT-Realtime-2, GPT-Realtime-Translate ve GPT-Realtime-Whisper. Bu teknolojiler, yapay zekânın sesle iletişim kurma kapasites…
📰 Audio Models in API 2026: Build Voice Apps 70% Faster with OpenAI’s New Tools OpenAI has introduced three new audio models in its API, empowering developers to build advanced voice applications. This move aligns with broader industry efforts to standardize AI transparency and b…
📰 Yeni Ses Modelleri 2026: Geliştiriciler İçin Ses API ve AI Ses Uygulamaları Devrimi Geliştiriciler için ses tabanlı uygulamaların geleceğini değiştirecek üç yeni ses modeli duyuruldu. Bu adım, yapay zekanın sesle etkileşime geçme kapasitesini derinlemesine yeniden tanımlıyor...…
📰 How OpenAI Codex Got Me Banned from Reddit (2026) Codex usage in development is sparking debate after a builder was banned from Reddit for disclosing AI tool use. The incident highlights growing tensions between AI efficiency and community transparency norms.... # AINews # AI #…
📰 OpenAI Codex Kullanımı 2026'da Reddit'te Yasaklandı mı? Geliştiriciler İçin Kritik Uyarı OpenAI'nin Codex araçları geliştiriciler arasında büyük popülerlik kazanırken, Reddit'te kullanımı nedeniyle hesap yasaklamaları yaşanmaya başlandı. Bu durum sadece bir teknik sorun değil, …
📰 OpenAI has introduced new voice intelligence features to its API, which could be beneficial for customer service systems and have applications across various fields such as education and creator platforms. 🔗 https:// techcrunch.com/2026/05/07/open ai-launches-new-voice-intellig…
Inworld AI has launched Realtime TTS-2, a closed-loop voice model that adapts to how users actually talk. Unlike traditional text-to-speech systems, TTS-2 hears the full audio context of each conversation - not just transcripts - allowing it to detect tone, pacing and emotion. ht…
OpenAI serves 900M weekly users with voice AI, but traditional pipelines caused unacceptable latency. This deep dive reveals their ingenious solution: an "audio-native" architecture built on a re-engineered WebRTC stack. They tackled "one-port-per-session" and stateful protocol i…