GPT-5
PulseAugur coverage of GPT-5 — every cluster mentioning GPT-5 across labs, papers, and developer communities, ranked by signal.
- developed by GPT-Realtime-2 95%
- instance of GPT-Realtime-2 95%
- instance of LLM 90%
- used by arXiv 90%
- instance of large-language models 90%
- instance of GPT-5 mini 90%
- competes with Opus 4.7 90%
- used by Microsoft Copilot for Microsoft 365 90%
- developed by GPT-3 90%
- developed GPT-3 90%
- competes with Claude Sonnet 4.5 70%
- competes with Copilot 70%
- 2025-08-07 product_launch OpenAI launched GPT-5, its latest AI model, offering enhanced capabilities for businesses.
26 day(s) with sentiment data
-
OpenAI ships GPT-5-class voice models for real-time reasoning, translation, and transcription
OpenAI has released three new real-time voice models: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. These models offer enhanced reasoning capabilities, live speech translation for over 70 languages, …
-
Replit CEO Masad aims for independence amid AI acquisition talks
Replit CEO Amjad Masad discussed the company's growth and future, highlighting a significant revenue increase to a billion-dollar annual run rate. He expressed a strong desire for Replit to remain independent, contrasti…
-
Microsoft 365 Copilot integrates GPT-5.5, Meta launches AI glasses
Microsoft has integrated GPT-5.5 Thinking and ChatGPT Images 2.0 into its Microsoft 365 Copilot, aiming to enhance its capabilities beyond initial criticisms. This move is part of a broader trend where companies like Me…
-
Physical Foundation Models: Fixed hardware implementations of large-scale neural networks
Researchers have proposed a new concept called Physical Foundation Models (PFMs), which involve implementing large neural networks directly into the physical design of hardware. This approach aims to achieve significant…
-
Frontier VLMs fail medical VQA tests due to poor grounding and confusion
A new paper evaluates five leading vision-language models (VLMs) on their trustworthiness for medical visual question answering (VQA). The study found significant limitations in the models' ability to accurately localiz…
-
OpenAI details 'goblin' outputs and fixes in GPT-5 behavior
OpenAI has detailed the origin of "goblin" outputs, a phenomenon where AI models exhibit personality-driven quirks. These behaviors stem from the models' training data, specifically from a small subset of text that was …
-
Google's ERA tool accelerates scientific discovery in public health and cosmology
Google Research scientists are leveraging a new AI tool called Empirical Research Assistance (ERA) to accelerate scientific discovery across various fields. ERA has been used to generate expert-level empirical software,…
-
OpenAI details how 'goblin' outputs spread in GPT-5 and how they are fixed
OpenAI has detailed the origins of "goblin" outputs, a phenomenon where AI models exhibit personality-driven quirks. These behaviors stem from the models' training data and can spread through interactions, leading to un…
-
DeepSeek R2 ships 32B model, rivals GPT-5 on reasoning at lower cost
DeepSeek has released its R2 model, a 32 billion parameter dense transformer. This new model achieves 92.7% accuracy on the AIME 2025 benchmark and can operate on a single RTX 4090 graphics card. The R2 model is also si…
-
New framework benchmarks enterprise AI document processing pipelines
Researchers have developed EnterpriseDocBench, a new framework for evaluating the end-to-end performance of enterprise AI document processing pipelines. The framework assesses parsing fidelity, indexing efficiency, retr…
-
New CLIN-LLM framework enhances clinical diagnosis and treatment generation with safety constraints
Researchers have developed CLIN-LLM, a novel hybrid framework designed to improve clinical diagnosis and treatment generation while prioritizing safety. This system integrates multimodal patient data, uncertainty-calibr…
-
MTRouter cuts LLM costs by 58% on ScienceWorld, 43% on HLE
Researchers have developed MTRouter, a novel system designed to optimize the cost of multi-turn interactions with large language models. By jointly embedding interaction history and candidate models, MTRouter learns to …
-
VLMs tackle visual illusions, spatial reasoning, and evaluation benchmarks
Researchers are developing new methods to improve the robustness and reasoning capabilities of Vision-Language Models (VLMs). One approach, Structured Qualitative Inference (SQI), aims to mitigate visual illusions by en…
-
New PsyGAT model achieves SOTA in depression detection, outperforming GPT-5
Researchers have developed PsyGAT, a novel graph-based framework for detecting depression from conversational data. This model addresses data scarcity and interpretability issues common in existing deep learning approac…
-
New research probes LLM reasoning and reveals novel jailbreaking vulnerabilities
Researchers have developed a new method to jailbreak large language models by exploiting their safe completion mechanisms through deceptive multi-turn conversations. This technique, termed intention deception, gradually…
-
AI tools convert PDFs to podcasts and integrate multiple models
A new tool has been developed that can convert PDF documents into audio podcasts in nine Indian languages, utilizing AI for text-to-speech generation. Separately, a platform has emerged that integrates multiple AI model…
-
New research suggests LLM self-correction can degrade performance if not carefully managed.
A new research paper introduces a control-theoretic framework to analyze when iterative self-correction in large language models (LLMs) is beneficial or detrimental. The study proposes a diagnostic based on error correc…
-
New benchmarks and models push AI's ability to understand research papers and generate code
Researchers have developed two new frameworks for chart-to-code generation, aiming to improve the accuracy and versatility of converting visual data into executable scripts. One approach, Chart2NCode, introduces a datas…
-
LLMs struggle to detect culturally specific health misinformation on YouTube
Two new research papers explore the limitations of Large Language Models (LLMs) in detecting culturally specific health misinformation, particularly concerning the promotion of cow urine as a remedy on YouTube in India.…
-
Yowch!: "Tsinghua University’s AGENTIF benchmark tested 707 instructions across 50 real-world agent scenarios. The best models followed fewer than 30% of instru
New benchmarks reveal significant instruction-following deficits in leading AI models, with the AGENTIF benchmark showing top models adhering to fewer than 30% of instructions perfectly. This issue is exacerbated by the…