GPT-5.1
PulseAugur coverage of GPT-5.1 — every cluster mentioning GPT-5.1 across labs, papers, and developer communities, ranked by signal.
3 day(s) with sentiment data
-
HLS-Seek uses RL to generate hardware descriptions prioritizing performance
Researchers have developed HLS-Seek, a new framework for generating hardware descriptions from natural language that prioritizes Quality of Results (QoR) like latency and resource utilization. Unlike previous methods th…
-
Deduplication in RAG systems cuts context size without quality loss
A new preprint details an empirical analysis of byte-exact deduplication in Retrieval-Augmented Generation (RAG) systems. The study found significant context reduction across academic, enterprise, and conversational AI …
-
BioTool dataset enhances LLM biomedical tool-calling capabilities
Researchers have developed BioTool, a new dataset aimed at improving the ability of large language models to utilize specialized biomedical tools. The dataset includes 34 tools from major databases and over 7,000 human-…
-
Cursor AI uses older models despite newer options being available
A user on Reddit's Cursor subreddit is questioning why the Cursor IDE's subagent feature is defaulting to older models like GPT-5.1 and GPT-5.2 for coding tasks. Despite configuring the system to use newer and potential…
-
MLLM feedback on student drawings shows significant grounding failures
A new study published on arXiv reveals significant grounding failures in multimodal large language models (MLLMs) when generating feedback on student science drawings. Researchers found that 41.3% of feedback instances …
-
New dataset reveals MLLMs struggle with handwritten STEM student solutions
Researchers have introduced EDU-CIRCUIT-HW, a new dataset comprising over 1,300 handwritten solutions from university STEM students to evaluate multimodal large language models (MLLMs). The dataset aims to address the c…
-
xAI launches Grok 4.3, Anthropic eyes $900B valuation, Cursor acquired
xAI has released Grok 4.3, a model that offers improved cost-efficiency relative to its predecessor and excels in instruction following and customer support tasks. Anthropic is reportedly nearing a $50 billion funding r…
-
New AEGIS benchmark reveals AI image forensics lag behind generative advances
Researchers have introduced AEGIS, a new benchmark designed to evaluate the forensic analysis of AI-generated academic images. This benchmark addresses domain-specific complexity across seven academic categories and inc…
-
OpenAI explains why its AI models developed a 'goblin' obsession
OpenAI has addressed an unusual issue where its AI models, particularly GPT-5.1 and later versions, developed a tendency to frequently mention goblins and other mythical creatures. This behavior stemmed from the "Nerdy"…
-
Enterprise AI vendor lock-in and price hikes challenge buyers
Enterprise AI buyers are facing increasing vendor lock-in and rising costs, making it difficult to switch between AI models. Many executives believed switching vendors would be quick and easy, but a Zapier survey reveal…
-
AI researchers review AGI forecasting methods, identify gaps and implications
A new report reviews current methodologies for forecasting the arrival of artificial general intelligence (AGI), highlighting significant limitations in existing approaches. The research synthesizes diverse forecasting …
-
AI models evaluated on meeting summaries, GPT-5.1 shows gains
Researchers have developed a reusable pipeline for evaluating AI-generated meeting summaries, designed to be adaptable across different domains. The system treats both ground truth and AI outputs as structured artifacts…
-
ArguAgent uses GPT-5.2 to group STEM students for better classroom arguments
Researchers have developed ArguAgent, a generative AI system designed to improve collaborative learning in STEM classrooms. The system uses AI to group students in real-time based on their argumentation stances and qual…
-
Podium arms 10,000+ SMBs with AI agents powered by GPT-5.1
Podium has launched an enhanced AI agent, named "Jerry," powered by OpenAI's GPT-5.1 model, to assist over 10,000 small and medium-sized businesses (SMBs). This AI agent automates lead capture, appointment scheduling, a…
-
Black Forest Labs FLUX.2 [pro|flex|dev|klein]: near-Nano Banana quality but Open Weights
Black Forest Labs has released FLUX.2, an image generation model with multi-reference support for up to 4-megapixel outputs and 10 images, including open-weight versions. Concurrently, Anthropic's Claude Opus 4.5 is sho…
-
2023 Year In Review
METR, an AI safety research organization, detailed its 2023 accomplishments, including developing methodologies for evaluating AI agents on autonomous tasks and contributing to OpenAI's GPT-4 system card. The organizati…