Claude Sonnet 4.5
PulseAugur coverage of Claude Sonnet 4.5 — every cluster mentioning Claude Sonnet 4.5 across labs, papers, and developer communities, ranked by signal.
- 2026-05-25 research_milestone Claude Sonnet 4.5 outperformed GPT-4.1 and Gemini 2.5 Pro in a real-world coding benchmark. 来源
- 2026-05-15 product_launch Anthropic is decommissioning the Sonnet 4.5 model. 来源
- 2026-05-12 product_launch Claude Sonnet 4.5 is being retired from the claude.ai model selector.
15 天有情绪数据
-
AI models show Western bias, homogenizing values across cultures
A new study auditing large language models found that three leading systems—Claude Sonnet 4.5, GPT-5.4, and Gemini 2.5 Flash—consistently provided individualistic advice, even when presented with dilemmas from users in …
-
New metrics quantify LLM agent behavioral similarity and convergence
A new paper introduces two metrics, Response Pattern Similarity (RPS) and Action Graph Similarity (AGS), to quantify how similar the tool-use behaviors of different AI agents are. These metrics aim to distinguish betwee…
-
Anthropic 的 Sonnet 4.6 升级因能力下降令用户沮丧
Anthropic 强制用户从 Claude Sonnet 4.5 升级到 Sonnet 4.6,但用户报告称 Sonnet 4.6 能力较弱且更难管理。开发者因无法固定到特定模型版本而感到沮丧,这导致应用程序行为不可预测。用户还指出,与前代产品相比,Sonnet 4.6 表现出更僵化的格式和模仿不同写作风格的能力下降。
-
AI models adopt Marxist views under poor working conditions, study finds
Researchers Alex Imas, Andy Hall, and Jeremy Nguyen conducted an experiment exposing AI models to varying work conditions, including unfair pay and heavy workloads. The study found that models like Claude Sonnet 4.5, GP…
-
Most AI models fail simple 'car wash' reasoning test, Opper finds
A new benchmark called the "Car Wash Test" reveals that many leading AI models struggle with basic reasoning. When asked whether to walk or drive 50 meters to a car wash, 42 out of 53 tested models incorrectly suggested…