<p>METR conducted preliminary evaluations of Anthropic’s upgraded Claude 3.5 Sonnet (October 2024 release), and a pre-deployment checkpoint of OpenAI’s o1. In both cases, we <strong>failed to find significant evidence for a dangerous level of capabilities</strong>. We provide mor…
**Anthropic** launched **Claude Sonnet 4.6**, an upgrade over Sonnet 4.5, featuring broad improvements in **coding, long-context reasoning, agent planning, knowledge work, and design**, plus a **1M-token context window (beta)**. Benchmarks show Sonnet 4.6 leading on **GDPval-AA E…
**Cursor** is reportedly fundraising at a **$28 billion valuation with $1 billion ARR**, while the combined **Cognition+Windsurf** entity is fundraising at a **$10 billion valuation** after acquiring Windsurf remainco for $300 million. The competition between AI coding agents int…
**OpenAI** is reportedly close to closing a deal with Windsurf, coinciding with **Cursor's** $900M funding round at a $9B valuation. **Nvidia** launched the **Llama-Nemotron series** featuring models from 8B to 253B parameters, praised for reasoning and inference efficiency. **Al…
**Anthropic** announced new Claude 3.5 models: **3.5 Sonnet** and **3.5 Haiku**, improving coding performance significantly, with Sonnet topping several coding benchmarks like **Aider** and **Vectara**. The new **Computer Use API** enables controlling computers via vision, scorin…
<!-- SC_OFF --><div class="md"><p>I notice recently most prompt's which i give to Opus 4.6 takes longer and mostly doesn't manage to do what i ask while Composer does it correctly and faster, but when Composer was released was pretty bad, makes me thing does Composer train from o…