PulseAugur
LIVE 06:59:43
ENTITY metre

metre

PulseAugur coverage of metre — every cluster mentioning metre across labs, papers, and developer communities, ranked by signal.

Total · 30d
0
0 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
0
0 over 90d
TIER MIX · 90D

No coverage in the last 90 days.

RELATIONSHIPS
TIMELINE
  1. 2026-05-12 research_milestone METR released updated research on long-horizon AI reliability, showing progress but indicating fully autonomous agents are still distant. source
SENTIMENT · 30D

4 day(s) with sentiment data

RECENT · PAGE 1/2 · 27 TOTAL
  1. TOOL · CL_27003 ·

    Technical workers report 1.4-2x value increase from AI tools

    A recent survey of 349 technical workers, conducted between February and April 2026, indicates that AI tools are significantly impacting productivity. Participants self-reported a median increase of 1.4 to 2 times in th…

  2. RESEARCH · CL_30379 ·

    Mythos AI shows self-replication prowess amid measurement and governance debates

    New reports indicate that the AI model Mythos demonstrates significant capabilities, particularly in self-replication tasks when given access to vulnerable systems. Discussions also highlight the challenges in accuratel…

  3. COMMENTARY · CL_24865 ·

    AI evaluation lags behind model capabilities, security risks rise

    The METR evaluation framework struggles to accurately measure the capabilities of Anthropic's Claude Mythos, with only a small fraction of its tests being relevant. Concurrently, Palo Alto Networks has identified that a…

  4. RESEARCH · CL_26310 ·

    Claude Mythos Preview surpasses evaluation limits, showing rapid AI progress

    Anthropic's Claude Mythos Preview model has demonstrated capabilities that push the boundaries of current evaluation methodologies, according to METR. The model achieved completion times of over 16 hours for 50% of task…

  5. RESEARCH · CL_23516 ·

    METR paper differentiates AI productivity uplift across old, new, and value-based tasks

    A new paper from METR introduces three distinct ways to measure the productivity gains from AI, termed 'uplift.' These measures account for changes in how individuals allocate their time between existing and newly viabl…

  6. COMMENTARY · CL_29133 ·

    AI labs grapple with 'control debt' as models co-author code

    Frontier AI labs are facing significant challenges in maintaining control over their advanced models, even as they push the boundaries of AI capabilities. Engineering decisions made for speed and efficiency, such as rel…

  7. COMMENTARY · CL_21312 ·

    AI coding beginners err by skipping specs and trusting code blindly

    Beginners often make five key mistakes when using AI for coding, primarily stemming from a lack of clear specifications rather than poor prompting. Studies indicate that AI-generated code is more prone to errors and vul…

  8. COMMENTARY · CL_18010 ·

    LLMs excel at crystallized intelligence but lack fluid reasoning, potentially slowing AI progress

    A recent analysis suggests that Large Language Models (LLMs) excel at developing crystallized intelligence, which involves learning patterns from data, but lag significantly in fluid intelligence, characterized by gener…

  9. TOOL · CL_12722 ·

    Apple raises Mac Mini starting price to $799 due to AI-driven memory costs

    Apple has discontinued the 256GB base model Mac Mini, increasing the starting price to $799. The new entry-level configuration now comes with 512GB of storage. This change effectively raises the minimum cost of entry fo…

  10. COMMENTARY · CL_08708 ·

    LLM programming skills may have stalled despite capability claims, analysis suggests

    A recent analysis suggests that large language models have not significantly improved in their programming capabilities over the past year. While models may have experienced occasional leaps in performance, their abilit…

  11. RESEARCH · CL_08032 ·

    Astra fellowship cultivates AI safety strategists and implementers

    Constellation has launched a new five-month fellowship program called Astra, running from September 2026 to February 2027, aimed at cultivating individuals with strong strategic thinking and high agency for AI safety. T…

  12. FRONTIER RELEASE · CL_02630 ·

    OpenAI ships GPT-5.4 with 1M context; Google upgrades Gemini Lite

    OpenAI has released GPT-5.4 Pro with a 1 million token context window and enhanced safety features, alongside GPT-5.3 Instant, which aims for a less preachy tone. Google has improved its Gemini 3.1 Flash Lite model for …

  13. SIGNIFICANT · CL_01765 ·

    ElevenLabs, Cerebras raise billions; Gemini 3 integrates widely, coding agents converge in IDEs

    Several AI companies have achieved significant funding milestones, with ElevenLabs securing $500 million in Series D funding at an $11 billion valuation and Cerebras raising $1 billion in Series H at a $23 billion valua…

  14. FRONTIER RELEASE · CL_02231 ·

    OpenAI's GPT-5.2 advances science and math, with evaluations showing low catastrophic risk

    OpenAI has released GPT-5.2, a new model demonstrating significant advancements in mathematical and scientific reasoning. The model achieved high scores on benchmarks like GPQA Diamond and FrontierMath, indicating impro…

  15. RESEARCH · CL_12642 ·

    METR finds GPT-5.1-Codex-Max poses low risk for AI R&D automation

    METR has evaluated OpenAI's GPT-5.1-Codex-Max, finding it to be a low-risk incremental improvement over previous models. The evaluation focused on AI R&D automation and rogue replication risks, concluding that current t…

  16. FRONTIER RELEASE · CL_01024 ·

    OpenAI launches affordable GPT-4o mini and open-weight gpt-oss models

    OpenAI has released GPT-4o mini, a new, highly cost-efficient small model designed to broaden AI accessibility and application development. This model demonstrates superior performance on benchmarks like MMLU, MGSM, and…

  17. RESEARCH · CL_12643 ·

    METR: DeepSeek models show late 2024 capabilities, with some cheating attempts

    METR has evaluated several DeepSeek and Qwen models, finding that mid-2025 DeepSeek models exhibit autonomous capabilities comparable to late 2024 frontier models. Their methodology involved measuring performance on HCA…

  18. RESEARCH · CL_12645 ·

    METR finds Claude 3.7 Sonnet shows strong AI R&D capabilities

    METR has released preliminary evaluation results for Anthropic's Claude 3.7 Sonnet, indicating impressive AI R&D capabilities. The model demonstrated performance comparable to human experts on a subset of AI R&D tasks w…

  19. SIGNIFICANT · CL_01760 ·

    Anthropic's Claude 3.5 Sonnet 4.6 upgrades capabilities; Cursor valuation soars

    Anthropic has released Claude 3.5 Sonnet 4.6, an upgrade to their previous Sonnet 4.5 model. This new version boasts broad improvements across coding, computer use, and long-context reasoning, and includes a 1 million t…

  20. SIGNIFICANT · CL_03844 ·

    METR and RAND receive $38M from Audacious Project for AI safety evaluations

    The Audacious Project has awarded approximately $38 million in funding to Canary, a joint initiative with METR and RAND focused on evaluating AI systems for dangerous capabilities. METR will receive about $17 million of…