ENTITY AIME 2025

AIME 2025

PulseAugur coverage of AIME 2025 — every cluster mentioning AIME 2025 across labs, papers, and developer communities, ranked by signal.

Total · 30d

16

16 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

15

15 over 90d

TIER MIX · 90D

significant 1
research 8
tool 7

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

6 day(s) with sentiment data

RECENT · PAGE 1/1 · 16 TOTAL

TOOL · CL_111725 · Jun 26 · 04:00

New method uses wrong drafts to boost LLM math capabilities

Researchers have developed a novel technique called "Weak-to-Strong Elicitation via Mismatched Wrong Drafts" to improve the capabilities of large language models. This method involves using mathematically incorrect draf…
TOOL · CL_107950 · Jun 24 · 04:00

New framework VeryTrace verifies and repairs LLM reasoning traces

Researchers have developed VeryTrace, a new framework designed to verify and repair reasoning traces generated by large language models (LLMs). This system formalizes natural language reasoning into a structured, compil…
TOOL · CL_106821 · Jun 22 · 15:07

New benchmark TriggerBench reveals prospective memory challenges for LLMs

Researchers have introduced TriggerBench, a new benchmark designed to evaluate prospective memory (PM) in large language models (LLMs). Unlike retrospective memory (RM), which relies on explicit queries, PM assesses an …
RESEARCH · CL_104687 · Jun 21 · 17:20

New framework unifies image generation capabilities; research tackles distillation challenges

Researchers have introduced DanceOPD, a novel on-policy generative field distillation framework designed to unify diverse image generation capabilities like text-to-image, local editing, and global editing within a sing…
TOOL · CL_106806 · Jun 17 · 00:00

New TAPO Method Enhances LLM Reasoning via Explicit Error Correction

Researchers have introduced Trajectory-Augmented Policy Optimization (TAPO), a novel method for enhancing large language model reasoning through self-distillation. Unlike traditional methods that implicitly align model …
RESEARCH · CL_98141 · Jun 17 · 00:00

New TAPO method enhances LLM self-distillation with explicit error correction · 4 sources tracked

Researchers have introduced Trajectory-Augmented Policy Optimization (TAPO), a novel method for self-distillation in large language models. Unlike traditional methods that implicitly align distributions, TAPO explicitly…
SIGNIFICANT · CL_70061 · Jun 4 · 03:24

Ideogram 4.0 leads open image model releases; Microsoft details MAI-Thinking-1

Ideogram has released version 4.0 of its open-source image generation model, which is now considered the best available in its category. This release, alongside Reve's advancements, highlights significant progress in AI…
RESEARCH · CL_61375 · May 27 · 18:09

NVIDIA quantizes Alibaba's Qwen3.6-35B model for efficient deployment

NVIDIA has released a quantized version of Alibaba's Qwen3.6-35B-A3B model, named nvidia/Qwen3.6-35B-A3B-NVFP4. This model utilizes the NVFP4 data type, reducing memory requirements by approximately 3.06x while maintain…
RESEARCH · CL_51260 · May 26 · 04:00

New methods optimize LLM inference by analyzing confidence dynamics

Two new research papers propose methods to optimize the inference time of large language models by analyzing their confidence levels during reasoning. The first paper, EAGer, uses token-wise entropy to dynamically alloc…
TOOL · CL_44850 · May 22 · 04:00

New benchmark reveals LLM reasoning failures and Claude's refusals

Researchers have developed the Robust Reasoning Benchmark (RRB), a new evaluation pipeline that tests large language models on mathematical problems with deliberate textual perturbations. The benchmark revealed that whi…
RESEARCH · CL_44784 · May 22 · 04:00

New methods enhance on-policy distillation for LLM training

Researchers have developed new methods to improve on-policy distillation (OPD), a technique for training smaller language models using larger ones. One approach, TIP, identifies informative tokens by analyzing student e…
RESEARCH · CL_24496 · May 9 · 22:24

NVIDIA Star Elastic embeds multiple reasoning models in one checkpoint

NVIDIA researchers have introduced Star Elastic, a novel post-training method that embeds multiple reasoning models of varying parameter sizes within a single checkpoint. This approach allows for the extraction of small…
TOOL · CL_20550 · May 7 · 04:00

New RLVR method enhances LLM reasoning with positive-negative prompt pairing

Researchers have developed a new method called prompt-efficient RLVR that improves the training of large language models for reasoning tasks. This technique focuses on selecting prompts that provide both positive anchor…
RESEARCH · CL_20477 · May 6 · 16:44

New RL method optimizes agent training by controlling rollout pass rates

Researchers have developed a new technique called Prefix Sampling (PS) to improve the efficiency of reinforcement learning (RL) for AI agents. This method addresses wasted compute on rollout groups with skewed pass rate…
RESEARCH · CL_02960 · Apr 23 · 12:36

Process Supervision via Verbal Critique Improves Reasoning in Large Language Models

Researchers have developed a new framework called Verbal Process Supervision (VPS) that enhances the reasoning capabilities of large language models without requiring gradient updates. This method utilizes structured na…
RESEARCH · CL_103038 · Jan 27 · 18:58

New research explores multilingual LLM scaling, knowledge integration, and specialized evaluation

Researchers are developing new methods and benchmarks to improve the capabilities and evaluation of large language models (LLMs). Google DeepMind has introduced ATLAS, a framework for optimizing multilingual LLM trainin…