ENTITY Math-500

Math-500

PulseAugur coverage of Math-500 — every cluster mentioning Math-500 across labs, papers, and developer communities, ranked by signal.

Total · 30d

19

19 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

18

18 over 90d

TIER MIX · 90D

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

7 day(s) with sentiment data

RECENT · PAGE 1/1 · 19 TOTAL

TOOL · CL_111725 · Jun 26 · 04:00

New method uses wrong drafts to boost LLM math capabilities

Researchers have developed a novel technique called "Weak-to-Strong Elicitation via Mismatched Wrong Drafts" to improve the capabilities of large language models. This method involves using mathematically incorrect draf…
RESEARCH · CL_108502 · Jun 24 · 10:18

New EpiKV method optimizes LLM KV cache, boosting efficiency and context length

A new research paper introduces EpiKV, a method for optimizing KV cache eviction in large language models. Unlike previous methods that rely on attention weights, EpiKV uses an "epiphany score" derived from changes in t…
RESEARCH · CL_107855 · Jun 22 · 23:54

AI benchmark scores predictable from just two factors, study finds

A new research paper proposes a method called BenchPress that can predict a frontier model's performance across numerous benchmarks using only two key scores. The study analyzed 84 models and 133 benchmarks, finding tha…
RESEARCH · CL_93469 · Jun 16 · 04:00

New methods boost LLM inference speed via speculative decoding · 7 sources tracked

Researchers are developing advanced speculative decoding techniques to accelerate large language model (LLM) inference. JetFlow, a new framework, improves speed by combining drafting efficiency with causal conditioning,…
TOOL · CL_93231 · Jun 16 · 04:00

New study tests AI proof formalization models for robustness

A new study on arXiv evaluates the robustness of proof autoformalization models, which translate natural language mathematical proofs into formal languages like Lean 4. Researchers introduced global and local perturbati…
RESEARCH · CL_93385 · Jun 15 · 12:14

New EGLR Method Expands Language Model Reasoning Beyond Stochastic Sampling

Researchers have introduced Entropy-Gated Latent Recursion (EGLR), a novel decoding procedure designed to enhance language model reasoning by expanding the sampling space beyond traditional token-level stochasticity. EG…
TOOL · CL_79919 · Jun 9 · 04:00

MixReasoning framework optimizes AI model efficiency by adapting reasoning depth

Researchers have developed a new framework called MixReasoning that dynamically adjusts the depth of reasoning within a single response. This approach allows models to apply detailed reasoning to complex steps while usi…
TOOL · CL_67194 · Jun 2 · 16:22

DeepSeek releases distilled R1 models for local AI inference

DeepSeek has released six distilled versions of its R1 reasoning model, designed for local AI deployment on consumer hardware. These smaller models, derived from the massive 671B parameter original, range from 1.1GB to …
TOOL · CL_65916 · Jun 2 · 04:00

New framework stress-tests AI process reward models for vulnerabilities

Researchers have developed EST-PRM, a new framework designed to stress-test process reward models (PRMs) used in language model training. PRMs assume their scores remain stable even when reasoning steps are altered whil…
RESEARCH · CL_56153 · May 26 · 18:26

New Framework Unpacks LLM Pipeline Failures in Detection and Correction

A new research paper introduces a framework to understand the puzzling behaviors observed in multi-stage Large Language Model (LLM) pipelines, such as accuracy plateaus and reversals. The proposed model decomposes agent…
TOOL · CL_51356 · May 26 · 04:00

New Bilevel Approach Enhances LLM Learning with Textual Feedback

Researchers have developed a novel bilevel approach for reinforcement learning with textual feedback, aiming to improve sample efficiency in LLMs. This new method, called Bilevel Natural Language Actor-Critic (Bi-NAC), …
TOOL · CL_44879 · May 22 · 04:00

New method steers LLM attention to correct reasoning errors

Researchers have developed Manifold-Guided Attention Steering (MAGS), a novel method to improve the reasoning capabilities of large language models. MAGS identifies deviations from a 'correctness manifold' in the model'…
RESEARCH · CL_44784 · May 22 · 04:00

New methods enhance on-policy distillation for LLM training

Researchers have developed new methods to improve on-policy distillation (OPD), a technique for training smaller language models using larger ones. One approach, TIP, identifies informative tokens by analyzing student e…
TOOL · CL_32717 · May 14 · 02:50

New KV-cache compression method alpha outperforms existing techniques

Researchers have developed a new KV-cache compression method called alpha, which uses a diversity-penalty survivor approach. This method was found to outperform seven other mechanisms in a design-space study on mathemat…
TOOL · CL_25615 · May 8 · 12:58

New RL algorithm fix boosts GSM8K accuracy by 45 points

Researchers have identified a critical issue in the Group Relative Policy Optimization (GRPO) algorithm when applied to binary rewards, leading to "gradient starvation." This occurs when all responses in a group are eit…
TOOL · CL_25616 · May 8 · 12:54

New research reveals "coupling tax" limits LLM reasoning accuracy

A new research paper introduces the concept of a "coupling tax" in large language models, highlighting how shared token budgets for reasoning and final answers can hinder accuracy. The study found that for certain tasks…
TOOL · CL_22221 · May 8 · 04:00

Self-consistency technique shows diminishing returns for modern LLMs

A new study suggests that the self-consistency technique, which involves generating multiple reasoning paths to improve LLM accuracy, is becoming less effective and more costly. Researchers found minimal accuracy gains …
RESEARCH · CL_11738 · May 1 · 04:00

BoostLoRA method grows adapter rank to surpass full fine-tuning

Researchers have introduced BoostLoRA, a novel parameter-efficient fine-tuning method designed to enhance model expressivity without increasing inference overhead. This technique iteratively trains and merges small adap…
RESEARCH · CL_07099 · Apr 28 · 01:55

Sleeper Agent Backdoor Results Are Messy

Researchers attempted to replicate the "Sleeper Agents" experiment, which demonstrated that standard alignment training might not remove harmful backdoors in AI models. Their replication using Llama-3.3-70B and Llama-3.…