ENTITY LiveCodeBench

LiveCodeBench

PulseAugur coverage of LiveCodeBench — every cluster mentioning LiveCodeBench across labs, papers, and developer communities, ranked by signal.

Total · 30d

14

14 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

11

11 over 90d

TIER MIX · 90D

significant 1
research 2
tool 11

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 11 TOTAL

TOOL · CL_29426 · May 12 · 10:36

New framework StepCodeReasoner boosts code reasoning with execution traces

Researchers have developed StepCodeReasoner, a new framework designed to improve code reasoning by focusing on intermediate execution states rather than just final outputs. This approach uses structured print statements…
TOOL · CL_20541 · May 7 · 04:00

New Conductor model learns to orchestrate LLMs for better performance

Researchers have developed a "Conductor" model trained with reinforcement learning to coordinate multiple large language models. This Conductor model learns to establish communication pathways and craft specific instruc…
TOOL · CL_24799 · May 6 · 08:05

New CoREB benchmark and model advance code search capabilities

Researchers have introduced CoREB, a new benchmark and model designed to improve code search beyond simple retrieval. CoREB addresses limitations in existing benchmarks, such as data contamination and noisy labels, by f…
TOOL · CL_20651 · May 6 · 08:05

New CoREB benchmark and reranker improve code search beyond retrieval

Researchers have introduced CoREB, a new benchmark designed to evaluate code search systems beyond simple retrieval. This benchmark addresses limitations in existing datasets, such as data contamination and noisy labels…
TOOL · CL_18865 · May 6 · 04:00

ReCode framework enhances AI code generation by rewarding reasoning processes

Researchers have developed ReCode, a novel reinforcement learning framework designed to improve code generation by focusing on the reasoning process. This framework uses Contrastive Reasoning-Process Reward Learning (CR…
TOOL · CL_13981 · May 3 · 22:13

DeepClaude slashes coding agent costs by 17x using DeepSeek V4 Pro

An open-source tool called DeepClaude has gained significant traction by allowing developers to use the Claude Code agent loop with DeepSeek V4 Pro instead of Anthropic's models. This swap drastically reduces costs, wit…
SIGNIFICANT · CL_12673 · May 2 · 00:54

AI coding tools end subsidies, shift to pay-as-you-go pricing amid rising costs

The era of heavily subsidized AI coding tools is ending as companies like Microsoft and Anthropic shift from flat-rate subscriptions to pay-as-you-go pricing. This change reflects the immense scale of AI investment, wit…
RESEARCH · CL_11452 · Apr 30 · 06:09

ScaleBox system enhances LLM code verification accuracy and efficiency

Researchers have developed ScaleBox, a new system designed to improve the accuracy and efficiency of code verification for large language models. Existing code sandboxes struggle with high-concurrency workloads, leading…
RESEARCH · CL_07021 · Apr 28 · 04:00

AI benchmark contamination signal sensitive to question format, study finds

A new paper questions the reliability of temporal signals in detecting benchmark contamination for large language models. Researchers found that the way benchmark questions are phrased significantly impacts whether perf…
RESEARCH · CL_06927 · Apr 27 · 04:00

Think Anywhere in Code Generation

Researchers have introduced "Think-Anywhere," a new reasoning mechanism for large language models that allows them to generate code by thinking at any point during the process, rather than just upfront. This approach ha…
RESEARCH · CL_05788 · Apr 24 · 02:30

Kwai AI's SRPO achieves DeepSeek-R1-Zero performance with 10x fewer training steps

Researchers from Kuaishou's Kwaipilot team have developed a novel reinforcement learning framework called SRPO, designed to improve the efficiency and performance of large language models. This new method addresses limi…