PulseAugur
LIVE 06:53:19
tool · [1 source] ·
0
tool

New CoREB benchmark and reranker improve code search beyond retrieval

Researchers have introduced CoREB, a new benchmark designed to evaluate code search systems beyond simple retrieval. This benchmark addresses limitations in existing datasets, such as data contamination and noisy labels, by using counterfactually rewritten problems across five programming languages. Experiments on CoREB revealed that while code-specialized embeddings excel in code-to-code retrieval, short keyword queries significantly degrade performance for all models. The study also highlights the task-specific nature of off-the-shelf rerankers, and introduces a fine-tuned reranker that shows consistent improvements across all evaluated tasks. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Introduces a new benchmark and model to improve code search capabilities, potentially impacting developer productivity.

RANK_REASON This is a research paper introducing a new benchmark and model for code search. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 · Hang Yu ·

    Beyond Retrieval: A Multitask Benchmark and Model for Code Search

    Code search has usually been evaluated as first-stage retrieval, even though production systems rely on broader pipelines with reranking and developer-style queries. Existing benchmarks also suffer from data contamination, label noise, and degenerate binary relevance. In this pap…