Researchers have introduced CoREB, a new benchmark designed to evaluate code search systems beyond simple retrieval. This benchmark addresses limitations in existing datasets, such as data contamination and noisy labels, by using counterfactually rewritten problems across five programming languages. Experiments on CoREB revealed that while code-specialized embeddings excel in code-to-code retrieval, short keyword queries significantly degrade performance for all models. The study also highlights the task-specific nature of off-the-shelf rerankers, and introduces a fine-tuned reranker that shows consistent improvements across all evaluated tasks. AI
影响 Introduces a new benchmark and model to improve code search capabilities, potentially impacting developer productivity.
排序理由 This is a research paper introducing a new benchmark and model for code search. [lever_c_demoted from research: ic=1 ai=1.0]
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →