Brief · PulseAugur

RESEARCH · arXiv cs.AI English(EN) · 5d · [3 sources]

CoQuIR: A Comprehensive Benchmark for Code Quality-Aware Information Retrieval

Researchers have developed FASE, a new metric for evaluating code quality in multi-agent AI systems. FASE approximates functional correctness by analyzing code dissimilarity, offering a significant speed improvement over existing methods. Separately, a new benchmark called CoQuIR has been introduced to assess code retrieval systems on dimensions beyond just functional relevance, including correctness, efficiency, security, and maintainability. CoQuIR includes annotations for over 42,000 queries across 11 languages and highlights that current retrieval models often fail to distinguish between high and low-quality code. AI

IMPACT These advancements in code quality evaluation could lead to more reliable AI-assisted software development and more trustworthy code retrieval systems.

Jiahui Geng
CoQuIR
HumanEval
BigCodeBench
Qwen3-Embedding-8B
FASE