English(EN) CoQuIR: A Comprehensive Benchmark for Code Quality-Aware Information Retrieval

新指标和基准推动AI代码质量评估

作者 PulseAugur 编辑部 · [3 个来源] · 2026-06-08 04:00

研究人员开发了FASE，一个用于评估多智能体AI系统中代码质量的新指标。FASE通过分析代码不相似性来近似评估功能正确性，与现有方法相比速度显著提升。另外，引入了一个名为CoQuIR的新基准，用于评估代码检索系统在功能相关性之外的维度，包括正确性、效率、安全性和可维护性。CoQuIR包含11种语言的42,000多个查询的注释，并指出当前的检索模型常常无法区分高质量和低质量代码。 AI

影响这些代码质量评估方面的进展可能带来更可靠的AI辅助软件开发和更值得信赖的代码检索系统。

排序理由两篇研究论文介绍了用于评估AI生成代码质量的新方法和基准。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

arXiv cs.AI TIER_1 English(EN) · Shizhe Lin, Ladan Tahvildari · 2026-06-09 04:00

FASE：代码质量的快速自适应语义熵

arXiv:2606.09800v1 Announce Type: cross Abstract: Multi-agent code generation offers a promising paradigm for autonomous software development by simulating the human software engineering lifecycle. However, system reliability remains hindered by LLM hallucinations and error propa…
arXiv cs.MA (Multiagent) TIER_1 English(EN) · Ladan Tahvildari · 2026-06-08 17:53

FASE：代码质量的快速自适应语义熵

Multi-agent code generation offers a promising paradigm for autonomous software development by simulating the human software engineering lifecycle. However, system reliability remains hindered by LLM hallucinations and error propagation across interacting agents. While semantic e…
arXiv cs.AI TIER_1 English(EN) · Jiahui Geng, Fengyu Cai, Shaobo Cui, Qing Li, Liangwei Chen, Chenyang Lyu, Haonan Li, Derui Zhu, Walter Pretschner, Heinz Koeppl, Fakhri Karray · 2026-06-08 04:00

CoQuIR：一个用于代码质量感知信息检索的综合基准

arXiv:2506.11066v3 Announce Type: replace-cross Abstract: Code retrieval is essential in modern software development, as it boosts code reuse and accelerates debugging. However, current benchmarks primarily emphasize functional relevance while neglecting critical dimensions of so…

报道来源 [3]

FASE：代码质量的快速自适应语义熵

FASE：代码质量的快速自适应语义熵

CoQuIR：一个用于代码质量感知信息检索的综合基准

相关实体

相关话题