PulseAugur
实时 09:01:44

Study finds simulated coding assistant data overestimates real-world performance

A new study on arXiv investigates the gap between simulated and real-world developer behavior for proactive coding assistants. Researchers collected data from 1,246 industry developers using a custom Visual Studio Code extension over three days. Their findings indicate that simulated traces do not accurately reflect real development patterns, potentially overestimating the performance of current AI assistants. The study also introduces ProCodeBench, a benchmark based on real-world data, and suggests that while simulated data can complement real data, it cannot replace it for training and evaluation. AI

影响 Highlights the critical need for real-world data in developing and evaluating AI coding assistants, potentially impacting future tool development.

排序理由 The cluster contains an academic paper detailing an empirical study and a new benchmark for AI coding assistants. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

Study finds simulated coding assistant data overestimates real-world performance

报道来源 [1]

  1. arXiv cs.AI TIER_1 English(EN) · Lehui Li, Ruixuan Jia, Guo-Ye Yang, Jia Li ·

    An Empirical Study of Proactive Coding Assistants in Real-World Software Development

    arXiv:2605.05700v1 Announce Type: cross Abstract: Large language model (LLM)-based coding assistants have made substantial progress, yet most systems remain reactive, requiring developers to explicitly formulate their needs. Proactive coding assistants aim to infer latent develop…