Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 1d

Pull Requests as a Training Signal for Repo-Level Code Editing

Researchers have developed a new method called Clean Pull Request (Clean-PR) to train AI models for repository-level code editing. This approach utilizes real-world GitHub pull requests, converting them into a structured dataset of over 2 million edits across 12 programming languages. By training models with this data, the researchers achieved significant performance improvements on the SWE-bench benchmark without relying on complex agent scaffolding during inference. AI

IMPACT Enhances AI's ability to perform complex, multi-file code modifications, potentially streamlining software development workflows.

GitHub
SWE-bench
Qinglin Zhu
Clean Pull Request