PulseAugur
EN
LIVE 10:01:02

New Benchmark for Detecting AI-Written Code Lines Released

Researchers have introduced HybridCodeAuthorship, a new benchmark dataset designed to evaluate AI-generated code detection at a line-by-line level. This dataset simulates real-world industry codebases where human and AI-authored code are interleaved, unlike existing benchmarks that often use academic or completely AI- or human-authored code. The dataset was constructed using Python code files from GitHub repositories. Initial benchmarking showed that the top-performing algorithm, AIGCode Detector, achieved an F1 score of 0.48 for chunk-level detection and 0.56 for line-level detection. AI

IMPACT Enables better tools for managing and analyzing codebases with mixed human and AI authorship.

RANK_REASON The cluster describes a new benchmark dataset for AI research. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Luke Patterson, Li Wang, Adam Faulkner ·

    HybridCodeAuthorship: A Benchmark Dataset for Line-Level Code Authorship Detection

    arXiv:2606.12620v1 Announce Type: cross Abstract: Thanks to the rapid adoption of AI code assistants powered by large language models (LLMs), industry codebases are, increasingly, a hybrid of AI- and human-authored code. For risk management and productivity analysis purposes, it …