PulseAugur / Brief
EN
LIVE 14:40:54

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. CIDR: A Large-Scale Industrial Source Code Dataset for Software Engineering Research

    Researchers have introduced CIDR, a new large-scale dataset of industrial source code designed to advance software engineering research. This dataset includes 2,440 repositories from 12 partner organizations, totaling 373 million lines of code across 138 programming languages. CIDR is unique as it comprises proprietary production codebases, processed through rigorous quality selection and anonymization, and is intended for research in code intelligence, model pre-training, and agent evaluation. AI

    CIDR: A Large-Scale Industrial Source Code Dataset for Software Engineering Research

    IMPACT Enables new research in code intelligence and the development of code language models and AI agents.