Researchers have developed Curation-Bench, a new benchmark designed to evaluate the ability of generalist coding agents to automate the data curation process for AI model training. Initial tests show that agents can perform basic data selection within ten iterations, matching existing baselines. However, agents tend to make only minor adjustments rather than exploring fundamentally new data policy families. A scaffolded approach, requiring agents to cite and adapt prior research methods, led to the autonomous composition of a superior data selection policy that outperformed published baselines with significantly less data. AI
IMPACT Automated data curation could significantly reduce the cost and effort of training AI models, potentially accelerating development.
RANK_REASON The cluster describes a new academic paper introducing a benchmark and findings on automating AI data curation. [lever_c_demoted from research: ic=1 ai=1.0]
Read on Hugging Face Daily Papers →
- Claude Code
- Codex
- Curation-Bench
- DataComp-Small
- generalist coding agents
- Kimi K2.5
- LLaVA-665K
- OpenHands
- Qwen3.5-397B
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →