Brief

last 24h

[2/2] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI English(EN) · 6h

Can Generalist Agents Automate Data Curation?

Researchers have developed Curation-Bench, a new benchmark designed to test whether generalist AI agents can automate the data curation process for AI development. In vision-language instruction tuning tasks, agents showed an ability to perform the curation loop but struggled with exploring new policy families, instead focusing on local variations. When provided with methodological guidance and adaptation scaffolds, agents were able to autonomously compose a data-selection policy that surpassed existing baselines with a significantly smaller data budget, highlighting the need for structured adaptation rather than simple prompting. AI

IMPACT Demonstrates a path toward automating a critical, labor-intensive aspect of AI development, potentially accelerating model training and improving efficiency.
- Generalist Agents
- Curation-Bench
RESEARCH · arXiv cs.AI English(EN) · 2w · [2 sources]

Governance by Construction for Generalist Agents

A new policy system called CUGA has been introduced to provide governance for generalist AI agents operating autonomously. This system acts as a modular, policy-as-code layer that integrates with LLM agents to ensure predictable and auditable behavior without requiring model fine-tuning. CUGA enforces governance through five checkpoints: intent guarding, playbook steering, tool usage guidance, human-in-the-loop approvals for high-risk actions, and output formatting. AI

IMPACT Introduces a framework for safer and more auditable deployment of autonomous enterprise AI agents.
- LLM
- Generalist Agents

Brief

Can Generalist Agents Automate Data Curation?

Governance by Construction for Generalist Agents