PulseAugur
EN
LIVE 05:35:27

New ProCUA-SFT dataset boosts AI agent desktop performance

Researchers have developed ProCUA-SFT, a new dataset designed to improve the training of computer-use agents (CUAs) that interact with graphical desktop environments. Existing datasets like AgentNet have shown negative transfer effects, hindering performance. ProCUA-SFT, comprising 3.1 million step-level samples from synthetic trajectories, addresses this by using an automated pipeline for task generation and verification. Fine-tuning the UI-TARS 7B model on ProCUA-SFT resulted in a significant performance increase on the OSWorld benchmark, outperforming models trained on AgentNet. A portion of ProCUA-SFT was also integrated into the Nemotron 3 Nano Omni model to enhance its computer-use capabilities. AI

IMPACT This new dataset significantly improves AI agents' ability to interact with desktop environments, potentially accelerating the development of more capable and autonomous software agents.

RANK_REASON The cluster describes a new dataset and technical report detailing its creation and performance improvements for AI agents, which falls under research.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New ProCUA-SFT dataset boosts AI agent desktop performance

COVERAGE [2]

  1. arXiv cs.LG TIER_1 English(EN) · Jaehun Jung, Ximing Lu, Brandon Cui, Muhammad Khalifa, Shaokun Zhang, Hao Zhang, Jin Xu, Amala Sanjay Deshmukh, Karan Sapra, Andrew Tao, Yejin Choi, Jan Kautz, Mingjie Liu, Yi Dong ·

    ProCUA-SFT Technical Report

    arXiv:2606.17321v1 Announce Type: new Abstract: Training computer-use agents (CUAs) -- models that interact with graphical desktops through screenshots and keyboard/mouse actions -- requires large-scale, diverse trajectory data collected in full desktop environments. The largest …

  2. Hugging Face Daily Papers TIER_1 English(EN) ·

    ProCUA-SFT Technical Report

    Training computer-use agents using a large-scale synthetic dataset with automated task generation and verification achieves significantly improved performance on desktop interaction benchmarks.