New dataset and models advance desktop computer-use agents

By PulseAugur Editorial · [1 sources] · 2026-06-11 04:00

Researchers have introduced GroundCUA, a large-scale dataset designed to improve computer-use agents by accurately connecting natural language instructions to on-screen elements in desktop environments. The dataset comprises 56,000 screenshots with over 3.56 million human-verified annotations across 87 applications. Utilizing this dataset, the GroundNext models, at 3B and 7B parameter scales, achieved state-of-the-art performance on five benchmarks with significantly less training data than previous methods. AI

IMPACT Enhances AI agent capabilities for desktop environments, potentially leading to more sophisticated automation tools.

RANK_REASON The cluster contains a research paper detailing a new dataset and models for improving AI agents. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Aarash Feizi, Shravan Nayak, Xiangru Jian, Kevin Qinghong Lin, Kaixin Li, Rabiul Awal, Xing Han L\`u, Johan Obando-Ceron, Juan A. Rodriguez, Nicolas Chapados, David Vazquez, Adriana Romero-Soriano, Reihaneh Rabbany, Perouz Taslakian, Christopher Pal, Spa… · 2026-06-11 04:00

Grounding Computer Use Agents on Human Demonstrations

arXiv:2511.07332v2 Announce Type: replace-cross Abstract: Building reliable computer-use agents requires grounding: accurately connecting natural language instructions to the correct on-screen elements. While large datasets exist for web and mobile interactions, high-quality reso…

COVERAGE [1]

Grounding Computer Use Agents on Human Demonstrations

RELATED TOPICS