Researchers from Fudan University and Tongyi Lab have developed ToolCUA, a new training paradigm for agents that can effectively utilize both graphical user interface (GUI) operations and tool calls. Experiments revealed that simply equipping agents with tools does not automatically improve performance, as models often struggle to choose between GUI and tool actions, leading to decreased accuracy. ToolCUA addresses this by first synthesizing interleaved GUI-Tool trajectories and then employing online agentic reinforcement learning with a novel tool-efficient path reward to guide the agent in selecting optimal action paths. AI
IMPACT This new training paradigm could enable more capable agents that efficiently leverage both graphical interfaces and external tools, improving task completion and reducing errors.
RANK_REASON The cluster describes a new training paradigm and methodology for AI agents, presented in a research paper, with open-sourced code and model weights. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →