A History-Aware Visually Grounded Critic for Computer Use Agents
Researchers have developed HiViG, a novel framework designed to improve the performance of Computer Use Agents (CUAs) in complex graphical user interface environments. HiViG addresses limitations in existing critics by incorporating both historical awareness of past actions and visual grounding to detect errors. This multimodal critic, trained on real GUI trajectories, evaluates actions by summarizing past achievements and verifying execution coordinates against screenshots, thereby preventing flawed actions before they occur. AI
IMPACT Enhances AI agent reliability in complex GUI tasks by reducing planning and execution errors.