Computer-Using Agents
PulseAugur coverage of Computer-Using Agents — every cluster mentioning Computer-Using Agents across labs, papers, and developer communities, ranked by signal.
-
New benchmark reveals AI agents struggle with real-world SaaS tasks
Researchers have introduced SaaS-Bench, a new benchmark designed to evaluate computer-using agents (CUAs) on realistic professional workflows. This benchmark utilizes 23 Software-as-a-Service (SaaS) systems across six d…
-
New benchmark reveals AI agents struggle with real-world SaaS tasks
Researchers have introduced SaaS-Bench, a new benchmark designed to evaluate computer-using agents (CUAs) on realistic professional workflows within Software-as-a-Service (SaaS) environments. The benchmark comprises 106…
-
Survey maps safety and security threats of autonomous computer-using agents
A new survey paper categorizes the safety and security threats posed by Computer-Using Agents (CUAs). These agents, powered by LLMs, can autonomously interact with software and interfaces, presenting novel risks. The pa…