GoClick model offers lightweight GUI element grounding for on-device AI agents

By PulseAugur Editorial · [1 sources] · 2026-04-28 04:00

Researchers have developed GoClick, a novel lightweight vision-language model designed for precise GUI element grounding on resource-constrained devices. Unlike existing large models, GoClick utilizes an encoder-decoder architecture and a progressive data refinement pipeline to achieve high accuracy with significantly fewer parameters. This approach enables on-device execution for GUI agents, improving latency and performance, and has shown success when integrated into device-cloud collaboration frameworks. AI

IMPACT Enables on-device GUI interaction for agents, potentially improving mobile app automation and accessibility.

RANK_REASON Academic paper introducing a new lightweight model for GUI element grounding.

Read on arXiv cs.CV →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

GoClick model offers lightweight GUI element grounding for on-device AI agents

COVERAGE [1]

arXiv cs.CV TIER_1 English(EN) · Hongxin Li, Yuntao Chen, Zhaoxiang Zhang · 2026-04-28 04:00

GoClick: Lightweight Element Grounding Model for Autonomous GUI Interaction

arXiv:2604.23941v1 Announce Type: new Abstract: Graphical User Interface (GUI) element grounding (precisely locating elements on screenshots based on natural language instructions) is fundamental for agents interacting with GUIs. Deploying this capability directly on resource-con…

COVERAGE [1]

GoClick: Lightweight Element Grounding Model for Autonomous GUI Interaction

RELATED ENTITIES

RELATED TOPICS