Researchers have developed a new framework called Propose-then-Critic to improve the accuracy of mapping natural language instructions to specific pixel locations within graphical user interfaces. This method uses a reinforcement learning paradigm that allows a 'proposer' module to generate potential targets and a 'critic' module to evaluate and select the best one. The two modules are trained to co-evolve, with the proposer's diverse outputs helping the critic become more robust, and the critic's improving judgment enabling the proposer to explore more options. Experiments across six benchmarks demonstrated significant improvements in both grounding accuracy and critic reliability. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON The submission is an academic paper detailing a novel method for GUI grounding.