Brief · PulseAugur

TOOL · arXiv cs.CL English(EN) · 7h

Do MLLMs Capture How Interfaces Guide User Behavior? A Benchmark for Multimodal UI/UX Design Understanding

Researchers have developed WiserUI-Bench, a new benchmark designed to evaluate how well multimodal large language models (MLLMs) understand the impact of user interface (UI) design on user behavior. The benchmark uses 300 real-world UI image pairs from industry A/B tests, including expert interpretations of why certain designs were more effective. Initial experiments show that current MLLMs have a limited grasp of how UI/UX design influences user actions. AI

IMPACT This benchmark could drive MLLM development towards more nuanced understanding of user interaction and design principles.

multimodal large language models
WiserUI-Bench
Jaehyun Jeon