Brief · PulseAugur

TOOL · arXiv cs.CL English(EN) · 19h

UXBench: Benchmarking User Experience in AI Assistants

Researchers have introduced UXBench, a new benchmark designed to evaluate the user experience of AI assistants. This benchmark focuses on preference alignment and dialogue generation, utilizing over 70,000 interaction logs from a Chinese AI assistant. UXBench includes three tasks—UX Judge, UX Eval, and UX Recovery—and has been tested on 26 large language models, revealing insights into how well these models understand and improve user experience. AI

IMPACT Establishes a new evaluation framework for AI assistants, pushing for user-centric optimization beyond raw capability.

language models
UXBench