X+Slides: Benchmarking Audience-Conditioned Slide Generation
Researchers have introduced X+Slides, a new benchmark designed to evaluate the audience-conditioning capabilities of large language models in generating slide decks. Unlike previous benchmarks that focused on completeness and technical depth, X+Slides incorporates audience-specific needs, such as specialists requiring proofs and decision-makers seeking conclusions. The benchmark utilizes a dynamic evaluation framework with 8,133 probes across 113 topics and seven presentation scenes, reporting metrics like Audience Coverage, Domain-wise Coverage, Efficiency, and Correctness. Initial experiments on models like DeepPresenter and NotebookLM indicate that current systems can convey a significant portion of audience-essential information but still have room for improvement. AI
IMPACT This benchmark could drive improvements in LLM-generated content by focusing on audience adaptation, leading to more effective communication tools.