Researchers have introduced X+Slides, a new benchmark designed to evaluate audience-conditioned slide generation by large language models. Unlike previous benchmarks that focused on completeness and technical depth, X+Slides incorporates the target audience's needs, such as specialists requiring proofs or decision-makers seeking conclusions. The benchmark utilizes a dynamic evaluation framework with 8,133 probes across 113 topics and seven presentation scenes, measuring Audience Coverage, Domain-wise Coverage, Efficiency, and Correctness. Initial experiments on models like DeepPresenter and SlideTailor revealed that while current systems can convey a significant portion of audience-essential information, there is still room for improvement in source-grounded evaluation. AI
IMPACT This benchmark could drive improvements in LLM-generated presentations, making them more tailored and effective for specific audiences.
RANK_REASON The cluster contains a research paper introducing a new benchmark for evaluating LLM capabilities. [lever_c_demoted from research: ic=1 ai=1.0]
- alphaXiv
- arXiv
- CatalyzeX
- DagsHub
- DeepPresenter
- Gotit.pub
- Hugging Face
- NotebookLM
- ScienceCast
- SlideTailor
- X+Slides
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →