PulseAugur
EN
LIVE 05:51:30

New X+Slides benchmark evaluates audience-conditioned slide generation

Researchers have introduced X+Slides, a new benchmark designed to evaluate audience-conditioned slide generation by large language models. Unlike previous benchmarks that focused on completeness and technical depth, X+Slides incorporates the target audience's needs, such as specialists requiring proofs or decision-makers seeking conclusions. The benchmark utilizes a dynamic evaluation framework with 8,133 probes across 113 topics and seven presentation scenes, measuring Audience Coverage, Domain-wise Coverage, Efficiency, and Correctness. Initial experiments on models like DeepPresenter and SlideTailor revealed that while current systems can convey a significant portion of audience-essential information, there is still room for improvement in source-grounded evaluation. AI

IMPACT This benchmark could drive improvements in LLM-generated presentations, making them more tailored and effective for specific audiences.

RANK_REASON The cluster contains a research paper introducing a new benchmark for evaluating LLM capabilities. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Fan Wu ·

    X+Slides: Benchmarking Audience-Conditioned Slide Generation

    Automatically generating slide decks from source documents is an important application of large language models (LLMs). Existing benchmarks primarily assess slide completeness and technical depth, while overlooking the target audience as a critical real-world factor. For instance…