PulseAugur
LIVE 11:00:46
research · [2 sources] ·
0
research

New corpus and metrics advance LLM use in systematic literature reviews

Two new research papers explore the application of large language models (LLMs) in the field of systematic reviews. The first paper introduces a large-scale, cross-disciplinary corpus of over 300,000 systematic reviews, designed to improve benchmarking for retrieval and screening components. The second paper, LLM4SCREENLIT, provides recommendations for evaluating LLM performance in literature screening, proposing a Weighted Matthews Correlation Coefficient (WMCC) to better account for the imbalanced nature of this task. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT New datasets and evaluation metrics for LLMs in systematic reviews could improve the efficiency and accuracy of scientific literature analysis.

RANK_REASON The cluster contains two academic papers published on arXiv, detailing new datasets and evaluation methodologies for LLMs in systematic reviews.

Read on arXiv cs.CL →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 · Pierre Achkar, Tim Gollub, Arno Simons, Harrisen Scells, Martin Potthast ·

    A Large-Scale, Cross-Disciplinary Corpus of Systematic Reviews

    arXiv:2604.22864v1 Announce Type: cross Abstract: Existing benchmarks for systematic reviewing remain limited either in scale or in disciplinary coverage, with some collections comprising only a modest number of topics and others focusing primarily on biomedical research. We pres…

  2. arXiv cs.LG TIER_1 · Lech Madeyski, Barbara Kitchenham, Martin Shepperd ·

    LLM4SCREENLIT: Recommendations on Assessing the Performance of Large Language Models for Screening Literature in Systematic Reviews

    arXiv:2511.12635v2 Announce Type: replace-cross Abstract: Context: Large language models (LLMs) are increasingly used to screen literature for systematic reviews (SRs), but the standard confusion-matrix metrics used to evaluate them can mislead under the imbalanced, cost-asymmetr…