Two new research papers explore the application of large language models (LLMs) in the field of systematic reviews. The first paper introduces a large-scale, cross-disciplinary corpus of over 300,000 systematic reviews, designed to improve benchmarking for retrieval and screening components. The second paper, LLM4SCREENLIT, provides recommendations for evaluating LLM performance in literature screening, proposing a Weighted Matthews Correlation Coefficient (WMCC) to better account for the imbalanced nature of this task. AI
影响 New datasets and evaluation metrics for LLMs in systematic reviews could improve the efficiency and accuracy of scientific literature analysis.
排序理由 The cluster contains two academic papers published on arXiv, detailing new datasets and evaluation methodologies for LLMs in systematic reviews.
AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →