New RenoBench dataset advances citation parsing evaluation

By PulseAugur Editorial · [1 sources] · 2026-06-02 04:00

Researchers have introduced RenoBench, a new public benchmark dataset designed for evaluating citation parsing systems. Sourced from four major publishing ecosystems, the dataset comprises 10,000 annotated citations across various languages and publication types. Initial evaluations show that language models, especially when fine-tuned, perform strongly on this task, paving the way for more standardized and reproducible research in automated citation parsing and scientometrics. AI

IMPACT Provides a standardized benchmark for evaluating and advancing citation parsing technologies, crucial for metascientific research.

RANK_REASON The cluster describes a new academic paper introducing a benchmark dataset for a specific NLP task. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

paper
other

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.CL TIER_1 English(EN) · Parth Sarin, Juan Pablo Alperin, Adam Buttrick, Dione Mentis · 2026-06-02 04:00

RenoBench: A Citation Parsing Benchmark

arXiv:2603.25640v2 Announce Type: replace-cross Abstract: Accurate parsing of citations is necessary for machine-readable scholarly infrastructure. But, despite sustained interest in this problem, existing evaluation techniques are often not generalizable, based on synthetic data…

COVERAGE [1]

RenoBench: A Citation Parsing Benchmark

RELATED ENTITIES

RELATED TOPICS