Brief · PulseAugur

TOOL · arXiv cs.IR (Information Retrieval) English(EN) · 17h

ScholarQuest: A Taxonomy-Guided Benchmark for Agentic Academic Paper Search in Open Literature Environments

Researchers have introduced ScholarQuest, a new benchmark designed to evaluate the performance of AI agents in academic paper search. This benchmark is built upon over 1,000 computer science topics and four distinct research intents, aiming to provide a more realistic and systematic assessment than existing methods. Initial benchmarking reveals that while agentic approaches outperform traditional single-shot retrieval, there is significant room for improvement in their effectiveness, with current top agents achieving limited recall rates. AI

IMPACT This benchmark could accelerate the development of more effective AI-powered academic search tools.

Hugging Face
LLM
arXiv
ScholarQuest