Researchers have developed HistoriQA-ThirdRepublic, a new French-language dataset designed for multi-hop question answering in historical research. This corpus, created in collaboration with a historian, contains 1782 questions derived from parliamentary debates and newspapers of the French Third Republic (1870-1940). It aims to evaluate retrieval-augmented and large language model systems by capturing complex reasoning patterns such as cross-source synthesis and temporal reasoning, bridging the gap between NLP benchmarks and historical scholarship. AI
IMPACT Provides a specialized dataset for evaluating LLMs in historical research, potentially improving domain-specific QA capabilities.
RANK_REASON The cluster contains a newly published academic paper detailing a new dataset for NLP research. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →