New French historical QA dataset released for LLM evaluation

By PulseAugur Editorial · [1 sources] · 2026-07-01 04:00

Researchers have developed HistoriQA-ThirdRepublic, a new French-language dataset designed for multi-hop question answering in historical research. This corpus, created in collaboration with a historian, contains 1782 questions derived from parliamentary debates and newspapers of the French Third Republic (1870-1940). It aims to evaluate retrieval-augmented and large language model systems by capturing complex reasoning patterns such as cross-source synthesis and temporal reasoning, bridging the gap between NLP benchmarks and historical scholarship. AI

IMPACT Provides a specialized dataset for evaluating LLMs in historical research, potentially improving domain-specific QA capabilities.

RANK_REASON The cluster contains a newly published academic paper detailing a new dataset for NLP research. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

paper
other

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New French historical QA dataset released for LLM evaluation

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Aur\'elien Pellet (LRE), Julien Perez (EPITA, LRE), Marie Puren (LRE, CJM) · 2026-07-01 04:00

HistoriQA-ThirdRepublic: Multi-Hop Question Answering Corpus for Historical Research, Parliamentary Debates from the French Third Republic (1870-1940)

arXiv:2606.31325v1 Announce Type: new Abstract: We present HistoriQA-ThirdRepublic: a French-language dataset of multi-hop historical questions derived from parliamentary debates and newspapers of the French Third Republic. Designed in collaboration with a historian, the corpus c…

COVERAGE [1]

HistoriQA-ThirdRepublic: Multi-Hop Question Answering Corpus for Historical Research, Parliamentary Debates from the French Third Republic (1870-1940)

RELATED TOPICS