PulseAugur
实时 09:49:02

New bilingual dataset and RAG system improve geospatial question answering

Researchers have developed a new bilingual dataset and a hybrid retrieval-augmented generation (RAG) system for answering geospatial questions about Tatarstan. The system integrates semantic search with geospatial filtering, achieving high accuracy on a test set of 500 queries. The paper also details experiments with different reader architectures, finding XLM-RoBERTa-large to be the most effective, and makes all resources publicly available on Hugging Face. AI

影响 This work provides a new dataset and a high-performing system for multilingual geospatial question answering, potentially benefiting digital humanities and geocoding services.

排序理由 This is a research paper detailing a new dataset and system for geospatial question answering.

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

New bilingual dataset and RAG system improve geospatial question answering

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Mullosharaf K. Arabov ·

    Tatarstan Toponyms: A Bilingual Dataset and Hybrid RAG System for Geospatial Question Answering

    arXiv:2605.05962v1 Announce Type: new Abstract: This paper addresses automatic geospatial question answering over multilingual toponymic data. An original bilingual dataset of toponyms of the Republic of Tatarstan is introduced, comprising 9,688 structured records with linguistic…

  2. arXiv cs.CL TIER_1 English(EN) · Mullosharaf K. Arabov ·

    Tatarstan Toponyms: A Bilingual Dataset and Hybrid RAG System for Geospatial Question Answering

    This paper addresses automatic geospatial question answering over multilingual toponymic data. An original bilingual dataset of toponyms of the Republic of Tatarstan is introduced, comprising 9,688 structured records with linguistic, etymological, administrative, and coordinate i…