PulseAugur
LIVE 06:56:19
research · [2 sources] ·
0
research

New bilingual dataset and RAG system improve geospatial question answering

Researchers have developed a new bilingual dataset and a hybrid retrieval-augmented generation (RAG) system for answering geospatial questions about Tatarstan. The system integrates semantic search with geospatial filtering, achieving high accuracy on a test set of 500 queries. The paper also details experiments with different reader architectures, finding XLM-RoBERTa-large to be the most effective, and makes all resources publicly available on Hugging Face. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT This work provides a new dataset and a high-performing system for multilingual geospatial question answering, potentially benefiting digital humanities and geocoding services.

RANK_REASON This is a research paper detailing a new dataset and system for geospatial question answering.

Read on arXiv cs.CL →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 · Mullosharaf K. Arabov ·

    Tatarstan Toponyms: A Bilingual Dataset and Hybrid RAG System for Geospatial Question Answering

    arXiv:2605.05962v1 Announce Type: new Abstract: This paper addresses automatic geospatial question answering over multilingual toponymic data. An original bilingual dataset of toponyms of the Republic of Tatarstan is introduced, comprising 9,688 structured records with linguistic…

  2. arXiv cs.CL TIER_1 · Mullosharaf K. Arabov ·

    Tatarstan Toponyms: A Bilingual Dataset and Hybrid RAG System for Geospatial Question Answering

    This paper addresses automatic geospatial question answering over multilingual toponymic data. An original bilingual dataset of toponyms of the Republic of Tatarstan is introduced, comprising 9,688 structured records with linguistic, etymological, administrative, and coordinate i…