New RAG research tackles tabular data, cost, and cross-lingual knowledge
ByPulseAugur Editorial·
Summary by gemini-2.5-flash-lite
from 18 sources
Several recent research papers explore advancements in Retrieval-Augmented Generation (RAG) systems. One paper introduces Orthogonal Subspace Decomposition (OSD) to separate task-specific behavior from document knowledge in parametric RAG, improving adapter composition. Another paper, CroSearch-R1, proposes a framework to better leverage cross-lingual knowledge for RAG by integrating multilingual information into a reinforcement learning process. Additionally, research investigates the impact of coreference resolution on RAG, demonstrating its ability to reduce ambiguity and improve performance, particularly for smaller models. Other studies focus on enhancing RAG for specific domains like financial reports through reranking analysis and for knowledge graph question answering using semantic caching.
AI
arXiv:2605.00318v1 Announce Type: new Abstract: Tabular documents such as CSV and Excel files are widely used in enterprise data pipelines, yet existing chunking strategies for retrieval-augmented generation (RAG) are primarily designed for unstructured text and do not account fo…
arXiv cs.LG
TIER_1·Shawqi Al-Maliki, Ammar Gharaibeh, Mohamed Rahouti, Mohammad Ruhul Amin, Mohamed Abdallah, Junaid Qadir, Ala Al-Fuqaha·
arXiv:2604.26981v1 Announce Type: cross Abstract: Large Language Models (LLMs) have revolutionized the field of natural language processing. However, they exhibit some limitations, including a lack of reliability and transparency: they may hallucinate and fail to provide sources …
Tabular documents such as CSV and Excel files are widely used in enterprise data pipelines, yet existing chunking strategies for retrieval-augmented generation (RAG) are primarily designed for unstructured text and do not account for tabular structure. We propose a structure-awar…
arXiv:2603.06198v2 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) is a framework in which a Generator, such as a Large Language Model (LLM), produces answers by retrieving documents from an external collection using a Retriever. In practice, Generators must…
arXiv:2604.26176v1 Announce Type: cross Abstract: The integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) has significantly advanced Knowledge Graph Question Answering (KGQA). However, existing LLM-driven KGQA systems act as stateless planners, g…
arXiv cs.CL
TIER_1·Weihang Su, Hanwen Zhang, Qingyao Ai, Yiqun Liu·
arXiv:2604.26768v1 Announce Type: new Abstract: Parametric Retrieval-Augmented Generation (PRAG) encodes external documents into lightweight parameter modules that can be retrieved and merged at inference time, offering a promising alternative to in-context retrieval augmentation…
Parametric Retrieval-Augmented Generation (PRAG) encodes external documents into lightweight parameter modules that can be retrieved and merged at inference time, offering a promising alternative to in-context retrieval augmentation. Despite its potential, many PRAG implementatio…
arXiv:2603.16877v2 Announce Type: replace Abstract: Financial analysts face significant challenges extracting information from lengthy 10-K reports, which often exceed 100 pages. This paper presents a Retrieval-Augmented Generation (RAG) system designed to answer questions about …
arXiv:2604.25182v1 Announce Type: new Abstract: A multilingual collection may contain useful knowledge in other languages to supplement and correct the facts in the original language for Retrieval-Augmented Generation (RAG). However, the vanilla approach that simply concatenates …
arXiv cs.CL
TIER_1·Youngjoon Jang, Seongtae Hong, Junyoung Son, Sungjin Park, Chanjun Park, Heuiseok Lim·
arXiv:2507.07847v3 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) has emerged as a crucial framework in natural language processing (NLP), improving factual consistency and reducing hallucinations by integrating external document retrieval with large langua…
The integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) has significantly advanced Knowledge Graph Question Answering (KGQA). However, existing LLM-driven KGQA systems act as stateless planners, generating retrieval plans in isolation without exp…
arXiv cs.LG
TIER_1·Zhuoling Li, Ha Linh Hong Tran Nguyen, Valeria Bladinieres, Maxim Romanovsky·
arXiv:2604.24623v1 Announce Type: cross Abstract: Graph-based Retrieval-Augmented Generation (GraphRAG) extends traditional RAG by using knowledge graphs (KGs) to give large language models (LLMs) a structured, semantically coherent context, yielding more grounded answers. Howeve…
arXiv cs.AI
TIER_1·Miao Xie, Xiao Zhang, Yi Li, Chunli Lv·
arXiv:2604.22843v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) has been proposed to mitigate hallucinations in large language models (LLMs), where generated outputs may be factually incorrect. However, existing RAG approaches predominantly rely on vector s…
arXiv:2604.22757v1 Announce Type: cross Abstract: We introduce StratRAG, an open-source retrieval evaluation dataset for benchmarking Retrieval-Augmented Generation (RAG) systems on multi-hop reasoning tasks under realistic, noisy document-pool conditions. Derived from HotpotQA (…
arXiv:2510.11541v2 Announce Type: replace Abstract: Retrieval-augmented generation (RAG) has demonstrated its ability to enhance Large Language Models (LLMs) by integrating external knowledge sources. However, multi-hop questions, which require the identification of multiple know…
A multilingual collection may contain useful knowledge in other languages to supplement and correct the facts in the original language for Retrieval-Augmented Generation (RAG). However, the vanilla approach that simply concatenates multiple pieces of knowledge from different lang…
Graph-based Retrieval-Augmented Generation (GraphRAG) extends traditional RAG by using knowledge graphs (KGs) to give large language models (LLMs) a structured, semantically coherent context, yielding more grounded answers. However, GraphRAG reasoning process remains a black-box,…
arXiv cs.CL
TIER_1·Lichang Song, Ting Long, Yi Chang·
arXiv:2602.18734v2 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) has demonstrated strong effectiveness in knowledge-intensive tasks by grounding language generation in external evidence. Despite its success, many existing RAG systems are built based on a r…