New RAG research tackles adversarial attacks and bias
ByPulseAugur Editorial·[14 sources]·
Two new research papers explore methods to improve the reliability and fairness of Retrieval-Augmented Generation (RAG) systems. One paper introduces BiRD, a defense mechanism that uses bidirectional ranking to detect and mitigate adversarial poisoning attacks, significantly reducing attack success rates while maintaining task accuracy. The other paper proposes a fairness-aware retrieval framework that models and controls bias introduced during the retrieval process, aiming to balance relevance and fairness in RAG outputs.
AI
IMPACT
New research offers methods to enhance RAG system security against attacks and improve fairness, potentially increasing trust and adoption.
RANK_REASON
Two academic papers published on arXiv detailing new methods for improving Retrieval-Augmented Generation systems.
arXiv:2605.26356v1 Announce Type: new Abstract: In-context learning has recently been linked to implicit gradient descent in linear self-attention models, suggesting that context can induce a forward-pass update. Retrieval-augmented generation (RAG) also relies on context, but re…
arXiv cs.AI
TIER_1English(EN)·Yu-Chen Den, Yung-Yu Shih, Zhi Rui Tam, Kuan-Yu Chen, Pu-Jen Cheng, Yun-Nung Chen, Eugene Yang·
arXiv:2605.26902v1 Announce Type: cross Abstract: Generative retrieval (GR) maps queries directly to document identifiers (docids) using parametric knowledge, However, this design makes corpus expansion costly: adding new documents requires updating model parameters to encode new…
arXiv cs.AI
TIER_1English(EN)·Tetsuya Sakai, Jina Lee, Hanpei Fang, Young-In Song·
arXiv:2605.26400v1 Announce Type: cross Abstract: We propose a framework for evaluating structured generative search summaries that are placed atop organic web search results. A structured summary, generated by a large language model, typically consists of an overview, several se…
arXiv cs.AI
TIER_1English(EN)·Zhe Yu, Wenpeng Xing, Chen Ye, Xuyang Teng, Bo Yang, Changting Lin, Meng Han·
arXiv:2605.27157v1 Announce Type: new Abstract: Retrieval-augmented LLMs are deployed for tasks where evidence quality determines action safety, yet evaluation protocols assume that single-turn robustness predicts robustness when evidence accumulates across turns. We show this as…
Retrieval-augmented LLMs are deployed for tasks where evidence quality determines action safety, yet evaluation protocols assume that single-turn robustness predicts robustness when evidence accumulates across turns. We show this assumption is fundamentally incorrect. Models exhi…
Generative retrieval (GR) maps queries directly to document identifiers (docids) using parametric knowledge, However, this design makes corpus expansion costly: adding new documents requires updating model parameters to encode new document-docid associations incurs repeated train…
arXiv:2605.25379v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) has become the standard way to ground large language models in external knowledge, but many systems still organize evidence as flat chunks and retrieve it through largely unstructured search. Thi…
We propose a framework for evaluating structured generative search summaries that are placed atop organic web search results. A structured summary, generated by a large language model, typically consists of an overview, several sections with section titles, and a list of source d…
The growing adoption of Retrieval-Augmented Generation (RAG) has led to a rise in adversarial attacks. Existing defenses, relying on semantic analysis or voting, face a trade-off between high computational cost and limited robustness under strong poisoning attacks. Their fundamen…
Retrieval-Augmented Generation (RAG) improves reliability of large language models by incorporating external knowledge, but the retrieval process can introduce bias that propagates to generated outputs. This issue is particularly challenging in top-k settings, where multiple docu…
arXiv:2605.25039v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate strong performance in natural language processing but often generate factual errors when relying solely on parametric knowledge. Retrieval-Augmented Generation (RAG) mitigates these errors by…
<p>RAG sounds complicated.</p> <p>It's not.</p> <p>But a lot of introductions to RAG make it sound more mysterious than it actually is. They use terms like "semantic search" and "vector embeddings" and "retrieval pipeline" before explaining what the actual problem is.</p> <p>So l…
<!-- SC_OFF --><div class="md"><h1>Hey</h1> <p>i built Aiki a lightweight tool that let's you chat with Wikipedia locally.</p> <p><strong>what it does:</strong> - Downloads and chunks wikipedia articles (u can choose those articles by their name or articles and also the option of…