Researchers have developed a new framework called TRACE to detect poisoning attacks in retrieval-augmented generation (RAG) systems. These attacks manipulate RAG models by inserting malicious documents into their retrieval corpora, leading to incorrect outputs. TRACE offers a lightweight solution by analyzing token influence attribution to identify these poisoned answers, bypassing the need for computationally intensive auxiliary classifiers or LLM verification. Experiments show TRACE effectively detects poisoning and reveals attacker-specified target answers across various QA benchmarks and LLMs. AI
IMPACT Enhances the security and reliability of retrieval-augmented generation systems, crucial for many AI applications.
RANK_REASON The cluster contains a research paper detailing a new framework for detecting attacks on AI systems.
- arXiv
- Hugging Face
- LLM
- QA
- retrieval-augmented generation
- TRACE
- alphaXiv
- CatalyzeX
- DagsHub
- Gotit.pub
- Influence Flower
- LLMs
- ScienceCast
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →