A recent study conducted by Emory University and IBM Research investigated the impact of stale documents on retrieval-augmented generation (RAG) systems. The experiment revealed that outdated information in a RAG system's index, similar to adversarial poisoning, can lead to inaccurate model responses. The study tested three retrieval configurations: dense vector retrieval with HNSW, BM25 sparse retrieval, and a governed selector. The governed selector, which pre-filters documents based on eligibility and versioning, achieved a 97% pass rate, significantly outperforming the other methods in handling stale data and offering a more robust defense against potential poisoning attacks. AI
IMPACT Highlights the critical need for robust document management in RAG systems to ensure accuracy and security.
RANK_REASON Research paper detailing findings on RAG system performance with stale data. [lever_c_demoted from research: ic=1 ai=1.0]
- BM25
- ContextNest
- Emory University
- Hierarchical Navigable Small World graphs
- IBM Research
- retrieval-augmented generation
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →