When Retrieval Doesn't Help: A Large-Scale Study of Biomedical RAG
A new study published on arXiv challenges the effectiveness of Retrieval-Augmented Generation (RAG) in biomedical question answering. Researchers found that RAG provided only minor and inconsistent improvements across various models and datasets, with the choice of the base model having a far greater impact. The findings suggest that current large language models struggle to effectively utilize retrieved information, indicating that model capabilities, rather than retrieval methods, are the primary bottleneck. AI
IMPACT Suggests current LLMs need improved reasoning over retrieved data, potentially shifting focus from RAG enhancements to core model capabilities.