Answer Presence Drives RAG Rewriting Gains
Researchers have investigated the gains seen in retrieval-augmented question-answering (RAG) pipelines, specifically focusing on the role of a "rewriter" LLM. Their findings suggest that the observed improvements in F1 scores are not solely due to better evidence curation but are significantly driven by the presence of the gold answer string within the rewritten context. Experiments demonstrated that removing the gold answer drastically reduced performance, while injecting it into rewrites where it was absent led to notable gains across various models and datasets. AI
IMPACT Reveals that answer presence, not just evidence quality, drives RAG performance, suggesting a need for new evaluation methods.