Fine-grained Claim-level RAG Benchmark for Law
Researchers have introduced ClaimRAG-LAW, a new benchmark dataset designed to evaluate retrieval-augmented generation (RAG) systems in the legal domain. This dataset supports both French and English, catering to both legal experts and non-experts with diverse question types. The evaluation of current state-of-the-art legal RAG systems using this framework revealed significant limitations in their retrieval and generation capabilities at a fine-grained claim level. AI
IMPACT Provides a more granular evaluation for legal RAG systems, potentially improving accuracy and reducing hallucinations in AI-generated legal responses.