Brief · PulseAugur

TOOL · arXiv cs.IR (Information Retrieval) English(EN) · 1d

Charge as a Construct-Validity Factor in Chinese Legal Case Retrieval: A Cross-Benchmark Audit

A new audit of Chinese Legal Case Retrieval (LCR) benchmarks reveals that the primary charge of a case, which encodes its legal characterization, is a significant factor in determining relevance. Researchers found that ranking cases solely by shared primary charge, combined with BM25, recovers nearly all of the performance gap between basic retrieval methods and advanced trained systems on the LeCaRDv2 benchmark. This suggests that current benchmarks may be overstating the legal reasoning capabilities of AI systems, as relevance is often determined by construction rather than true understanding of legal principles. AI

IMPACT Highlights potential overestimation of AI's legal reasoning abilities in current benchmarks, suggesting a need for more robust evaluation methods.

BM25
LeCaRDv2
Chinese Legal Case Retrieval
LeCaRDv1
CAIL2022