Evaluating Universal Machine Learning Force Fields Against Experimental Measurements
A new evaluation framework called UniFFBench, featuring the MinX dataset, has been developed to assess the performance of universal machine learning force fields (UMLFFs) against experimental measurements. This framework includes over 1,500 mineral systems under extreme conditions and uses experimental data for validation. The evaluation of six leading UMLFFs revealed a significant "reality gap," where models performing well on computational benchmarks struggled with experimental complexity, showing prediction errors too high for practical applications. AI
IMPACT Highlights limitations in current ML force fields, potentially guiding future research towards more experimentally grounded models.