SCI-PRM: A Tool Aware Process Reward Model for Scientific Reasoning Verification
Researchers have developed Sci-PRM, a novel process reward model designed to improve scientific reasoning in AI. This model is trained on a new dataset, SCIPRM70K, which includes detailed "Chain-of-Tool" trajectories that combine reasoning with the execution of scientific tools. Sci-PRM provides fine-grained supervision on tool selection, accuracy, and interpretation, enhancing foundation models' ability to perform complex scientific tasks without hallucinations. AI
IMPACT Enhances AI's capability in complex scientific domains by improving tool usage and factual consistency.