BigFinanceBench: A Workflow-Grounded Benchmark for Financial-Research Agents
Researchers have introduced BigFinanceBench, a new benchmark designed to evaluate the auditable derivation of financial research answers. This benchmark includes 928 expert-authored tasks with detailed rubrics to assess the full workflow, not just the final output. Initial evaluations of ten leading AI agents showed that the best performer achieved only 58.8% of the rubric score, indicating significant room for improvement in financial research capabilities. AI
IMPACT This benchmark will drive development of more transparent and auditable AI agents for financial research.