CrowdMath: A Dataset of Crowdsourced Mathematical Research Discussions
Researchers have introduced CrowdMath, a new dataset comprising 164 annotated discussion chains from a collaborative mathematical research program. This dataset captures the nuances of open-problem solving, including partial arguments, error identification, and reasoning repair, which are absent in existing benchmarks. While frontier models show promise in predicting the flow of mathematical discussions, they struggle to accurately classify the functional roles of individual contributions within these collaborative efforts. AI
IMPACT This dataset could push frontier models to better understand and participate in complex, collaborative problem-solving scenarios.