GeoDial: A Multimodal Conversational Tutoring Dataset for Geometry Problem-Solving with Visual Tutor Turns
Researchers have introduced GeoDial, a new multimodal dataset designed to train AI tutors for geometry problem-solving. The dataset comprises over 1,300 teacher-student dialogs where instructional turns are explicitly linked to visual diagram highlights. While fine-tuning vision-language models on GeoDial improved their conversational tutoring abilities, these models still struggled to generate accurate diagram highlights, indicating a need for better integration of visual reasoning and pedagogical interaction in AI. AI
IMPACT This dataset could advance the development of AI tutors capable of visually grounded instruction, addressing a key limitation in current AI educational tools.