Cross-Dataset Bloom Question Classification: Supervised Models and Prompted LLMs
Researchers have evaluated the effectiveness of Large Language Models (LLMs) for classifying assessment questions according to Bloom's taxonomy, a task that can significantly reduce instructor workload. Traditional supervised machine learning and deep learning models showed a substantial drop in performance when applied to datasets they were not trained on. In contrast, LLMs demonstrated more stable performance across different datasets, suggesting they are a more robust option for this task. The study also introduced a user-friendly interface to assist instructors in classifying question banks, which was found to be highly usable and required minimal effort. AI
IMPACT LLMs offer a more generalizable solution for educational question classification, potentially reducing instructor workload and improving assessment consistency.