Researchers have developed a new pipeline that uses large language models (LLMs) as judges for educational assessments, specifically for question-level marking in preparation for university admissions exams. This system grounds LLM outputs in official curriculum documents and marking guidelines to ensure accuracy and consistency. The pipeline identifies question topics, cognitive demand, and uses syllabus artifacts to generate rubrics and evaluate student responses, showing comparable results to human tutors with more traceable justifications. AI
IMPACT This research introduces a novel method for grounding LLM-based educational assessment in official curricula, potentially improving the reliability and transparency of automated grading systems.
RANK_REASON The cluster contains an academic paper detailing a new research methodology and pipeline. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →