PulseAugur
EN
LIVE 11:23:57

LLM-as-Judge pipeline grounds AI marking in official curriculum

Researchers have developed a new pipeline that uses large language models (LLMs) as judges for educational assessments, specifically for question-level marking in preparation for university admissions exams. This system grounds LLM outputs in official curriculum documents and marking guidelines to ensure accuracy and consistency. The pipeline identifies question topics, cognitive demand, and uses syllabus artifacts to generate rubrics and evaluate student responses, showing comparable results to human tutors with more traceable justifications. AI

IMPACT This research introduces a novel method for grounding LLM-based educational assessment in official curricula, potentially improving the reliability and transparency of automated grading systems.

RANK_REASON The cluster contains an academic paper detailing a new research methodology and pipeline. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Xiwei Xu, Chen Wang, Jacky Jiang, Phil Yang, Qian Fu, Mohan Dhall, Wenjie Zhang, Liming Zhu ·

    LLM-as-Judge in Education: A Curriculum-Grounded Marking Pipeline

    arXiv:2606.17507v1 Announce Type: new Abstract: Generative AI and large language models (LLMs) are increasingly applied to question generation and automated assessment. However, deploying LLMs in preparation for high-stakes exams requires more than prompt engineering; it demands …