Brief · PulseAugur

TOOL · arXiv cs.CL English(EN) · 1w

GradeLegal: Automated Grading for German Legal Cases

Researchers have developed a system called GradeLegal to automate the grading of German legal exam solutions using large language models. The study evaluated 27 different LLMs and various prompting strategies, finding that reasoning-oriented models can achieve high agreement with expert graders in public law, reaching a quadratic weighted kappa of 0.91. However, performance in criminal law was lower, indicating a more challenging task. Ensembling multiple models further improved grading accuracy, offering a potential alternative to top-tier proprietary models. AI

IMPACT Automated grading systems could streamline feedback for legal students and reduce bottlenecks for educators.

LLMs
German legal exams
GradeLegal
Abdullah Al Zubaer