Two new research papers explore the effectiveness of large language models (LLMs) in automated essay scoring (AES). The first paper synthesizes 65 studies, finding that LLM-human agreement in essay scoring is highly context-dependent and varies significantly. The second paper investigates domain-adaptive pretraining (DAPT) on learner corpora for AES, suggesting that while targeted DAPT can improve in-domain scoring, it doesn't consistently enhance cross-dataset transferability. AI
IMPACT These studies highlight the nuanced performance of LLMs in educational assessment, indicating areas where further research and development are needed for reliable application.
RANK_REASON The cluster contains two academic papers published on arXiv discussing research findings related to LLMs and automated essay scoring.
AI-generated summary · Google Gemini · from 3 sources. How we write summaries →