GradingAttack: Exposing Security Vulnerabilities in LLM Based Educational Grading Agents
Researchers have developed a new framework called GradingAttack to expose security vulnerabilities in large language model (LLM) based educational grading agents. The study introduces token-level and prompt-level attack strategies designed to manipulate grading outcomes with high stealth. Experiments showed that these attacks can effectively compromise grading agents, highlighting the urgent need for more secure LLM systems in education. AI
IMPACT Highlights critical security flaws in LLM-based educational tools, necessitating the development of more robust and trustworthy AI systems for academic integrity.