PulseAugur
实时 23:08:41

New K-12 knowledge graph benchmarks LLM curriculum cognition

Researchers have developed K12-KGraph, a novel knowledge graph designed to evaluate and train large language models (LLMs) specifically for K-12 education. This graph, derived from official textbooks, captures curriculum structure, including prerequisites and concept relationships, going beyond simple factual recall. To support this, they created K12-Bench, a 23,640-question benchmark, and K12-Train, a fine-tuning dataset. Experiments show current LLMs struggle with curriculum cognition, and the K12-Train dataset significantly improves performance on educational benchmarks with high sample efficiency. AI

影响 Establishes a new benchmark for evaluating LLM understanding of educational curricula, potentially driving development of more pedagogically aware AI.

排序理由 The cluster describes a new academic paper introducing a novel dataset and benchmark for evaluating LLMs in an educational context. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

New K-12 knowledge graph benchmarks LLM curriculum cognition

报道来源 [1]

  1. arXiv cs.CL TIER_1 English(EN) · Wentao Zhang ·

    K12-KGraph: A Curriculum-Aligned Knowledge Graph for Benchmarking and Training Educational LLMs

    Large language models (LLMs) are increasingly used in K-12 education, yet existing benchmarks such as C-Eval, CMMLU, GaokaoBench, and EduEval mainly evaluate factual recall through exam-style question answering. Effective educational AI additionally requires curriculum cognition:…