PulseAugur
LIVE 15:18:11
ENTITY LLM-as-judges

LLM-as-judges

PulseAugur coverage of LLM-as-judges — every cluster mentioning LLM-as-judges across labs, papers, and developer communities, ranked by signal.

Total · 30d
1
1 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
1
1 over 90d
TIER MIX · 90D
RECENT · PAGE 1/1 · 1 TOTAL
  1. RESEARCH · CL_22175 ·

    Study reveals rubric design impacts human-autorater agreement in LLM evaluations

    A new research paper explores how changes to evaluation rubrics impact agreement between human evaluators and AI models acting as judges, known as autoraters. The study found that providing clear examples and context wi…