ENTITY LLM-as-judges

LLM-as-judges

PulseAugur coverage of LLM-as-judges — every cluster mentioning LLM-as-judges across labs, papers, and developer communities, ranked by signal.

Total · 30d

1

1 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

1

1 over 90d

TIER MIX · 90D

RECENT · PAGE 1/1 · 1 TOTAL

RESEARCH · CL_22175 · May 7 · 13:55

Study reveals rubric design impacts human-autorater agreement in LLM evaluations

A new research paper explores how changes to evaluation rubrics impact agreement between human evaluators and AI models acting as judges, known as autoraters. The study found that providing clear examples and context wi…