PulseAugur
EN
LIVE 19:07:02
ENTITY JudgmentBench

JudgmentBench

PulseAugur coverage of JudgmentBench — every cluster mentioning JudgmentBench across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
1
1 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
1
1 over 90d
TIER MIX · 90D
SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 1 TOTAL
  1. TOOL · CL_51032 ·

    JudgmentBench dataset shows preference judgments outperform rubrics for AI evaluation

    Researchers have introduced JudgmentBench, a new benchmark dataset designed to compare rubric-based scoring against pairwise preference judgments for evaluating AI model outputs. The dataset comprises 1,539 rubric score…