PulseAugur
EN
LIVE 14:06:43
ENTITY evalplus

evalplus

PulseAugur coverage of evalplus — every cluster mentioning evalplus across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
2
2 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
2
2 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 2 TOTAL
  1. TOOL · CL_85566 ·

    LLM benchmarks saturate quickly due to training data contamination

    Public LLM benchmarks are becoming saturated and less useful for differentiating top-tier models due to their training data inadvertently including benchmark questions. This contamination issue, observed in benchmarks l…

  2. RESEARCH · CL_10517 ·

    IBM's new 8B Granite 4.1 model outperforms older 32B MoE version

    IBM has released Granite 4.1, a family of open-source language models designed for enterprise use, featuring three sizes (3B, 8B, and 30B parameters). Notably, the 8B dense model demonstrates performance matching or exc…