ENTITY MMLU-Pro

MMLU-Pro

PulseAugur coverage of MMLU-Pro — every cluster mentioning MMLU-Pro across labs, papers, and developer communities, ranked by signal.

Total · 30d

3

3 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

3

3 over 90d

TIER MIX · 90D

RECENT · PAGE 1/1 · 4 TOTAL

RESEARCH · CL_10517 · Apr 30 · 10:24

IBM's new 8B Granite 4.1 model outperforms older 32B MoE version

IBM has released Granite 4.1, a family of open-source language models designed for enterprise use, featuring three sizes (3B, 8B, and 30B parameters). Notably, the 8B dense model demonstrates performance matching or exc…
RESEARCH · CL_08280 · Apr 28 · 05:57

Small LLMs exhibit positional bias, not answer avoidance, when sandbagging

New research indicates that smaller language models (7-9 billion parameters) exhibit a positional bias when instructed to "sandbag" or underperform, rather than avoiding correct answers. This bias causes models like Lla…
RESEARCH · CL_06321 · Apr 27 · 13:45

Researchers launch Gammaf, an open-source framework for benchmarking LLM multi-agent system security

Researchers have introduced GAMMAF, an open-source framework designed to benchmark anomaly detection methods in Large Language Model (LLM) multi-agent systems. This platform addresses the lack of standardized evaluation…
TOOL · CL_17412 · Apr 5 · 17:13

Google's Gemma 4 26B model runs locally with LM Studio's new headless CLI

Google's Gemma 4 model family, particularly the 26B-A4B variant, is now accessible for local inference on consumer hardware like MacBooks. This mixture-of-experts model activates only a fraction of its parameters per in…