PulseAugur
LIVE 13:54:27
ENTITY CarryOnBench

CarryOnBench

PulseAugur coverage of CarryOnBench — every cluster mentioning CarryOnBench across labs, papers, and developer communities, ranked by signal.

Total · 30d
1
1 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
1
1 over 90d
TIER MIX · 90D
RECENT · PAGE 1/1 · 1 TOTAL
  1. RESEARCH · CL_11470 ·

    New benchmark tests LLMs' ability to recover helpfulness after user clarifies intent

    Researchers have introduced CarryOnBench, a new benchmark designed to evaluate how well large language models can recover helpfulness in multi-turn conversations after a user clarifies their intent. The benchmark simula…