PulseAugur
EN
LIVE 17:15:59
ENTITY TriggerBench

TriggerBench

PulseAugur coverage of TriggerBench — every cluster mentioning TriggerBench across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
2
2 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
2
2 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 2 TOTAL
  1. TOOL · CL_106821 ·

    New benchmark TriggerBench reveals prospective memory challenges for LLMs

    Researchers have introduced TriggerBench, a new benchmark designed to evaluate prospective memory (PM) in large language models (LLMs). Unlike retrospective memory (RM), which relies on explicit queries, PM assesses an …

  2. RESEARCH · CL_103038 ·

    New research explores multilingual LLM scaling, knowledge integration, and specialized evaluation

    Researchers are developing new methods and benchmarks to improve the capabilities and evaluation of large language models (LLMs). Google DeepMind has introduced ATLAS, a framework for optimizing multilingual LLM trainin…