PulseAugur
实时 10:07:06
实体 DELEGATE-52

DELEGATE-52

PulseAugur coverage of DELEGATE-52 — every cluster mentioning DELEGATE-52 across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
1
90 天内 1
发布 · 30天
0
90 天内 0
论文 · 30天
1
90 天内 1
层级分布 · 90 天
时间线
  1. 2026-05-11 research_milestone Microsoft Research released the DELEGATE-52 benchmark, highlighting significant document corruption issues with leading AI models.
  2. 2026-05-11 research_milestone Microsoft Research introduced the DELEGATE-52 benchmark, revealing significant document corruption by LLMs. 来源
情绪 · 30 天

1 天有情绪数据

最近 · 第 1/1 页 · 共 1 条
  1. RESEARCH · CL_36786 ·

    Microsoft Research: LLMs corrupt 25% of documents in delegated tasks

    A new benchmark, DELEGATE-52, developed by Microsoft Research, reveals that current large language models significantly corrupt documents during delegated workflows. Even advanced models like Gemini 3.1 Pro, Claude 4.6 …