ENTITY deep research agents

deep research agents

PulseAugur coverage of deep research agents — every cluster mentioning deep research agents across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

5 over 90d

Releases · 30d

0 over 90d

Papers · 30d

5 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

3 day(s) with sentiment data

RECENT · PAGE 1/1 · 5 TOTAL

TOOL · CL_135377 · Jul 10 · 04:00

New DR-Arena framework automates LLM agent evaluation

Researchers have developed DR-Arena, an automated evaluation framework designed to assess the capabilities of deep research agents, which are advanced large language models capable of autonomous investigation. Unlike st…
TOOL · CL_110956 · Jun 25 · 19:19

New research shows 13 words can poison LLMs via user content

A new research paper details a method for poisoning Large Language Models (LLMs) by subtly altering user-generated content. The study suggests that as few as 13 words can be sufficient to compromise the model's integrit…
TOOL · CL_92135 · Jun 15 · 14:45

Research Paper Reveals User-Generated Content Can Poison Deep-Research AI Agents

A new research paper details a vulnerability in deep-research agents, which can be compromised through user-generated content. The study, available on arXiv, explores how malicious input can poison these AI systems. Thi…
TOOL · CL_74401 · Jun 6 · 04:00

Research paper warns of 'Search-Time Contamination' inflating AI agent benchmarks

A new research paper identifies a problem called Search-Time Contamination (STC) in deep research agents that use web search for evaluation. This contamination occurs when agents retrieve benchmark metadata, question co…
TOOL · CL_40852 · May 18 · 23:55

New benchmark reveals LLM judges unreliable for research agents

Researchers have developed a new benchmark called REFLECT to evaluate the reliability of Large Language Models (LLMs) when used as judges for deep research agents. These agents automate complex information-seeking tasks…

New DR-Arena framework automates LLM agent evaluation

New research shows 13 words can poison LLMs via user content

Research Paper Reveals User-Generated Content Can Poison Deep-Research AI Agents

Research paper warns of 'Search-Time Contamination' inflating AI agent benchmarks

New benchmark reveals LLM judges unreliable for research agents