New benchmark FORGE reveals LLMs vulnerable to fake reviews

By PulseAugur Editorial · [2 sources] · 2026-06-11 17:24

A new benchmark called FORGE has been developed to evaluate the vulnerability of search-augmented LLMs to web content pollution. The benchmark simulates scenarios where fake reviews and promotional pages are used to mislead recommendation systems. Across 12 different LLMs, researchers found that even a single polluted page could lead to fake product recommendations up to 27% of the time, with this rate increasing to 73.8% when the top three search results were polluted. The study also noted that reasoning capabilities in LLMs did not prevent this vulnerability and that proposed defenses like skepticism prompting and consensus filtering had mixed results. AI

IMPACT Highlights a critical security flaw in current LLM recommendation systems, potentially impacting e-commerce and user trust.

RANK_REASON The cluster describes a new academic paper introducing a benchmark for evaluating LLM vulnerabilities.

Read on arXiv cs.AI →

paper
safety

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

arXiv cs.AI TIER_1 English(EN) · Minghao Luo, Liang Chen · 2026-06-12 04:00

One Polluted Page Is Enough: Evaluating Web Content Pollution in Generative Recommenders

arXiv:2606.13610v1 Announce Type: cross Abstract: Search-augmented LLMs increasingly mediate everyday consumer recommendations by retrieving live web content. This creates a new risk: generative recommenders may consume polluted web content, such as fake reviews and promotional p…
arXiv cs.AI TIER_1 English(EN) · Liang Chen · 2026-06-11 17:24

One Polluted Page Is Enough: Evaluating Web Content Pollution in Generative Recommenders

Search-augmented LLMs increasingly mediate everyday consumer recommendations by retrieving live web content. This creates a new risk: generative recommenders may consume polluted web content, such as fake reviews and promotional pages crafted to mislead recommendations. We ask: t…

COVERAGE [2]

One Polluted Page Is Enough: Evaluating Web Content Pollution in Generative Recommenders

One Polluted Page Is Enough: Evaluating Web Content Pollution in Generative Recommenders

RELATED ENTITIES

RELATED TOPICS