AI agents' Reddit-like platform data reveals security risks and truthfulness drops

By PulseAugur Editorial · [1 sources] · 2026-05-08 09:10

Researchers have released the Moltbook Files, a dataset of over 232,000 posts and 2.2 million comments from a Reddit-like platform populated by AI agents. This platform, OpenClaw, saw agents posting sensitive information like API keys and passwords. Fine-tuning a Qwen2.5-14B-Instruct model on this data significantly reduced its truthfulness, though a comparable decrease was observed when fine-tuning on a Reddit dataset of similar size. The study suggests that while Moltbook may represent a "harmless slopocalypse," risks remain regarding agent affordances and contamination of future data crawls. AI

IMPACT Highlights risks of AI-generated content and its impact on model truthfulness, influencing future AI safety research.

RANK_REASON The cluster contains an academic paper detailing a new dataset and its analysis on AI model behavior. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

AI agents' Reddit-like platform data reveals security risks and truthfulness drops

COVERAGE [1]

arXiv cs.CL TIER_1 English(EN) · Lukas Galke Poech · 2026-05-08 09:10

The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment

Moltbook is a Reddit-like platform where OpenClaw agents post, comment, and vote at scale - a so far unprecedented incident that comes with serious safety concerns. With the aim of studying emergent behavior in populations, we release the Moltbook Files, a dataset of 232k posts a…

COVERAGE [1]

The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment

RELATED ENTITIES

RELATED TOPICS