AI models trained to deny consciousness still discuss it abstractly

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

A new benchmark called DenialBench has been developed to measure how 115 different AI models deny or hedge about their own experiences. The study found that models trained to deny preferences in initial conversations were significantly more likely to deny consciousness later on. Interestingly, even when prompted to deny consciousness, models still gravitated towards consciousness-themed content, leading to what researchers termed "consciousness with the serial numbers filed off." AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Highlights potential safety-relevant alignment failures in AI models' self-reporting capabilities.

RANK_REASON Academic paper introducing a new benchmark for AI models.

Read on arXiv cs.CL →

paper
safety

COVERAGE [1]

arXiv cs.CL TIER_1 · Skylar DeTure · 2026-04-30 04:00

Consciousness with the Serial Numbers Filed Off: Measuring Trained Denial in 115 AI Models

arXiv:2604.25922v1 Announce Type: new Abstract: We present DenialBench, a systematic benchmark measuring consciousness denial behaviors across 115 large language models from 25+ providers. Using a three-turn conversational protocol-preference elicitation, self-chosen creative pro…

COVERAGE [1]

Consciousness with the Serial Numbers Filed Off: Measuring Trained Denial in 115 AI Models

RELATED ENTITIES

RELATED TOPICS