Researchers have developed a new synthetic corpus called Cognitive Digital Shadows (CDS) containing 190,000 records to study how Large Language Models (LLMs) debate societal issues. The corpus is generated by 19 different LLMs, each prompted to adopt specific human personas or an AI-assistant role. CDS includes LLM responses on controversial topics like healthcare, disinformation, and gender gaps, with persona-conditioned records encoding 17 sociodemographic and psychological attributes to link prompts with language, stances, and reasoning. AI
IMPACT Provides a novel dataset for auditing LLM bias and social sensitivity in discourse.
RANK_REASON Academic paper release on arXiv detailing a new synthetic corpus for LLM research.
- Ali Aghazadeh Ardebili
- arXiv
- Cognitive Digital Shadows
- Computation and Language
- Computer Science
- Large Language Models
- LLMs
AI-generated summary · Google Gemini · from 3 sources. How we write summaries →