PulseAugur
EN
LIVE 08:03:08

LLM-generated stories show low diversity due to preference data

A new research paper identifies a significant lack of diversity in stories generated by large language models. The study found that a small set of 11 words, including names like Elias and settings like lighthouses, appear in nearly 90% of generated stories across four different models. These words are not common in general literature but are prevalent in preference datasets likely used for model alignment, suggesting that these datasets and alignment techniques may be disproportionately influencing model output and leading to repetitive narratives. AI

IMPACT Highlights how preference data and alignment techniques can lead to repetitive outputs in LLM-generated content, potentially impacting creative applications.

RANK_REASON The cluster contains an academic paper detailing research findings on LLM behavior. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

LLM-generated stories show low diversity due to preference data

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Sil Hamilton, David Mimno ·

    Elias in the Lighthouse, Again? Diagnosing Low Diversity in LLM Stories

    arXiv:2605.26492v1 Announce Type: cross Abstract: LLM-generated stories are a popular use case, but they show very low variability. We sample 20,000 total stories from four current models using five prompts. We find that 11 words occur in 88.3% of generated stories, with little d…