PulseAugur
LIVE 08:29:16
tool · [1 source] ·
0
tool

New framework StereoTales finds harmful stereotypes in 23 LLMs

Researchers have developed StereoTales, a new multilingual framework and dataset designed to identify and evaluate social biases in large language models. The framework analyzes over 650,000 generated stories across 10 languages from 23 different LLMs, uncovering more than 1,500 harmful stereotypes. Findings indicate that all evaluated models exhibit significant harmful stereotypes in open-ended generation, and these biases adapt based on the prompt language, reflecting culturally specific issues. Interestingly, human and LLM judgments on the harmfulness of these stereotypes show a notable alignment. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Identifies widespread, culturally-adaptive harmful stereotypes in LLMs, highlighting a critical area for model safety and alignment research.

RANK_REASON The cluster describes a new academic paper detailing a novel dataset and evaluation pipeline for studying bias in LLMs. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 · Matteo Dora ·

    StereoTales: A Multilingual Framework for Open-Ended Stereotype Discovery in LLMs

    Multilingual studies of social bias in open-ended LLM generation remain limited: most existing benchmarks are English-centric, template-based, or restricted to recognizing pre-specified stereotypes. We introduce StereoTales, a multilingual dataset and evaluation pipeline for syst…