PulseAugur
实时 06:20:32

LLMs analyze language ideologies in Luxembourgish news comments

Researchers have developed a new method using sparse crosscoders to track the emergence and consolidation of linguistic features within large language models during pretraining. This technique, which includes a novel metric called Relative Indirect Effects (RelIE), helps identify when specific capabilities become causally important for task performance. The approach is architecture-agnostic and scalable, offering a more interpretable way to analyze representation learning in LLMs. Separately, another study explores the use of LLMs to detect language ideologies in Luxembourgish news comments, a small language with limited representation in training data. The research investigates whether machine translation to high-resource languages improves LLM performance on this task, suggesting LLMs can be practical tools for identifying ideological content despite current optimization limitations. AI

影响 Provides new methods for understanding LLM internal representations and explores LLM utility for sociolinguistic analysis.

排序理由 This cluster contains two academic papers published on arXiv, one detailing a new method for analyzing LLM pretraining and another exploring LLM applications in sociolinguistics.

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。 我们如何撰写摘要 →

LLMs analyze language ideologies in Luxembourgish news comments

报道来源 [4]

  1. arXiv cs.AI TIER_1 English(EN) · Deniz Bayazit, Aaron Mueller, Antoine Bosselut ·

    Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining

    arXiv:2509.05291v2 Announce Type: replace-cross Abstract: Large language models (LLMs) learn non-trivial abstractions during pretraining, such as detecting irregular plural noun subjects. However, because traditional evaluation methods (e.g., benchmarking) fail to reveal how mode…

  2. arXiv cs.CL TIER_1 English(EN) · Emilia Milano, Alistair Plum, Yves Scherrer, Christoph Purschke ·

    Language Ideologies in a Multilingual Society: An LLM-based Analysis of Luxembourgish News Comments

    arXiv:2604.27661v1 Announce Type: new Abstract: Detecting language ideologies is a valuable yet complex task for understanding how identities are constructed through discourse. In Luxembourg's multicultural and multilingual society, language ideologies reflect more than simple pr…

  3. arXiv cs.CL TIER_1 English(EN) · Christoph Purschke ·

    Language Ideologies in a Multilingual Society: An LLM-based Analysis of Luxembourgish News Comments

    Detecting language ideologies is a valuable yet complex task for understanding how identities are constructed through discourse. In Luxembourg's multicultural and multilingual society, language ideologies reflect more than simple preferences: they carry deep cultural and social m…

  4. Hugging Face Daily Papers TIER_1 English(EN) ·

    Language Ideologies in a Multilingual Society: An LLM-based Analysis of Luxembourgish News Comments

    Detecting language ideologies is a valuable yet complex task for understanding how identities are constructed through discourse. In Luxembourg's multicultural and multilingual society, language ideologies reflect more than simple preferences: they carry deep cultural and social m…