English(EN) DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects

新基准揭示AI检测器在非标准美式英语方言上表现不佳

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-30 04:00

引入了一个新的基准DIA-HARM，用于评估有害内容检测模型在50种英语方言上的性能。研究人员发现，这些主要在标准美式英语上训练的模型，在遇到方言变体时会表现出显著的脆弱性，导致性能下降。虽然微调后的Transformer模型总体上优于零样本大型语言模型，但与单语模型相比，多语模型在不同方言上的泛化能力更强。 AI

影响 AI内容审核系统可能会系统性地对非标准美式英语使用者造成不利影响，因此需要更广泛的方言训练数据。

排序理由该集群描述了一篇介绍用于评估AI模型在方言变体上性能的基准的新学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Jason Lucas, Matt Murtagh, Ali Al-Lawati, Uchendu Uchendu, Adaku Uchendu, Dongwon Lee · 2026-06-30 04:00

DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects

arXiv:2604.05318v2 Announce Type: replace Abstract: Harmful content detectors, particularly disinformation classifiers, are predominantly developed and evaluated on Standard American English (SAE), leaving their robustness to dialectal variation unexplored. We present DIA-HARM, t…

报道来源 [1]

DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects

相关实体

相关话题