English(EN) Obfuscation Rules for Detecting and Detoxifying Korean Toxicity

新的韩语数据集应对LLM中的混淆毒性语言

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-29 04:00

研究人员推出了KOTOX，一个旨在提高对韩语毒性语言检测和净化能力的新数据集，特别是在用户使用混淆技术时。该数据集对韩语混淆模式进行了分类，并提供了源自真实世界示例的转换规则，从而能够创建成对的中性、毒性和混淆句子。在KOTOX上训练的模型在处理混淆文本方面表现出更强的能力，同时不影响其处理非混淆内容的能力，这标志着在减轻韩语语言模型中伪装的毒性表达方面迈出了重要一步。 AI

影响通过提高对韩语中伪装毒性语言的检测能力，增强了LLM的安全性。

排序理由该集群描述了在arXiv上发布的一篇新学术论文和数据集。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Yejin Lee, Su-Hyeon Kim, Hyundong Jin, Dayoung Kim, Yeonsoo Kim, Yo-Sub Han · 2026-05-29 04:00

Obfuscation Rules for Detecting and Detoxifying Korean Toxicity

arXiv:2510.10961v3 Announce Type: replace-cross Abstract: As language models become increasingly deployed in online environments, toxicity detection and detoxification have received growing attention. Existing studies primarily focus on non-obfuscated text, which limits robustnes…

报道来源 [1]

Obfuscation Rules for Detecting and Detoxifying Korean Toxicity

相关实体

相关话题