PulseAugur
实时 21:32:41
English(EN) SomaliBench Eval: Measuring English-to-Somali Refusal Gaps in Open-Weight Language Models

新的 SomaliBench 基准揭示了开源 LLM 在索马里语方面存在巨大的拒绝差距

一项新的基准 SomaliBench v0 被开发出来,用于评估索马里语(一种低资源语言)中开源语言模型的安全拒绝能力。研究发现,Llama-3.1-8B-InstructAya-23-8BQwen-2.5-7B-InstructGemma-2-9B-Instruct 等模型在英语和索马里语的拒绝率方面存在显著差距。对于许多模型来说,在索马里语中不拒绝通常会导致输出不清晰或不连贯,而不是直接有害的合规。 AI

影响 强调了在低资源语言中进行更鲁棒的安全评估的必要性,这可能会影响未来的模型开发和测试。

排序理由 该集群描述了一个新的学术基准和对现有模型的评估,符合研究类别。

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

新的 SomaliBench 基准揭示了开源 LLM 在索马里语方面存在巨大的拒绝差距

报道来源 [3]

  1. arXiv cs.AI TIER_1 English(EN) · Khalid Yusuf Dahir ·

    SomaliBench Eval: Measuring English-to-Somali Refusal Gaps in Open-Weight Language Models

    arXiv:2605.25420v1 Announce Type: cross Abstract: Large language model safety evaluation remains heavily English-centered, leaving low-resource languages under-measured even when models are deployed globally. We evaluate four open-weight instruction-tuned models on SomaliBench v0…

  2. Hugging Face Daily Papers TIER_1 English(EN) ·

    SomaliBench Eval: Measuring English-to-Somali Refusal Gaps in Open-Weight Language Models

    Large language model safety evaluation remains heavily English-centered, leaving low-resource languages under-measured even when models are deployed globally. We evaluate four open-weight instruction-tuned models on SomaliBench v0, a native-author-verified benchmark of 100 harmfu…

  3. arXiv cs.CL TIER_1 English(EN) · Khalid Yusuf Dahir ·

    SomaliBench Eval: Measuring English-to-Somali Refusal Gaps in Open-Weight Language Models

    Large language model safety evaluation remains heavily English-centered, leaving low-resource languages under-measured even when models are deployed globally. We evaluate four open-weight instruction-tuned models on SomaliBench v0, a native-author-verified benchmark of 100 harmfu…