PulseAugur
EN
LIVE 06:11:45
日本語(JA) 「どのLLMがロシアのプロパガンダに対抗するのに優れているか?」がわかるベンチマークをエストニア政府が発表 https:// web.brid.gy/r/https://gigazine .net/news/20260605-llm-resisting-russian-propaganda/

Estonia benchmark: Claude Opus 4.7 best resists Russian propaganda

Estonia's Language Institute has released a new benchmark called "Propaganda Resistance" to evaluate how well large language models can withstand Russian state-sponsored disinformation. The benchmark tested 14 types of Russian propaganda narratives across three languages, with models responding to 75 questions. Anthropic's Claude Opus 4.7 emerged as the top performer, achieving a near-perfect score, while NVIDIA's Nemotron 3 Super 120B and Alibaba's Qwen 3.6 Plus also demonstrated strong resistance. AI

IMPACT This benchmark highlights the critical need for LLMs to resist disinformation, influencing future model development and safety evaluations.

RANK_REASON Benchmark release by a government-affiliated institute. [lever_c_demoted from research: ic=1 ai=1.0]

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Estonia benchmark: Claude Opus 4.7 best resists Russian propaganda

COVERAGE [1]

  1. Mastodon — fosstodon.org TIER_1 日本語(JA) · [email protected] ·

    Estonian government releases benchmark that shows which LLM is best at countering Russian propaganda

    「どのLLMがロシアのプロパガンダに対抗するのに優れているか?」がわかるベンチマークをエストニア政府が発表 https:// web.brid.gy/r/https://gigazine .net/news/20260605-llm-resisting-russian-propaganda/