These LLMs are the best at resisting Russian propaganda
The Estonian Language Institute, in collaboration with Propastop, has developed a new benchmark to evaluate large language models' resistance to Russian propaganda. The test involved posing questions in English, Estonian, and Russian, designed to elicit misinformation or propaganda narratives. Anthropic's Claude models, particularly Opus 4.7, demonstrated the strongest performance among proprietary frontier models, achieving an exemplary score on 77% of the test questions. AI
IMPACT This benchmark highlights the potential for LLMs to be influenced by state-sponsored propaganda, emphasizing the need for robust safety measures and further research into model alignment.