English(EN) On the Adversarial Robustness of Multimodal LLM Judges

新框架揭示AI裁判易受对抗性攻击的漏洞

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-16 04:00

研究人员引入了RobustMLLMJudge，一个旨在评估多模态大语言模型（MLLMs）在用作图像质量和安全评估等任务的裁判时的对抗鲁棒性的框架。研究发现，当前的MLLM裁判容易受到提高分数的攻击，并提出了一种名为流形引导语义归纳攻击（MGSIA）的新方法来创建更有效和可迁移的对抗性攻击。这凸显了开发更鲁棒的MLLM裁判以确保自动化评估系统可靠性的关键需求。 AI

影响强调了对更鲁棒的AI裁判的需求，可能影响AI评估系统的开发和部署。

排序理由该集群包含一篇学术论文，详细介绍了用于评估AI模型鲁棒性的新框架和攻击方法。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Zihan Wang, Guansong Pang, Zelin Liu, Wenjun Miao, Jin Zheng, Xiao Bai · 2026-06-16 04:00

On the Adversarial Robustness of Multimodal LLM Judges

arXiv:2606.15608v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) are increasingly used as automated judges, e.g., for image quality and safety assessment. However, their adversarial robustness remains largely unexplored, threatening the fairness and reliab…

报道来源 [1]

On the Adversarial Robustness of Multimodal LLM Judges

相关实体

相关话题