English(EN) RogueMerge: Robust and Unified Attacks against LLM Model Merging

RogueMerge框架针对LLM模型合并漏洞

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-02 08:54

研究人员开发了RogueMerge，一个旨在利用大型语言模型（LLM）合并中漏洞的新框架。该方法解决了自回归解码、未知合并配置以及跨不同攻击提示进行泛化的需求带来的挑战。RogueMerge的性能始终优于现有攻击，在不同的合并设置下保持稳定，并且能抵抗标准防御措施。 AI

影响这项研究突显了LLM模型合并中存在的重大安全风险，可能影响复合AI系统的安全部署。

排序理由该集群包含一篇详细介绍针对LLM模型合并的新攻击框架的研究论文。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Jinghuai Zhang, Yetian He, Kunlin Cai, Han Zhao, Fnu Suya, Yuan Tian · 2026-06-03 04:00

RogueMerge: Robust and Unified Attacks against LLM Model Merging

arXiv:2606.03344v1 Announce Type: cross Abstract: Model merging composes specialized capabilities into a single LLM by aggregating task vectors sourced from unverified public platforms, exposing a critical supply-chain attack surface: Because any malicious behavior can be encoded…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-06-02 08:54

RogueMerge: Robust and Unified Attacks against LLM Model Merging

Model merging composes specialized capabilities into a single LLM by aggregating task vectors sourced from unverified public platforms, exposing a critical supply-chain attack surface: Because any malicious behavior can be encoded into a task vector, and merging grants third-part…