PulseAugur
实时 19:04:42
English(EN) Vulnerability of Natural Language Classifiers to Evolutionary Generated Adversarial Text

新的GAversary工具可生成针对NLP模型的对抗性攻击

研究人员开发了GAversary,这是一种新颖的混合遗传算法,旨在针对自然语言处理模型生成对抗性攻击。这种黑盒方法仅需要模型的logit输出来指导其漏洞搜索。GAversary利用GloVe嵌入来提出语义上相似的词语替换,显著降低了目标模型在基准数据集上的准确性。在一个实例中,它将准确性从76.8%降低到5.8%,优于现有的BAE和A2T攻击,尽管它扰动了更多的词语并且运行时间稍长。 AI

影响 这项研究强调了一种测试NLP模型鲁棒性的新方法,有可能带来更安全可靠的AI系统。

排序理由 该集群包含一篇研究论文,详细介绍了生成针对NLP模型的对抗性攻击的新方法。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

新的GAversary工具可生成针对NLP模型的对抗性攻击

报道来源 [2]

  1. arXiv cs.AI TIER_1 English(EN) · Manjinder Singh, Alexander E. I. Brownlee, Mohamed Elawady ·

    Vulnerability of Natural Language Classifiers to Evolutionary Generated Adversarial Text

    arXiv:2606.27215v1 Announce Type: new Abstract: Deep learning models have achieved impressive performance across various fields but remain vulnerable to adversarial inputs, particularly in NLP, where such attacks can have significant real-world consequences. Adversarial attacks o…

  2. arXiv cs.AI TIER_1 English(EN) · Mohamed Elawady ·

    Vulnerability of Natural Language Classifiers to Evolutionary Generated Adversarial Text

    Deep learning models have achieved impressive performance across various fields but remain vulnerable to adversarial inputs, particularly in NLP, where such attacks can have significant real-world consequences. Adversarial attacks often involve small, semantically similar token r…