EvoDefense uses LLMs to co-evolve defenses against black-box attacks

By PulseAugur Editorial · [2 sources] · 2026-05-29 10:49

Researchers have developed EvoDefense, a novel approach to protect large language models (LLMs) from attacks in black-box scenarios. This system uses a guard LLM and an experience memory to continuously refine defense strategies through an iterative attack-defense evolution loop. EvoDefense demonstrates strong generalization capabilities, effectively defending against unseen attacks and various LLM architectures without requiring retraining. AI

IMPACT Enhances LLM security by providing a dynamic defense mechanism against evolving adversarial attacks.

RANK_REASON The cluster contains a research paper detailing a new method for LLM security.

Read on arXiv cs.CL →

paper
safety

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

EvoDefense uses LLMs to co-evolve defenses against black-box attacks

COVERAGE [2]

arXiv cs.CL TIER_1 Nederlands(NL) · Yu Li, Yuenan Hou, Yingmei Wei, Yanming Guo, Chaochao Lu · 2026-06-01 04:00

EvoDefense: Co-Evolving Black-Box Defense with Large Language Models

arXiv:2605.31140v1 Announce Type: cross Abstract: Large Language Models (LLMs) remain highly vulnerable to diverse attacks, particularly in black-box settings where the internals of target models are inaccessible. Existing black-box defenses typically rely on pre-defined filterin…
arXiv cs.CL TIER_1 Nederlands(NL) · Chaochao Lu · 2026-05-29 10:49

EvoDefense: Co-Evolving Black-Box Defense with Large Language Models

Large Language Models (LLMs) remain highly vulnerable to diverse attacks, particularly in black-box settings where the internals of target models are inaccessible. Existing black-box defenses typically rely on pre-defined filtering heuristics, which often fail to generalize to un…

COVERAGE [2]

EvoDefense: Co-Evolving Black-Box Defense with Large Language Models

EvoDefense: Co-Evolving Black-Box Defense with Large Language Models

RELATED ENTITIES

RELATED TOPICS