ENTITY
Jailbreak Attacks
Jailbreak Attacks
PulseAugur coverage of Jailbreak Attacks — every cluster mentioning Jailbreak Attacks across labs, papers, and developer communities, ranked by signal.
Total · 30d
2
2 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
2
2 over 90d
TIER MIX · 90D
TOPICS
RECENT · PAGE 1/1 · 2 TOTAL
-
New safeguard uses draft models to detect LLM jailbreaks
Researchers have developed a new safeguard to improve the safety of large language models (LLMs) against jailbreak attacks. This system leverages the transferability of attacks from larger models to smaller "draft" mode…
-
New research tackles LLM jailbreaks with dynamic evaluation and robust defense strategies
Multiple research papers explore advanced techniques for enhancing the safety and robustness of large language models (LLMs) against jailbreak attacks. These studies introduce novel frameworks and methods for evaluating…