实体
JailbreakBench
JailbreakBench
PulseAugur coverage of JailbreakBench — every cluster mentioning JailbreakBench across labs, papers, and developer communities, ranked by signal.
总计 · 30天
2
90 天内 2
发布 · 30天
0
90 天内 0
论文 · 30天
2
90 天内 2
层级分布 · 90 天
情绪 · 30 天
2 天有情绪数据
最近 · 第 1/1 页 · 共 2 条
-
Reflect-Guard enhances LLM safety with logical self-reflection
Researchers have developed Reflect-Guard, a new method to improve the safety of large language models against adversarial prompts. This technique uses chain-of-thought self-reflection, fine-tuning models like Llama-Guar…
-
New LASH framework boosts LLM jailbreaking by combining attack methods
Researchers have developed LASH, a novel framework designed to enhance the jailbreaking of large language models. LASH adaptively combines outputs from multiple existing attack methods, treating them as seed prompts. Th…