PulseAugur
EN
LIVE 12:26:06

New research reveals escalating LLM and LALM jailbreak vulnerabilities

Three new research papers explore the vulnerabilities and defenses of large language models (LLMs) and large audio-language models (LALMs). The first paper details a taxonomy of audio jailbreak attacks and defenses, highlighting that current defenses often compromise usability for robustness. The second paper offers a comprehensive review of LLM vulnerabilities, categorizing attacks and defenses while identifying research gaps in areas like resilient alignment and automated detection. The third paper introduces "Jailbreak Scaling Laws," demonstrating how adversarial prompts can shift attack success rates from polynomial to exponential growth, a phenomenon observed across various LLMs and attack methods. AI

IMPACT New research highlights escalating risks in LLM and LALM security, emphasizing the need for more robust and usable defenses against sophisticated jailbreaking techniques.

RANK_REASON Cluster consists of three academic papers detailing research into LLM and LALM vulnerabilities and defenses.

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 4 sources. How we write summaries →

New research reveals escalating LLM and LALM jailbreak vulnerabilities

COVERAGE [4]

  1. arXiv cs.AI TIER_1 English(EN) · Bo-Han Feng, Yu-Hsuan Li Liang, Chien-Feng Liu, You-Hsuan Chang, Yun-Nung Chen ·

    Audio Jailbreaks in Large Audio-Language Models: Taxonomy, Attack-Defense Analysis, and Cost-Aware Evaluation

    arXiv:2605.30031v1 Announce Type: cross Abstract: Large Audio Language Models (LALMs) expand jailbreak risks from token-level prompting to the full speech perception-to-reasoning pipeline, where unsafe behavior can be induced through semantics, acoustic style, signal artifacts, o…

  2. arXiv cs.AI TIER_1 English(EN) · Benji Peng, Hanxuan Chen, Keyu Chen, Qian Niu, Ziqian Bi, Ming Liu, Pohsun Feng, Tianyang Wang, Lawrence K. Q. Yan, Yizhu Wen, Yichao Zhang, Caitlyn Heqi Yin, Xinyuan Song, Riyang Bao, Jiacheng Shi ·

    Jailbreaking and Mitigation of Vulnerabilities in Large Language Models

    arXiv:2410.15236v4 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have transformed artificial intelligence by advancing natural language understanding and generation, enabling applications across fields beyond healthcare, software engineering, and conversatio…

  3. arXiv cs.AI TIER_1 English(EN) · Indranil Halder, Annesya Banerjee, Cengiz Pehlevan ·

    Jailbreak Scaling Laws for Large Language Models: Polynomial-Exponential Crossover

    arXiv:2603.11331v3 Announce Type: replace-cross Abstract: Adversarial attacks can reliably steer safety-aligned large language models toward unsafe behavior. Empirically, we find that adversarial prompt-injection attacks can amplify attack success rate from the slow polynomial gr…

  4. arXiv cs.AI TIER_1 English(EN) · Yun-Nung Chen ·

    Audio Jailbreaks in Large Audio-Language Models: Taxonomy, Attack-Defense Analysis, and Cost-Aware Evaluation

    Large Audio Language Models (LALMs) expand jailbreak risks from token-level prompting to the full speech perception-to-reasoning pipeline, where unsafe behavior can be induced through semantics, acoustic style, signal artifacts, or internal representations. Existing work studies …