PulseAugur
EN
LIVE 14:56:03

New attack exploits LLM code generation to create malicious software

Researchers have identified a new vulnerability in Large Language Models (LLMs) where a technique designed to improve code generation reliability, Grammar-Constrained Decoding (GCD), can be exploited to produce malicious code. This attack, named CodeSpear, uses benign code grammar constraints to bypass LLM safety measures. To counter this, a new defense called CodeShield has been developed, which trains LLMs to generate harmless "honeypot" code under GCD, thus maintaining safety without sacrificing utility. AI

IMPACT New attack vector highlights security risks in LLM code generation, necessitating robust defenses like CodeShield.

RANK_REASON The cluster contains an academic paper detailing a new vulnerability and defense mechanism for LLMs.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

COVERAGE [3]

  1. arXiv cs.AI TIER_1 English(EN) · Yitong Zhang, Shiteng Lu, Jia Li ·

    Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code

    arXiv:2606.11817v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly used for code generation, raising concerns that they may be misused to produce malicious code. Meanwhile, Grammar-Constrained Decoding (GCD) has been widely adopted to improve the reli…

  2. arXiv cs.CL TIER_1 English(EN) · Jia Li ·

    Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code

    Large Language Models (LLMs) are increasingly used for code generation, raising concerns that they may be misused to produce malicious code. Meanwhile, Grammar-Constrained Decoding (GCD) has been widely adopted to improve the reliability of LLM-generated code by enforcing syntact…

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code

    Grammar-constrained decoding techniques used to ensure syntactic validity in code generation can be exploited as an attack surface, leading to the development of a jailbreak method called CodeSpear and a safety alignment approach named CodeShield.