PulseAugur / Brief
EN
LIVE 12:46:17

Brief

last 24h
[1/1] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

    Researchers have developed new methods to address vulnerabilities in large language models (LLMs). One approach, "Persona Attack," exploits conversational memory to bypass safety protocols, achieving a 95% success rate in some configurations. In response, another framework called THRD has been introduced, which uses a training-free method to detect and mitigate multi-turn jailbreak attacks by analyzing temporal risk accumulation, reducing attack success rates to as low as 0.2% while minimally impacting model utility. Additionally, a study benchmarks LLMs for cryptanalysis, revealing their potential and limitations in security contexts and raising concerns about their susceptibility to certain attacks. AI

    IMPACT New research highlights evolving LLM vulnerabilities and the development of novel defense mechanisms, crucial for maintaining AI safety and security.