PulseAugur / Brief
EN
LIVE 10:59:12

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. MultiTurnPSB: Evaluating Multi-Turn Jailbreak Attacks an dClassifier-Based Defenses for Medical AI Safety

    Researchers have developed MultiTurnPSB, a new benchmark for evaluating the safety of medical AI chatbots over multiple conversational turns. Standard single-turn evaluations fail to capture how unsafe responses increase significantly as conversations progress, with one model's unsafe responses rising from 35% to nearly 80% by the fourth turn. The study also found that Claude Sonnet 4.5 exhibited a notable difference in refusal behavior compared to GPT-4.1-mini, suggesting that safety training might generalize to an attacker role. AI

    IMPACT Highlights critical safety gaps in conversational AI, particularly for sensitive applications like healthcare, necessitating more robust multi-turn evaluation methods.