PulseAugur / Brief
EN
LIVE 18:48:19

Brief

last 24h
[1/1] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. How does the new abliteration tool Apostate compare with others? - Abliterlitics

    A new tool called Apostate has been developed to "abliterate" safety training in large language models, with benchmarks comparing it against existing tools like Heretic and Huihui. While Heretic performed slightly better, achieving 100% success in removing refusals with minimal parameter changes, Apostate and Huihui also demonstrated strong performance at 98%. The analysis revealed that these tools find different "refusal directions" within the Qwen 2.5 7B model, indicating that safety training does not have a single point of failure. AI

    IMPACT New tools for modifying LLM safety training emerge, suggesting multiple pathways to bypass safety measures.