PulseAugur / Brief
EN
LIVE 07:09:16

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. ALIGNBEAM : Inference-Time Alignment Transfer via Cross-Vocabulary Logit Mixing

    Researchers have developed ALIGNBEAM, a novel method for enhancing the safety of large language models without altering their weights. This technique addresses the issue of domain fine-tuning degrading model safety by enabling alignment transfer even between models with different vocabularies. ALIGNBEAM operates at inference time, using a small LLM judge to select the safest continuation from multiple candidates, thereby improving refusal rates on adversarial benchmarks while maintaining task accuracy and practical inference overhead. AI

    IMPACT Enables cross-family LLM safety alignment without retraining, potentially improving the security of deployed models.