PulseAugur / Brief
EN
LIVE 10:16:38

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. "Did you lie?" Evaluating Lie Detectors across Model Scale and Belief-Verified Model Organisms

    Researchers have developed new methods to evaluate lie detectors for language models, addressing the challenge that existing testbeds often fail to ensure models genuinely believe the opposite of what they state. The study introduces 13 reasoning model organisms with verified hidden beliefs and a prompted-lying testbed called Varied Deception. Across 31 open-weight models, detectors showed scaling with model capability on prompted lying, but activation- and logprob-based methods struggled with the trained model organisms. The chain-of-thought judge performed best, though partly due to verification methods. AI

    IMPACT New evaluation methods and datasets for AI lie detection could improve model auditing and safety research.