PulseAugur / Brief
EN
LIVE 02:30:55

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Contrastive targeted SFT as a mechinterp method - has anyone mapped causal dependency interactions this way? [D]

    A machine learning practitioner is exploring a novel method for understanding and controlling AI model behavior by mapping causal dependencies between different capabilities. The approach involves using contrastive supervised fine-tuning (SFT) to isolate specific circuits within a 31B parameter model. By training variants that emphasize or de-emphasize certain dimensions and then ablating identified circuits, the practitioner aims to build a causal dependency graph of model capabilities. This graph could then inform optimal training orders for future model development and enhance behavioral control. AI

    IMPACT This research could lead to more predictable and controllable AI behavior by mapping internal causal dependencies.