Researchers have developed a new method for controlling Large Language Model (LLM) behavior at inference time by treating their layer-wise dynamics as locally-linear systems. This approach adapts classical linear optimal control techniques to steer model activations towards desired semantic targets. The method offers closed-loop control with minimal computational overhead and provides theoretical guarantees on performance, outperforming existing activation steering techniques in controlling attributes like toxicity and truthfulness. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON Academic paper detailing a novel method for controlling LLM behavior.