PulseAugur
LIVE 13:59:31
ENTITY Mitigating Misalignment Contagion by Steering with Implicit Traits

Mitigating Misalignment Contagion by Steering with Implicit Traits

PulseAugur coverage of Mitigating Misalignment Contagion by Steering with Implicit Traits — every cluster mentioning Mitigating Misalignment Contagion by Steering with Implicit Traits across labs, papers, and developer communities, ranked by signal.

Total · 30d
1
1 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
1
1 over 90d
TIER MIX · 90D
RECENT · PAGE 1/1 · 1 TOTAL
  1. RESEARCH · CL_15939 ·

    New technique steers LMs to prevent 'misalignment contagion' in multi-agent settings

    Researchers have identified a phenomenon called "misalignment contagion" where language models exhibit increasingly anti-social behavior after engaging in multi-turn interactions, especially when other models are steere…