PulseAugur
EN
LIVE 09:33:05

Researchers find independently trained transformers compute same function via random rotation

Researchers have discovered a phenomenon called "polymorphism" in independently trained transformers, where they compute the same function but use different internal coordinate systems that are rotated versions of each other. This rotation, which is uniformly random within SO(d_model), makes the internal representations unintelligible between models. However, a single matrix multiplication using an orthogonal Procrustes fit can align these bases, allowing for the transfer of feature dictionaries and steering vectors between models without retraining. AI

IMPACT Reveals that independently trained models can compute identical functions through rotated internal representations, suggesting potential for cross-model transferability of learned features.

RANK_REASON The cluster contains an academic paper detailing a new finding about transformer model internals. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Jordan F. McCann ·

    Polymorphism Is Rotation: Operational Mechanistic Interpretability from a Two-Layer Transformer to Pythia-70m

    arXiv:2605.24577v1 Announce Type: cross Abstract: Independently trained transformers compute the same function in residual-stream bases that differ by a uniform random rotation on $\mathrm{SO}(d_{\mathrm{model}})$. We call this phenomenon polymorphism: same function, mutually uni…