Mean-Pooled Cosine Similarity is Not Length-Invariant: Theory and Cross-Domain Evidence for a Length-Invariant Alternative
A new paper published on arXiv argues that mean-pooled cosine similarity, a common metric for comparing neural representations, is not length-invariant. The researchers demonstrate that sequence length alone can heavily influence this metric, potentially skewing results in cross-lingual and cross-modal comparisons. They propose using Centered Kernel Alignment (CKA) as a more robust, length-invariant alternative for evaluating representational similarity. AI
IMPACT Challenges the validity of common evaluation metrics, potentially impacting how model performance is assessed and compared.