LLM-based Embeddings: Attention Values Encode Sentence Semantics Better Than Hidden States
A new research paper proposes that attention values, rather than hidden states, are more effective for capturing sentence semantics in Large Language Models (LLMs). The paper introduces Value Aggregation (VA), a method that pools token values across layers and indices, outperforming existing LLM-based embeddings in a training-free setting. A refined technique, Aligned Weighted VA (AlignedWVA), further enhances performance by interpreting layer attention outputs as aligned weighted value vectors, achieving state-of-the-art results. AI
IMPACT Proposes a new method for generating more semantically rich sentence embeddings from LLMs, potentially improving downstream NLP applications.