Microsoft has released Differential Transformer V2, an advancement in attention mechanisms for large language models. This new version improves efficiency and performance by allowing attention to be computed more sparsely. The update aims to reduce computational costs and enhance the scalability of transformer models. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON Release of a new version of a transformer attention mechanism by Microsoft.