Subquadratic AI introduces SubQ-1.1-Small, a new model using Smart Sparse Attention
Subquadratic AI has released its new model, SubQ 1.1 Small, which utilizes Smart Sparse Attention to achieve near-perfect long-context retrieval up to 12 million tokens. This model significantly reduces computational requirements, using up to 1,000 times less attention compute compared to standard methods. At 1 million tokens, SubQ 1.1 Small requires 64.5 times less compute and runs 56 times faster than FlashAttention-2, while maintaining strong general reasoning capabilities. AI
IMPACT Significantly advances long-context retrieval, potentially enabling new applications requiring processing of massive documents or codebases.