Subquadratic AI has released its new model, SubQ 1.1 Small, which utilizes Smart Sparse Attention to achieve near-perfect long-context retrieval up to 12 million tokens. This model significantly reduces computational requirements, using up to 1,000 times less attention compute compared to standard methods. At 1 million tokens, SubQ 1.1 Small requires 64.5 times less compute and runs 56 times faster than FlashAttention-2, while maintaining strong general reasoning capabilities. AI
IMPACT Significantly advances long-context retrieval, potentially enabling new applications requiring processing of massive documents or codebases.
RANK_REASON New model release from a frontier AI lab with detailed technical specifications and benchmark results. [lever_c_demoted from frontier_release: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →