What "Subquadratic Attention" Actually Means
SubQ has launched a new frontier LLM, SubQ, featuring a 12 million token context window and a novel subquadratic attention mechanism. This approach aims to overcome the computational limitations of traditional quadratic attention, which quadruples compute with doubled context length. SubQ's learned-sparse attention dynamically selects relevant token pairs at inference time, offering a significant cost reduction compared to full attention models. AI
IMPACT Enables processing of much larger contexts like entire codebases and long agent traces, potentially reducing reliance on retrieval augmentation.