PulseAugur
实时 08:28:25

Subquadratic debuts 12M-token context window with linear scaling architecture

Subquadratic, a startup with 11 PhD researchers, has launched a new model featuring its Subquadratic Selective Attention (SSA) architecture, which claims to scale linearly with context length. This innovation allows for a 12-million-token context window, aiming to overcome the quadratic cost limitations of traditional dense attention mechanisms in LLMs. Early benchmarks show competitive performance against models like GPT-5.5 and Claude Opus on tasks such as MRCR v2 and SWE-Bench, with significantly faster inference speeds. AI

影响 Linear scaling in compute and memory with context length could significantly reduce the cost and improve the ROI of RAG and agentic decomposition.

排序理由 A startup released a new model with a novel architecture and provided benchmark results. [lever_c_demoted from research: ic=1 ai=1.0]

在 dev.to — LLM tag 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

Subquadratic debuts 12M-token context window with linear scaling architecture

报道来源 [1]

  1. dev.to — LLM tag TIER_1 English(EN) · Andrew Kew ·

    12 million tokens, linear cost: Subquadratic's bet against the attention tax

    <p>The quadratic attention problem has quietly shaped everything you've built with LLMs. RAG pipelines, agentic decomposition, hybrid architectures — these aren't the natural shape of AI systems. They're workarounds. Doubling the context quadruples the compute, so everyone stoppe…