GQA
PulseAugur coverage of GQA — every cluster mentioning GQA across labs, papers, and developer communities, ranked by signal.
2 天有情绪数据
-
OpenMythos 教程展示用于更深层计算的循环 Transformer
OpenMythos 框架能够构建先进的循环深度 Transformer 模型,并通过使用 Google Colab 的教程进行了演示。该教程展示了如何构建和比较多潜在注意力(MLA)和分组查询注意力(GQA)模型变体,并分析它们的参数数量和循环注入矩阵的稳定性。该过程涉及设置一个合成组合推理任务,模型在该任务中学习预测固定值的模数和,说明了循环如何通过参数重用来促进更深层的计算。
-
Transformer大语言模型架构趋向标准化栈
对2017年至2025年间53个大语言模型的最新分析显示,Transformer架构正显著趋同。这一事实上的标准包括预归一化 (RMSNorm)、旋转位置嵌入 (RoPE)、MLP中的SwiGLU激活函数以及共享键值注意力机制 (MQA/GQA)。这种趋同归因于优化稳定性提高、每FLOP质量提升以及内核可用性和KV缓存经济性等实际考量。
-
IBM releases Granite 4.1 LLMs with 512K context and Apache 2.0 license
IBM has released the Granite 4.1 family of large language models, comprising 3B, 8B, and 30B parameter versions. These models were trained on approximately 15 trillion tokens through a five-stage pre-training process th…
-
BLASST paper introduces dynamic sparse attention for faster LLM inference
Researchers have developed BLASST, a novel sparse attention mechanism designed to accelerate inference for large language models with long contexts. This drop-in solution dynamically skips attention blocks using a simpl…
-
Kwai Summary Attention compresses historical contexts for efficient long-context LLMs
Researchers have introduced Kwai Summary Attention (KSA), a novel attention mechanism designed to address the quadratic time complexity of standard softmax attention in large language models. KSA aims to maintain a line…
-
DeepSeek benchmarks MLA vs GQA on A100, revealing bandwidth-quality tradeoff
A technical analysis explores DeepSeek's decision to utilize MLA (Multi-Head Linear Attention) over GQA (Grouped-Query Attention) in their models. The author highlights this choice as a strategic trade-off between compu…
-
DeepSeek-V4, LoRA, and other LLM techniques detailed in new blogs
A series of six blog posts has been published on Outcome School, detailing fundamental components of contemporary large language models. The posts cover technical concepts such as RMSNorm, DeepSeek-V4, LoRA, RoPE, GQA, …