PulseAugur
实时 17:06:34
实体 Cola

Cola

PulseAugur coverage of Cola — every cluster mentioning Cola across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
3
90 天内 3
发布 · 30天
0
90 天内 0
论文 · 30天
3
90 天内 3
层级分布 · 90 天
情绪 · 30 天

1 天有情绪数据

最近 · 第 1/1 页 · 共 3 条
  1. RESEARCH · CL_30733 ·

    大语言模型预训练研究探索稀疏与密集及低秩方法

    两篇新研究论文探讨了大语言模型高效预训练的方法。第一篇论文在小规模上比较了密集和稀疏的专家混合(MoE)Transformer架构,发现MoE模型在匹配激活参数时能改善验证损失,但在总参数容量相等的情况下,其性能并不超过密集模型。第二篇论文研究了各种低秩预训练技术,表明即使验证困惑度相似,这些方法也会收敛到几何上不同的解,并且不能完全复制全秩训练的泛化能力或内部表示。

  2. RESEARCH · CL_14140 ·

    Lost in State Space: Probing Frozen Mamba Representations

    A new research paper investigates the internal workings of Mamba, a recurrent neural network architecture. The study tested the hypothesis that Mamba's state could directly yield semantic sentence summaries without addi…

  3. RESEARCH · CL_05149 ·

    LoRA fine-tuning research suggests rank 1 is sufficient, proposes data-aware initialization

    Three new research papers explore methods to optimize LoRA fine-tuning for large language models. One paper proposes reducing the LoRA rank threshold to 1 for binary classification tasks, showing competitive performance…