PulseAugur
实时 10:03:56
实体 LiveCodeBench V6

LiveCodeBench V6

PulseAugur coverage of LiveCodeBench V6 — every cluster mentioning LiveCodeBench V6 across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
3
90 天内 3
发布 · 30天
0
90 天内 0
论文 · 30天
2
90 天内 2
层级分布 · 90 天
情绪 · 30 天

1 天有情绪数据

最近 · 第 1/1 页 · 共 3 条
  1. RESEARCH · CL_40825 ·

    新的自蒸馏方法提高了大型语言模型在推理任务上的性能

    研究人员开发了新的大型语言模型自蒸馏技术,可在不依赖外部反馈的情况下提高其性能。AVSD(自适应视图自蒸馏)在多个特权信息视图之间平衡共识信号,并使用视图特定的残差来增强学习。自策略蒸馏(SPD)从梯度中提取能力子空间,以提高性能和泛化能力,尤其是在代码生成和数学推理方面。CEPO(对比证据策略优化)通过对比正确答案和错误答案来锐化关键标记的信用分配,从而提高了多模态数学推理基准的准确性。

  2. RESEARCH · CL_02960 ·

    Process Supervision via Verbal Critique Improves Reasoning in Large Language Models

    Researchers have developed a new framework called Verbal Process Supervision (VPS) that enhances the reasoning capabilities of large language models without requiring gradient updates. This method utilizes structured na…

  3. FRONTIER RELEASE · CL_01735 ·

    Google DeepMind launches Deep Think for Gemini Ultra subscribers

    Google DeepMind has released a new AI capability called Deep Think, now available to Google AI Ultra subscribers via the Gemini app. This feature utilizes parallel thinking techniques, allowing the model to explore mult…