PulseAugur
实时 23:45:10
实体 DeepSeek-R1-Zero

DeepSeek-R1-Zero

PulseAugur coverage of DeepSeek-R1-Zero — every cluster mentioning DeepSeek-R1-Zero across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
1
90 天内 1
发布 · 30天
0
90 天内 0
论文 · 30天
1
90 天内 1
层级分布 · 90 天
最近 · 第 1/1 页 · 共 1 条
  1. RESEARCH · CL_05788 ·

    Kwai AI's SRPO achieves DeepSeek-R1-Zero performance with 10x fewer training steps

    Researchers from Kuaishou's Kwaipilot team have developed a novel reinforcement learning framework called SRPO, designed to improve the efficiency and performance of large language models. This new method addresses limi…