PulseAugur
LIVE 08:11:54
ENTITY Dingwei Zhu

Dingwei Zhu

PulseAugur coverage of Dingwei Zhu — every cluster mentioning Dingwei Zhu across labs, papers, and developer communities, ranked by signal.

Total · 30d
1
1 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
1
1 over 90d
TIER MIX · 90D
RECENT · PAGE 1/1 · 1 TOTAL
  1. RESEARCH · CL_05416 ·

    DVPO and EVPO advance LLM post-training with novel RL optimization techniques

    Researchers have introduced DVPO, a new reinforcement learning framework designed for improving Large Language Model (LLM) post-training, particularly when dealing with noisy or incomplete supervision signals. DVPO util…