ENTITY Dingwei Zhu

Dingwei Zhu

PulseAugur coverage of Dingwei Zhu — every cluster mentioning Dingwei Zhu across labs, papers, and developer communities, ranked by signal.

Total · 30d

1

1 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

1

1 over 90d

TIER MIX · 90D

TOPICS

RECENT · PAGE 1/1 · 1 TOTAL

RESEARCH · CL_05416 · Apr 21 · 14:07

DVPO and EVPO advance LLM post-training with novel RL optimization techniques

Researchers have introduced DVPO, a new reinforcement learning framework designed for improving Large Language Model (LLM) post-training, particularly when dealing with noisy or incomplete supervision signals. DVPO util…