PulseAugur
LIVE 03:43:54
ENTITY On-Policy Distillation

On-Policy Distillation

PulseAugur coverage of On-Policy Distillation — every cluster mentioning On-Policy Distillation across labs, papers, and developer communities, ranked by signal.

Total · 30d
3
3 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
3
3 over 90d
TIER MIX · 90D
SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 3 TOTAL
  1. TOOL · CL_27502 ·

    ProteinOPD framework enhances protein design alignment with 8x speedup

    Researchers have developed ProteinOPD, a new framework for aligning protein language models (PLMs) with desired functions. This method adapts pretrained PLMs into specialized teachers and distills their knowledge into a…

  2. RESEARCH · CL_21952 ·

    New methods enhance on-policy distillation for LLMs

    Researchers have developed new methods to improve the efficiency and stability of on-policy distillation (OPD) for large language models. One approach, vOPD, uses a control variate baseline derived from the reverse KL d…

  3. RESEARCH · CL_06734 ·

    Researchers refine on-policy distillation for more stable LLM training

    Researchers have identified significant empirical failure modes in on-policy distillation (OPD), a technique used for post-training large language models. The standard implementation, which relies on sampled-token log-r…