ENTITY On-Policy Distillation

On-Policy Distillation

PulseAugur coverage of On-Policy Distillation — every cluster mentioning On-Policy Distillation across labs, papers, and developer communities, ranked by signal.

Total · 30d

3 over 90d

Releases · 30d

0 over 90d

Papers · 30d

3 over 90d

TIER MIX · 90D

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 3 TOTAL

TOOL · CL_27502 · May 11 · 08:38

ProteinOPD framework enhances protein design alignment with 8x speedup

Researchers have developed ProteinOPD, a new framework for aligning protein language models (PLMs) with desired functions. This method adapts pretrained PLMs into specialized teachers and distills their knowledge into a…
RESEARCH · CL_21952 · May 8 · 04:00

New methods enhance on-policy distillation for LLMs

Researchers have developed new methods to improve the efficiency and stability of on-policy distillation (OPD) for large language models. One approach, vOPD, uses a control variate baseline derived from the reverse KL d…
RESEARCH · CL_06734 · Apr 28 · 04:00

Researchers refine on-policy distillation for more stable LLM training

Researchers have identified significant empirical failure modes in on-policy distillation (OPD), a technique used for post-training large language models. The standard implementation, which relies on sampled-token log-r…

ProteinOPD framework enhances protein design alignment with 8x speedup

New methods enhance on-policy distillation for LLMs

Researchers refine on-policy distillation for more stable LLM training