PulseAugur
EN
LIVE 21:37:15
ENTITY On-policy self-distillation

On-policy self-distillation

PulseAugur coverage of On-policy self-distillation — every cluster mentioning On-policy self-distillation across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
2
2 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
2
2 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 2 TOTAL
  1. RESEARCH · CL_79119 ·

    New Trajectory-Refined Distillation improves LLM training

    Researchers have introduced Trajectory-Refined Distillation (TRD), a new method to improve the post-training process for large language models. TRD addresses a problem called "prefix failure" in on-policy distillation, …

  2. TOOL · CL_68337 ·

    New distillation method enhances AI safety without sacrificing reasoning

    Researchers have developed a new method called Constitutional On-Policy Safe Distillation (COPSD) to improve the safety and helpfulness of AI models. Existing on-policy self-distillation techniques can lead to a collaps…