PulseAugur
EN
LIVE 18:23:49
ENTITY Direct Preference Optimisation

Direct Preference Optimisation

PulseAugur coverage of Direct Preference Optimisation — every cluster mentioning Direct Preference Optimisation across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
2
2 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
2
2 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 2 TOTAL
  1. RESEARCH · CL_99653 ·

    Sequential DPO shows varied impact on language model preferences

    Researchers have investigated the impact of sequential Direct Preference Optimization (DPO) on language models, finding that it does not uniformly degrade previously learned preferences. The study, using Llama-3.1-8B-In…

  2. TOOL · CL_53684 ·

    New framework boosts LLM safety alignment with curriculum learning

    Researchers have developed a new framework called Staged-Competence to improve the safety alignment of large language models using Direct Preference Optimization (DPO). This curriculum learning approach organizes prefer…