ENTITY
Direct Preference Optimisation
Direct Preference Optimisation
PulseAugur coverage of Direct Preference Optimisation — every cluster mentioning Direct Preference Optimisation across labs, papers, and developer communities, ranked by signal.
Total · 30d
2
2 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
2
2 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D
2 day(s) with sentiment data
RECENT · PAGE 1/1 · 2 TOTAL
-
Sequential DPO shows varied impact on language model preferences
Researchers have investigated the impact of sequential Direct Preference Optimization (DPO) on language models, finding that it does not uniformly degrade previously learned preferences. The study, using Llama-3.1-8B-In…
-
New framework boosts LLM safety alignment with curriculum learning
Researchers have developed a new framework called Staged-Competence to improve the safety alignment of large language models using Direct Preference Optimization (DPO). This curriculum learning approach organizes prefer…