PulseAugur
LIVE 01:41:46
ENTITY DPO

DPO

PulseAugur coverage of DPO — every cluster mentioning DPO across labs, papers, and developer communities, ranked by signal.

Total · 30d
38
38 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
32
32 over 90d
TIER MIX · 90D
RECENT · PAGE 1/1 · 2 TOTAL
  1. TOOL · CL_29384 ·

    New TBPO method optimizes language models at token level

    Researchers have introduced Token-level Bregman Preference Optimization (TBPO), a new method for aligning language models using pairwise preferences. Unlike existing approaches that focus on full sequences, TBPO operate…

  2. TOOL · CL_27578 ·

    EvoPref algorithm enhances LLM alignment with evolutionary optimization

    Researchers have developed EvoPref, a novel multi-objective evolutionary algorithm designed to improve the alignment of large language models (LLMs). Unlike traditional gradient-based methods that can lead to preference…