Researchers have introduced Discrete Tilt Matching (DTM), a novel likelihood-free method for fine-tuning discrete diffusion large language models (dLLMs). DTM reframes the fine-tuning process as a state-level matching of local unmasking posteriors, offering a weighted cross-entropy objective that is more stable than traditional reinforcement learning approaches. Experiments show that DTM improves performance on tasks like Sudoku and Countdown, while maintaining competitiveness on mathematical benchmarks. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON The submission of an arXiv preprint detailing a new method for fine-tuning dLLMs.