Brief · PulseAugur

RESEARCH · arXiv cs.CL English(EN) · 3w · [93 sources]

Dynamic Chunking for Diffusion Language Models

Researchers are exploring new methods to improve diffusion language models (DLMs), which offer faster inference than autoregressive models. Several recent papers introduce techniques to enhance DLM performance, including NAVIRA for decoupled remasking, SARDI for retrieval-augmented generation using discarded tokens, and AXON for supportive token revealing. Another study identifies limitations in DLMs, such as a locality bias and distraction from mask tokens, proposing a mask-agnostic loss function to improve context comprehension. Additionally, a survey provides a comprehensive overview of the DLM landscape, covering foundational principles, state-of-the-art models, and future research directions. AI

IMPACT New techniques aim to improve the speed and accuracy of diffusion language models, potentially making them more competitive with autoregressive models.

DCDM
Dynamic Chunking Diffusion Model
Chunking Attention
OpenWebText
RePlaid
arXiv
Hugging Face
Block Approximate Sparse Attention
Dynamic Chunking Diffusion Models
FlashAttention
Diffusion Language Models
DLM-SWAI
Boundary-Guided Policy Optimization
Eso-LMs
Masked Diffusion Models
DiffRetriever
BlockBatch
dgMARK
Dynamic Infilling Anchors (DIA)
PRISM
AXON
Masked Diffusion Language Models
T$^\star$
Hanchen Xia
SARDI
Autoregressive Language Models
NAVIRA
Maksim Kryzhanovskiy
Tianyi Li
Julianna Piskorz