Brief · PulseAugur

TOOL · arXiv cs.LG English(EN) · 9h

DAPD: Dependency-Aware Parallel Decoding via Attention for Diffusion LLMs

Researchers have introduced Dependency-Aware Parallel Decoding (DAPD), a novel method for accelerating the decoding process in Diffusion Large Language Models (dLLMs). DAPD utilizes self-attention to construct a conditional dependency graph, enabling parallel unmasking of tokens by identifying independent sets within the graph. This training-free approach avoids the need for auxiliary models or retraining, improving the accuracy-steps trade-off and better leveraging the any-order generation capabilities of dLLMs. AI

IMPACT Accelerates inference for Diffusion LLMs, potentially enabling faster generation and wider adoption of these models.

Dream
Diffusion LLMs
Moongyu Jeon