Sakana AI Proposes DiffusionBlocks: a Block-wise Training Framework That Converts Residual Networks into Independently Trainable Denoising Modules
Sakana AI has introduced DiffusionBlocks, a novel framework for training neural networks more efficiently. This method partitions a network into multiple blocks, allowing each block to be trained independently. By reducing the number of layers processed simultaneously, DiffusionBlocks significantly cuts down on memory requirements during training without sacrificing performance across various architectures. The approach leverages the connection between residual networks and diffusion models, treating residual connections as discretized denoising steps. AI
IMPACT Reduces training memory requirements for deep neural networks, potentially enabling larger models and faster iteration cycles.