ChunkFT: Byte-Streamed Optimization for Memory-Efficient Full Fine-Tuning
Researchers have developed ChunkFT, a novel framework designed to significantly reduce the memory required for full-parameter fine-tuning of large language models. This method dynamically activates a working set of parameters, enabling gradient computation on sub-tensors without altering the model architecture. Experiments show ChunkFT can fine-tune models like Llama 3-8B on a single consumer GPU, achieving performance comparable to traditional full fine-tuning while using substantially less memory. AI
IMPACT Enables fine-tuning of large language models on consumer hardware, potentially democratizing advanced model customization.