Researchers have discovered that fine-tuning a single layer in large language models (LLMs) can be as effective as tuning the entire model when using Zeroth-Order (ZO) optimization. This dominant layer, identified by analyzing activation outliers before training, consistently matches or surpasses full-model ZO fine-tuning across various tasks and model families. The dominant layer's effectiveness stems from its high perturbation sensitivity and early position in the residual stream, allowing optimization signals to propagate efficiently. This method offers significant speedups, with experiments showing up to a 4.52x training speed increase while maintaining or improving performance. AI
IMPACT This research could significantly reduce the computational cost and time required for fine-tuning LLMs, making advanced model adaptation more accessible.
RANK_REASON The cluster contains a research paper detailing a novel method for fine-tuning LLMs. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →