Distilling Linearized Behavior into Non-Linear Fine-Tuning for Effective Task Arithmetic
Researchers have developed a method to combine the benefits of linear and non-linear fine-tuning for large language models. Their approach distills the desirable properties of linearized models, which are good for task arithmetic like model merging, into standard non-linear fine-tuned models. This allows for effective task composition and strong performance on benchmarks without the inference-time costs associated with purely linearized models. AI
IMPACT Enables more efficient and effective task arithmetic in language models without increased inference costs.