Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 1d

Distilling Linearized Behavior into Non-Linear Fine-Tuning for Effective Task Arithmetic

Researchers have developed a method to combine the benefits of linear and non-linear fine-tuning for large language models. Their approach distills the desirable properties of linearized models, which are good for task arithmetic like model merging, into standard non-linear fine-tuned models. This allows for effective task composition and strong performance on benchmarks without the inference-time costs associated with purely linearized models. AI

IMPACT Enables more efficient and effective task arithmetic in language models without increased inference costs.

Hugging Face
arXiv
Angelo Porrello