From "Weak" Signals to Strong Models: Preference Delta Aggregation with LoRA Merging
Researchers have developed a new framework called Preference Delta Aggregation (PDA) to improve large language models by combining multiple "weak" supervision signals. These signals are derived from comparisons between less capable model pairs. To address potential interference during the merging process, they introduced Geometric Alignment Merging (GAM), a method that aligns adapter subspaces before aggregation. Evaluations demonstrated that PDA with GAM significantly enhances model performance on knowledge reasoning and agentic search tasks, outperforming single-signal methods and showing gains with each additional incorporated signal. AI
IMPACT Introduces a novel method for improving LLM training efficiency and performance by leveraging aggregated weak preference signals.