Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 1d

Federated Variational Preference Alignment with Gumbel-Softmax Prior for Personalized User Preferences

Researchers have developed a new framework called Federated Variational Preference Alignment with Gumbel-Softmax Prior (FedVPA-GP) to address challenges in personalizing large language models within a federated learning setting. This approach aims to disentangle conflicting user preferences, such as helpfulness versus harmlessness, without compromising data privacy. By introducing a Federated Mixture Prior and an Orthogonal Loss, FedVPA-GP stabilizes variational inference and enforces the separation of preference prototypes, outperforming monolithic baselines in experiments. AI

IMPACT Enables more nuanced and personalized LLM behavior by disentangling conflicting user preferences in a privacy-preserving manner.

Large Language Models
Federated Learning
FedVPA-GP
Variational Preference Learning
HH-RLHF dataset