PRISM: Preference-Aware Influence Function Based Data Selection Method for Efficient Fine-Tuning
Researchers have developed PRISM, a novel method for efficient fine-tuning of large language models by prioritizing high-value training data. PRISM assigns weights to target examples based on model preference, creating a preference-aware target direction. This approach ensures that the limited training budget is allocated to data samples that most effectively steer the model towards desired behaviors, outperforming existing methods in both general fine-tuning and safety alignment. AI
IMPACT Enhances LLM training efficiency by optimizing data selection, potentially reducing costs and improving model alignment.