Coherence Maximization Improves Pluralistic Alignment
Researchers have developed a method called Internal Coherence Maximization (ICM) to generate persona-specific examples for aligning AI systems with diverse human values. This approach infers labels by maximizing the predictability of examples, enabling AI models to steer towards target group values without extensive human supervision. Experiments across four benchmarks demonstrated that ICM-inferred examples perform comparably to human-labeled data, with coherence proving to be a critical factor for better generalization. AI
IMPACT Introduces a novel method for scalable value specification in AI, potentially improving alignment with diverse human values.