Spectral DPPs via NEPv: A Scalable Continuous Relaxation of Determinantal MAP for Diversity-Aware Data Selection
Researchers have developed a new method called Spectral DPPs via NEPv to address the NP-hard problem of selecting diverse, high-quality subsets from large datasets. This approach recasts the Determinantal MAP objective as a continuous optimization problem on the Stiefel manifold, leading to a Nonlinear Eigenvalue Problem with eigenvector dependency (NEPv). The proposed solver, OurMethod, offers a scalable solution that integrates with common machine learning kernels and scales near-linearly with the size of the candidate pool. AI
IMPACT This method could improve efficiency in data curation and subset selection for training large AI models.