FIGMA: Towards FIne-Grained Music retrievAl
Researchers have developed FIGMA, a new architecture designed to improve fine-grained music retrieval using natural language descriptions. Unlike previous models that struggle with detailed musical attributes like tempo or key, FIGMA utilizes a multi-view contrastive approach to better capture both high-level semantics and frame-level musical details. This advancement is supported by the creation of the FGMCaps dataset, containing 380,000 music-caption pairs, which has enabled FIGMA to significantly outperform existing models in music retrieval tasks. AI
IMPACT Enhances AI's ability to understand and retrieve music based on detailed musical attributes, potentially impacting music discovery and curation tools.