Researchers have evaluated pretrained music embeddings for the task of recognizing jazz standards from audio, a challenging problem due to variations in performance, tempo, and arrangement. A Harmonic CNN model trained from scratch showed overfitting to training performances, while pretrained embeddings from foundation models offered better retrieval results but were sensitive to performer identity. A lightweight contrastive projection helped mitigate this sensitivity, suggesting jazz standard recognition can serve as a valuable benchmark for music representation models. AI
IMPACT This research could lead to improved music information retrieval systems, particularly for complex genres like jazz.
RANK_REASON Academic paper published on arXiv detailing research findings. [lever_c_demoted from research: ic=1 ai=0.7]
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →