The Relative Instability of Model Comparison with Cross-validation
A new paper published on arXiv demonstrates that cross-validation, a common statistical technique for comparing machine learning models, can produce unstable and invalid inferences. The research specifically highlights that the Lasso and soft-thresholding methods, despite being individually stable, can lead to unreliable comparisons. This instability calls into question the routine use of cross-validation for model comparison without prior verification of relative stability. AI
IMPACT Highlights potential flaws in standard model evaluation techniques, urging caution in interpreting comparative results.