Guiding Multi-Objective Genetic Programming with Description Length Improves Symbolic Regression Solutions
Researchers have developed new methods to improve symbolic regression using genetic programming by employing description length (DL) and fractional Bayes factor (FBF) criteria. These criteria help select compact and generalizable expressions, mitigating issues like overfitting and structural bloat, especially in the presence of noisy data. The study compared different search and selection strategies, finding that post-selection with DL/FBF enhances test performance over traditional AIC/BIC baselines, while using DL/FBF directly as a fitness function can lead to premature convergence. AI
IMPACT Introduces refined techniques for model selection in genetic programming, potentially improving the accuracy and generalizability of symbolic regression models.