TailedTS: Benchmark Dataset for Heavy-Tailed Time Series Prediction and Periodicity Quantification
Researchers have introduced TailedTS, a new benchmark dataset designed to evaluate time series forecasting models on data exhibiting heavy-tailed, zero-inflated, and non-Gaussian distributions. Derived from Wikipedia page view data, TailedTS contains approximately 24.69 billion data points, highlighting that a small percentage of pages receive a majority of views, thus creating a challenging testbed for model robustness. The dataset also facilitates research into periodicity quantification and standardized prediction benchmarks using non-Gaussian loss functions, revealing that standard estimators perform poorly on high-volume data. AI
IMPACT Introduces a new dataset to improve the robustness of time series forecasting models against extreme volatility and non-Gaussian distributions.