PulseAugur
LIVE 19:33:23
tool · [1 source] ·

New TailedTS dataset challenges time series models with heavy-tailed data

Researchers have introduced TailedTS, a new benchmark dataset designed to evaluate time series forecasting models on data exhibiting heavy-tailed, zero-inflated, and non-Gaussian distributions. Derived from Wikipedia page view data, TailedTS contains approximately 24.69 billion data points, highlighting that a small percentage of pages receive a majority of views, thus creating a challenging testbed for model robustness. The dataset also facilitates research into periodicity quantification and standardized prediction benchmarks using non-Gaussian loss functions, revealing that standard estimators perform poorly on high-volume data. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Introduces a new dataset to improve the robustness of time series forecasting models against extreme volatility and non-Gaussian distributions.

RANK_REASON The cluster describes the release of a new academic benchmark dataset. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv stat.ML →

New TailedTS dataset challenges time series models with heavy-tailed data

COVERAGE [1]

  1. arXiv stat.ML TIER_1 · Xinyu Chen, HanQin Cai, Lijun Ding, Jinhua Zhao ·

    TailedTS: Benchmark Dataset for Heavy-Tailed Time Series Prediction and Periodicity Quantification

    arXiv:2605.16361v1 Announce Type: cross Abstract: We present TailedTS, a large-scale benchmark dataset derived from Wikipedia hourly page view observations throughout 2024, specifically designed to test time series forecasting models under heavy-tailed, zero-inflated, and non-Gau…