PulseAugur
实时 23:57:57

New TailedTS dataset challenges time series models with heavy-tailed data

Researchers have introduced TailedTS, a new benchmark dataset designed to evaluate time series forecasting models on data exhibiting heavy-tailed, zero-inflated, and non-Gaussian distributions. Derived from Wikipedia page view data, TailedTS contains approximately 24.69 billion data points, highlighting that a small percentage of pages receive a majority of views, thus creating a challenging testbed for model robustness. The dataset also facilitates research into periodicity quantification and standardized prediction benchmarks using non-Gaussian loss functions, revealing that standard estimators perform poorly on high-volume data. AI

影响 Introduces a new dataset to improve the robustness of time series forecasting models against extreme volatility and non-Gaussian distributions.

排序理由 The cluster describes the release of a new academic benchmark dataset. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv stat.ML 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

New TailedTS dataset challenges time series models with heavy-tailed data

报道来源 [1]

  1. arXiv stat.ML TIER_1 English(EN) · Xinyu Chen, HanQin Cai, Lijun Ding, Jinhua Zhao ·

    TailedTS: Benchmark Dataset for Heavy-Tailed Time Series Prediction and Periodicity Quantification

    arXiv:2605.16361v1 Announce Type: cross Abstract: We present TailedTS, a large-scale benchmark dataset derived from Wikipedia hourly page view observations throughout 2024, specifically designed to test time series forecasting models under heavy-tailed, zero-inflated, and non-Gau…