The performance of time series forecasting has recently been greatly improved
by the introduction of transformers. In this paper, we propose a general
multi-scale framework that can be applied to the state-of-the-art
transformer-based time series forecasting models (FEDformer, Autoformer, etc.).
By iteratively refining a forecasted time series at multiple scales with shared
weights, introducing architecture adaptations, and a specially-designed
normalization scheme, we are able to achieve significant performance
improvements, from 5.5% to 38.5% across datasets and transformer architectures,
with minimal additional computational overhead. Via detailed ablation studies,
we demonstrate the effectiveness of each of our contributions across the
architecture and methodology. Furthermore, our experiments on various public
datasets demonstrate that the proposed improvements outperform their
corresponding baseline counterparts. Our code is publicly available in
https://github.com/BorealisAI/scaleformer