677 research outputs found

    A Systematic Review for Transformer-based Long-term Series Forecasting

    Full text link
    The emergence of deep learning has yielded noteworthy advancements in time series forecasting (TSF). Transformer architectures, in particular, have witnessed broad utilization and adoption in TSF tasks. Transformers have proven to be the most successful solution to extract the semantic correlations among the elements within a long sequence. Various variants have enabled transformer architecture to effectively handle long-term time series forecasting (LTSF) tasks. In this article, we first present a comprehensive overview of transformer architectures and their subsequent enhancements developed to address various LTSF tasks. Then, we summarize the publicly available LTSF datasets and relevant evaluation metrics. Furthermore, we provide valuable insights into the best practices and techniques for effectively training transformers in the context of time-series analysis. Lastly, we propose potential research directions in this rapidly evolving field

    TRU-NET: A Deep Learning Approach to High Resolution Prediction of Rainfall

    Get PDF
    Climate models (CM) are used to evaluate the impact of climate change on the risk of floods and strong precipitation events. However, these numerical simulators have difficulties representing precipitation events accurately, mainly due to limited spatial resolution when simulating multi-scale dynamics in the atmosphere. To improve the prediction of high resolution precipitation we apply a Deep Learning (DL) approach using an input of CM simulations of the model fields (weather variables) that are more predictable than local precipitation. To this end, we present TRU-NET (Temporal Recurrent U-Net), an encoder-decoder model featuring a novel 2D cross attention mechanism between contiguous convolutional-recurrent layers to effectively model multi-scale spatio-temporal weather processes. We use a conditional-continuous loss function to capture the zero-skewed %extreme event patterns of rainfall. Experiments show that our model consistently attains lower RMSE and MAE scores than a DL model prevalent in short term precipitation prediction and improves upon the rainfall predictions of a state-of-the-art dynamical weather model. Moreover, by evaluating the performance of our model under various, training and testing, data formulation strategies, we show that there is enough data for our deep learning approach to output robust, high-quality results across seasons and varying regions

    TempEE: Temporal-Spatial Parallel Transformer for Radar Echo Extrapolation Beyond Auto-Regression

    Full text link
    Meteorological radar reflectivity data (i.e. radar echo) significantly influences precipitation prediction. It can facilitate accurate and expeditious forecasting of short-term heavy rainfall bypassing the need for complex Numerical Weather Prediction (NWP) models. In comparison to conventional models, Deep Learning (DL)-based radar echo extrapolation algorithms exhibit higher effectiveness and efficiency. Nevertheless, the development of reliable and generalized echo extrapolation algorithm is impeded by three primary challenges: cumulative error spreading, imprecise representation of sparsely distributed echoes, and inaccurate description of non-stationary motion processes. To tackle these challenges, this paper proposes a novel radar echo extrapolation algorithm called Temporal-Spatial Parallel Transformer, referred to as TempEE. TempEE avoids using auto-regression and instead employs a one-step forward strategy to prevent cumulative error spreading during the extrapolation process. Additionally, we propose the incorporation of a Multi-level Temporal-Spatial Attention mechanism to improve the algorithm's capability of capturing both global and local information while emphasizing task-related regions, including sparse echo representations, in an efficient manner. Furthermore, the algorithm extracts spatio-temporal representations from continuous echo images using a parallel encoder to model the non-stationary motion process for echo extrapolation. The superiority of our TempEE has been demonstrated in the context of the classic radar echo extrapolation task, utilizing a real-world dataset. Extensive experiments have further validated the efficacy and indispensability of various components within TempEE.Comment: Have been accepted by IEEE Transactions on Geoscience and Remote Sensing, see https://ieeexplore.ieee.org/document/1023874

    A review on Day-Ahead Solar Energy Prediction

    Get PDF
    Accurate day-ahead prediction of solar energy plays a vital role in the planning of supply and demand in a power grid system. The previous study shows predictions based on weather forecasts composed of numerical text data. They can reflect temporal factors therefore the data versus the result might not always give the most accurate and precise results. That is why incorporating different methods and techniques which enhance accuracy is an important topic. An in-depth review of current deep learning-based forecasting models for renewable energy is provided in this paper

    SwinVRNN: A Data-Driven Ensemble Forecasting Model via Learned Distribution Perturbation

    Full text link
    Data-driven approaches for medium-range weather forecasting are recently shown extraordinarily promising for ensemble forecasting for their fast inference speed compared to traditional numerical weather prediction (NWP) models, but their forecast accuracy can hardly match the state-of-the-art operational ECMWF Integrated Forecasting System (IFS) model. Previous data-driven attempts achieve ensemble forecast using some simple perturbation methods, like initial condition perturbation and Monte Carlo dropout. However, they mostly suffer unsatisfactory ensemble performance, which is arguably attributed to the sub-optimal ways of applying perturbation. We propose a Swin Transformer-based Variational Recurrent Neural Network (SwinVRNN), which is a stochastic weather forecasting model combining a SwinRNN predictor with a perturbation module. SwinRNN is designed as a Swin Transformer-based recurrent neural network, which predicts future states deterministically. Furthermore, to model the stochasticity in prediction, we design a perturbation module following the Variational Auto-Encoder paradigm to learn multivariate Gaussian distributions of a time-variant stochastic latent variable from data. Ensemble forecasting can be easily achieved by perturbing the model features leveraging noise sampled from the learned distribution. We also compare four categories of perturbation methods for ensemble forecasting, i.e. fixed distribution perturbation, learned distribution perturbation, MC dropout, and multi model ensemble. Comparisons on WeatherBench dataset show the learned distribution perturbation method using our SwinVRNN model achieves superior forecast accuracy and reasonable ensemble spread due to joint optimization of the two targets. More notably, SwinVRNN surpasses operational IFS on surface variables of 2-m temperature and 6-hourly total precipitation at all lead times up to five days

    AMLNet: Adversarial Mutual Learning Neural Network for Non-AutoRegressive Multi-Horizon Time Series Forecasting

    Full text link
    Multi-horizon time series forecasting, crucial across diverse domains, demands high accuracy and speed. While AutoRegressive (AR) models excel in short-term predictions, they suffer speed and error issues as the horizon extends. Non-AutoRegressive (NAR) models suit long-term predictions but struggle with interdependence, yielding unrealistic results. We introduce AMLNet, an innovative NAR model that achieves realistic forecasts through an online Knowledge Distillation (KD) approach. AMLNet harnesses the strengths of both AR and NAR models by training a deep AR decoder and a deep NAR decoder in a collaborative manner, serving as ensemble teachers that impart knowledge to a shallower NAR decoder. This knowledge transfer is facilitated through two key mechanisms: 1) outcome-driven KD, which dynamically weights the contribution of KD losses from the teacher models, enabling the shallow NAR decoder to incorporate the ensemble's diversity; and 2) hint-driven KD, which employs adversarial training to extract valuable insights from the model's hidden states for distillation. Extensive experimentation showcases AMLNet's superiority over conventional AR and NAR models, thereby presenting a promising avenue for multi-horizon time series forecasting that enhances accuracy and expedites computation.Comment: 10 pages, 3 figure

    DiffLoad: Uncertainty Quantification in Load Forecasting with Diffusion Model

    Full text link
    Electrical load forecasting is of great significance for the decision makings in power systems, such as unit commitment and energy management. In recent years, various self-supervised neural network-based methods have been applied to electrical load forecasting to improve forecasting accuracy and capture uncertainties. However, most current methods are based on Gaussian likelihood methods, which aim to accurately estimate the distribution expectation under a given covariate. This kind of approach is difficult to adapt to situations where temporal data has a distribution shift and outliers. In this paper, we propose a diffusion-based Seq2seq structure to estimate epistemic uncertainty and use the robust additive Cauchy distribution to estimate aleatoric uncertainty. Rather than accurately forecasting conditional expectations, we demonstrate our method's ability in separating two types of uncertainties and dealing with the mutant scenarios

    Traffic Prediction using Artificial Intelligence: Review of Recent Advances and Emerging Opportunities

    Full text link
    Traffic prediction plays a crucial role in alleviating traffic congestion which represents a critical problem globally, resulting in negative consequences such as lost hours of additional travel time and increased fuel consumption. Integrating emerging technologies into transportation systems provides opportunities for improving traffic prediction significantly and brings about new research problems. In order to lay the foundation for understanding the open research challenges in traffic prediction, this survey aims to provide a comprehensive overview of traffic prediction methodologies. Specifically, we focus on the recent advances and emerging research opportunities in Artificial Intelligence (AI)-based traffic prediction methods, due to their recent success and potential in traffic prediction, with an emphasis on multivariate traffic time series modeling. We first provide a list and explanation of the various data types and resources used in the literature. Next, the essential data preprocessing methods within the traffic prediction context are categorized, and the prediction methods and applications are subsequently summarized. Lastly, we present primary research challenges in traffic prediction and discuss some directions for future research.Comment: Published in Transportation Research Part C: Emerging Technologies (TR_C), Volume 145, 202

    Transform Diabetes - Harnessing Transformer-Based Machine Learning and Layered Ensemble with Enhanced Training for Improved Glucose Prediction.

    Get PDF
    Type 1 diabetes is a common chronic disease characterized by the body’s inability to regulate the blood glucose level, leading to severe health consequences if not handled manually. Accurate blood glucose level predictions can enable better disease management and inform subsequent treatment decisions. However, predicting future blood glucose levels is a complex problem due to the inherent complexity and variability of the human body. This thesis investigates using a Transformer model to outperform a state-of-the-art Convolutional Recurrent Neural Network model by forecasting blood glucose levels on the same dataset. The problem is structured, and the data is preprocessed as a multivariate multi-step time series. A unique Layered Ensemble technique that Enhances the Training of the final model is introduced. This technique manages missing data and counters potential issues from other techniques by employing both a Long Short-Term Memory model and a Transformer model together. The experimental results show that this novel ensemble technique reduces the root mean squared error by approximately 14.28% when predicting the blood glucose level 30 minutes in the future compared to the state-of-the-art model. This improvement highlights the potential of this approach to assist diabetes patients with effective disease management

    Transform Diabetes - Harnessing Transformer-Based Machine Learning and Layered Ensemble with Enhanced Training for Improved Glucose Prediction.

    Get PDF
    Type 1 diabetes is a common chronic disease characterized by the body’s inability to regulate the blood glucose level, leading to severe health consequences if not handled manually. Accurate blood glucose level predictions can enable better disease management and inform subsequent treatment decisions. However, predicting future blood glucose levels is a complex problem due to the inherent complexity and variability of the human body. This thesis investigates using a Transformer model to outperform a state-of-the-art Convolutional Recurrent Neural Network model by forecasting blood glucose levels on the same dataset. The problem is structured, and the data is preprocessed as a multivariate multi-step time series. A unique Layered Ensemble technique that Enhances the Training of the final model is introduced. This technique manages missing data and counters potential issues from other techniques by employing both a Long Short-Term Memory model and a Transformer model together. The experimental results show that this novel ensemble technique reduces the root mean squared error by approximately 14.28% when predicting the blood glucose level 30 minutes in the future compared to the state-of-the-art model. This improvement highlights the potential of this approach to assist diabetes patients with effective disease management
    • …
    corecore