1,396 research outputs found
An Interpretable Hybrid Predictive Model of COVID-19 Cases using Autoregressive Model and LSTM
The Coronavirus Disease 2019 (COVID-19) has a profound impact on global
health and economy, making it crucial to build accurate and interpretable
data-driven predictive models for COVID-19 cases to improve policy making. The
extremely large scale of the pandemic and the intrinsically changing
transmission characteristics pose great challenges for effective COVID-19 case
prediction. To address this challenge, we propose a novel hybrid model in which
the interpretability of the Autoregressive model (AR) and the predictive power
of the long short-term memory neural networks (LSTM) join forces. The proposed
hybrid model is formalized as a neural network with an architecture that
connects two composing model blocks, of which the relative contribution is
decided data-adaptively in the training procedure. We demonstrate the favorable
performance of the hybrid model over its two component models as well as other
popular predictive models through comprehensive numerical studies on two data
sources under multiple evaluation metrics. Specifically, in county-level data
of 8 California counties, our hybrid model achieves 4.173% MAPE on average,
outperforming the composing AR (5.629%) and LSTM (4.934%). In country-level
datasets, our hybrid model outperforms the widely-used predictive models - AR,
LSTM, SVM, Gradient Boosting, and Random Forest - in predicting COVID-19 cases
in 8 countries around the world. In addition, we illustrate the
interpretability of our proposed hybrid model, a key feature not shared by most
black-box predictive models for COVID-19 cases. Our study provides a new and
promising direction for building effective and interpretable data-driven
models, which could have significant implications for public health policy
making and control of the current and potential future pandemics
Interpreting County Level COVID-19 Infection and Feature Sensitivity using Deep Learning Time Series Models
Interpretable machine learning plays a key role in healthcare because it is
challenging in understanding feature importance in deep learning model
predictions. We propose a novel framework that uses deep learning to study
feature sensitivity for model predictions. This work combines sensitivity
analysis with heterogeneous time-series deep learning model prediction, which
corresponds to the interpretations of spatio-temporal features. We forecast
county-level COVID-19 infection using the Temporal Fusion Transformer. We then
use the sensitivity analysis extending Morris Method to see how sensitive the
outputs are with respect to perturbation to our static and dynamic input
features. The significance of the work is grounded in a real-world COVID-19
infection prediction with highly non-stationary, finely granular, and
heterogeneous data. 1) Our model can capture the detailed daily changes of
temporal and spatial model behaviors and achieves high prediction performance
compared to a PyTorch baseline. 2) By analyzing the Morris sensitivity indices
and attention patterns, we decipher the meaning of feature importance with
observational population and dynamic model changes. 3) We have collected 2.5
years of socioeconomic and health features over 3142 US counties, such as
observed cases and deaths, and a number of static (age distribution, health
disparity, and industry) and dynamic features (vaccination, disease spread,
transmissible cases, and social distancing). Using the proposed framework, we
conduct extensive experiments and show our model can learn complex interactions
and perform predictions for daily infection at the county level. Being able to
model the disease infection with a hybrid prediction and description accuracy
measurement with Morris index at the county level is a central idea that sheds
light on individual feature interpretation via sensitivity analysis
The impact of spatio-temporal travel distance on epidemics using an interpretable attention-based sequence-to-sequence model
Amidst the COVID-19 pandemic, travel restrictions have emerged as crucial
interventions for mitigating the spread of the virus. In this study, we enhance
the predictive capabilities of our model, Sequence-to-Sequence Epidemic
Attention Network (S2SEA-Net), by incorporating an attention module, allowing
us to assess the impact of distinct classes of travel distances on epidemic
dynamics. Furthermore, our model provides forecasts for new confirmed cases and
deaths. To achieve this, we leverage daily data on population movement across
various travel distance categories, coupled with county-level epidemic data in
the United States. Our findings illuminate a compelling relationship between
the volume of travelers at different distance ranges and the trajectories of
COVID-19. Notably, a discernible spatial pattern emerges with respect to these
travel distance categories on a national scale. We unveil the geographical
variations in the influence of population movement at different travel
distances on the dynamics of epidemic spread. This will contribute to the
formulation of strategies for future epidemic prevention and public health
policies.Comment: 18 pages, 7 figure
A Novel Deep Learning Model For Hotel Demand and Revenue Prediction amid COVID-19
The COVID-19 pandemic has cast a substantial impact on the tourism and hospitality sector. Public policies such as travel restrictions and stay-at-home orders had significantly affected tourist activities and service businesses' operations and profitability. It is essential to develop interpretable forecasting models to support managerial and organizational decision-making. We developed DemandNet, a novel deep learning framework for predicting time series data under the influence of the COVID-19 pandemic. The DemandNet framework has the following unique characteristics. First, it selects the top static and dynamic features embedded in the time series data. Second, it includes a nonlinear model which can provide interpretable insight into the previously seen data. Third, a novel prediction model is developed to leverage the above characteristics to make robust long-term forecasts. We evaluated DemandNet using daily hotel demand and revenue data from eight cities in the US between 2013 and 2020. Our findings reveal that DemandNet outperforms the state-of-art models and can accurately predict the effect of the COVID-19 pandemic on hotel demand and revenue
Distributional Drift Adaptation with Temporal Conditional Variational Autoencoder for Multivariate Time Series Forecasting
Due to the nonstationary nature, the distribution of real-world multivariate
time series (MTS) changes over time, which is known as distribution drift. Most
existing MTS forecasting models greatly suffer from distribution drift and
degrade the forecasting performance over time. Existing methods address
distribution drift via adapting to the latest arrived data or self-correcting
per the meta knowledge derived from future data. Despite their great success in
MTS forecasting, these methods hardly capture the intrinsic distribution
changes, especially from a distributional perspective. Accordingly, we propose
a novel framework temporal conditional variational autoencoder (TCVAE) to model
the dynamic distributional dependencies over time between historical
observations and future data in MTSs and infer the dependencies as a temporal
conditional distribution to leverage latent variables. Specifically, a novel
temporal Hawkes attention mechanism represents temporal factors subsequently
fed into feed-forward networks to estimate the prior Gaussian distribution of
latent variables. The representation of temporal factors further dynamically
adjusts the structures of Transformer-based encoder and decoder to distribution
changes by leveraging a gated attention mechanism. Moreover, we introduce
conditional continuous normalization flow to transform the prior Gaussian to a
complex and form-free distribution to facilitate flexible inference of the
temporal conditional distribution. Extensive experiments conducted on six
real-world MTS datasets demonstrate the TCVAE's superior robustness and
effectiveness over the state-of-the-art MTS forecasting baselines. We further
illustrate the TCVAE applicability through multifaceted case studies and
visualization in real-world scenarios.Comment: 13 pages, 6 figures, submitted to IEEE Transactions on Neural
Networks and Learning Systems (TNNLS
Short term energy consumption forecasting using neural basis expansion analysis for interpretable time series
Smart grids and smart homes are getting people\u27s attention in the modern era of smart cities. The advancements of smart technologies and smart grids have created challenges related to energy efficiency and production according to the future demand of clients. Machine learning, specifically neural network-based methods, remained successful in energy consumption prediction, but still, there are gaps due to uncertainty in the data and limitations of the algorithms. Research published in the literature has used small datasets and profiles of primarily single users; therefore, models have difficulties when applied to large datasets with profiles of different customers. Thus, a smart grid environment requires a model that handles consumption data from thousands of customers. The proposed model enhances the newly introduced method of Neural Basis Expansion Analysis for interpretable Time Series (N-BEATS) with a big dataset of energy consumption of 169 customers. Further, to validate the results of the proposed model, a performance comparison has been carried out with the Long Short Term Memory (LSTM), Blocked LSTM, Gated Recurrent Units (GRU), Blocked GRU and Temporal Convolutional Network (TCN). The proposed interpretable model improves the prediction accuracy on the big dataset containing energy consumption profiles of multiple customers. Incorporating covariates into the model improved accuracy by learning past and future energy consumption patterns. Based on a large dataset, the proposed model performed better for daily, weekly, and monthly energy consumption predictions. The forecasting accuracy of the N-BEATS interpretable model for 1-day-ahead energy consumption with day as covariates remained better than the 1, 2, 3, and 4-week scenarios
Data-Centric Epidemic Forecasting: A Survey
The COVID-19 pandemic has brought forth the importance of epidemic
forecasting for decision makers in multiple domains, ranging from public health
to the economy as a whole. While forecasting epidemic progression is frequently
conceptualized as being analogous to weather forecasting, however it has some
key differences and remains a non-trivial task. The spread of diseases is
subject to multiple confounding factors spanning human behavior, pathogen
dynamics, weather and environmental conditions. Research interest has been
fueled by the increased availability of rich data sources capturing previously
unobservable facets and also due to initiatives from government public health
and funding agencies. This has resulted, in particular, in a spate of work on
'data-centered' solutions which have shown potential in enhancing our
forecasting capabilities by leveraging non-traditional data sources as well as
recent innovations in AI and machine learning. This survey delves into various
data-driven methodological and practical advancements and introduces a
conceptual framework to navigate through them. First, we enumerate the large
number of epidemiological datasets and novel data streams that are relevant to
epidemic forecasting, capturing various factors like symptomatic online
surveys, retail and commerce, mobility, genomics data and more. Next, we
discuss methods and modeling paradigms focusing on the recent data-driven
statistical and deep-learning based methods as well as on the novel class of
hybrid models that combine domain knowledge of mechanistic models with the
effectiveness and flexibility of statistical approaches. We also discuss
experiences and challenges that arise in real-world deployment of these
forecasting systems including decision-making informed by forecasts. Finally,
we highlight some challenges and open problems found across the forecasting
pipeline.Comment: 67 pages, 12 figure
- …