3 research outputs found

    Effects of Missing Data Imputation Methods on Univariate Time Series Forecasting with Arima and LSTM

    Get PDF
    Missing data are common in real-life studies and missing observations within the univariate time series cause analytical problems in the flow of the analysis. Imputation of missing values is an inevitable step in the analysis of every incomplete univariate time series data. The reviewed literature has shown that the focus of existing studies is on comparing the distribution of imputed data. There is a gap of knowledge on how different imputation methods for univariate time series data affect the fit and prediction performance of time series models. In this work, we evaluated the predictive performance of autoregressive integrated moving average (ARIMA) and long short-term memory (LSTM) models on imputed time-series data using Kalman smoothing on ARIMA, Kalman smoothing on structural time series model, mean imputation, exponentially weighted moving average, simple moving average, linear, cubic spline, stine, and KNN interpolation techniques under missing completely at random (MCAR) mechanism. Missing values were generated at 10%, 15%, 25%, and 35% rates using complete data of 24-hour ambulatory diastolic blood pressure readings. The performance of models was compared on imputed and original data using mean absolute percentage error (MAPE) and root mean square error (RMSE). Kalman smoothing on structural time series, exponentially weighted moving average, and Kalman smoothing on ARIMA were the best missing data replacement techniques as the gap of the missingness increased. The performance of mean imputation, cubic spline, KNN, and the other simple interpolation methods reduced significantly as the gap of missingness increased. The LSTM gave better predictions on the original training data, but the ARIMA predictions on imputed data gave consistent results across the four scenarios

    Tracking Progression of Patient State of Health in Critical Care Using Inferred Shared Dynamics in Physiological Time Series

    Get PDF
    Abstract — Physiologic systems generate complex dynamics in their output signals that reflect the changing state of the underlying control systems. In this work, we used a switching vector autoregressive (switching VAR) framework to systematically learn and identify a collection of vital sign dynamics, which can possibly be recurrent within the same patient and shared across the entire cohort. We show that these dynamical behaviors can be used to characterize and elucidate the progression of patients ’ states of health over time. Using the mean arterial blood pressure time series of 337 ICU patients during the first 24 hours of their ICU stays, we demonstrated that the learned dynamics from as early as the first 8 hours of patients ’ ICU stays can achieve similar hospital mortality prediction performance as the well-known SAPS-I acuity scores, suggesting that the discovered latent dynamics structure may yield more timely insights into the progression of a patient’s state of health than the traditional snapshot-based acuity scores. I
    corecore