446 research outputs found
Time Series Modeling of Irregularly Sampled Multivariate Clinical Data
Building of an accurate predictive model of clinical time series for a patient is critical for understanding of the patient condition, its dynamics, and optimal patient management. Unfortunately, this process is challenging because of: (1) multivariate behaviors: the real-world dynamics is multivariate and it is better described by multivariate time series (MTS); (2) irregular samples: sequential observations are collected at different times, and the time elapsed between two consecutive observations may vary; and (3) patient variability: clinical MTS vary from patient to patient and an individual patient may exhibit short-term variability reflecting the different events affecting the care and patient state.
In this dissertation, we investigate the different ways of developing and refining forecasting models from the irregularly sampled clinical MTS data collection. First, we focus on the refinements of a popular model for MTS analysis: the linear dynamical system (LDS) (a.k.a Kalman filter) and its application to MTS forecasting. We propose (1) a regularized LDS learning framework which automatically shuts down LDSs' spurious and unnecessary dimensions, and consequently, prevents the overfitting problem given a small amount of data; and (2) a generalized LDS learning framework via matrix factorization, which allows various constraints can be easily incorporated to guide the learning process. Second, we study ways of modeling irregularly sampled univariate clinical time series. We develop a new two-layer hierarchical dynamical system model for irregularly sampled clinical time series prediction. We demonstrate that our new system adapts better to irregular samples and it supports more accurate predictions. Finally, we propose, develop and experiment with two personalized forecasting frameworks for modeling and predicting clinical MTS of an individual patient. The first approach relies on model adaptation techniques. It calibrates the population based model's predictions with patient specific residual models, which are learned from the difference between the patient observations and the population based model's predictions. The second framework relies on adaptive model selection strategies to combine advantages of the population based, patient specific and short-term individualized predictive models. We demonstrate the benefits and advantages of the aforementioned frameworks on synthetic data sets, public time series data sets and clinical data extracted from EHRs
A multivariate timeseries modeling approach to severity of illness assessment and forecasting in ICU with sparse, heterogeneous clinical data
The ability to determine patient acuity (or severity of illness) has immediate practical use for clinicians. We evaluate the use of multivariate timeseries modeling with the multi-task Gaussian process (GP) models using noisy, incomplete, sparse, heterogeneous and unevenly-sampled clinical data, including both physiological signals and clinical notes. The learned multi-task GP (MTGP) hyperparameters are then used to assess and forecast patient acuity. Experiments were conducted with two real clinical data sets acquired from ICU patients: firstly, estimating cerebrovascular pressure reactivity, an important indicator of secondary damage for traumatic brain injury patients, by learning the interactions between intracranial pressure and mean arterial blood pressure signals, and secondly, mortality prediction using clinical progress notes. In both cases, MTGPs provided improved results: an MTGP model provided better results than single-task GP models for signal interpolation and forecasting (0.91 vs 0.69 RMSE), and the use of MTGP hyperparameters obtained improved results when used as additional classification features (0.812 vs 0.788 AUC).Intel Science and Technology Center for Big DataNational Institutes of Health. (U.S.). National Library of Medicine (Biomedical Informatics Research Training Grant NIH/NLM 2T15 LM007092-22)National Institute of Biomedical Imaging and Bioengineering (U.S.) (R01 Grant EB001659)Singapore. Agency for Science, Technology and Research (Graduate Scholarship
A multivariate timeseries modeling approach to severity of illness assessment and forecasting in ICU with sparse, heterogeneous clinical data
The ability to determine patient acuity (or severity of illness) has immediate practical use for clinicians. We evaluate the use of multivariate timeseries modeling with the multi-task Gaussian process (GP) models using noisy, incomplete, sparse, heterogeneous and unevenly-sampled clinical data, including both physiological signals and clinical notes. The learned multi-task GP (MTGP) hyperparameters are then used to assess and forecast patient acuity. Experiments were conducted with two real clinical data sets acquired from ICU patients: firstly, estimating cerebrovascular pressure reactivity, an important indicator of secondary damage for traumatic brain injury patients, by learning the interactions between intracranial pressure and mean arterial blood pressure signals, and secondly, mortality prediction using clinical progress notes. In both cases, MTGPs provided improved results: an MTGP model provided better results than single-task GP models for signal interpolation and forecasting (0.91 vs 0.69 RMSE), and the use of MTGP hyperparameters obtained improved results when used as additional classification features (0.812 vs 0.788 AUC).Intel Science and Technology Center for Big DataNational Institutes of Health. (U.S.). National Library of Medicine (Biomedical Informatics Research Training Grant NIH/NLM 2T15 LM007092-22)National Institute of Biomedical Imaging and Bioengineering (U.S.) (R01 Grant EB001659)Singapore. Agency for Science, Technology and Research (Graduate Scholarship
Recommended from our members
Deep Learning Models for Irregularly Sampled and Incomplete Time Series
Irregularly sampled time series data arise naturally in many application domains including biology, ecology, climate science, astronomy, geology, finance, and health. Such data present fundamental challenges to many classical models from machine learning and statistics. The first challenge with modeling such data is the presence of variable time gaps between the observation time points. The second challenge is that the dimensionality of the inputs can be different for different data cases. This occurs naturally due to the fact that different data cases are likely to include different numbers of observations. The third challenge is that different irregularly sampled instances have observations recorded at different times. This results in a lack of temporal alignment across data cases. There could also be a lack of alignment of observation time points across different dimensions in the same multivariate time series. These features of irregularly sampled time series data invalidate the assumption of a coherent fully-observed fixed-dimensional feature space that underlies many basic supervised and unsupervised learning models.
In this thesis, we focus on the development of deep learning models for the problems of supervised and unsupervised learning from irregularly sampled time series data. We begin by introducing a computationally efficient architecture for whole time series classification and regression problems based on the use of a novel deterministic interpolation-based layer that acts as a bridge between multivariate irregularly sampled time series data instances and standard neural network layers that assume regularly-spaced or fixed-dimensional inputs. The architecture is based on the use of a radial basis function (RBF) kernel interpolation network followed by the application of a prediction network. Next, we show how the use of fixed RBF kernel functions can be relaxed through the use of a novel attention-based continuous-time interpolation framework. We show that using attention to learn temporal similarity results in improvements over fixed RBF kernels and other recent approaches in terms of both supervised and unsupervised tasks. Next, we present a novel deep learning framework for probabilistic interpolation that significantly improves uncertainty quantification in the output interpolations. Furthermore, we show that this framework is also able to improve classification performance. As our final contribution, we study fusion architectures for learning from text data combined with irregularly sampled time series data
Spatiotemporal Modeling of Multivariate Signals With Graph Neural Networks and Structured State Space Models
Multivariate signals are prevalent in various domains, such as healthcare,
transportation systems, and space sciences. Modeling spatiotemporal
dependencies in multivariate signals is challenging due to (1) long-range
temporal dependencies and (2) complex spatial correlations between sensors. To
address these challenges, we propose representing multivariate signals as
graphs and introduce GraphS4mer, a general graph neural network (GNN)
architecture that captures both spatial and temporal dependencies in
multivariate signals. Specifically, (1) we leverage Structured State Spaces
model (S4), a state-of-the-art sequence model, to capture long-term temporal
dependencies and (2) we propose a graph structure learning layer in GraphS4mer
to learn dynamically evolving graph structures in the data. We evaluate our
proposed model on three distinct tasks and show that GraphS4mer consistently
improves over existing models, including (1) seizure detection from
electroencephalography signals, outperforming a previous GNN with
self-supervised pretraining by 3.1 points in AUROC; (2) sleep staging from
polysomnography signals, a 4.1 points improvement in macro-F1 score compared to
existing sleep staging models; and (3) traffic forecasting, reducing MAE by
8.8% compared to existing GNNs and by 1.4% compared to Transformer-based
models
- …