3,418 research outputs found
Time Series Modeling of Irregularly Sampled Multivariate Clinical Data
Building of an accurate predictive model of clinical time series for a patient is critical for understanding of the patient condition, its dynamics, and optimal patient management. Unfortunately, this process is challenging because of: (1) multivariate behaviors: the real-world dynamics is multivariate and it is better described by multivariate time series (MTS); (2) irregular samples: sequential observations are collected at different times, and the time elapsed between two consecutive observations may vary; and (3) patient variability: clinical MTS vary from patient to patient and an individual patient may exhibit short-term variability reflecting the different events affecting the care and patient state.
In this dissertation, we investigate the different ways of developing and refining forecasting models from the irregularly sampled clinical MTS data collection. First, we focus on the refinements of a popular model for MTS analysis: the linear dynamical system (LDS) (a.k.a Kalman filter) and its application to MTS forecasting. We propose (1) a regularized LDS learning framework which automatically shuts down LDSs' spurious and unnecessary dimensions, and consequently, prevents the overfitting problem given a small amount of data; and (2) a generalized LDS learning framework via matrix factorization, which allows various constraints can be easily incorporated to guide the learning process. Second, we study ways of modeling irregularly sampled univariate clinical time series. We develop a new two-layer hierarchical dynamical system model for irregularly sampled clinical time series prediction. We demonstrate that our new system adapts better to irregular samples and it supports more accurate predictions. Finally, we propose, develop and experiment with two personalized forecasting frameworks for modeling and predicting clinical MTS of an individual patient. The first approach relies on model adaptation techniques. It calibrates the population based model's predictions with patient specific residual models, which are learned from the difference between the patient observations and the population based model's predictions. The second framework relies on adaptive model selection strategies to combine advantages of the population based, patient specific and short-term individualized predictive models. We demonstrate the benefits and advantages of the aforementioned frameworks on synthetic data sets, public time series data sets and clinical data extracted from EHRs
Automatic Classification of Irregularly Sampled Time Series with Unequal Lengths: A Case Study on Estimated Glomerular Filtration Rate
A patient's estimated glomerular filtration rate (eGFR) can provide important
information about disease progression and kidney function. Traditionally, an
eGFR time series is interpreted by a human expert labelling it as stable or
unstable. While this approach works for individual patients, the time consuming
nature of it precludes the quick evaluation of risk in large numbers of
patients. However, automating this process poses significant challenges as eGFR
measurements are usually recorded at irregular intervals and the series of
measurements differs in length between patients. Here we present a two-tier
system to automatically classify an eGFR trend. First, we model the time series
using Gaussian process regression (GPR) to fill in `gaps' by resampling a fixed
size vector of fifty time-dependent observations. Second, we classify the
resampled eGFR time series using a K-NN/SVM classifier, and evaluate its
performance via 5-fold cross validation. Using this approach we achieved an
F-score of 0.90, compared to 0.96 for 5 human experts when scored amongst
themselves
- …