145 research outputs found
NPRL: Nightly Profile Representation Learning for Early Sepsis Onset Prediction in ICU Trauma Patients
Sepsis is a syndrome that develops in response to the presence of infection.
It is characterized by severe organ dysfunction and is one of the leading
causes of mortality in Intensive Care Units (ICUs) worldwide. These
complications can be reduced through early application of antibiotics, hence
the ability to anticipate the onset of sepsis early is crucial to the survival
and well-being of patients. Current machine learning algorithms deployed inside
medical infrastructures have demonstrated poor performance and are insufficient
for anticipating sepsis onset early. In recent years, deep learning
methodologies have been proposed to predict sepsis, but some fail to capture
the time of onset (e.g., classifying patients' entire visits as developing
sepsis or not) and others are unrealistic to be deployed into medical
facilities (e.g., creating training instances using a fixed time to onset where
the time of onset needs to be known apriori). Therefore, in this paper, we
first propose a novel but realistic prediction framework that predicts each
morning whether sepsis onset will occur within the next 24 hours with the help
of most recent data collected at night, when patient-provider ratios are higher
due to cross-coverage resulting in limited observation to each patient.
However, as we increase the prediction rate into daily, the number of negative
instances will increase while that of positive ones remain the same.
Thereafter, we have a severe class imbalance problem, making a machine learning
model hard to capture rare sepsis cases. To address this problem, we propose to
do nightly profile representation learning (NPRL) for each patient. We prove
that NPRL can theoretically alleviate the rare event problem. Our empirical
study using data from a level-1 trauma center further demonstrates the
effectiveness of our proposal
Recommended from our members
Deep Learning Models for Irregularly Sampled and Incomplete Time Series
Irregularly sampled time series data arise naturally in many application domains including biology, ecology, climate science, astronomy, geology, finance, and health. Such data present fundamental challenges to many classical models from machine learning and statistics. The first challenge with modeling such data is the presence of variable time gaps between the observation time points. The second challenge is that the dimensionality of the inputs can be different for different data cases. This occurs naturally due to the fact that different data cases are likely to include different numbers of observations. The third challenge is that different irregularly sampled instances have observations recorded at different times. This results in a lack of temporal alignment across data cases. There could also be a lack of alignment of observation time points across different dimensions in the same multivariate time series. These features of irregularly sampled time series data invalidate the assumption of a coherent fully-observed fixed-dimensional feature space that underlies many basic supervised and unsupervised learning models.
In this thesis, we focus on the development of deep learning models for the problems of supervised and unsupervised learning from irregularly sampled time series data. We begin by introducing a computationally efficient architecture for whole time series classification and regression problems based on the use of a novel deterministic interpolation-based layer that acts as a bridge between multivariate irregularly sampled time series data instances and standard neural network layers that assume regularly-spaced or fixed-dimensional inputs. The architecture is based on the use of a radial basis function (RBF) kernel interpolation network followed by the application of a prediction network. Next, we show how the use of fixed RBF kernel functions can be relaxed through the use of a novel attention-based continuous-time interpolation framework. We show that using attention to learn temporal similarity results in improvements over fixed RBF kernels and other recent approaches in terms of both supervised and unsupervised tasks. Next, we present a novel deep learning framework for probabilistic interpolation that significantly improves uncertainty quantification in the output interpolations. Furthermore, we show that this framework is also able to improve classification performance. As our final contribution, we study fusion architectures for learning from text data combined with irregularly sampled time series data
Sequential Gaussian Processes for Online Learning of Nonstationary Functions
Many machine learning problems can be framed in the context of estimating
functions, and often these are time-dependent functions that are estimated in
real-time as observations arrive. Gaussian processes (GPs) are an attractive
choice for modeling real-valued nonlinear functions due to their flexibility
and uncertainty quantification. However, the typical GP regression model
suffers from several drawbacks: i) Conventional GP inference scales
with respect to the number of observations; ii) updating a GP model
sequentially is not trivial; and iii) covariance kernels often enforce
stationarity constraints on the function, while GPs with non-stationary
covariance kernels are often intractable to use in practice. To overcome these
issues, we propose an online sequential Monte Carlo algorithm to fit mixtures
of GPs that capture non-stationary behavior while allowing for fast,
distributed inference. By formulating hyperparameter optimization as a
multi-armed bandit problem, we accelerate mixing for real time inference. Our
approach empirically improves performance over state-of-the-art methods for
online GP estimation in the context of prediction for simulated non-stationary
data and hospital time series data
- …