99 research outputs found
Path Imputation Strategies for Signature Models of Irregular Time Series
The signature transform is a 'universal nonlinearity' on the space of
continuous vector-valued paths, and has received attention for use in machine
learning on time series. However, real-world temporal data is typically
observed at discrete points in time, and must first be transformed into a
continuous path before signature techniques can be applied. We make this step
explicit by characterising it as an imputation problem, and empirically assess
the impact of various imputation strategies when applying signature-based
neural nets to irregular time series data. For one of these strategies,
Gaussian process (GP) adapters, we propose an extension~(GP-PoM) that makes
uncertainty information directly available to the subsequent classifier while
at the same time preventing costly Monte-Carlo (MC) sampling. In our
experiments, we find that the choice of imputation drastically affects shallow
signature models, whereas deeper architectures are more robust. Next, we
observe that uncertainty-aware predictions (based on GP-PoM or indicator
imputations) are beneficial for predictive performance, even compared to the
uncertainty-aware training of conventional GP adapters. In conclusion, we have
demonstrated that the path construction is indeed crucial for signature models
and that our proposed strategy leads to competitive performance in general,
while improving robustness of signature models in particular
Interpolation-Prediction Networks for Irregularly Sampled Time Series
In this paper, we present a new deep learning architecture for addressing the
problem of supervised learning with sparse and irregularly sampled multivariate
time series. The architecture is based on the use of a semi-parametric
interpolation network followed by the application of a prediction network. The
interpolation network allows for information to be shared across multiple
dimensions of a multivariate time series during the interpolation stage, while
any standard deep learning model can be used for the prediction network. This
work is motivated by the analysis of physiological time series data in
electronic health records, which are sparse, irregularly sampled, and
multivariate. We investigate the performance of this architecture on both
classification and regression tasks, showing that our approach outperforms a
range of baseline and recently proposed models.Comment: International Conference on Learning Representations. arXiv admin
note: substantial text overlap with arXiv:1812.0053
Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier
We present a scalable end-to-end classifier that uses streaming physiological
and medication data to accurately predict the onset of sepsis, a
life-threatening complication from infections that has high mortality and
morbidity. Our proposed framework models the multivariate trajectories of
continuous-valued physiological time series using multitask Gaussian processes,
seamlessly accounting for the high uncertainty, frequent missingness, and
irregular sampling rates typically associated with real clinical data. The
Gaussian process is directly connected to a black-box classifier that predicts
whether a patient will become septic, chosen in our case to be a recurrent
neural network to account for the extreme variability in the length of patient
encounters. We show how to scale the computations associated with the Gaussian
process in a manner so that the entire system can be discriminatively trained
end-to-end using backpropagation. In a large cohort of heterogeneous inpatient
encounters at our university health system we find that it outperforms several
baselines at predicting sepsis, and yields 19.4% and 55.5% improved areas under
the Receiver Operating Characteristic and Precision Recall curves as compared
to the NEWS score currently used by our hospital.Comment: Presented at 34th International Conference on Machine Learning (ICML
2017), Sydney, Australi
Scalable Joint Models for Reliable Uncertainty-Aware Event Prediction
Missing data and noisy observations pose significant challenges for reliably
predicting events from irregularly sampled multivariate time series
(longitudinal) data. Imputation methods, which are typically used for
completing the data prior to event prediction, lack a principled mechanism to
account for the uncertainty due to missingness. Alternatively, state-of-the-art
joint modeling techniques can be used for jointly modeling the longitudinal and
event data and compute event probabilities conditioned on the longitudinal
observations. These approaches, however, make strong parametric assumptions and
do not easily scale to multivariate signals with many observations. Our
proposed approach consists of several key innovations. First, we develop a
flexible and scalable joint model based upon sparse multiple-output Gaussian
processes. Unlike state-of-the-art joint models, the proposed model can explain
highly challenging structure including non-Gaussian noise while scaling to
large data. Second, we derive an optimal policy for predicting events using the
distribution of the event occurrence estimated by the joint model. The derived
policy trades-off the cost of a delayed detection versus incorrect assessments
and abstains from making decisions when the estimated event probability does
not satisfy the derived confidence criteria. Experiments on a large dataset
show that the proposed framework significantly outperforms state-of-the-art
techniques in event prediction.Comment: To appear in IEEE Transaction on Pattern Analysis and Machine
Intelligenc
Modeling Irregularly Sampled Clinical Time Series
While the volume of electronic health records (EHR) data continues to grow,
it remains rare for hospital systems to capture dense physiological data
streams, even in the data-rich intensive care unit setting. Instead, typical
EHR records consist of sparse and irregularly observed multivariate time
series, which are well understood to present particularly challenging problems
for machine learning methods. In this paper, we present a new deep learning
architecture for addressing this problem based on the use of a semi-parametric
interpolation network followed by the application of a prediction network. The
interpolation network allows for information to be shared across multiple
dimensions during the interpolation stage, while any standard deep learning
model can be used for the prediction network. We investigate the performance of
this architecture on the problems of mortality and length of stay prediction.Comment: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018
arXiv:cs/010120
Progressive Growing of Neural ODEs
Neural Ordinary Differential Equations (NODEs) have proven to be a powerful
modeling tool for approximating (interpolation) and forecasting (extrapolation)
irregularly sampled time series data. However, their performance degrades
substantially when applied to real-world data, especially long-term data with
complex behaviors (e.g., long-term trend across years, mid-term seasonality
across months, and short-term local variation across days). To address the
modeling of such complex data with different behaviors at different frequencies
(time spans), we propose a novel progressive learning paradigm of NODEs for
long-term time series forecasting. Specifically, following the principle of
curriculum learning, we gradually increase the complexity of data and network
capacity as training progresses. Our experiments with both synthetic data and
real traffic data (PeMS Bay Area traffic data) show that our training
methodology consistently improves the performance of vanilla NODEs by over 64%
Integrating Physiological Time Series and Clinical Notes with Deep Learning for Improved ICU Mortality Prediction
Intensive Care Unit Electronic Health Records (ICU EHRs) store multimodal
data about patients including clinical notes, sparse and irregularly sampled
physiological time series, lab results, and more. To date, most methods
designed to learn predictive models from ICU EHR data have focused on a single
modality. In this paper, we leverage the recently proposed
interpolation-prediction deep learning architecture(Shukla and Marlin 2019) as
a basis for exploring how physiological time series data and clinical notes can
be integrated into a unified mortality prediction model. We study both early
and late fusion approaches and demonstrate how the relative predictive value of
clinical text and physiological data change over time. Our results show that a
late fusion approach can provide a statistically significant improvement in
mortality prediction performance over using individual modalities in isolation.Comment: Presented at ACM Conference on Health, Inference and Learning
(Workshop Track), 202
Unsupervised Online Anomaly Detection On Irregularly Sampled Or Missing Valued Time-Series Data Using LSTM Networks
We study anomaly detection and introduce an algorithm that processes variable
length, irregularly sampled sequences or sequences with missing values. Our
algorithm is fully unsupervised, however, can be readily extended to supervised
or semisupervised cases when the anomaly labels are present as remarked
throughout the paper. Our approach uses the Long Short Term Memory (LSTM)
networks in order to extract temporal features and find the most relevant
feature vectors for anomaly detection. We incorporate the sampling time
information to our model by modulating the standard LSTM model with time
modulation gates. After obtaining the most relevant features from the LSTM, we
label the sequences using a Support Vector Data Descriptor (SVDD) model. We
introduce a loss function and then jointly optimize the feature extraction and
sequence processing mechanisms in an end-to-end manner. Through this joint
optimization, the LSTM extracts the most relevant features for anomaly
detection later to be used in the SVDD, hence completely removes the need for
feature selection by expert knowledge. Furthermore, we provide a training
algorithm for the online setup, where we optimize our model parameters with
individual sequences as the new data arrives. Finally, on real-life datasets,
we show that our model significantly outperforms the standard approaches thanks
to its combination of LSTM with SVDD and joint optimization.Comment: 11 page
Set Functions for Time Series
Despite the eminent successes of deep neural networks, many architectures are
often hard to transfer to irregularly-sampled and asynchronous time series that
commonly occur in real-world datasets, especially in healthcare applications.
This paper proposes a novel approach for classifying irregularly-sampled time
series with unaligned measurements, focusing on high scalability and data
efficiency. Our method SeFT (Set Functions for Time Series) is based on recent
advances in differentiable set function learning, extremely parallelizable with
a beneficial memory footprint, thus scaling well to large datasets of long time
series and online monitoring scenarios. Furthermore, our approach permits
quantifying per-observation contributions to the classification outcome. We
extensively compare our method with existing algorithms on multiple healthcare
time series datasets and demonstrate that it performs competitively whilst
significantly reducing runtime.Comment: Accepted at the International Conference on Machine Learning (ICML)
202
A Survey on Principles, Models and Methods for Learning from Irregularly Sampled Time Series
Irregularly sampled time series data arise naturally in many application
domains including biology, ecology, climate science, astronomy, and health.
Such data represent fundamental challenges to many classical models from
machine learning and statistics due to the presence of non-uniform intervals
between observations. However, there has been significant progress within the
machine learning community over the last decade on developing specialized
models and architectures for learning from irregularly sampled univariate and
multivariate time series data. In this survey, we first describe several axes
along which approaches to learning from irregularly sampled time series differ
including what data representations they are based on, what modeling primitives
they leverage to deal with the fundamental problem of irregular sampling, and
what inference tasks they are designed to perform. We then survey the recent
literature organized primarily along the axis of modeling primitives. We
describe approaches based on temporal discretization, interpolation,
recurrence, attention and structural invariance. We discuss similarities and
differences between approaches and highlight primary strengths and weaknesses.Comment: Presented at NeurIPS 2020 Workshop: ML Retrospectives, Surveys &
Meta-Analyses (ML-RSA
- …