5,912 research outputs found
Attend and Diagnose: Clinical Time Series Analysis using Attention Models
With widespread adoption of electronic health records, there is an increased
emphasis for predictive models that can effectively deal with clinical
time-series data. Powered by Recurrent Neural Network (RNN) architectures with
Long Short-Term Memory (LSTM) units, deep neural networks have achieved
state-of-the-art results in several clinical prediction tasks. Despite the
success of RNNs, its sequential nature prohibits parallelized computing, thus
making it inefficient particularly when processing long sequences. Recently,
architectures which are based solely on attention mechanisms have shown
remarkable success in transduction tasks in NLP, while being computationally
superior. In this paper, for the first time, we utilize attention models for
clinical time-series modeling, thereby dispensing recurrence entirely. We
develop the \textit{SAnD} (Simply Attend and Diagnose) architecture, which
employs a masked, self-attention mechanism, and uses positional encoding and
dense interpolation strategies for incorporating temporal order. Furthermore,
we develop a multi-task variant of \textit{SAnD} to jointly infer models with
multiple diagnosis tasks. Using the recent MIMIC-III benchmark datasets, we
demonstrate that the proposed approach achieves state-of-the-art performance in
all tasks, outperforming LSTM models and classical baselines with
hand-engineered features.Comment: AAAI 201
Assessing hyper parameter optimization and speedup for convolutional neural networks
The increased processing power of graphical processing units (GPUs) and the availability of large image datasets has fostered a renewed interest in extracting semantic information from images. Promising results for complex image categorization problems have been achieved using deep learning, with neural networks comprised of many layers. Convolutional neural networks (CNN) are one such architecture which provides more opportunities for image classification. Advances in CNN enable the development of training models using large labelled image datasets, but the hyper parameters need to be specified, which is challenging and complex due to the large number of parameters. A substantial amount of computational power and processing time is required to determine the optimal hyper parameters to define a model yielding good results. This article provides a survey of the hyper parameter search and optimization methods for CNN architectures
The Parameter Houlihan: a solution to high-throughput identifiability indeterminacy for brutally ill-posed problems
One way to interject knowledge into clinically impactful forecasting is to
use data assimilation, a nonlinear regression that projects data onto a
mechanistic physiologic model, instead of a set of functions, such as neural
networks. Such regressions have an advantage of being useful with particularly
sparse, non-stationary clinical data. However, physiological models are often
nonlinear and can have many parameters, leading to potential problems with
parameter identifiability, or the ability to find a unique set of parameters
that minimize forecasting error. The identifiability problems can be minimized
or eliminated by reducing the number of parameters estimated, but reducing the
number of estimated parameters also reduces the flexibility of the model and
hence increases forecasting error. We propose a method, the parameter Houlihan,
that combines traditional machine learning techniques with data assimilation,
to select the right set of model parameters to minimize forecasting error while
reducing identifiability problems. The method worked well: the data
assimilation-based glucose forecasts and estimates for our cohort using the
Houlihan-selected parameter sets generally also minimize forecasting errors
compared to other parameter selection methods such as by-hand parameter
selection. Nevertheless, the forecast with the lowest forecast error does not
always accurately represent physiology, but further advancements of the
algorithm provide a path for improving physiologic fidelity as well. Our hope
is that this methodology represents a first step toward combining machine
learning with data assimilation and provides a lower-threshold entry point for
using data assimilation with clinical data by helping select the right
parameters to estimate
- …