1,177 research outputs found
Bootstrapping Graph Convolutional Neural Networks for Autism Spectrum Disorder Classification
Using predictive models to identify patterns that can act as biomarkers for
different neuropathoglogical conditions is becoming highly prevalent. In this
paper, we consider the problem of Autism Spectrum Disorder (ASD) classification
where previous work has shown that it can be beneficial to incorporate a wide
variety of meta features, such as socio-cultural traits, into predictive
modeling. A graph-based approach naturally suits these scenarios, where a
contextual graph captures traits that characterize a population, while the
specific brain activity patterns are utilized as a multivariate signal at the
nodes. Graph neural networks have shown improvements in inferencing with
graph-structured data. Though the underlying graph strongly dictates the
overall performance, there exists no systematic way of choosing an appropriate
graph in practice, thus making predictive models non-robust. To address this,
we propose a bootstrapped version of graph convolutional neural networks
(G-CNNs) that utilizes an ensemble of weakly trained G-CNNs, and reduce the
sensitivity of models on the choice of graph construction. We demonstrate its
effectiveness on the challenging Autism Brain Imaging Data Exchange (ABIDE)
dataset and show that our approach improves upon recently proposed graph-based
neural networks. We also show that our method remains more robust to noisy
graphs
A Generative Modeling Approach to Limited Channel ECG Classification
Processing temporal sequences is central to a variety of applications in
health care, and in particular multi-channel Electrocardiogram (ECG) is a
highly prevalent diagnostic modality that relies on robust sequence modeling.
While Recurrent Neural Networks (RNNs) have led to significant advances in
automated diagnosis with time-series data, they perform poorly when models are
trained using a limited set of channels. A crucial limitation of existing
solutions is that they rely solely on discriminative models, which tend to
generalize poorly in such scenarios. In order to combat this limitation, we
develop a generative modeling approach to limited channel ECG classification.
This approach first uses a Seq2Seq model to implicitly generate the missing
channel information, and then uses the latent representation to perform the
actual supervisory task. This decoupling enables the use of unsupervised data
and also provides highly robust metric spaces for subsequent discriminative
learning. Our experiments with the Physionet dataset clearly evidence the
effectiveness of our approach over standard RNNs in disease prediction
Understanding Behavior of Clinical Models under Domain Shifts
The hypothesis that computational models can be reliable enough to be adopted
in prognosis and patient care is revolutionizing healthcare. Deep learning, in
particular, has been a game changer in building predictive models, thus leading
to community-wide data curation efforts. However, due to inherent variabilities
in population characteristics and biological systems, these models are often
biased to the training datasets. This can be limiting when models are deployed
in new environments, when there are systematic domain shifts not known a
priori. In this paper, we propose to emulate a large class of domain shifts,
that can occur in clinical settings, with a given dataset, and argue that
evaluating the behavior of predictive models in light of those shifts is an
effective way to quantify their reliability. More specifically, we develop an
approach for building realistic scenarios, based on analysis of \textit{disease
landscapes} in multi-label classification. Using the openly available MIMIC-III
EHR dataset for phenotyping, for the first time, our work sheds light into data
regimes where deep clinical models can fail to generalize. This work emphasizes
the need for novel validation mechanisms driven by real-world domain shifts in
AI for healthcare
A Regularized Attention Mechanism for Graph Attention Networks
Machine learning models that can exploit the inherent structure in data have
gained prominence. In particular, there is a surge in deep learning solutions
for graph-structured data, due to its wide-spread applicability in several
fields. Graph attention networks (GAT), a recent addition to the broad class of
feature learning models in graphs, utilizes the attention mechanism to
efficiently learn continuous vector representations for semi-supervised
learning problems. In this paper, we perform a detailed analysis of GAT models,
and present interesting insights into their behavior. In particular, we show
that the models are vulnerable to heterogeneous rogue nodes and hence propose
novel regularization strategies to improve the robustness of GAT models. Using
benchmark datasets, we demonstrate performance improvements on semi-supervised
learning, using the proposed robust variant of GAT
Learning Stable Multilevel Dictionaries for Sparse Representations
Sparse representations using learned dictionaries are being increasingly used
with success in several data processing and machine learning applications. The
availability of abundant training data necessitates the development of
efficient, robust and provably good dictionary learning algorithms. Algorithmic
stability and generalization are desirable characteristics for dictionary
learning algorithms that aim to build global dictionaries which can efficiently
model any test data similar to the training samples. In this paper, we propose
an algorithm to learn dictionaries for sparse representations from large scale
data, and prove that the proposed learning algorithm is stable and
generalizable asymptotically. The algorithm employs a 1-D subspace clustering
procedure, the K-hyperline clustering, in order to learn a hierarchical
dictionary with multiple levels. We also propose an information-theoretic
scheme to estimate the number of atoms needed in each level of learning and
develop an ensemble approach to learn robust dictionaries. Using the proposed
dictionaries, the sparse code for novel test data can be computed using a
low-complexity pursuit procedure. We demonstrate the stability and
generalization characteristics of the proposed algorithm using simulations. We
also evaluate the utility of the multilevel dictionaries in compressed recovery
and subspace learning applications
Attend and Diagnose: Clinical Time Series Analysis using Attention Models
With widespread adoption of electronic health records, there is an increased
emphasis for predictive models that can effectively deal with clinical
time-series data. Powered by Recurrent Neural Network (RNN) architectures with
Long Short-Term Memory (LSTM) units, deep neural networks have achieved
state-of-the-art results in several clinical prediction tasks. Despite the
success of RNNs, its sequential nature prohibits parallelized computing, thus
making it inefficient particularly when processing long sequences. Recently,
architectures which are based solely on attention mechanisms have shown
remarkable success in transduction tasks in NLP, while being computationally
superior. In this paper, for the first time, we utilize attention models for
clinical time-series modeling, thereby dispensing recurrence entirely. We
develop the \textit{SAnD} (Simply Attend and Diagnose) architecture, which
employs a masked, self-attention mechanism, and uses positional encoding and
dense interpolation strategies for incorporating temporal order. Furthermore,
we develop a multi-task variant of \textit{SAnD} to jointly infer models with
multiple diagnosis tasks. Using the recent MIMIC-III benchmark datasets, we
demonstrate that the proposed approach achieves state-of-the-art performance in
all tasks, outperforming LSTM models and classical baselines with
hand-engineered features.Comment: AAAI 201
Calibrating Healthcare AI: Towards Reliable and Interpretable Deep Predictive Models
The wide-spread adoption of representation learning technologies in clinical
decision making strongly emphasizes the need for characterizing model
reliability and enabling rigorous introspection of model behavior. While the
former need is often addressed by incorporating uncertainty quantification
strategies, the latter challenge is addressed using a broad class of
interpretability techniques. In this paper, we argue that these two objectives
are not necessarily disparate and propose to utilize prediction calibration to
meet both objectives. More specifically, our approach is comprised of a
calibration-driven learning method, which is also used to design an
interpretability technique based on counterfactual reasoning. Furthermore, we
introduce \textit{reliability plots}, a holistic evaluation mechanism for model
reliability. Using a lesion classification problem with dermoscopy images, we
demonstrate the effectiveness of our approach and infer interesting insights
about the model behavior
An Unsupervised Approach to Solving Inverse Problems using Generative Adversarial Networks
Solving inverse problems continues to be a challenge in a wide array of
applications ranging from deblurring, image inpainting, source separation etc.
Most existing techniques solve such inverse problems by either explicitly or
implicitly finding the inverse of the model. The former class of techniques
require explicit knowledge of the measurement process which can be unrealistic,
and rely on strong analytical regularizers to constrain the solution space,
which often do not generalize well. The latter approaches have had remarkable
success in part due to deep learning, but require a large collection of
source-observation pairs, which can be prohibitively expensive. In this paper,
we propose an unsupervised technique to solve inverse problems with generative
adversarial networks (GANs). Using a pre-trained GAN in the space of source
signals, we show that one can reliably recover solutions to under determined
problems in a `blind' fashion, i.e., without knowledge of the measurement
process. We solve this by making successive estimates on the model and the
solution in an iterative fashion. We show promising results in three
challenging applications -- blind source separation, image deblurring, and
recovering an image from its edge map, and perform better than several
baselines
Self-Training with Improved Regularization for Sample-Efficient Chest X-Ray Classification
Automated diagnostic assistants in healthcare necessitate accurate AI models
that can be trained with limited labeled data, can cope with severe class
imbalances and can support simultaneous prediction of multiple disease
conditions. To this end, we present a deep learning framework that utilizes a
number of key components to enable robust modeling in such challenging
scenarios. Using an important use-case in chest X-ray classification, we
provide several key insights on the effective use of data augmentation,
self-training via distillation and confidence tempering for small data learning
in medical imaging. Our results show that using 85% lesser labeled data, we can
build predictive models that match the performance of classifiers trained in a
large-scale data setting
Beyond L2-Loss Functions for Learning Sparse Models
Incorporating sparsity priors in learning tasks can give rise to simple, and
interpretable models for complex high dimensional data. Sparse models have
found widespread use in structure discovery, recovering data from corruptions,
and a variety of large scale unsupervised and supervised learning problems.
Assuming the availability of sufficient data, these methods infer dictionaries
for sparse representations by optimizing for high-fidelity reconstruction. In
most scenarios, the reconstruction quality is measured using the squared
Euclidean distance, and efficient algorithms have been developed for both batch
and online learning cases. However, new application domains motivate looking
beyond conventional loss functions. For example, robust loss functions such as
and Huber are useful in learning outlier-resilient models, and the
quantile loss is beneficial in discovering structures that are the
representative of a particular quantile. These new applications motivate our
work in generalizing sparse learning to a broad class of convex loss functions.
In particular, we consider the class of piecewise linear quadratic (PLQ) cost
functions that includes Huber, as well as , quantile, Vapnik, hinge
loss, and smoothed variants of these penalties. We propose an algorithm to
learn dictionaries and obtain sparse codes when the data reconstruction
fidelity is measured using any smooth PLQ cost function. We provide convergence
guarantees for the proposed algorithm, and demonstrate the convergence behavior
using empirical experiments. Furthermore, we present three case studies that
require the use of PLQ cost functions: (i) robust image modeling, (ii) tag
refinement for image annotation and retrieval and (iii) computing empirical
confidence limits for subspace clustering.Comment: 10 pages, 6 figure
- β¦