17,532 research outputs found
Attention Gated Networks: Learning to Leverage Salient Regions in Medical Images
We propose a novel attention gate (AG) model for medical image analysis that
automatically learns to focus on target structures of varying shapes and sizes.
Models trained with AGs implicitly learn to suppress irrelevant regions in an
input image while highlighting salient features useful for a specific task.
This enables us to eliminate the necessity of using explicit external
tissue/organ localisation modules when using convolutional neural networks
(CNNs). AGs can be easily integrated into standard CNN models such as VGG or
U-Net architectures with minimal computational overhead while increasing the
model sensitivity and prediction accuracy. The proposed AG models are evaluated
on a variety of tasks, including medical image classification and segmentation.
For classification, we demonstrate the use case of AGs in scan plane detection
for fetal ultrasound screening. We show that the proposed attention mechanism
can provide efficient object localisation while improving the overall
prediction performance by reducing false positives. For segmentation, the
proposed architecture is evaluated on two large 3D CT abdominal datasets with
manual annotations for multiple organs. Experimental results show that AG
models consistently improve the prediction performance of the base
architectures across different datasets and training sizes while preserving
computational efficiency. Moreover, AGs guide the model activations to be
focused around salient regions, which provides better insights into how model
predictions are made. The source code for the proposed AG models is publicly
available.Comment: Accepted for Medical Image Analysis (Special Issue on Medical Imaging
with Deep Learning). arXiv admin note: substantial text overlap with
arXiv:1804.03999, arXiv:1804.0533
Bivariate Beta-LSTM
Long Short-Term Memory (LSTM) infers the long term dependency through a cell
state maintained by the input and the forget gate structures, which models a
gate output as a value in [0,1] through a sigmoid function. However, due to the
graduality of the sigmoid function, the sigmoid gate is not flexible in
representing multi-modality or skewness. Besides, the previous models lack
modeling on the correlation between the gates, which would be a new method to
adopt inductive bias for a relationship between previous and current input.
This paper proposes a new gate structure with the bivariate Beta distribution.
The proposed gate structure enables probabilistic modeling on the gates within
the LSTM cell so that the modelers can customize the cell state flow with
priors and distributions. Moreover, we theoretically show the higher upper
bound of the gradient compared to the sigmoid function, and we empirically
observed that the bivariate Beta distribution gate structure provides higher
gradient values in training. We demonstrate the effectiveness of bivariate Beta
gate structure on the sentence classification, image classification, polyphonic
music modeling, and image caption generation.Comment: AAAI 202
- …