18,204 research outputs found
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
The inability to interpret the model prediction in semantically and visually
meaningful ways is a well-known shortcoming of most existing computer-aided
diagnosis methods. In this paper, we propose MDNet to establish a direct
multimodal mapping between medical images and diagnostic reports that can read
images, generate diagnostic reports, retrieve images by symptom descriptions,
and visualize attention, to provide justifications of the network diagnosis
process. MDNet includes an image model and a language model. The image model is
proposed to enhance multi-scale feature ensembles and utilization efficiency.
The language model, integrated with our improved attention mechanism, aims to
read and explore discriminative image feature descriptions from reports to
learn a direct mapping from sentence words to image pixels. The overall network
is trained end-to-end by using our developed optimization strategy. Based on a
pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we
conduct sufficient experiments to demonstrate that MDNet outperforms
comparative baselines. The proposed image model obtains state-of-the-art
performance on two CIFAR datasets as well.Comment: CVPR2017 Ora
Signal2Image Modules in Deep Neural Networks for EEG Classification
Deep learning has revolutionized computer vision utilizing the increased
availability of big data and the power of parallel computational units such as
graphical processing units. The vast majority of deep learning research is
conducted using images as training data, however the biomedical domain is rich
in physiological signals that are used for diagnosis and prediction problems.
It is still an open research question how to best utilize signals to train deep
neural networks.
In this paper we define the term Signal2Image (S2Is) as trainable or
non-trainable prefix modules that convert signals, such as
Electroencephalography (EEG), to image-like representations making them
suitable for training image-based deep neural networks defined as `base
models'. We compare the accuracy and time performance of four S2Is (`signal as
image', spectrogram, one and two layer Convolutional Neural Networks (CNNs))
combined with a set of `base models' (LeNet, AlexNet, VGGnet, ResNet, DenseNet)
along with the depth-wise and 1D variations of the latter. We also provide
empirical evidence that the one layer CNN S2I performs better in eleven out of
fifteen tested models than non-trainable S2Is for classifying EEG signals and
we present visual comparisons of the outputs of the S2Is.Comment: 4 pages, 2 figures, 1 table, EMBC 201
- …