6,643 research outputs found
A Novel Weight-Shared Multi-Stage CNN for Scale Robustness
Convolutional neural networks (CNNs) have demonstrated remarkable results in
image classification for benchmark tasks and practical applications. The CNNs
with deeper architectures have achieved even higher performance recently thanks
to their robustness to the parallel shift of objects in images as well as their
numerous parameters and the resulting high expression ability. However, CNNs
have a limited robustness to other geometric transformations such as scaling
and rotation. This limits the performance improvement of the deep CNNs, but
there is no established solution. This study focuses on scale transformation
and proposes a network architecture called the weight-shared multi-stage
network (WSMS-Net), which consists of multiple stages of CNNs. The proposed
WSMS-Net is easily combined with existing deep CNNs such as ResNet and DenseNet
and enables them to acquire robustness to object scaling. Experimental results
on the CIFAR-10, CIFAR-100, and ImageNet datasets demonstrate that existing
deep CNNs combined with the proposed WSMS-Net achieve higher accuracies for
image classification tasks with only a minor increase in the number of
parameters and computation time.Comment: accepted version, 13 page
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
The inability to interpret the model prediction in semantically and visually
meaningful ways is a well-known shortcoming of most existing computer-aided
diagnosis methods. In this paper, we propose MDNet to establish a direct
multimodal mapping between medical images and diagnostic reports that can read
images, generate diagnostic reports, retrieve images by symptom descriptions,
and visualize attention, to provide justifications of the network diagnosis
process. MDNet includes an image model and a language model. The image model is
proposed to enhance multi-scale feature ensembles and utilization efficiency.
The language model, integrated with our improved attention mechanism, aims to
read and explore discriminative image feature descriptions from reports to
learn a direct mapping from sentence words to image pixels. The overall network
is trained end-to-end by using our developed optimization strategy. Based on a
pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we
conduct sufficient experiments to demonstrate that MDNet outperforms
comparative baselines. The proposed image model obtains state-of-the-art
performance on two CIFAR datasets as well.Comment: CVPR2017 Ora
- …