12,434 research outputs found
Efficient video indexing for monitoring disease activity and progression in the upper gastrointestinal tract
Endoscopy is a routine imaging technique used for both diagnosis and
minimally invasive surgical treatment. While the endoscopy video contains a
wealth of information, tools to capture this information for the purpose of
clinical reporting are rather poor. In date, endoscopists do not have any
access to tools that enable them to browse the video data in an efficient and
user friendly manner. Fast and reliable video retrieval methods could for
example, allow them to review data from previous exams and therefore improve
their ability to monitor disease progression. Deep learning provides new
avenues of compressing and indexing video in an extremely efficient manner. In
this study, we propose to use an autoencoder for efficient video compression
and fast retrieval of video images. To boost the accuracy of video image
retrieval and to address data variability like multi-modality and view-point
changes, we propose the integration of a Siamese network. We demonstrate that
our approach is competitive in retrieving images from 3 large scale videos of 3
different patients obtained against the query samples of their previous
diagnosis. Quantitative validation shows that the combined approach yield an
overall improvement of 5% and 8% over classical and variational autoencoders,
respectively.Comment: Accepted at IEEE International Symposium on Biomedical Imaging
(ISBI), 201
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
The inability to interpret the model prediction in semantically and visually
meaningful ways is a well-known shortcoming of most existing computer-aided
diagnosis methods. In this paper, we propose MDNet to establish a direct
multimodal mapping between medical images and diagnostic reports that can read
images, generate diagnostic reports, retrieve images by symptom descriptions,
and visualize attention, to provide justifications of the network diagnosis
process. MDNet includes an image model and a language model. The image model is
proposed to enhance multi-scale feature ensembles and utilization efficiency.
The language model, integrated with our improved attention mechanism, aims to
read and explore discriminative image feature descriptions from reports to
learn a direct mapping from sentence words to image pixels. The overall network
is trained end-to-end by using our developed optimization strategy. Based on a
pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we
conduct sufficient experiments to demonstrate that MDNet outperforms
comparative baselines. The proposed image model obtains state-of-the-art
performance on two CIFAR datasets as well.Comment: CVPR2017 Ora
- …