12,434 research outputs found

    Efficient video indexing for monitoring disease activity and progression in the upper gastrointestinal tract

    Full text link
    Endoscopy is a routine imaging technique used for both diagnosis and minimally invasive surgical treatment. While the endoscopy video contains a wealth of information, tools to capture this information for the purpose of clinical reporting are rather poor. In date, endoscopists do not have any access to tools that enable them to browse the video data in an efficient and user friendly manner. Fast and reliable video retrieval methods could for example, allow them to review data from previous exams and therefore improve their ability to monitor disease progression. Deep learning provides new avenues of compressing and indexing video in an extremely efficient manner. In this study, we propose to use an autoencoder for efficient video compression and fast retrieval of video images. To boost the accuracy of video image retrieval and to address data variability like multi-modality and view-point changes, we propose the integration of a Siamese network. We demonstrate that our approach is competitive in retrieving images from 3 large scale videos of 3 different patients obtained against the query samples of their previous diagnosis. Quantitative validation shows that the combined approach yield an overall improvement of 5% and 8% over classical and variational autoencoders, respectively.Comment: Accepted at IEEE International Symposium on Biomedical Imaging (ISBI), 201

    MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network

    Full text link
    The inability to interpret the model prediction in semantically and visually meaningful ways is a well-known shortcoming of most existing computer-aided diagnosis methods. In this paper, we propose MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process. MDNet includes an image model and a language model. The image model is proposed to enhance multi-scale feature ensembles and utilization efficiency. The language model, integrated with our improved attention mechanism, aims to read and explore discriminative image feature descriptions from reports to learn a direct mapping from sentence words to image pixels. The overall network is trained end-to-end by using our developed optimization strategy. Based on a pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we conduct sufficient experiments to demonstrate that MDNet outperforms comparative baselines. The proposed image model obtains state-of-the-art performance on two CIFAR datasets as well.Comment: CVPR2017 Ora
    • …
    corecore