104 research outputs found

    Deep Lesion Graphs in the Wild: Relationship Learning and Organization of Significant Radiology Image Findings in a Diverse Large-scale Lesion Database

    Full text link
    Radiologists in their daily work routinely find and annotate significant abnormalities on a large number of radiology images. Such abnormalities, or lesions, have collected over years and stored in hospitals' picture archiving and communication systems. However, they are basically unsorted and lack semantic annotations like type and location. In this paper, we aim to organize and explore them by learning a deep feature representation for each lesion. A large-scale and comprehensive dataset, DeepLesion, is introduced for this task. DeepLesion contains bounding boxes and size measurements of over 32K lesions. To model their similarity relationship, we leverage multiple supervision information including types, self-supervised location coordinates and sizes. They require little manual annotation effort but describe useful attributes of the lesions. Then, a triplet network is utilized to learn lesion embeddings with a sequential sampling strategy to depict their hierarchical similarity structure. Experiments show promising qualitative and quantitative results on lesion retrieval, clustering, and classification. The learned embeddings can be further employed to build a lesion graph for various clinically useful applications. We propose algorithms for intra-patient lesion matching and missing annotation mining. Experimental results validate their effectiveness.Comment: Accepted by CVPR2018. DeepLesion url adde

    Towards Interpretable Machine Learning in Medical Image Analysis

    Get PDF
    Over the past few years, ML has demonstrated human expert level performance in many medical image analysis tasks. However, due to the black-box nature of classic deep ML models, translating these models from the bench to the bedside to support the corresponding stakeholders in the desired tasks brings substantial challenges. One solution is interpretable ML, which attempts to reveal the working mechanisms of complex models. From a human-centered design perspective, interpretability is not a property of the ML model but an affordance, i.e., a relationship between algorithm and user. Thus, prototyping and user evaluations are critical to attaining solutions that afford interpretability. Following human-centered design principles in highly specialized and high stakes domains, such as medical image analysis, is challenging due to the limited access to end users. This dilemma is further exacerbated by the high knowledge imbalance between ML designers and end users. To overcome the predicament, we first define 4 levels of clinical evidence that can be used to justify the interpretability to design ML models. We state that designing ML models with 2 levels of clinical evidence: 1) commonly used clinical evidence, such as clinical guidelines, and 2) iteratively developed clinical evidence with end users are more likely to design models that are indeed interpretable to end users. In this dissertation, we first address how to design interpretable ML in medical image analysis that affords interpretability with these two different levels of clinical evidence. We further highly recommend formative user research as the first step of the interpretable model design to understand user needs and domain requirements. We also indicate the importance of empirical user evaluation to support transparent ML design choices to facilitate the adoption of human-centered design principles. All these aspects in this dissertation increase the likelihood that the algorithms afford interpretability and enable stakeholders to capitalize on the benefits of interpretable ML. In detail, we first propose neural symbolic reasoning to implement public clinical evidence into the designed models for various routinely performed clinical tasks. We utilize the routinely applied clinical taxonomy for abnormality classification in chest x-rays. We also establish a spleen injury grading system by strictly following the clinical guidelines for symbolic reasoning with the detected and segmented salient clinical features. Then, we propose the entire interpretable pipeline for UM prognostication with cytopathology images. We first perform formative user research and found that pathologists believe cell composition is informative for UM prognostication. Thus, we build a model to analyze cell composition directly. Finally, we conduct a comprehensive user study to assess the human factors of human-machine teaming with the designed model, e.g., whether the proposed model indeed affords interpretability to pathologists. The human-centered design process is proven to be truly interpretable to pathologists for UM prognostication. All in all, this dissertation introduces a comprehensive human-centered design for interpretable ML solutions in medical image analysis that affords interpretability to end users

    Data efficient deep learning for medical image analysis: A survey

    Full text link
    The rapid evolution of deep learning has significantly advanced the field of medical image analysis. However, despite these achievements, the further enhancement of deep learning models for medical image analysis faces a significant challenge due to the scarcity of large, well-annotated datasets. To address this issue, recent years have witnessed a growing emphasis on the development of data-efficient deep learning methods. This paper conducts a thorough review of data-efficient deep learning methods for medical image analysis. To this end, we categorize these methods based on the level of supervision they rely on, encompassing categories such as no supervision, inexact supervision, incomplete supervision, inaccurate supervision, and only limited supervision. We further divide these categories into finer subcategories. For example, we categorize inexact supervision into multiple instance learning and learning with weak annotations. Similarly, we categorize incomplete supervision into semi-supervised learning, active learning, and domain-adaptive learning and so on. Furthermore, we systematically summarize commonly used datasets for data efficient deep learning in medical image analysis and investigate future research directions to conclude this survey.Comment: Under Revie

    Machine learning methods for the analysis and interpretation of images and other multi-dimensional data

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    注目領域検出のための視覚的注意モデル設計に関する研究

    Get PDF
    Visual attention is an important mechanism in the human visual system. When human observe images and videos, they usually do not describe all the contents in them. Instead, they tend to talk about the semantically important regions and objects in the images. The human eye is usually attracted by some regions of interest rather than the entire scene. These regions of interest that present the mainly meaningful or semantic content are called saliency region. Visual saliency detection refers to the use of intelligent algorithms to simulate human visual attention mechanism, extract both the low-level features and high-level semantic information and localize the salient object regions in images and videos. The generated saliency map indicates the regions that are likely to attract human attention. As a fundamental problem of image processing and computer vision, visual saliency detection algorithms have been extensively studied by researchers to solve practical tasks, such as image and video compression, image retargeting, object detection, etc. The visual attention mechanism adopted by saliency detection in general are divided into two categories, namely the bottom-up model and top-down model. The bottom-up attention algorithm focuses on utilizing the low-level visual features such as colour and edges to locate the salient objects. While the top-down attention utilizes the supervised learning to detect saliency. In recent years, more and more research tend to design deep neural networks with attention mechanisms to improve the accuracy of saliency detection. The design of deep attention neural network is inspired by human visual attention. The main goal is to enable the network to automatically capture the information that is critical to the target tasks and suppress irrelevant information, shift the attention from focusing on all to local. Currently various domain’s attention has been developed for saliency detection and semantic segmentation, such as the spatial attention module in convolution network, it generates a spatial attention map by utilizing the inter-spatial relationship of features; the channel attention module produces a attention by exploring the inter-channel relationship of features. All these well-designed attentions have been proven to be effective in improving the accuracy of saliency detection. This paper investigates the visual attention mechanism of salient object detection and applies it to digital histopathology image analysis for the detection and classification of breast cancer metastases. As shown in following contents, the main research contents include three parts: First, we studied the semantic attention mechanism and proposed a semantic attention approach to accurately localize the salient objects in complex scenarios. The proposed semantic attention uses Faster-RCNN to capture high-level deep features and replaces the last layer of Faster-RCNN by a FC layer and sigmoid function for visual saliency detection; it calculates proposals' attention probabilities by comparing their feature distances with the possible salient object. The proposed method introduces a re-weighting mechanism to reduce the influence of the complexity background, and a proposal selection mechanism to remove the background noise to obtain objects with accurate shape and contour. The simulation result shows that the semantic attention mechanism is robust to images with complex background due to the consideration of high-level object concept, the algorithm achieved outstanding performance among the salient object detection algorithms in the same period. Second, we designed a deep segmentation network (DSNet) for saliency object prediction. We explored a Pyramidal Attentional ASPP (PA-ASPP) module which can provide pixel level attention. DSNet extracts multi-level features with dilated ResNet-101 and the multiscale contextual information was locally weighted with the proposed PA-ASPP. The pyramid feature aggregation encodes the multi-level features from three different scales. This feature fusion incorporates neighboring scales of context features more precisely to produce better pixel-level attention. Finally, we use a scale-aware selection (SAS) module to locally weight multi-scale contextual features, capture important contexts of ASPP for the accurate and consistent dense prediction. The simulation results demonstrated that the proposed PA-ASPP is effective and can generate more coherent results. Besides, with the SAS, the model can adaptively capture the regions with different scales effectively. Finally, based on previous research on attentional mechanisms, we proposed a novel Deep Regional Metastases Segmentation (DRMS) framework for the detection and classification of breast cancer metastases. As we know, the digitalized whole slide image has high-resolution, usually has gigapixel, however the size of abnormal region is often relatively small, and most of the slide region are normal. The highly trained pathologists usually localize the regions of interest first in the whole slide, then perform precise examination in the selected regions. Even though the process is time-consuming and prone to miss diagnosis. Through observation and analysis, we believe that visual attention should be perfectly suited for the application of digital pathology image analysis. The integrated framework for WSI analysis can capture the granularity and variability of WSI, rich information from multi-grained pathological image. We first utilize the proposed attention mechanism based DSNet to detect the regional metastases in patch-level. Then, adopt the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) to predict the whole metastases from individual slides. Finally, determine patient-level pN-stages by aggregating each individual slide-level prediction. In combination with the above techniques, the framework can make better use of the multi-grained information in histological lymph node section of whole-slice images. Experiments on large-scale clinical datasets (e.g., CAMELYON17) demonstrate that our method delivers advanced performance and provides consistent and accurate metastasis detection
    corecore