10,781 research outputs found

    An Interpretable Deep Hierarchical Semantic Convolutional Neural Network for Lung Nodule Malignancy Classification

    Full text link
    While deep learning methods are increasingly being applied to tasks such as computer-aided diagnosis, these models are difficult to interpret, do not incorporate prior domain knowledge, and are often considered as a "black-box." The lack of model interpretability hinders them from being fully understood by target users such as radiologists. In this paper, we present a novel interpretable deep hierarchical semantic convolutional neural network (HSCNN) to predict whether a given pulmonary nodule observed on a computed tomography (CT) scan is malignant. Our network provides two levels of output: 1) low-level radiologist semantic features, and 2) a high-level malignancy prediction score. The low-level semantic outputs quantify the diagnostic features used by radiologists and serve to explain how the model interprets the images in an expert-driven manner. The information from these low-level tasks, along with the representations learned by the convolutional layers, are then combined and used to infer the high-level task of predicting nodule malignancy. This unified architecture is trained by optimizing a global loss function including both low- and high-level tasks, thereby learning all the parameters within a joint framework. Our experimental results using the Lung Image Database Consortium (LIDC) show that the proposed method not only produces interpretable lung cancer predictions but also achieves significantly better results compared to common 3D CNN approaches

    Multi-scale Discriminant Saliency with Wavelet-based Hidden Markov Tree Modelling

    Full text link
    The bottom-up saliency, an early stage of humans' visual attention, can be considered as a binary classification problem between centre and surround classes. Discriminant power of features for the classification is measured as mutual information between distributions of image features and corresponding classes . As the estimated discrepancy very much depends on considered scale level, multi-scale structure and discriminant power are integrated by employing discrete wavelet features and Hidden Markov Tree (HMT). With wavelet coefficients and Hidden Markov Tree parameters, quad-tree like label structures are constructed and utilized in maximum a posterior probability (MAP) of hidden class variables at corresponding dyadic sub-squares. Then, a saliency value for each square block at each scale level is computed with discriminant power principle. Finally, across multiple scales is integrated the final saliency map by an information maximization rule. Both standard quantitative tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating the proposed multi-scale discriminant saliency (MDIS) method against the well-know information based approach AIM on its released image collection with eye-tracking data. Simulation results are presented and analysed to verify the validity of MDIS as well as point out its limitation for further research direction.Comment: arXiv admin note: substantial text overlap with arXiv:1301.396

    STV-based Video Feature Processing for Action Recognition

    Get PDF
    In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end

    Combining local regularity estimation and total variation optimization for scale-free texture segmentation

    Get PDF
    Texture segmentation constitutes a standard image processing task, crucial to many applications. The present contribution focuses on the particular subset of scale-free textures and its originality resides in the combination of three key ingredients: First, texture characterization relies on the concept of local regularity ; Second, estimation of local regularity is based on new multiscale quantities referred to as wavelet leaders ; Third, segmentation from local regularity faces a fundamental bias variance trade-off: In nature, local regularity estimation shows high variability that impairs the detection of changes, while a posteriori smoothing of regularity estimates precludes from locating correctly changes. Instead, the present contribution proposes several variational problem formulations based on total variation and proximal resolutions that effectively circumvent this trade-off. Estimation and segmentation performance for the proposed procedures are quantified and compared on synthetic as well as on real-world textures
    • …
    corecore