10,782 research outputs found
An Interpretable Deep Hierarchical Semantic Convolutional Neural Network for Lung Nodule Malignancy Classification
While deep learning methods are increasingly being applied to tasks such as
computer-aided diagnosis, these models are difficult to interpret, do not
incorporate prior domain knowledge, and are often considered as a "black-box."
The lack of model interpretability hinders them from being fully understood by
target users such as radiologists. In this paper, we present a novel
interpretable deep hierarchical semantic convolutional neural network (HSCNN)
to predict whether a given pulmonary nodule observed on a computed tomography
(CT) scan is malignant. Our network provides two levels of output: 1) low-level
radiologist semantic features, and 2) a high-level malignancy prediction score.
The low-level semantic outputs quantify the diagnostic features used by
radiologists and serve to explain how the model interprets the images in an
expert-driven manner. The information from these low-level tasks, along with
the representations learned by the convolutional layers, are then combined and
used to infer the high-level task of predicting nodule malignancy. This unified
architecture is trained by optimizing a global loss function including both
low- and high-level tasks, thereby learning all the parameters within a joint
framework. Our experimental results using the Lung Image Database Consortium
(LIDC) show that the proposed method not only produces interpretable lung
cancer predictions but also achieves significantly better results compared to
common 3D CNN approaches
Multi-scale Discriminant Saliency with Wavelet-based Hidden Markov Tree Modelling
The bottom-up saliency, an early stage of humans' visual attention, can be
considered as a binary classification problem between centre and surround
classes. Discriminant power of features for the classification is measured as
mutual information between distributions of image features and corresponding
classes . As the estimated discrepancy very much depends on considered scale
level, multi-scale structure and discriminant power are integrated by employing
discrete wavelet features and Hidden Markov Tree (HMT). With wavelet
coefficients and Hidden Markov Tree parameters, quad-tree like label structures
are constructed and utilized in maximum a posterior probability (MAP) of hidden
class variables at corresponding dyadic sub-squares. Then, a saliency value for
each square block at each scale level is computed with discriminant power
principle. Finally, across multiple scales is integrated the final saliency map
by an information maximization rule. Both standard quantitative tools such as
NSS, LCC, AUC and qualitative assessments are used for evaluating the proposed
multi-scale discriminant saliency (MDIS) method against the well-know
information based approach AIM on its released image collection with
eye-tracking data. Simulation results are presented and analysed to verify the
validity of MDIS as well as point out its limitation for further research
direction.Comment: arXiv admin note: substantial text overlap with arXiv:1301.396
STV-based Video Feature Processing for Action Recognition
In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end
Combining local regularity estimation and total variation optimization for scale-free texture segmentation
Texture segmentation constitutes a standard image processing task, crucial to
many applications. The present contribution focuses on the particular subset of
scale-free textures and its originality resides in the combination of three key
ingredients: First, texture characterization relies on the concept of local
regularity ; Second, estimation of local regularity is based on new multiscale
quantities referred to as wavelet leaders ; Third, segmentation from local
regularity faces a fundamental bias variance trade-off: In nature, local
regularity estimation shows high variability that impairs the detection of
changes, while a posteriori smoothing of regularity estimates precludes from
locating correctly changes. Instead, the present contribution proposes several
variational problem formulations based on total variation and proximal
resolutions that effectively circumvent this trade-off. Estimation and
segmentation performance for the proposed procedures are quantified and
compared on synthetic as well as on real-world textures
- …