7 research outputs found
Coarse-to-Fine Classification via Parametric and Nonparametric Models for Computer-Aided Diagnosis
Classification is one of the core problems in Computer-Aided Diagnosis (CAD),
targeting for early cancer detection using 3D medical imaging interpretation.
High detection sensitivity with desirably low false positive (FP) rate is
critical for a CAD system to be accepted as a valuable or even indispensable
tool in radiologists' workflow. Given various spurious imagery noises which
cause observation uncertainties, this remains a very challenging task. In this
paper, we propose a novel, two-tiered coarse-to-fine (CTF) classification
cascade framework to tackle this problem. We first obtain
classification-critical data samples (e.g., samples on the decision boundary)
extracted from the holistic data distributions using a robust parametric model
(e.g., \cite{Raykar08}); then we build a graph-embedding based nonparametric
classifier on sampled data, which can more accurately preserve or formulate the
complex classification boundary. These two steps can also be considered as
effective "sample pruning" and "feature pursuing + NN/template matching",
respectively. Our approach is validated comprehensively in colorectal polyp
detection and lung nodule detection CAD systems, as the top two deadly cancers,
using hospital scale, multi-site clinical datasets. The results show that our
method achieves overall better classification/detection performance than
existing state-of-the-art algorithms using single-layer classifiers, such as
the support vector machine variants \cite{Wang08}, boosting \cite{Slabaugh10},
logistic regression \cite{Ravesteijn10}, relevance vector machine
\cite{Raykar08}, -nearest neighbor \cite{Murphy09} or spectral projections
on graph \cite{Cai08}
Improving Computer-aided Detection using Convolutional Neural Networks and Random View Aggregation
Automated computer-aided detection (CADe) in medical imaging has been an
important tool in clinical practice and research. State-of-the-art methods
often show high sensitivities but at the cost of high false-positives (FP) per
patient rates. We design a two-tiered coarse-to-fine cascade framework that
first operates a candidate generation system at sensitivities of 100% but
at high FP levels. By leveraging existing CAD systems, coordinates of regions
or volumes of interest (ROI or VOI) for lesion candidates are generated in this
step and function as input for a second tier, which is our focus in this study.
In this second stage, we generate 2D (two-dimensional) or 2.5D views via
sampling through scale transformations, random translations and rotations with
respect to each ROI's centroid coordinates. These random views are used to
train deep convolutional neural network (ConvNet) classifiers. In testing, the
trained ConvNets are employed to assign class (e.g., lesion, pathology)
probabilities for a new set of random views that are then averaged at each
ROI to compute a final per-candidate classification probability. This second
tier behaves as a highly selective process to reject difficult false positives
while preserving high sensitivities. The methods are evaluated on three
different data sets with different numbers of patients: 59 patients for
sclerotic metastases detection, 176 patients for lymph node detection, and
1,186 patients for colonic polyp detection. Experimental results show the
ability of ConvNets to generalize well to different medical imaging CADe
applications and scale elegantly to various data sets. Our proposed methods
improve CADe performance markedly in all cases. CADe sensitivities improved
from 57% to 70%, from 43% to 77% and from 58% to 75% at 3 FPs per patient for
sclerotic metastases, lymph nodes and colonic polyps, respectively.Comment: 2D vs 2.5D vs 3D inputs and comparison to other standard classifiers
such as SVM have been addressed by more experimentation and two completely
new sections and figures. Results and Discussions have been updated
accordingl
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
Remarkable progress has been made in image recognition, primarily due to the
availability of large-scale annotated datasets and the revival of deep CNN.
CNNs enable learning data-driven, highly representative, layered hierarchical
image features from sufficient training data. However, obtaining datasets as
comprehensively annotated as ImageNet in the medical imaging domain remains a
challenge. There are currently three major techniques that successfully employ
CNNs to medical image classification: training the CNN from scratch, using
off-the-shelf pre-trained CNN features, and conducting unsupervised CNN
pre-training with supervised fine-tuning. Another effective method is transfer
learning, i.e., fine-tuning CNN models pre-trained from natural image dataset
to medical image tasks. In this paper, we exploit three important, but
previously understudied factors of employing deep convolutional neural networks
to computer-aided detection problems. We first explore and evaluate different
CNN architectures. The studied models contain 5 thousand to 160 million
parameters, and vary in numbers of layers. We then evaluate the influence of
dataset scale and spatial image context on performance. Finally, we examine
when and why transfer learning from pre-trained ImageNet (via fine-tuning) can
be useful. We study two specific computer-aided detection (CADe) problems,
namely thoraco-abdominal lymph node (LN) detection and interstitial lung
disease (ILD) classification. We achieve the state-of-the-art performance on
the mediastinal LN detection, with 85% sensitivity at 3 false positive per
patient, and report the first five-fold cross-validation classification results
on predicting axial CT slices with ILD categories. Our extensive empirical
evaluation, CNN model analysis and valuable insights can be extended to the
design of high performance CAD systems for other medical imaging tasks
On conformal divergences and their population minimizers
Total Bregman divergences are a recent tweak of ordinary Bregman divergences
originally motivated by applications that required invariance by rotations.
They have displayed superior results compared to ordinary Bregman divergences
on several clustering, computer vision, medical imaging and machine learning
tasks. These preliminary results raise two important problems : First, report a
complete characterization of the left and right population minimizers for this
class of total Bregman divergences. Second, characterize a principled superset
of total and ordinary Bregman divergences with good clustering properties, from
which one could tailor the choice of a divergence to a particular application.
In this paper, we provide and study one such superset with interesting
geometric features, that we call conformal divergences, and focus on their left
and right population minimizers. Our results are obtained in a recently coined
-geometric structure that is a generalization of the dually flat affine
connections in information geometry. We characterize both analytically and
geometrically the population minimizers. We prove that conformal divergences
(resp. total Bregman divergences) are essentially exhaustive for their left
(resp. right) population minimizers. We further report new results and extend
previous results on the robustness to outliers of the left and right population
minimizers, and discuss the role of the -geometric structure in
clustering. Additional results are also given
Coarse-to-Fine Curriculum Learning
When faced with learning challenging new tasks, humans often follow sequences
of steps that allow them to incrementally build up the necessary skills for
performing these new tasks. However, in machine learning, models are most often
trained to solve the target tasks directly.Inspired by human learning, we
propose a novel curriculum learning approach which decomposes challenging tasks
into sequences of easier intermediate goals that are used to pre-train a model
before tackling the target task. We focus on classification tasks, and design
the intermediate tasks using an automatically constructed label hierarchy. We
train the model at each level of the hierarchy, from coarse labels to fine
labels, transferring acquired knowledge across these levels. For instance, the
model will first learn to distinguish animals from objects, and then use this
acquired knowledge when learning to classify among more fine-grained classes
such as cat, dog, car, and truck. Most existing curriculum learning algorithms
for supervised learning consist of scheduling the order in which the training
examples are presented to the model. In contrast, our approach focuses on the
output space of the model. We evaluate our method on several established
datasets and show significant performance gains especially on classification
problems with many labels. We also evaluate on a new synthetic dataset which
allows us to study multiple aspects of our method
Coarse-to-Fine Classification via Parametric and Nonparametric Models for Computer-Aided Diagnosis
Classification is one of the core problems in Computer-Aided Diagnosis (CAD), targeting for early cancer detection using 3D medical imaging interpretation. High detection sensitivity with desirably low false positive (FP) rate is critical for a CAD system to be accepted as a valuable or even indispensable tool in radiologists ’ workflow. Given various spurious imagery noises which cause observation uncertainties, this remains a very challenging task. In this paper, we propose a novel, two-tiered coarse-to-fine (CTF) classification cascade framework to tackle this problem. We first obtain classification-critical data samples (e.g., samples on the decision boundary) extracted from the holistic data distributions using a robust parametric model (e.g., [35]); then we build a graph-embedding based nonparametric classifier on sampled data, which can more accurately preserve or formulate the complex classification boundary. These two steps can also be considered as effective “sample pruning ” and “feature pursuing + kNN/template matching”, respectively. Our approach is validated comprehensively in colorectal polyp detection and lung nodule detection CAD systems, as the top two deadly cancers, using hospital scale, multi-site clinical datasets. The results show that our method achieves overall better classification/detection performance than existing state-of-the-art algorithms using single-layer classifiers, such as the support vector machine variants [45], boosting [40], logistic regression [33], relevance vector machine [35], k-nearest neighbor [30] and sparse projections on graph [6]
Coarse-to-Fine Classification via Parametric and Nonparametric Models for Computer-Aided Diagnosis
Classification is one of the core problems in Computer-Aided Diagnosis (CAD), targeting for early cancer detection using 3D medical imaging interpretation. High detection sensitivity with desirably low false positive (FP) rate is critical for a CAD system to be accepted as a valuable or even indispensable tool in radiologists ’ workflow. Given various spurious imagery noises which cause observation uncertainties, this remains a very challenging task. In this paper, we propose a novel, two-tiered coarse-to-fine (CTF) classification cascade framework to tackle this problem. We first obtain classification-critical data samples (e.g., samples on the decision boundary) extracted from the holistic data distributions using a robust parametric model (e.g., [13]); then we build a graph-embedding based nonparametric classifier on sampled data, which can more accurately preserve or formulate the complex classification boundary. These two steps can also be considered as effective “sample pruning ” and “feature pursuing + kNN/template matching”, respectively. Our approach is validated comprehensively in colorectal polyp detection and lung nodule detection CAD systems, as the top two deadly cancers, using hospital scale, multi-site clinical datasets. The results show that our method achieves overall better classification/detection performance than existing state-of-the-art algorithms using single-layer classifiers, such as the support vector machine variants [17], boosting [15], logistic regression [11], relevance vector machine [13], k-nearest neighbor [9] or spectral projections on graph [2]