5 research outputs found
EigenRank by Committee: A Data Subset Selection and Failure Prediction paradigm for Robust Deep Learning based Medical Image Segmentation
Translation of fully automated deep learning based medical image segmentation
technologies to clinical workflows face two main algorithmic challenges. The
first, is the collection and archival of large quantities of manually annotated
ground truth data for both training and validation. The second is the relative
inability of the majority of deep learning based segmentation techniques to
alert physicians to a likely segmentation failure. Here we propose a novel
algorithm, named `Eigenrank' which addresses both of these challenges.
Eigenrank can select for manual labeling, a subset of medical images from a
large database, such that a U-Net trained on this subset is superior to one
trained on a randomly selected subset of the same size. Eigenrank can also be
used to pick out, cases in a large database, where deep learning segmentation
will fail. We present our algorithm, followed by results and a discussion of
how Eigenrank exploits the Von Neumann information to perform both data subset
selection and failure prediction for medical image segmentation using deep
learning
Experimental Design for Overparameterized Learning with Application to Single Shot Deep Active Learning
The impressive performance exhibited by modern machine learning models hinges
on the ability to train such models on a very large amount of labeled data.
However, since access to large volumes of labeled data is often limited or
expensive, it is desirable to alleviate this bottleneck by carefully curating
the training set. Optimal experimental design is a well-established paradigm
for selecting data point to be labeled so to maximally inform the learning
process. Unfortunately, classical theory on optimal experimental design focuses
on selecting examples in order to learn underparameterized (and thus,
non-interpolative) models, while modern machine learning models such as deep
neural networks are overparameterized, and oftentimes are trained to be
interpolative. As such, classical experimental design methods are not
applicable in many modern learning setups. Indeed, the predictive performance
of underparameterized models tends to be variance dominated, so classical
experimental design focuses on variance reduction, while the predictive
performance of overparameterized models can also be, as is shown in this paper,
bias dominated or of mixed nature. In this paper we propose a design strategy
that is well suited for overparameterized regression and interpolation. We
demonstrate the applicability of our method in the context of deep learning by
proposing a new algorithm for single-shot deep active learnin
Gone Fishing: Neural Active Learning with Fisher Embeddings
There is an increasing need for effective active learning algorithms that are
compatible with deep neural networks. This paper motivates and revisits a
classic, Fisher-based active selection objective, and proposes BAIT, a
practical, tractable, and high-performing algorithm that makes it viable for
use with neural models. BAIT draws inspiration from the theoretical analysis of
maximum likelihood estimators (MLE) for parametric models. It selects batches
of samples by optimizing a bound on the MLE error in terms of the Fisher
information, which we show can be implemented efficiently at scale by
exploiting linear-algebraic structure especially amenable to execution on
modern hardware. Our experiments demonstrate that BAIT outperforms the previous
state of the art on both classification and regression problems, and is
flexible enough to be used with a variety of model architectures
A Survey on Deep Learning of Small Sample in Biomedical Image Analysis
The success of deep learning has been witnessed as a promising technique for
computer-aided biomedical image analysis, due to end-to-end learning framework
and availability of large-scale labelled samples. However, in many cases of
biomedical image analysis, deep learning techniques suffer from the small
sample learning (SSL) dilemma caused mainly by lack of annotations. To be more
practical for biomedical image analysis, in this paper we survey the key SSL
techniques that help relieve the suffering of deep learning by combining with
the development of related techniques in computer vision applications. In order
to accelerate the clinical usage of biomedical image analysis based on deep
learning techniques, we intentionally expand this survey to include the
explanation methods for deep models that are important to clinical decision
making. We survey the key SSL techniques by dividing them into five categories:
(1) explanation techniques, (2) weakly supervised learning techniques, (3)
transfer learning techniques, (4) active learning techniques, and (5)
miscellaneous techniques involving data augmentation, domain knowledge,
traditional shallow methods and attention mechanism. These key techniques are
expected to effectively support the application of deep learning in clinical
biomedical image analysis, and furtherly improve the analysis performance,
especially when large-scale annotated samples are not available. We bulid demos
at https://github.com/PengyiZhang/MIADeepSSL
Embracing Imperfect Datasets: A Review of Deep Learning Solutions for Medical Image Segmentation
The medical imaging literature has witnessed remarkable progress in
high-performing segmentation models based on convolutional neural networks.
Despite the new performance highs, the recent advanced segmentation models
still require large, representative, and high quality annotated datasets.
However, rarely do we have a perfect training dataset, particularly in the
field of medical imaging, where data and annotations are both expensive to
acquire. Recently, a large body of research has studied the problem of medical
image segmentation with imperfect datasets, tackling two major dataset
limitations: scarce annotations where only limited annotated data is available
for training, and weak annotations where the training data has only sparse
annotations, noisy annotations, or image-level annotations. In this article, we
provide a detailed review of the solutions above, summarizing both the
technical novelties and empirical results. We further compare the benefits and
requirements of the surveyed methodologies and provide our recommended
solutions. We hope this survey article increases the community awareness of the
techniques that are available to handle imperfect medical image segmentation
datasets.Comment: Accepted for publication in the journal of Medical Image Analysi