31 research outputs found
Obtaining Consensus Annotations For Retinal Image Segmentation Using Random Forest And Graph Cuts
We combine random forest (RF) classifiers and graph cuts (GC) to generate a consensus segmentation of multiple experts. Supervised RFs quantify the consistency of an annotator through a normalized consistency score, while semi supervised RFs predict missing expert annotations. The normalized score is used as the penalty cost in a second order Markov random field (MRF) cost function and the final consensus label is obtained by GC optimization. Experimental results on real patient retinal image datasets show the consensus segmentation by our method is more accurate than those obtained by competing methods
Silver Standard Masks for Data Augmentation Applied to Deep-Learning-Based Skull-Stripping
The bottleneck of convolutional neural networks (CNN) for medical imaging is
the number of annotated data required for training. Manual segmentation is
considered to be the "gold-standard". However, medical imaging datasets with
expert manual segmentation are scarce as this step is time-consuming and
expensive. We propose in this work the use of what we refer to as silver
standard masks for data augmentation in deep-learning-based skull-stripping
also known as brain extraction. We generated the silver standard masks using
the consensus algorithm Simultaneous Truth and Performance Level Estimation
(STAPLE). We evaluated CNN models generated by the silver and gold standard
masks. Then, we validated the silver standard masks for CNNs training in one
dataset, and showed its generalization to two other datasets. Our results
indicated that models generated with silver standard masks are comparable to
models generated with gold standard masks and have better generalizability.
Moreover, our results also indicate that silver standard masks could be used to
augment the input dataset at training stage, reducing the need for manual
segmentation at this step
Disentangling Human Error from the Ground Truth in Segmentation of Medical Images
Recent years have seen increasing use of supervised learning methods for
segmentation tasks. However, the predictive performance of these algorithms
depends on the quality of labels. This problem is particularly pertinent in the
medical image domain, where both the annotation cost and inter-observer
variability are high. In a typical label acquisition process, different human
experts provide their estimates of the 'true' segmentation labels under the
influence of their own biases and competence levels. Treating these noisy
labels blindly as the ground truth limits the performance that automatic
segmentation algorithms can achieve. In this work, we present a method for
jointly learning, from purely noisy observations alone, the reliability of
individual annotators and the true segmentation label distributions, using two
coupled CNNs. The separation of the two is achieved by encouraging the
estimated annotators to be maximally unreliable while achieving high fidelity
with the noisy training data. We first define a toy segmentation dataset based
on MNIST and study the properties of the proposed algorithm. We then
demonstrate the utility of the method on three public medical imaging
segmentation datasets with simulated (when necessary) and real diverse
annotations: 1) MSLSC (multiple-sclerosis lesions); 2) BraTS (brain tumours);
3) LIDC-IDRI (lung abnormalities). In all cases, our method outperforms
competing methods and relevant baselines particularly in cases where the number
of annotations is small and the amount of disagreement is large. The
experiments also show strong ability to capture the complex spatial
characteristics of annotators' mistakes
Disentangling human error from the ground truth in segmentation of medical images
Recent years have seen increasing use of supervised learning methods for segmentation tasks. However, the predictive performance of these algorithms depends on the quality of labels. This problem is particularly pertinent in the medical image domain, where both the annotation cost and inter-observer variability are high. In a typical label acquisition process, different human experts provide their estimates of the "true'' segmentation labels under the influence of their own biases and competence levels. Treating these noisy labels blindly as the ground truth limits the performance that automatic segmentation algorithms can achieve. In this work, we present a method for jointly learning, from purely noisy observations alone, the reliability of individual annotators and the true segmentation label distributions, using two coupled CNNs. The separation of the two is achieved by encouraging the estimated annotators to be maximally unreliable while achieving high fidelity with the noisy training data. We first define a toy segmentation dataset based on MNIST and study the properties of the proposed algorithm. We then demonstrate the utility of the method on three public medical imaging segmentation datasets with simulated (when necessary) and real diverse annotations: 1) MSLSC (multiple-sclerosis lesions); 2) BraTS (brain tumours); 3) LIDC-IDRI (lung abnormalities). In all cases, our method outperforms competing methods and relevant baselines particularly in cases where the number of annotations is small and the amount of disagreement is large. The experiments also show strong ability to capture the complex spatial characteristics of annotators' mistakes. Our code is available at \url{https://github.com/moucheng2017/LearnNoisyLabelsMedicalImages}