31 research outputs found

    Obtaining Consensus Annotations For Retinal Image Segmentation Using Random Forest And Graph Cuts

    Get PDF
    We combine random forest (RF) classifiers and graph cuts (GC) to generate a consensus segmentation of multiple experts. Supervised RFs quantify the consistency of an annotator through a normalized consistency score, while semi supervised RFs predict missing expert annotations. The normalized score is used as the penalty cost in a second order Markov random field (MRF) cost function and the final consensus label is obtained by GC optimization. Experimental results on real patient retinal image datasets show the consensus segmentation by our method is more accurate than those obtained by competing methods

    Silver Standard Masks for Data Augmentation Applied to Deep-Learning-Based Skull-Stripping

    Full text link
    The bottleneck of convolutional neural networks (CNN) for medical imaging is the number of annotated data required for training. Manual segmentation is considered to be the "gold-standard". However, medical imaging datasets with expert manual segmentation are scarce as this step is time-consuming and expensive. We propose in this work the use of what we refer to as silver standard masks for data augmentation in deep-learning-based skull-stripping also known as brain extraction. We generated the silver standard masks using the consensus algorithm Simultaneous Truth and Performance Level Estimation (STAPLE). We evaluated CNN models generated by the silver and gold standard masks. Then, we validated the silver standard masks for CNNs training in one dataset, and showed its generalization to two other datasets. Our results indicated that models generated with silver standard masks are comparable to models generated with gold standard masks and have better generalizability. Moreover, our results also indicate that silver standard masks could be used to augment the input dataset at training stage, reducing the need for manual segmentation at this step

    Disentangling Human Error from the Ground Truth in Segmentation of Medical Images

    Get PDF
    Recent years have seen increasing use of supervised learning methods for segmentation tasks. However, the predictive performance of these algorithms depends on the quality of labels. This problem is particularly pertinent in the medical image domain, where both the annotation cost and inter-observer variability are high. In a typical label acquisition process, different human experts provide their estimates of the 'true' segmentation labels under the influence of their own biases and competence levels. Treating these noisy labels blindly as the ground truth limits the performance that automatic segmentation algorithms can achieve. In this work, we present a method for jointly learning, from purely noisy observations alone, the reliability of individual annotators and the true segmentation label distributions, using two coupled CNNs. The separation of the two is achieved by encouraging the estimated annotators to be maximally unreliable while achieving high fidelity with the noisy training data. We first define a toy segmentation dataset based on MNIST and study the properties of the proposed algorithm. We then demonstrate the utility of the method on three public medical imaging segmentation datasets with simulated (when necessary) and real diverse annotations: 1) MSLSC (multiple-sclerosis lesions); 2) BraTS (brain tumours); 3) LIDC-IDRI (lung abnormalities). In all cases, our method outperforms competing methods and relevant baselines particularly in cases where the number of annotations is small and the amount of disagreement is large. The experiments also show strong ability to capture the complex spatial characteristics of annotators' mistakes

    Disentangling human error from the ground truth in segmentation of medical images

    Get PDF
    Recent years have seen increasing use of supervised learning methods for segmentation tasks. However, the predictive performance of these algorithms depends on the quality of labels. This problem is particularly pertinent in the medical image domain, where both the annotation cost and inter-observer variability are high. In a typical label acquisition process, different human experts provide their estimates of the "true'' segmentation labels under the influence of their own biases and competence levels. Treating these noisy labels blindly as the ground truth limits the performance that automatic segmentation algorithms can achieve. In this work, we present a method for jointly learning, from purely noisy observations alone, the reliability of individual annotators and the true segmentation label distributions, using two coupled CNNs. The separation of the two is achieved by encouraging the estimated annotators to be maximally unreliable while achieving high fidelity with the noisy training data. We first define a toy segmentation dataset based on MNIST and study the properties of the proposed algorithm. We then demonstrate the utility of the method on three public medical imaging segmentation datasets with simulated (when necessary) and real diverse annotations: 1) MSLSC (multiple-sclerosis lesions); 2) BraTS (brain tumours); 3) LIDC-IDRI (lung abnormalities). In all cases, our method outperforms competing methods and relevant baselines particularly in cases where the number of annotations is small and the amount of disagreement is large. The experiments also show strong ability to capture the complex spatial characteristics of annotators' mistakes. Our code is available at \url{https://github.com/moucheng2017/LearnNoisyLabelsMedicalImages}
    corecore