8 research outputs found

    Assessing emphysema in CT scans of the lungs:Using machine learning, crowdsourcing and visual similarity

    Get PDF

    Augmenting Chest X-ray Datasets with Non-Expert Annotations

    Full text link
    The advancement of machine learning algorithms in medical image analysis requires the expansion of training datasets. A popular and cost-effective approach is automated annotation extraction from free-text medical reports, primarily due to the high costs associated with expert clinicians annotating chest X-ray images. However, it has been shown that the resulting datasets are susceptible to biases and shortcuts. Another strategy to increase the size of a dataset is crowdsourcing, a widely adopted practice in general computer vision with some success in medical image analysis. In a similar vein to crowdsourcing, we enhance two publicly available chest X-ray datasets by incorporating non-expert annotations. However, instead of using diagnostic labels, we annotate shortcuts in the form of tubes. We collect 3.5k chest drain annotations for CXR14, and 1k annotations for 4 different tube types in PadChest. We train a chest drain detector with the non-expert annotations that generalizes well to expert labels. Moreover, we compare our annotations to those provided by experts and show "moderate" to "almost perfect" agreement. Finally, we present a pathology agreement study to raise awareness about ground truth annotations. We make our annotations and code available

    Large-scale medical image annotation with quality-controlled crowdsourcing

    Get PDF
    Accurate annotations of medical images are essential for various clinical applications. The remarkable advances in machine learning, especially deep learning based techniques, show great potential for automatic image segmentation. However, these solutions require a huge amount of accurately annotated reference data for training. Especially in the domain of medical image analysis, the availability of domain experts for reference data generation is becoming a major bottleneck for machine learning applications. In this context, crowdsourcing has gained increasing attention as a tool for low-cost and large-scale data annotation. As a method to outsource cognitive tasks to anonymous non-expert workers over the internet, it has evolved into a valuable tool for data annotation in various research fields. Major challenges in crowdsourcing remain the high variance in the annotation quality as well as the lack of domain specific knowledge of the individual workers. Current state-of-the-art methods for quality control usually induce further costs, as they rely on a redundant distribution of tasks or perform additional annotations on tasks with already known reference outcome. Aim of this thesis is to apply common crowdsourcing techniques for large-scale medical image annotation and create a cost effective quality control method for crowd-sourced image annotation. The problem of large-scale medical image annotation is addressed by introducing a hybrid crowd-algorithm approach that allowed expert-level organ segmentation in CT scans. A pilot study performed on the case of liver segmentation in abdominal CT scans showed that the proposed approach is able to create organ segmentations matching the quality of those create by medical experts. Recording the behavior of individual non-expert online workers during the annotation process in clickstreams enabled the derivation of an annotation quality measure that could successfully be used to merge crowd-sourced segmentations. A comprehensive validation study performed with various object classes from publicly available data sets demonstrated that the presented quality control measure generalizes well over different object classes and clearly outperforms state-of-the-art methods in terms of costs and segmentation quality. In conclusion, the methods introduced in this thesis are an essential contribution to reduce the annotation costs and further improve the quality of crowd-sourced image segmentation

    Crowdsourced emphysema assessment

    Get PDF
    \u3cp\u3eClassification of emphysema patterns is believed to be useful for improved diagnosis and prognosis of chronic obstructive pulmonary disease. Emphysema patterns can be assessed visually on lung CT scans. Visual assessment is a complex and time-consuming task performed by experts, making it unsuitable for obtaining large amounts of labeled data. We investigate if visual assessment of emphysema can be framed as an image similarity task that does not require expert. Substituting untrained annotators for experts makes it possible to label data sets much faster and at a lower cost. We use crowd annotators to gather similarity triplets and use t-distributed stochastic triplet embedding to learn an embedding. The quality of the embedding is evaluated by predicting expert assessed emphysema patterns. We find that although performance varies due to low quality triplets and randomness in the embedding, we still achieve a median F \u3csub\u3e1\u3c/sub\u3e score of 0.58 for prediction of four patterns. \u3c/p\u3

    Crowdsourced emphysema assessment

    No full text
    Classification of emphysema patterns is believed to be useful for improved diagnosis and prognosis of chronic obstructive pulmonary disease. Emphysema patterns can be assessed visually on lung CT scans. Visual assessment is a complex and time-consuming task performed by experts, making it unsuitable for obtaining large amounts of labeled data. We investigate if visual assessment of emphysema can be framed as an image similarity task that does not require expert. Substituting untrained annotators for experts makes it possible to label data sets much faster and at a lower cost. We use crowd annotators to gather similarity triplets and use t-distributed stochastic triplet embedding to learn an embedding. The quality of the embedding is evaluated by predicting expert assessed emphysema patterns. We find that although performance varies due to low quality triplets and randomness in the embedding, we still achieve a median F 1 score of 0.58 for prediction of four patterns
    corecore