8 research outputs found

    CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

    Full text link
    Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models. The dataset is freely available at https://stanfordmlgroup.github.io/competitions/chexpert .Comment: Published in AAAI 201

    Identification of fossil worm tubes from Phanerozoic hydrothermal vents and cold seeps

    Get PDF
    One of the main limitations to understanding the evolutionary history of hydrothermal vent and cold seep communities is the identification of tube fossils from ancient deposits. Tube-dwelling annelids are some of the most conspicuous inhabitants of modern vent and seep ecosystems, and ancient vent and seep tubular fossils are usually considered to have been made by annelids. However, the taxonomic affinities of many tube fossils from vents and seeps are contentious, or have remained largely undetermined due to difficulties in identification. In this study, we make a detailed chemical (Fourier-transform infrared spectroscopy and pyrolysis gas-chromatography mass-spectrometry) and morphological assessment of modern annelid tubes from six families, and fossil tubes (seven tube types from the Cenozoic, 12 Mesozoic and four Palaeozoic) from hydrothermal vent and cold seep environments. Characters identified from these investigations were used to explore for the first time the systematics of ancient vent and seep tubes within a cladistic framework. Results reveal details of the compositions and ultrastructures of modern tubes, and also suggest that two types of tubes from ancient vent localities were made by the annelid family Siboglinidae, which often dominates modern vents and seeps. Our results also highlight that several vent and seep tube fossils formerly thought to have been made by annelids cannot be assigned an annelid affiliation with any certainty. The findings overall improve the level of quality control with regard to interpretations of fossil tubes, and, most importantly, suggest that siboglinids likely occupied Mesozoic vents and seeps, greatly increasing the minimum age of the clade relative to earlier molecular estimates

    Midlife managerial experience is linked to late life hippocampal morphology and function

    Get PDF
    An active cognitive lifestyle has been suggested to have a protective role in the long-term maintenance of cognition. Amongst healthy older adults, more managerial or supervisory experiences in midlife are linked to a slower hippocampal atrophy rate in late life. Yet whether similar links exist in individuals with Mild Cognitive Impairment (MCI) is not known, nor whether these differences have any functional implications. 68 volunteers from the Sydney SMART Trial, diagnosed with non-amnestic MCI, were divided into high and low managerial experience (HME/LME) during their working life. All participants underwent neuropsychological testing, structural and resting-state functional MRI. Group comparisons were performed on hippocampal volume, morphology, hippocampal seed-based functional connectivity, memory and executive function and self-ratings of memory proficiency. HME was linked to better memory function (p = 0.024), mediated by larger hippocampal volume (p = 0.025). More specifically, deformation analysis found HME had relatively more volume in the CA1 sub-region of the hippocampus (p  <  0.05). Paradoxically, this group rated their memory proficiency worse (p = 0.004), a result correlated with diminished functional connectivity between the right hippocampus and right prefrontal cortex (p  <  0.001). Finally, hierarchical regression modelling substantiated this double dissociation

    NIBBS-Search for Fast and Accurate Prediction of Phenotype-Biased Metabolic Systems

    Get PDF
    Understanding of genotype-phenotype associations is important not only for furthering our knowledge on internal cellular processes, but also essential for providing the foundation necessary for genetic engineering of microorganisms for industrial use (e.g., production of bioenergy or biofuels). However, genotype-phenotype associations alone do not provide enough information to alter an organism's genome to either suppress or exhibit a phenotype. It is important to look at the phenotype-related genes in the context of the genome-scale network to understand how the genes interact with other genes in the organism. Identification of metabolic subsystems involved in the expression of the phenotype is one way of placing the phenotype-related genes in the context of the entire network. A metabolic system refers to a metabolic network subgraph; nodes are compounds and edges labels are the enzymes that catalyze the reaction. The metabolic subsystem could be part of a single metabolic pathway or span parts of multiple pathways. Arguably, comparative genome-scale metabolic network analysis is a promising strategy to identify these phenotype-related metabolic subsystems. Network Instance-Based Biased Subgraph Search (NIBBS) is a graph-theoretic method for genome-scale metabolic network comparative analysis that can identify metabolic systems that are statistically biased toward phenotype-expressing organismal networks. We set up experiments with target phenotypes like hydrogen production, TCA expression, and acid-tolerance. We show via extensive literature search that some of the resulting metabolic subsystems are indeed phenotype-related and formulate hypotheses for other systems in terms of their role in phenotype expression. NIBBS is also orders of magnitude faster than MULE, one of the most efficient maximal frequent subgraph mining algorithms that could be adjusted for this problem. Also, the set of phenotype-biased metabolic systems output by NIBBS comes very close to the set of phenotype-biased subgraphs output by an exact maximally-biased subgraph enumeration algorithm ( MBS-Enum ). The code (NIBBS and the module to visualize the identified subsystems) is available at http://freescience.org/cs/NIBBS

    Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists.

    No full text
    BackgroundChest radiograph interpretation is critical for the detection of thoracic diseases, including tuberculosis and lung cancer, which affect millions of people worldwide each year. This time-consuming task typically requires expert radiologists to read the images, leading to fatigue-based diagnostic error and lack of diagnostic expertise in areas of the world where radiologists are not available. Recently, deep learning approaches have been able to achieve expert-level performance in medical image interpretation tasks, powered by large network architectures and fueled by the emergence of large labeled datasets. The purpose of this study is to investigate the performance of a deep learning algorithm on the detection of pathologies in chest radiographs compared with practicing radiologists.Methods and findingsWe developed CheXNeXt, a convolutional neural network to concurrently detect the presence of 14 different pathologies, including pneumonia, pleural effusion, pulmonary masses, and nodules in frontal-view chest radiographs. CheXNeXt was trained and internally validated on the ChestX-ray8 dataset, with a held-out validation set consisting of 420 images, sampled to contain at least 50 cases of each of the original pathology labels. On this validation set, the majority vote of a panel of 3 board-certified cardiothoracic specialist radiologists served as reference standard. We compared CheXNeXt's discriminative performance on the validation set to the performance of 9 radiologists using the area under the receiver operating characteristic curve (AUC). The radiologists included 6 board-certified radiologists (average experience 12 years, range 4-28 years) and 3 senior radiology residents, from 3 academic institutions. We found that CheXNeXt achieved radiologist-level performance on 11 pathologies and did not achieve radiologist-level performance on 3 pathologies. The radiologists achieved statistically significantly higher AUC performance on cardiomegaly, emphysema, and hiatal hernia, with AUCs of 0.888 (95% confidence interval [CI] 0.863-0.910), 0.911 (95% CI 0.866-0.947), and 0.985 (95% CI 0.974-0.991), respectively, whereas CheXNeXt's AUCs were 0.831 (95% CI 0.790-0.870), 0.704 (95% CI 0.567-0.833), and 0.851 (95% CI 0.785-0.909), respectively. CheXNeXt performed better than radiologists in detecting atelectasis, with an AUC of 0.862 (95% CI 0.825-0.895), statistically significantly higher than radiologists' AUC of 0.808 (95% CI 0.777-0.838); there were no statistically significant differences in AUCs for the other 10 pathologies. The average time to interpret the 420 images in the validation set was substantially longer for the radiologists (240 minutes) than for CheXNeXt (1.5 minutes). The main limitations of our study are that neither CheXNeXt nor the radiologists were permitted to use patient history or review prior examinations and that evaluation was limited to a dataset from a single institution.ConclusionsIn this study, we developed and validated a deep learning algorithm that classified clinically important abnormalities in chest radiographs at a performance level comparable to practicing radiologists. Once tested prospectively in clinical settings, the algorithm could have the potential to expand patient access to chest radiograph diagnostics

    MRI Radiogenomics of Pediatric Medulloblastoma: A Multicenter Study

    No full text
    Background: Radiogenomics of pediatric medulloblastoma (MB) offers an opportunity for MB risk stratification, which may aid therapeutic decision making, family counseling, and selection of patient groups suitable for targeted genetic analysis. / Purpose: To develop machine learning strategies that identify the four clinically significant MB molecular subgroups. / Materials and Methods: In this retrospective study, consecutive pediatric patients with newly diagnosed MB at MRI at 12 international pediatric sites between July 1997 and May 2020 were identified. There were 1800 features extracted from T2- and contrast-enhanced T1-weighted preoperative MRI scans. A two-stage sequential classifier was designed—one that first identifies non-wingless (WNT) and non–sonic hedgehog (SHH) MB and then differentiates therapeutically relevant WNT from SHH. Further, a classifier that distinguishes high-risk group 3 from group 4 MB was developed. An independent, binary subgroup analysis was conducted to uncover radiomics features unique to infantile versus childhood SHH subgroups. The best-performing models from six candidate classifiers were selected, and performance was measured on holdout test sets. CIs were obtained by bootstrapping the test sets for 2000 random samples. Model accuracy score was compared with the no-information rate using the Wald test. / Results: The study cohort comprised 263 patients (mean age ± SD at diagnosis, 87 months ± 60; 166 boys). A two-stage classifier outperformed a single-stage multiclass classifier. The combined, sequential classifier achieved a microaveraged F1 score of 88% and a binary F1 score of 95% specifically for WNT. A group 3 versus group 4 classifier achieved an area under the receiver operating characteristic curve of 98%. Of the Image Biomarker Standardization Initiative features, texture and first-order intensity features were most contributory across the molecular subgroups. / Conclusion: An MRI-based machine learning decision path allowed identification of the four clinically relevant molecular pediatric medulloblastoma subgroups
    corecore