65 research outputs found

    Was there COVID-19 back in 2012? Challenge for AI in Diagnosis with Similar Indications

    Get PDF
    Purpose: Since the recent COVID-19 outbreak, there has been an avalanche of research papers applying deep learning based image processing to chest radiographs for detection of the disease. To test the performance of the two top models for CXR COVID-19 diagnosis on external datasets to assess model generalizability. Methods: In this paper, we present our argument regarding the efficiency and applicability of existing deep learning models for COVID-19 diagnosis. We provide results from two popular models - COVID-Net and CoroNet evaluated on three publicly available datasets and an additional institutional dataset collected from EMORY Hospital between January and May 2020, containing patients tested for COVID-19 infection using RT-PCR. Results: There is a large false positive rate (FPR) for COVID-Net on both ChexPert (55.3%) and MIMIC-CXR (23.4%) dataset. On the EMORY Dataset, COVID-Net has 61.4% sensitivity, 0.54 F1-score and 0.49 precision value. The FPR of the CoroNet model is significantly lower across all the datasets as compared to COVID-Net - EMORY(9.1%), ChexPert (1.3%), ChestX-ray14 (0.02%), MIMIC-CXR (0.06%). Conclusion: The models reported good to excellent performance on their internal datasets, however we observed from our testing that their performance dramatically worsened on external data. This is likely from several causes including overfitting models due to lack of appropriate control patients and ground truth labels. The fourth institutional dataset was labeled using RT-PCR, which could be positive without radiographic findings and vice versa. Therefore, a fusion model of both clinical and radiographic data may have better performance and generalization

    PadChest: A large chest x-ray image dataset with multi-label annotated reports

    Get PDF
    We present a labeled large-scale, high resolution chest x-ray dataset for the automated exploration of medical images along with their associated reports. This dataset includes more than 160,000 images obtained from 67,000 patients that were interpreted and reported by radiologists at Hospital San Juan Hospital (Spain) from 2009 to 2017, covering six different position views and additional information on image acquisition and patient demography. The reports were labeled with 174 different radiographic findings, 19 differential diagnoses and 104 anatomic locations organized as a hierarchical taxonomy and mapped onto standard Unified Medical Language System (UMLS) terminology. Of these reports, 27% were manually annotated by trained physicians and the remaining set was labeled using a supervised method based on a recurrent neural network with attention mechanisms. The labels generated were then validated in an independent test set achieving a 0.93 Micro-F1 score. To the best of our knowledge, this is one of the largest public chest x-ray database suitable for training supervised models concerning radiographs, and the first to contain radiographic reports in Spanish. The PadChest dataset can be downloaded from http://bimcv.cipf.es/bimcv-projects/padchest/

    Deep learning in medical imaging and radiation therapy

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/146980/1/mp13264_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/146980/2/mp13264.pd

    Label Uncertainty and Learning Using Partially Available Privileged Information for Clinical Decision Support: Applications in Detection of Acute Respiratory Distress Syndrome

    Full text link
    Artificial intelligence and machine learning have the potential to transform health care by deriving new and important insights from the vast amount of data generated during routine delivery of healthcare. The digitization of health data provides an important opportunity for new knowledge discovery and improved care delivery through the development of clinical decision support that can leverage this data to support various aspects of healthcare - from early diagnosis to epidemiology, drug development, and robotic-assisted surgery. These diverse efforts share the ultimate goal of improving quality of care and outcome for patients. This thesis aims to tackle long-standing problems in machine learning and healthcare, such as modeling label uncertainty (e.g., from ambiguity in diagnosis or poorly labeled examples) and representation of data that may not be reliably accessible in a live environment. Label uncertainty hinges on the fact that even clinical experts may have low confidence when assigning a medical diagnosis to some patients due to ambiguity in the case or imperfect reliability of the diagnostic criteria. As a result, some data used for machine training may be mislabeled, hindering the model’s ability to learn the complexity of the underlying task and adversely affecting the algorithm’s overall performance. In this work, I describe a heuristic approach for physicians to quantify their diagnostic uncertainty. I also propose an implementation of instance-weighted support vector machines to incorporate this information during model training. To address the issue of unreliable data, this thesis examines the idea of learning using “partially available” privileged information. This paradigm, based on knowledge transfer, allows for models to use additional data available during training but may not be accessible during testing/deployment. This type of data is abundant in healthcare, where much more information about a patient’s health status is available in retrospective analysis (e.g., in the training data) but not available in real-time environments (e.g., in the test set). In this thesis, “privileged information” are features extracted from chest x-rays (CXRs) using novel feature engineering algorithms and transfer learning with deep residual networks. This example works well for numerous clinical applications, since CXRs are retrospectively accessible during model training but may not be available in a live environment due to delay from ordering, developing, and processing the request. This thesis is motivated by improving diagnosis of acute respiratory distress syndrome (ARDS), a life-threatening lung injury associated with high mortality. The diagnosis of ARDS serves as a model for many medical conditions where standard tests are not routinely available and diagnostic uncertainty is common. While this thesis focuses on improving diagnosis of ARDS, the proposed learning methods will generalize across various healthcare settings, allowing for better characterization of patient health status and improving the overall quality of patient care. This thesis also includes development of methods for time-series analysis of longitudinal health data, signal processing techniques for quality assessment, lung segmentation from complex CXRs, and novel feature extraction algorithm for quantification of pulmonary opacification. These algorithms were tested and validated on data obtained from patients at Michigan Medicine and additional external sources. These studies demonstrate that careful, principled use of methodologies in machine learning and artificial intelligence can potentially assist healthcare providers with early detection of ARDS and help make a timely, accurate medical diagnosis to improve outcomes for patients.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167930/1/nreamaro_1.pd

    Federated learning enables big data for rare cancer boundary detection.

    Get PDF
    Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing
    corecore