605 research outputs found
Lung Segmentation from Chest X-rays using Variational Data Imputation
Pulmonary opacification is the inflammation in the lungs caused by many
respiratory ailments, including the novel corona virus disease 2019 (COVID-19).
Chest X-rays (CXRs) with such opacifications render regions of lungs
imperceptible, making it difficult to perform automated image analysis on them.
In this work, we focus on segmenting lungs from such abnormal CXRs as part of a
pipeline aimed at automated risk scoring of COVID-19 from CXRs. We treat the
high opacity regions as missing data and present a modified CNN-based image
segmentation network that utilizes a deep generative model for data imputation.
We train this model on normal CXRs with extensive data augmentation and
demonstrate the usefulness of this model to extend to cases with extreme
abnormalities.Comment: Accepted to be presented at the first Workshop on the Art of Learning
with Missing Values (Artemiss) hosted by the 37th International Conference on
Machine Learning (ICML). Source code, training data and the trained models
are available here: https://github.com/raghavian/lungVAE
Learning to detect chest radiographs containing lung nodules using visual attention networks
Machine learning approaches hold great potential for the automated detection
of lung nodules in chest radiographs, but training the algorithms requires vary
large amounts of manually annotated images, which are difficult to obtain. Weak
labels indicating whether a radiograph is likely to contain pulmonary nodules
are typically easier to obtain at scale by parsing historical free-text
radiological reports associated to the radiographs. Using a repositotory of
over 700,000 chest radiographs, in this study we demonstrate that promising
nodule detection performance can be achieved using weak labels through
convolutional neural networks for radiograph classification. We propose two
network architectures for the classification of images likely to contain
pulmonary nodules using both weak labels and manually-delineated bounding
boxes, when these are available. Annotated nodules are used at training time to
deliver a visual attention mechanism informing the model about its localisation
performance. The first architecture extracts saliency maps from high-level
convolutional layers and compares the estimated position of a nodule against
the ground truth, when this is available. A corresponding localisation error is
then back-propagated along with the softmax classification error. The second
approach consists of a recurrent attention model that learns to observe a short
sequence of smaller image portions through reinforcement learning. When a
nodule annotation is available at training time, the reward function is
modified accordingly so that exploring portions of the radiographs away from a
nodule incurs a larger penalty. Our empirical results demonstrate the potential
advantages of these architectures in comparison to competing methodologies
Collaborative Training of Medical Artificial Intelligence Models with non-uniform Labels
Artificial intelligence (AI) methods are revolutionizing medical image
analysis. However, robust AI models require large multi-site datasets for
training. While multiple stakeholders have provided publicly available
datasets, the ways in which these data are labeled differ widely. For example,
one dataset of chest radiographs might contain labels denoting the presence of
metastases in the lung, while another dataset of chest radiograph might focus
on the presence of pneumonia. With conventional approaches, these data cannot
be used together to train a single AI model. We propose a new framework that we
call flexible federated learning (FFL) for collaborative training on such data.
Using publicly available data of 695,000 chest radiographs from five
institutions - each with differing labels - we demonstrate that large and
heterogeneously labeled datasets can be used to train one big AI model with
this framework. We find that models trained with FFL are superior to models
that are trained on matching annotations only. This may pave the way for
training of truly large-scale AI models that make efficient use of all existing
data.Comment: 2 figures, 3 tables, 5 supplementary table
Domain Specific Deep Neural Network Model for Classification of Abnormalities on Chest Radiographs
This study collected pre-processed dataset of chest radiographs formulated a deep neural network model for detecting abnormalities It also evaluated the performance of the formulated model and implemented a prototype of the formulated model This was with the view to develop a deep neural network model to automatically classify abnormalities in chest radiographs In order to achieve the overall purpose of this research a large set of chest x-ray images were sourced for and collected from the CheXpert dataset which is an online repository of annotated chest radiographs compiled by the Machine Learning Research group Stanford University The chest radiographs were preprocessed into a format that can be fed into a deep neural network The preprocessing techniques used were standardization and normalization The classification problem was formulated as a multi-label binary classification model which used convolutional neural network architecture for making decision on whether an abnormality was present or not in the chest radiographs The classification model was evaluated using specificity sensitivity and Area Under Curve AUC score as parameter A prototype of the classification model was implemented using Keras Open source deep learning framework in Python Programming Language The AUC ROC curve of the model was able to classify Atelestasis Support devices Pleural effusion Pneumonia A normal CXR no finding Pneumothorax and Consolidation However Lung opacity and Cardiomegaly had probability out of less than 0 5 and thus were classified as absent Precision recall and F1 score values were 0 78 this imply that the number of False Positive and False Negative are the same revealing some measure of label imbalance in the dataset The study concluded that the developed model is sufficient to classify abnormalities present in chest radiographs into present or absen
Enhancing Network Initialization for Medical AI Models Using Large-Scale, Unlabeled Natural Images
Pre-training datasets, like ImageNet, have become the gold standard in
medical image analysis. However, the emergence of self-supervised learning
(SSL), which leverages unlabeled data to learn robust features, presents an
opportunity to bypass the intensive labeling process. In this study, we
explored if SSL for pre-training on non-medical images can be applied to chest
radiographs and how it compares to supervised pre-training on non-medical
images and on medical images. We utilized a vision transformer and initialized
its weights based on (i) SSL pre-training on natural images (DINOv2), (ii) SL
pre-training on natural images (ImageNet dataset), and (iii) SL pre-training on
chest radiographs from the MIMIC-CXR database. We tested our approach on over
800,000 chest radiographs from six large global datasets, diagnosing more than
20 different imaging findings. Our SSL pre-training on curated images not only
outperformed ImageNet-based pre-training (P<0.001 for all datasets) but, in
certain cases, also exceeded SL on the MIMIC-CXR dataset. Our findings suggest
that selecting the right pre-training strategy, especially with SSL, can be
pivotal for improving artificial intelligence (AI)'s diagnostic accuracy in
medical imaging. By demonstrating the promise of SSL in chest radiograph
analysis, we underline a transformative shift towards more efficient and
accurate AI models in medical imaging
Was there COVID-19 back in 2012? Challenge for AI in Diagnosis with Similar Indications
Purpose: Since the recent COVID-19 outbreak, there has been an avalanche of research papers applying deep learning based image processing to chest radiographs for detection of the disease. To test the performance of the two top models for CXR COVID-19 diagnosis on external datasets to assess model generalizability.
Methods: In this paper, we present our argument regarding the efficiency and applicability of existing deep learning models for COVID-19 diagnosis. We provide results from two popular models - COVID-Net and CoroNet evaluated on three publicly available datasets and an additional institutional dataset collected from EMORY Hospital between January and May 2020, containing patients tested for COVID-19 infection using RT-PCR.
Results: There is a large false positive rate (FPR) for COVID-Net on both ChexPert (55.3%) and MIMIC-CXR (23.4%) dataset. On the EMORY Dataset, COVID-Net has 61.4% sensitivity, 0.54 F1-score and 0.49 precision value. The FPR of the CoroNet model is significantly lower across all the datasets as compared to COVID-Net - EMORY(9.1%), ChexPert (1.3%), ChestX-ray14 (0.02%), MIMIC-CXR (0.06%).
Conclusion: The models reported good to excellent performance on their internal datasets, however we observed from our testing that their performance dramatically worsened on external data. This is likely from several causes including overfitting models due to lack of appropriate control patients and ground truth labels. The fourth institutional dataset was labeled using RT-PCR, which could be positive without radiographic findings and vice versa. Therefore, a fusion model of both clinical and radiographic data may have better performance and generalization
- …