1,128 research outputs found
Reliability-based cleaning of noisy training labels with inductive conformal prediction in multi-modal biomedical data mining
Accurately labeling biomedical data presents a challenge. Traditional
semi-supervised learning methods often under-utilize available unlabeled data.
To address this, we propose a novel reliability-based training data cleaning
method employing inductive conformal prediction (ICP). This method capitalizes
on a small set of accurately labeled training data and leverages ICP-calculated
reliability metrics to rectify mislabeled data and outliers within vast
quantities of noisy training data. The efficacy of the method is validated
across three classification tasks within distinct modalities: filtering
drug-induced-liver-injury (DILI) literature with title and abstract, predicting
ICU admission of COVID-19 patients through CT radiomics and electronic health
records, and subtyping breast cancer using RNA-sequencing data. Varying levels
of noise to the training labels were introduced through label permutation.
Results show significant enhancements in classification performance: accuracy
enhancement in 86 out of 96 DILI experiments (up to 11.4%), AUROC and AUPRC
enhancements in all 48 COVID-19 experiments (up to 23.8% and 69.8%), and
accuracy and macro-average F1 score improvements in 47 out of 48 RNA-sequencing
experiments (up to 74.6% and 89.0%). Our method offers the potential to
substantially boost classification performance in multi-modal biomedical
machine learning tasks. Importantly, it accomplishes this without necessitating
an excessive volume of meticulously curated training data
Learning to detect chest radiographs containing lung nodules using visual attention networks
Machine learning approaches hold great potential for the automated detection
of lung nodules in chest radiographs, but training the algorithms requires vary
large amounts of manually annotated images, which are difficult to obtain. Weak
labels indicating whether a radiograph is likely to contain pulmonary nodules
are typically easier to obtain at scale by parsing historical free-text
radiological reports associated to the radiographs. Using a repositotory of
over 700,000 chest radiographs, in this study we demonstrate that promising
nodule detection performance can be achieved using weak labels through
convolutional neural networks for radiograph classification. We propose two
network architectures for the classification of images likely to contain
pulmonary nodules using both weak labels and manually-delineated bounding
boxes, when these are available. Annotated nodules are used at training time to
deliver a visual attention mechanism informing the model about its localisation
performance. The first architecture extracts saliency maps from high-level
convolutional layers and compares the estimated position of a nodule against
the ground truth, when this is available. A corresponding localisation error is
then back-propagated along with the softmax classification error. The second
approach consists of a recurrent attention model that learns to observe a short
sequence of smaller image portions through reinforcement learning. When a
nodule annotation is available at training time, the reward function is
modified accordingly so that exploring portions of the radiographs away from a
nodule incurs a larger penalty. Our empirical results demonstrate the potential
advantages of these architectures in comparison to competing methodologies
Abnormality Detection in Mammography using Deep Convolutional Neural Networks
Breast cancer is the most common cancer in women worldwide. The most common
screening technology is mammography. To reduce the cost and workload of
radiologists, we propose a computer aided detection approach for classifying
and localizing calcifications and masses in mammogram images. To improve on
conventional approaches, we apply deep convolutional neural networks (CNN) for
automatic feature learning and classifier building. In computer-aided
mammography, deep CNN classifiers cannot be trained directly on full mammogram
images because of the loss of image details from resizing at input layers.
Instead, our classifiers are trained on labelled image patches and then adapted
to work on full mammogram images for localizing the abnormalities.
State-of-the-art deep convolutional neural networks are compared on their
performance of classifying the abnormalities. Experimental results indicate
that VGGNet receives the best overall accuracy at 92.53\% in classifications.
For localizing abnormalities, ResNet is selected for computing class activation
maps because it is ready to be deployed without structural change or further
training. Our approach demonstrates that deep convolutional neural network
classifiers have remarkable localization capabilities despite no supervision on
the location of abnormalities is provided.Comment: 6 page
- …