9 research outputs found
Federated Learning for Breast Density Classification: A Real-World Implementation
Building robust deep learning-based models requires large quantities of
diverse training data. In this study, we investigate the use of federated
learning (FL) to build medical imaging classification models in a real-world
collaborative setting. Seven clinical institutions from across the world joined
this FL effort to train a model for breast density classification based on
Breast Imaging, Reporting & Data System (BI-RADS). We show that despite
substantial differences among the datasets from all sites (mammography system,
class distribution, and data set size) and without centralizing data, we can
successfully train AI models in federation. The results show that models
trained using FL perform 6.3% on average better than their counterparts trained
on an institute's local data alone. Furthermore, we show a 45.8% relative
improvement in the models' generalizability when evaluated on the other
participating sites' testing data.Comment: Accepted at the 1st MICCAI Workshop on "Distributed And Collaborative
Learning"; add citation to Fig. 1 & 2 and update Fig.
Federated learning for predicting clinical outcomes in patients with COVID-19
Federated learning (FL) is a method used for training artificial intelligence models with data from multiple sources while maintaining data anonymity, thus removing many barriers to data sharing. Here we used data from 20 institutes across the globe to train a FL model, called EXAM (electronic medical record (EMR) chest X-ray AI model), that predicts the future oxygen requirements of symptomatic patients with COVID-19 using inputs of vital signs, laboratory data and chest X-rays. EXAM achieved an average area under the curve (AUC) >0.92 for predicting outcomes at 24 and 72 h from the time of initial presentation to the emergency room, and it provided 16% improvement in average AUC measured across all participating sites and an average increase in generalizability of 38% when compared with models trained at a single site using that site's data. For prediction of mechanical ventilation treatment or death at 24 h at the largest independent test site, EXAM achieved a sensitivity of 0.950 and specificity of 0.882. In this study, FL facilitated rapid data science collaboration without data exchange and generated a model that generalized across heterogeneous, unharmonized datasets for prediction of clinical outcomes in patients with COVID-19, setting the stage for the broader use of FL in healthcare