5,347 research outputs found
Federated Learning for Medical Image Analysis: A Survey
Machine learning in medical imaging often faces a fundamental dilemma, namely
the small sample size problem. Many recent studies suggest using multi-domain
data pooled from different acquisition sites/datasets to improve statistical
power. However, medical images from different sites cannot be easily shared to
build large datasets for model training due to privacy protection reasons. As a
promising solution, federated learning, which enables collaborative training of
machine learning models based on data from different sites without cross-site
data sharing, has attracted considerable attention recently. In this paper, we
conduct a comprehensive survey of the recent development of federated learning
methods in medical image analysis. We first introduce the background and
motivation of federated learning for dealing with privacy protection and
collaborative learning issues in medical imaging. We then present a
comprehensive review of recent advances in federated learning methods for
medical image analysis. Specifically, existing methods are categorized based on
three critical aspects of a federated learning system, including client end,
server end, and communication techniques. In each category, we summarize the
existing federated learning methods according to specific research problems in
medical image analysis and also provide insights into the motivations of
different approaches. In addition, we provide a review of existing benchmark
medical imaging datasets and software platforms for current federated learning
research. We also conduct an experimental study to empirically evaluate typical
federated learning methods for medical image analysis. This survey can help to
better understand the current research status, challenges and potential
research opportunities in this promising research field.Comment: 19 pages, 6 figure
RC-SSFL: Towards Robust and Communication-efficient Semi-supervised Federated Learning System
Federated Learning (FL) is an emerging decentralized artificial intelligence
paradigm, which promises to train a shared global model in high-quality while
protecting user data privacy. However, the current systems rely heavily on a
strong assumption: all clients have a wealth of ground truth labeled data,
which may not be always feasible in the real life. In this paper, we present a
practical Robust, and Communication-efficient Semi-supervised FL (RC-SSFL)
system design that can enable the clients to jointly learn a high-quality model
that is comparable to typical FL's performance. In this setting, we assume that
the client has only unlabeled data and the server has a limited amount of
labeled data. Besides, we consider malicious clients can launch poisoning
attacks to harm the performance of the global model. To solve this issue,
RC-SSFL employs a minimax optimization-based client selection strategy to
select the clients who hold high-quality updates and uses geometric median
aggregation to robustly aggregate model updates. Furthermore, RC-SSFL
implements a novel symmetric quantization method to greatly improve
communication efficiency. Extensive case studies on two real-world datasets
demonstrate that RC-SSFL can maintain the performance comparable to typical FL
in the presence of poisoning attacks and reduce communication overhead by
Federated Self-Supervised Learning of Multi-Sensor Representations for Embedded Intelligence
Smartphones, wearables, and Internet of Things (IoT) devices produce a wealth
of data that cannot be accumulated in a centralized repository for learning
supervised models due to privacy, bandwidth limitations, and the prohibitive
cost of annotations. Federated learning provides a compelling framework for
learning models from decentralized data, but conventionally, it assumes the
availability of labeled samples, whereas on-device data are generally either
unlabeled or cannot be annotated readily through user interaction. To address
these issues, we propose a self-supervised approach termed
\textit{scalogram-signal correspondence learning} based on wavelet transform to
learn useful representations from unlabeled sensor inputs, such as
electroencephalography, blood volume pulse, accelerometer, and WiFi channel
state information. Our auxiliary task requires a deep temporal neural network
to determine if a given pair of a signal and its complementary viewpoint (i.e.,
a scalogram generated with a wavelet transform) align with each other or not
through optimizing a contrastive objective. We extensively assess the quality
of learned features with our multi-view strategy on diverse public datasets,
achieving strong performance in all domains. We demonstrate the effectiveness
of representations learned from an unlabeled input collection on downstream
tasks with training a linear classifier over pretrained network, usefulness in
low-data regime, transfer learning, and cross-validation. Our methodology
achieves competitive performance with fully-supervised networks, and it
outperforms pre-training with autoencoders in both central and federated
contexts. Notably, it improves the generalization in a semi-supervised setting
as it reduces the volume of labeled data required through leveraging
self-supervised learning.Comment: Accepted for publication at IEEE Internet of Things Journa
Dealing With Heterogeneous 3D MR Knee Images: A Federated Few-Shot Learning Method With Dual Knowledge Distillation
Federated Learning has gained popularity among medical institutions since it
enables collaborative training between clients (e.g., hospitals) without
aggregating data. However, due to the high cost associated with creating
annotations, especially for large 3D image datasets, clinical institutions do
not have enough supervised data for training locally. Thus, the performance of
the collaborative model is subpar under limited supervision. On the other hand,
large institutions have the resources to compile data repositories with
high-resolution images and labels. Therefore, individual clients can utilize
the knowledge acquired in the public data repositories to mitigate the shortage
of private annotated images. In this paper, we propose a federated few-shot
learning method with dual knowledge distillation. This method allows joint
training with limited annotations across clients without jeopardizing privacy.
The supervised learning of the proposed method extracts features from limited
labeled data in each client, while the unsupervised data is used to distill
both feature and response-based knowledge from a national data repository to
further improve the accuracy of the collaborative model and reduce the
communication cost. Extensive evaluations are conducted on 3D magnetic
resonance knee images from a private clinical dataset. Our proposed method
shows superior performance and less training time than other semi-supervised
federated learning methods. Codes and additional visualization results are
available at https://github.com/hexiaoxiao-cs/fedml-knee
An Evaluation of Non-Contrastive Self-Supervised Learning for Federated Medical Image Analysis
Privacy and annotation bottlenecks are two major issues that profoundly
affect the practicality of machine learning-based medical image analysis.
Although significant progress has been made in these areas, these issues are
not yet fully resolved. In this paper, we seek to tackle these concerns head-on
and systematically explore the applicability of non-contrastive self-supervised
learning (SSL) algorithms under federated learning (FL) simulations for medical
image analysis. We conduct thorough experimentation of recently proposed
state-of-the-art non-contrastive frameworks under standard FL setups. With the
SoTA Contrastive Learning algorithm, SimCLR as our comparative baseline, we
benchmark the performances of our 4 chosen non-contrastive algorithms under
non-i.i.d. data conditions and with a varying number of clients. We present a
holistic evaluation of these techniques on 6 standardized medical imaging
datasets. We further analyse different trends inferred from the findings of our
research, with the aim to find directions for further research based on ours.
To the best of our knowledge, ours is the first to perform such a thorough
analysis of federated self-supervised learning for medical imaging. All of our
source code will be made public upon acceptance of the paper
- …