19 research outputs found
UniFed: A Unified Framework for Federated Learning on Non-IID Image Features
How to tackle non-iid data is a crucial topic in federated learning. This
challenging problem not only affects training process, but also harms
performance of clients not participating in training. Existing literature
mainly focuses on either side, yet still lacks a unified solution to handle
these two types (internal and external) of clients in a joint way. In this
work, we propose a unified framework to tackle the non-iid issues for internal
and external clients together. Firstly, we propose to use client-specific batch
normalization in either internal or external clients to alleviate feature
distribution shifts incurred by non-iid data. Then we present theoretical
analysis to demonstrate the benefits of client-specific batch normalization.
Specifically, we show that our approach promotes convergence speed for
federated training and yields lower generalization error bound for external
clients. Furthermore, we use causal reasoning to form a causal view to explain
the advantages of our framework. At last, we conduct extensive experiments on
natural and medical images to evaluate our method, where our method achieves
state-of-the-art performance, faster convergence, and shows good compatibility.
We also performed comprehensive analytical studies on a real-world medical
dataset to demonstrate the effectiveness
FedSoup: Improving Generalization and Personalization in Federated Learning via Selective Model Interpolation
Cross-silo federated learning (FL) enables the development of machine
learning models on datasets distributed across data centers such as hospitals
and clinical research laboratories. However, recent research has found that
current FL algorithms face a trade-off between local and global performance
when confronted with distribution shifts. Specifically, personalized FL methods
have a tendency to overfit to local data, leading to a sharp valley in the
local model and inhibiting its ability to generalize to out-of-distribution
data. In this paper, we propose a novel federated model soup method (i.e.,
selective interpolation of model parameters) to optimize the trade-off between
local and global performance. Specifically, during the federated training
phase, each client maintains its own global model pool by monitoring the
performance of the interpolated model between the local and global models. This
allows us to alleviate overfitting and seek flat minima, which can
significantly improve the model's generalization performance. We evaluate our
method on retinal and pathological image classification tasks, and our proposed
method achieves significant improvements for out-of-distribution
generalization. Our code is available at https://github.com/ubc-tea/FedSoup.Comment: Accepted by MICCAI202
Client-Level Differential Privacy via Adaptive Intermediary in Federated Medical Imaging
Despite recent progress in enhancing the privacy of federated learning (FL)
via differential privacy (DP), the trade-off of DP between privacy protection
and performance is still underexplored for real-world medical scenario. In this
paper, we propose to optimize the trade-off under the context of client-level
DP, which focuses on privacy during communications. However, FL for medical
imaging involves typically much fewer participants (hospitals) than other
domains (e.g., mobile devices), thus ensuring clients be differentially private
is much more challenging. To tackle this problem, we propose an adaptive
intermediary strategy to improve performance without harming privacy.
Specifically, we theoretically find splitting clients into sub-clients, which
serve as intermediaries between hospitals and the server, can mitigate the
noises introduced by DP without harming privacy. Our proposed approach is
empirically evaluated on both classification and segmentation tasks using two
public datasets, and its effectiveness is demonstrated with significant
performance improvements and comprehensive analytical studies. Code is
available at: https://github.com/med-air/Client-DP-FL.Comment: Accepted by 26th International Conference on Medical Image Computing
and Computer Assisted Intervention (MICCAI'23
Fair Federated Medical Image Segmentation via Client Contribution Estimation
How to ensure fairness is an important topic in federated learning (FL).
Recent studies have investigated how to reward clients based on their
contribution (collaboration fairness), and how to achieve uniformity of
performance across clients (performance fairness). Despite achieving progress
on either one, we argue that it is critical to consider them together, in order
to engage and motivate more diverse clients joining FL to derive a high-quality
global model. In this work, we propose a novel method to optimize both types of
fairness simultaneously. Specifically, we propose to estimate client
contribution in gradient and data space. In gradient space, we monitor the
gradient direction differences of each client with respect to others. And in
data space, we measure the prediction error on client data using an auxiliary
model. Based on this contribution estimation, we propose a FL method, federated
training via contribution estimation (FedCE), i.e., using estimation as global
model aggregation weights. We have theoretically analyzed our method and
empirically evaluated it on two real-world medical datasets. The effectiveness
of our approach has been validated with significant performance improvements,
better collaboration fairness, better performance fairness, and comprehensive
analytical studies.Comment: Accepted at CVPR 202
Author Correction: Federated learning enables big data for rare cancer boundary detection.
10.1038/s41467-023-36188-7NATURE COMMUNICATIONS14
Federated learning enables big data for rare cancer boundary detection.
Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing
Federated Learning Enables Big Data for Rare Cancer Boundary Detection
Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing
A New Scheduling Algorithm for Reducing Data Aggregation Latency in Wireless Sensor Networks * Abstract
Existing works on data aggregation in wireless sensor networks (WSNs) usually use a single channel which results in a long latency due to high interference, especially in high-density networks. Therefore, data aggregation is a fundamental yet time-consuming task in WSNs. We present an improved algorithm to reduce data aggregation latency. Our algorithm has a latency bound of 16R + Δ – 11, where Δ is the maximum degree and R is the network radius. We prove that our algorithm has smaller latency than the algorithm in [1]. The simulation results show that our algorithm has much better performance in practice than previous works
Pulmonary Alveolar Proteinosis in Setting of Inhaled Toxin Exposure and Chronic Substance Abuse
Pulmonary alveolar proteinosis (PAP) is a rare lung disorder in which defects in alveolar macrophage maturation or function lead to the accumulation of proteinaceous surfactant in alveolar space, resulting in impaired gas exchange and hypoxemia. PAP is categorized into three types: hereditary, autoimmune, and secondary. We report a case of secondary PAP in a 47-year-old man, whose risk factors include occupational exposure to inhaled toxins, especially aluminum dust, the use of anabolic steroids, and alcohol abuse, which in mice leads to alveolar macrophage dysfunction through a zinc-dependent mechanism that inhibits granulocyte macrophage-colony stimulating factor (GM-CSF) receptor signalling. Although the rarity and vague clinical presentation of PAP can pose diagnostic challenges, clinician awareness of PAP risk factors may facilitate the diagnostic process and lead to more prompt treatment
How do natural and human factors influence ecosystem services changing? A case study in two most developed regions of China
Understanding the mechanisms that influence changes in ecosystem services (ESs) is critical to the sustainable management of ecosystems. However, existing studies ignore the different importance of influencing factors of ESs in different periods and do not consider the spatiotemporal heterogeneity of influence factors. In this study, we first quantified six ESs for the Yangtze River Delta (YRD) and Pearl River Delta (PRD) in 2000 and 2020 based on remote sensing data, including water yield, grain production, climate regulation, air purification, biodiversity, and recreation. Then, eight factors influencing ESs were selected from natural and human perspectives, and random forest was used to determine the importance level of factors influencing ESs. Finally, the GTWR model was used to explore the spatial and temporal differentiation of factors influencing ESs. The results showed that the spatial variation of the six ESs in the YRD and PRD was irregular from 2000 to 2020. In 2000, natural factors (forest, topography, climate) dominated the regional ESs, while in 2020 human factors (population, economy, human activities) gradually replaced the dominance of natural factors on ESs. The spatial and temporal heterogeneity of multiple influence factors on ESs in the YRD and PRD is significant, and we interpret the ecological implications in detail and propose a series of policy recommendations. The results of this study could provide an important reference for scientific guidance to enhance ecological sustainability in developed regions