Search CORE

9,087 research outputs found

Federated Transfer Learning with Multimodal Data

Author: Sun Yulian
Publication venue
Publication date: 05/09/2022
Field of study

Smart cars, smartphones and other devices in the Internet of Things (IoT), which usually have more than one sensors, produce multimodal data. Federated Learning supports collecting a wealth of multimodal data from different devices without sharing raw data. Transfer Learning methods help transfer knowledge from some devices to others. Federated Transfer Learning methods benefit both Federated Learning and Transfer Learning. This newly proposed Federated Transfer Learning framework aims at connecting data islands with privacy protection. Our construction is based on Federated Learning and Transfer Learning. Compared with previous Federated Transfer Learnings, where each user should have data with identical modalities (either all unimodal or all multimodal), our new framework is more generic, it allows a hybrid distribution of user data. The core strategy is to use two different but inherently connected training methods for our two types of users. Supervised Learning is adopted for users with only unimodal data (Type 1), while Self-Supervised Learning is applied to user with multimodal data (Type 2) for both the feature of each modality and the connection between them. This connection knowledge of Type 2 will help Type 1 in later stages of training. Training in the new framework can be divided in three steps. In the first step, users who have data with the identical modalities are grouped together. For example, user with only sound signals are in group one, and those with only images are in group two, and users with multimodal data are in group three, and so on. In the second step, Federated Learning is executed within the groups, where Supervised Learning and Self-Supervised Learning are used depending on the group's nature. Most of the Transfer Learning happens in the third step, where the related parts in the network obtained from the previous steps are aggregated (federated).Comment: 73 pages, 54 figures, master thesi

arXiv.org e-Print Archive

Federated Learning in Computer Vision

Author: Rizzoli G.
Shenaj D.
Zanuttigh P.
Publication venue: IEEE
Publication date: 01/01/2023
Field of study

Federated Learning (FL) has recently emerged as a novel machine learning paradigm allowing to preserve privacy and to account for the distributed nature of the learning process in many real-world settings. Computer vision tasks deal with huge datasets often with critical privacy issues, therefore many federated learning approaches have been presented to exploit its distributed and privacy-preserving nature. Firstly, this paper introduces the different FL settings used in computer vision and the main challenges that need to be tackled. Then, it provides a comprehensive overview of the different strategies used for FL in vision applications and presents several different approaches for image classification, object detection, semantic segmentation and for focused settings in face recognition and medical imaging. For the various approaches the considered FL setting, the employed data and methodologies and the achieved results are thoroughly discussed

Archivio istituzionale della ricerca - Università di Padova

Heterogeneous Datasets for Federated Survival Analysis Simulation

Author: Alberto Archetti
André Martin
Eugenio Lomurno
Francesco Lattari
Matteo Matteucci
Publication venue
Publication date: 01/01/2023
Field of study

Heterogeneous Datasets for Federated Survival Analysis Simulation This repo contains three algorithms for constructing realistic federated datasets for survival analysis. Each algorithm starts from an existing non-federated dataset and assigns each sample to a specific client in the federation. The algorithms are: uniform_split: assigns each sample to a random client with uniform probability; quantity_skewed_split: assigns each sample to a random client according to the Dirichlet distribution [3, 4]; label_skewed_split: assigns each sample to a time bin, then assigns a set of samples from each bin to the clients according to the Dirichlet distribution [3, 4]. For more information, please take a look at our paper at https://arxiv.org/abs/2301.12166 [1]. Content federated_survival_datasets.zip: the content of the repository at https://github.com/archettialberto/federated_survival_datasets Heterogheneous_Datasets_for_Federated_Survival_Analysis_Simulation.pdf: the conference paper describing the work. Installation Federated Survival Datasets is built on top of numpy and scikit-learn. To install those libraries you can run pip install -r requirements.txt. To import survival datasets into your project, we strongly recommend SurvSet (https://github.com/ErikinBC/SurvSet) [2], a comprehensive collection of more than 70 survival datasets. Usage import numpy as np import pandas as pd from federated_survival_datasets import label_skewed_split # import a survival dataset and extract the input array X and the output array y df = pd.read_csv("metabric.csv") X = df[[f"x{i}" for i in range(9)]].to_numpy() y = np.array([(e, t) for e, t in zip(df["event"], df["time"])], dtype=[("event", bool), ("time", float)]) # run the splitting algorithm client_data = label_skewed_split(num_clients=8, X=X, y=y) # check the number of samples assigned to each client for i, (X_c, y_c) in enumerate(client_data): print(f"Client {i} - X: {X_c.shape}, y: {y_c.shape}") We provide an example notebook in the zipped folder to illustrate the proposed algorithms. It requires scikit-survival, seaborn, and pandas. References [1] Archetti, A., Lomurno, E., Lattari, F., Martin, A., & Matteucci, M. (2023). Heterogeneous Datasets for Federated Survival Analysis Simulation. arXiv preprint arXiv:2301.12166. [2] Drysdale, E. (2022). SurvSet: An open-source time-to-event dataset repository. arXiv preprint arXiv:2203.03094. [3] Hsu, T. M. H., Qi, H., & Brown, M. (2019). Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335. [4] Li, Q., Diao, Y., Chen, Q., & He, B. (2022, May). Federated learning on non-iid data silos: An experimental study. In 2022 IEEE 38th International Conference on Data Engineering (ICDE) (pp. 965-978). IEEE

Archivio istituzionale della ricerca - Politecnico di Milano