2 research outputs found
Towards causal federated learning : a federated approach to learning representations using causal invariance
Federated Learning is an emerging privacy-preserving distributed machine learning approach to building a shared model by performing distributed training locally on participating devices (clients) and aggregating the local models into a global one. As this approach prevents data collection and aggregation, it helps in reducing associated privacy risks to a great extent.
However, the data samples across all participating clients are
usually not independent and identically distributed (non-i.i.d.), and Out of Distribution (OOD) generalization for the learned models can be poor. Besides this challenge, federated learning also remains vulnerable to various attacks on security wherein a few malicious participating entities work towards inserting backdoors, degrading the generated aggregated model as well as inferring the data owned by participating entities. In this work, we propose an approach for learning invariant (causal) features common to all participating clients in a federated learning setup and analyse empirically how it enhances the Out of Distribution (OOD) accuracy as well as the privacy of the final learned model. Although Federated Learning allows for participants to contribute their local data without revealing it, it faces issues in data security and in accurately paying participants for quality data contributions. In this report, we also propose an EOS Blockchain design and workflow to establish data security, a novel validation error based metric upon which we qualify gradient uploads for payment, and implement a small example of our Blockchain Causal Federated Learning model to analyze its performance with respect to robustness, privacy and fairness in incentivization.Lâapprentissage fĂ©dĂ©rĂ© est une approche Ă©mergente dâapprentissage automatique distribuĂ©
préservant la confidentialité pour créer un modÚle partagé en effectuant une formation
distribuée localement sur les appareils participants (clients) et en agrégeant les modÚles locaux
en un modĂšle global. Comme cette approche empĂȘche la collecte et lâagrĂ©gation de donnĂ©es,
elle contribue à réduire dans une large mesure les risques associés à la vie privée. Cependant,
les échantillons de données de tous les clients participants sont généralement pas indépendante
et distribuée de maniÚre identique (non-i.i.d.), et la généralisation hors distribution (OOD)
pour les modĂšles appris peut ĂȘtre mĂ©diocre. Outre ce dĂ©fi, lâapprentissage fĂ©dĂ©rĂ© reste
également vulnérable à diverses attaques contre la sécurité dans lesquelles quelques entités
participantes malveillantes sâefforcent dâinsĂ©rer des portes dĂ©robĂ©es, dĂ©gradant le modĂšle
agrĂ©gĂ© gĂ©nĂ©rĂ© ainsi que dâinfĂ©rer les donnĂ©es dĂ©tenues par les entitĂ©s participantes. Dans cet
article, nous proposons une approche pour lâapprentissage des caractĂ©ristiques invariantes
(causales) communes Ă tous les clients participants dans une configuration dâapprentissage
fédérée et analysons empiriquement comment elle améliore la précision hors distribution
(OOD) ainsi que la confidentialitĂ© du modĂšle appris final. Bien que lâapprentissage fĂ©dĂ©rĂ©
permette aux participants de contribuer leurs données locales sans les révéler, il se heurte à des
problÚmes de sécurité des données et de paiement précis des participants pour des contributions
de données de qualité. Dans ce rapport, nous proposons également une conception et un
flux de travail EOS Blockchain pour établir la sécurité des données, une nouvelle métrique
basée sur les erreurs de validation sur laquelle nous qualifions les téléchargements de gradient
pour le paiement, et implĂ©mentons un petit exemple de notre modĂšle dâapprentissage fĂ©dĂ©rĂ©
blockchain pour analyser ses performances
A global metagenomic map of urban microbiomes and antimicrobial resistance
We present a global atlas of 4,728 metagenomic samples from mass-transit systems in 60 cities over 3 years, representing the first systematic, worldwide catalog of the urban microbial ecosystem. This atlas provides an annotated, geospatial profile of microbial strains, functional characteristics, antimicrobial resistance (AMR) markers, and genetic elements, including 10,928 viruses, 1,302 bacteria, 2 archaea, and 838,532 CRISPR arrays not found in reference databases. We identified 4,246 known species of urban microorganisms and a consistent set of 31 species found in 97% of samples that were distinct from human commensal organisms. Profiles of AMR genes varied widely in type and density across cities. Cities showed distinct microbial taxonomic signatures that were driven by climate and geographic differences. These results constitute a high-resolution global metagenomic atlas that enables discovery of organisms and genes, highlights potential public health and forensic applications, and provides a culture-independent view of AMR burden in cities.Funding: the Tri-I Program in Computational Biology and Medicine (CBM) funded by NIH grant 1T32GM083937; GitHub; Philip Blood and the Extreme Science and Engineering Discovery Environment (XSEDE), supported by NSF grant number ACI-1548562 and NSF award number ACI-1445606; NASA (NNX14AH50G, NNX17AB26G), the NIH (R01AI151059, R25EB020393, R21AI129851, R35GM138152, U01DA053941); STARR Foundation (I13- 0052); LLS (MCL7001-18, LLS 9238-16, LLS-MCL7001-18); the NSF (1840275); the Bill and Melinda Gates Foundation (OPP1151054); the Alfred P. Sloan Foundation (G-2015-13964); Swiss National Science Foundation grant number 407540_167331; NIH award number UL1TR000457; the US Department of Energy Joint Genome Institute under contract number DE-AC02-05CH11231; the National Energy Research Scientific Computing Center, supported by the Office of Science of the US Department of Energy; Stockholm Health Authority grant SLL 20160933; the Institut Pasteur Korea; an NRF Korea grant (NRF-2014K1A4A7A01074645, 2017M3A9G6068246); the CONICYT Fondecyt IniciaciĂłn grants 11140666 and 11160905; Keio University Funds for Individual Research; funds from the Yamagata prefectural government and the city of Tsuruoka; JSPS KAKENHI grant number 20K10436; the bilateral AT-UA collaboration fund (WTZ:UA 02/2019; Ministry of Education and Science of Ukraine, UA:M/84-2019, M/126-2020); Kyiv Academic Univeristy; Ministry of Education and Science of Ukraine project numbers 0118U100290 and 0120U101734; Centro de Excelencia Severo Ochoa 2013â2017; the CERCA Programme / Generalitat de Catalunya; the CRG-Novartis-Africa mobility program 2016; research funds from National Cheng Kung University and the Ministry of Science and Technology; Taiwan (MOST grant number 106-2321-B-006-016); we thank all the volunteers who made sampling NYC possible, Minciencias (project no. 639677758300), CNPq (EDN - 309973/2015-5), the Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science â MOE, ECNU, the Research Grants Council of Hong Kong through project 11215017, National Key RD Project of China (2018YFE0201603), and Shanghai Municipal Science and Technology Major Project (2017SHZDZX01) (L.S.