2 research outputs found

    Towards causal federated learning : a federated approach to learning representations using causal invariance

    Full text link
    Federated Learning is an emerging privacy-preserving distributed machine learning approach to building a shared model by performing distributed training locally on participating devices (clients) and aggregating the local models into a global one. As this approach prevents data collection and aggregation, it helps in reducing associated privacy risks to a great extent. However, the data samples across all participating clients are usually not independent and identically distributed (non-i.i.d.), and Out of Distribution (OOD) generalization for the learned models can be poor. Besides this challenge, federated learning also remains vulnerable to various attacks on security wherein a few malicious participating entities work towards inserting backdoors, degrading the generated aggregated model as well as inferring the data owned by participating entities. In this work, we propose an approach for learning invariant (causal) features common to all participating clients in a federated learning setup and analyse empirically how it enhances the Out of Distribution (OOD) accuracy as well as the privacy of the final learned model. Although Federated Learning allows for participants to contribute their local data without revealing it, it faces issues in data security and in accurately paying participants for quality data contributions. In this report, we also propose an EOS Blockchain design and workflow to establish data security, a novel validation error based metric upon which we qualify gradient uploads for payment, and implement a small example of our Blockchain Causal Federated Learning model to analyze its performance with respect to robustness, privacy and fairness in incentivization.L’apprentissage fĂ©dĂ©rĂ© est une approche Ă©mergente d’apprentissage automatique distribuĂ© prĂ©servant la confidentialitĂ© pour crĂ©er un modĂšle partagĂ© en effectuant une formation distribuĂ©e localement sur les appareils participants (clients) et en agrĂ©geant les modĂšles locaux en un modĂšle global. Comme cette approche empĂȘche la collecte et l’agrĂ©gation de donnĂ©es, elle contribue Ă  rĂ©duire dans une large mesure les risques associĂ©s Ă  la vie privĂ©e. Cependant, les Ă©chantillons de donnĂ©es de tous les clients participants sont gĂ©nĂ©ralement pas indĂ©pendante et distribuĂ©e de maniĂšre identique (non-i.i.d.), et la gĂ©nĂ©ralisation hors distribution (OOD) pour les modĂšles appris peut ĂȘtre mĂ©diocre. Outre ce dĂ©fi, l’apprentissage fĂ©dĂ©rĂ© reste Ă©galement vulnĂ©rable Ă  diverses attaques contre la sĂ©curitĂ© dans lesquelles quelques entitĂ©s participantes malveillantes s’efforcent d’insĂ©rer des portes dĂ©robĂ©es, dĂ©gradant le modĂšle agrĂ©gĂ© gĂ©nĂ©rĂ© ainsi que d’infĂ©rer les donnĂ©es dĂ©tenues par les entitĂ©s participantes. Dans cet article, nous proposons une approche pour l’apprentissage des caractĂ©ristiques invariantes (causales) communes Ă  tous les clients participants dans une configuration d’apprentissage fĂ©dĂ©rĂ©e et analysons empiriquement comment elle amĂ©liore la prĂ©cision hors distribution (OOD) ainsi que la confidentialitĂ© du modĂšle appris final. Bien que l’apprentissage fĂ©dĂ©rĂ© permette aux participants de contribuer leurs donnĂ©es locales sans les rĂ©vĂ©ler, il se heurte Ă  des problĂšmes de sĂ©curitĂ© des donnĂ©es et de paiement prĂ©cis des participants pour des contributions de donnĂ©es de qualitĂ©. Dans ce rapport, nous proposons Ă©galement une conception et un flux de travail EOS Blockchain pour Ă©tablir la sĂ©curitĂ© des donnĂ©es, une nouvelle mĂ©trique basĂ©e sur les erreurs de validation sur laquelle nous qualifions les tĂ©lĂ©chargements de gradient pour le paiement, et implĂ©mentons un petit exemple de notre modĂšle d’apprentissage fĂ©dĂ©rĂ© blockchain pour analyser ses performances

    A global metagenomic map of urban microbiomes and antimicrobial resistance

    No full text
    We present a global atlas of 4,728 metagenomic samples from mass-transit systems in 60 cities over 3 years, representing the first systematic, worldwide catalog of the urban microbial ecosystem. This atlas provides an annotated, geospatial profile of microbial strains, functional characteristics, antimicrobial resistance (AMR) markers, and genetic elements, including 10,928 viruses, 1,302 bacteria, 2 archaea, and 838,532 CRISPR arrays not found in reference databases. We identified 4,246 known species of urban microorganisms and a consistent set of 31 species found in 97% of samples that were distinct from human commensal organisms. Profiles of AMR genes varied widely in type and density across cities. Cities showed distinct microbial taxonomic signatures that were driven by climate and geographic differences. These results constitute a high-resolution global metagenomic atlas that enables discovery of organisms and genes, highlights potential public health and forensic applications, and provides a culture-independent view of AMR burden in cities.Funding: the Tri-I Program in Computational Biology and Medicine (CBM) funded by NIH grant 1T32GM083937; GitHub; Philip Blood and the Extreme Science and Engineering Discovery Environment (XSEDE), supported by NSF grant number ACI-1548562 and NSF award number ACI-1445606; NASA (NNX14AH50G, NNX17AB26G), the NIH (R01AI151059, R25EB020393, R21AI129851, R35GM138152, U01DA053941); STARR Foundation (I13- 0052); LLS (MCL7001-18, LLS 9238-16, LLS-MCL7001-18); the NSF (1840275); the Bill and Melinda Gates Foundation (OPP1151054); the Alfred P. Sloan Foundation (G-2015-13964); Swiss National Science Foundation grant number 407540_167331; NIH award number UL1TR000457; the US Department of Energy Joint Genome Institute under contract number DE-AC02-05CH11231; the National Energy Research Scientific Computing Center, supported by the Office of Science of the US Department of Energy; Stockholm Health Authority grant SLL 20160933; the Institut Pasteur Korea; an NRF Korea grant (NRF-2014K1A4A7A01074645, 2017M3A9G6068246); the CONICYT Fondecyt Iniciación grants 11140666 and 11160905; Keio University Funds for Individual Research; funds from the Yamagata prefectural government and the city of Tsuruoka; JSPS KAKENHI grant number 20K10436; the bilateral AT-UA collaboration fund (WTZ:UA 02/2019; Ministry of Education and Science of Ukraine, UA:M/84-2019, M/126-2020); Kyiv Academic Univeristy; Ministry of Education and Science of Ukraine project numbers 0118U100290 and 0120U101734; Centro de Excelencia Severo Ochoa 2013–2017; the CERCA Programme / Generalitat de Catalunya; the CRG-Novartis-Africa mobility program 2016; research funds from National Cheng Kung University and the Ministry of Science and Technology; Taiwan (MOST grant number 106-2321-B-006-016); we thank all the volunteers who made sampling NYC possible, Minciencias (project no. 639677758300), CNPq (EDN - 309973/2015-5), the Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science – MOE, ECNU, the Research Grants Council of Hong Kong through project 11215017, National Key RD Project of China (2018YFE0201603), and Shanghai Municipal Science and Technology Major Project (2017SHZDZX01) (L.S.
    corecore