12 research outputs found

    Uncovering the Potential of Federated Learning: Addressing Algorithmic and Data-driven Challenges under Privacy Restrictions

    Get PDF
    Federated learning is a groundbreaking distributed machine learning paradigm that allows for the collaborative training of models across various entities without directly sharing sensitive data, ensuring privacy and robustness. This Ph.D. dissertation delves into the intricacies of federated learning, investigating the algorithmic and data-driven challenges of deep learning models in the presence of additive noise in this framework. The main objective is to provide strategies to measure the generalization, stability, and privacy-preserving capabilities of these models and further improve them. To this end, five noise infusion mechanisms at varying noise levels within centralized and federated learning settings are explored. As model complexity is a key component of the generalization and stability of deep learning models during training and evaluation, a comparative analysis of three Convolutional Neural Network (CNN) architectures is provided. A key contribution of this study is introducing specific metrics for training with noise. Signal-to-Noise Ratio (SNR) is introduced as a quantitative measure of the trade-off between privacy and training accuracy of noise-infused models, aiming to find the noise level that yields optimal privacy and accuracy. Moreover, the Price of Stability and Price of Anarchy are defined in the context of privacy-preserving deep learning, contributing to the systematic investigation of the noise infusion mechanisms to enhance privacy without compromising performance. This research sheds light on the delicate balance between these critical factors, fostering a deeper understanding of the implications of noise-based regularization in machine learning. The present study also explores a real-world application of federated learning in weather prediction applications that suffer from the issue of imbalanced datasets. Utilizing data from multiple sources combined with advanced data augmentation techniques improves the accuracy and generalization of weather prediction models, even when dealing with imbalanced datasets. Overall, federated learning is pivotal in harnessing decentralized datasets for real-world applications while safeguarding privacy. By leveraging noise as a tool for regularization and privacy enhancement, this research study aims to contribute to the development of robust, privacy-aware algorithms, ensuring that AI-driven solutions prioritize both utility and privacy

    Enabling NATO’s Collective Defense: Critical Infrastructure Security and Resiliency (NATO COE-DAT Handbook 1)

    Get PDF
    In 2014 NATO’s Center of Excellence-Defence Against Terrorism (COE-DAT) launched the inaugural course on “Critical Infrastructure Protection Against Terrorist Attacks.” As this course garnered increased attendance and interest, the core lecturer team felt the need to update the course in critical infrastructure (CI) taking into account the shift from an emphasis on “protection” of CI assets to “security and resiliency.” What was lacking in the fields of academe, emergency management, and the industry practitioner community was a handbook that leveraged the collective subject matter expertise of the core lecturer team, a handbook that could serve to educate government leaders, state and private-sector owners and operators of critical infrastructure, academicians, and policymakers in NATO and partner countries. Enabling NATO’s Collective Defense: Critical Infrastructure Security and Resiliency is the culmination of such an effort, the first major collaborative research project under a Memorandum of Understanding between the US Army War College Strategic Studies Institute (SSI), and NATO COE-DAT. The research project began in October 2020 with a series of four workshops hosted by SSI. The draft chapters for the book were completed in late January 2022. Little did the research team envision the Russian invasion of Ukraine in February this year. The Russian occupation of the Zaporizhzhya nuclear power plant, successive missile attacks against Ukraine’s electric generation and distribution facilities, rail transport, and cyberattacks against almost every sector of the country’s critical infrastructure have been on world display. Russian use of its gas supplies as a means of economic warfare against Europe—designed to undermine NATO unity and support for Ukraine—is another timely example of why adversaries, nation-states, and terrorists alike target critical infrastructure. Hence, the need for public-private sector partnerships to secure that infrastructure and build the resiliency to sustain it when attacked. Ukraine also highlights the need for NATO allies to understand where vulnerabilities exist in host nation infrastructure that will undermine collective defense and give more urgency to redressing and mitigating those fissures.https://press.armywarcollege.edu/monographs/1951/thumbnail.jp

    Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

    Get PDF
    Unsupervised Machine Translation hasbeen advancing our ability to translatewithout parallel data, but state-of-the-artmethods assume an abundance of mono-lingual data. This paper investigates thescenario where monolingual data is lim-ited as well, finding that current unsuper-vised methods suffer in performance un-der this stricter setting. We find that theperformance loss originates from the poorquality of the pretrained monolingual em-beddings, and we propose using linguis-tic information in the embedding train-ing scheme. To support this, we look attwo linguistic features that may help im-prove alignment quality: dependency in-formation and sub-word information. Us-ing dependency-based embeddings resultsin a complementary word representationwhich offers a boost in performance ofaround 1.5 BLEU points compared to stan-dardWORD2VECwhen monolingual datais limited to 1 million sentences per lan-guage. We also find that the inclusion ofsub-word information is crucial to improv-ing the quality of the embedding
    corecore