558 research outputs found
Tiny Groups Tackle Byzantine Adversaries
A popular technique for tolerating malicious faults in open distributed
systems is to establish small groups of participants, each of which has a
non-faulty majority. These groups are used as building blocks to design
attack-resistant algorithms.
Despite over a decade of active research, current constructions require group
sizes of , where is the number of participants in the system.
This group size is important since communication and state costs scale
polynomially with this parameter. Given the stubbornness of this logarithmic
barrier, a natural question is whether better bounds are possible.
Here, we consider an attacker that controls a constant fraction of the total
computational resources in the system. By leveraging proof-of-work (PoW), we
demonstrate how to reduce the group size exponentially to while
maintaining strong security guarantees. This reduction in group size yields a
significant improvement in communication and state costs.Comment: This work is supported by the National Science Foundation grant CCF
1613772 and a C Spire Research Gif
SPFL: A Self-purified Federated Learning Method Against Poisoning Attacks
While Federated learning (FL) is attractive for pulling privacy-preserving
distributed training data, the credibility of participating clients and
non-inspectable data pose new security threats, of which poisoning attacks are
particularly rampant and hard to defend without compromising privacy,
performance or other desirable properties of FL. To tackle this problem, we
propose a self-purified FL (SPFL) method that enables benign clients to exploit
trusted historical features of locally purified model to supervise the training
of aggregated model in each iteration. The purification is performed by an
attention-guided self-knowledge distillation where the teacher and student
models are optimized locally for task loss, distillation loss and
attention-based loss simultaneously. SPFL imposes no restriction on the
communication protocol and aggregator at the server. It can work in tandem with
any existing secure aggregation algorithms and protocols for augmented security
and privacy guarantee. We experimentally demonstrate that SPFL outperforms
state-of-the-art FL defenses against various poisoning attacks. The attack
success rate of SPFL trained model is at most 3 above that of a clean
model, even if the poisoning attack is launched in every iteration with all but
one malicious clients in the system. Meantime, it improves the model quality on
normal inputs compared to FedAvg, either under attack or in the absence of an
attack
Recommended from our members
Emerging Trustworthiness Issues in Distributed Learning Systems
A distributed learning system allocates learning processes onto several workstations to enable faster learning algorithms. Federated Learning (FL) is an increasingly popular type of distributed learning which allows mutually untrusted clients to collaboratively train a common machine learning model without sharing their private/proprietary training data with each other. In this dissertation, we aim to address emerging trustworthiness issues in distributed learning systems, particularly in the field of FL.
First, we tackle the issue of robustness in FL and demonstrate its susceptibility by presenting a comprehensive analysis of the various poisoning attacks and defensive aggregation rules proposed in the literature and connecting them under a common framework. To address this issue, we propose Federated Rank Learning (FRL) which reduces the space of client updates from a continuous space of float numbers in standard FL to a discrete space of integer values, limiting the adversary\u27s options for poisoning attacks.
Next, we address the privacy concerns in FL, including access privacy and data privacy. An adversarial server in FL gets information about the data distribution of a target client by monitoring either I) local updates that the target submits throughout the FL training or II) the access pattern of the target, which can be privacy sensitive in many real-world scenarios. To preserve access privacy, we design Heterogeneous Private Information Retrieval (HPIR), which allows clients to fetch their specific model parameters from untrusted servers without leaking any information. We believe that HPIR will enable new application scenarios for private distributed learning systems, as well as improve the usability of some of the known applications of PIR. To preserve data privacy, we show that local rankings leak less information about private training data. We conduct a comprehensive investigation on the privacy of rankings in FRL to measure data leakage compared to weight parameter updates in standard FL in presence of the state-of-the-art white-box membership inference attack.
Finally, we address the issue of fairness in FL where a single model cannot represent all clients equally due to heterogeneity in their data distributions. To alleviate this issue, we propose Equal and Equitable Federated Learning (E2FL). E2FL produces fair federated learning models by preserving both equity and equality among the participating clients based on learning on parameter rankings where multiple global models are learned so that each group of clients can benefit from their personalized model
Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting
Federated learning has exhibited vulnerabilities to Byzantine attacks, where
the Byzantine attackers can send arbitrary gradients to a central server to
destroy the convergence and performance of the global model. A wealth of robust
AGgregation Rules (AGRs) have been proposed to defend against Byzantine
attacks. However, Byzantine clients can still circumvent robust AGRs when data
is non-Identically and Independently Distributed (non-IID). In this paper, we
first reveal the root causes of performance degradation of current robust AGRs
in non-IID settings: the curse of dimensionality and gradient heterogeneity. In
order to address this issue, we propose GAS, a \shorten approach that can
successfully adapt existing robust AGRs to non-IID settings. We also provide a
detailed convergence analysis when the existing robust AGRs are combined with
GAS. Experiments on various real-world datasets verify the efficacy of our
proposed GAS. The implementation code is provided in
https://github.com/YuchenLiu-a/byzantine-gas
Recommended from our members
Decentralised computer systems
The architecture of the Web was designed to enable decentralised exchange of information. Early architects envisioned an egalitarian yet organic society thriving in cyberspace. The reality of the Web today, unfortunately, does not bear out these visions: information networks have repeatedly shown a tendency towards consolidation and centralisation with the current Web split between a handful of large corporations.
The advent of Bitcoin and successor blockchain networks re-ignited interest in developing alternatives to the centralised Web and paving a way back to the earlier architectural visions for the Web. This has led to immense hype around these technologies with the cryptocurrency market valued at several hundred billions of dollars at the time of writing. With great hype, apparently, come great scams. I start off by analysing the use of Bitcoin as an enabler for crime and then present both technical solutions as well as policy recommendations to mitigate the harm these crimes cause.
These policy recommendations then lead us on to look more closely at cryptocurrency's tamer cousin: permissioned blockchains. These systems, while less revolutionary in their premise, nevertheless aim to provide sweeping improvements in the efficiency and transparency of existing enterprise systems. To see whether they work in practice, I present the results of my work in delivering a production permissioned blockchain system to real users. This involves comparing several permissioned blockchain systems, exploring their deficiencies and developing solutions for the most egregious of those.
Lastly, I do a deep dive into one of the most persistent technical issues with permissioned blockchains, and decentralised networks in general: the lack of scalability in their consensus mechanisms. I present two novel consensus algorithms that aim to improve upon the state of the art in several ways. The first is designed to enable existing permissioned blockchain networks to scale to thousands of nodes. The second presents an entirely new way of building decentralised consensus systems utilising a trie-based data structure at its core as opposed to the usual linear ledgers used in current systems
- …