1,160 research outputs found
Towards Sybil Resilience in Decentralized Learning
Federated learning is a privacy-enforcing machine learning technology but
suffers from limited scalability. This limitation mostly originates from the
internet connection and memory capacity of the central parameter server, and
the complexity of the model aggregation function. Decentralized learning has
recently been emerging as a promising alternative to federated learning. This
novel technology eliminates the need for a central parameter server by
decentralizing the model aggregation across all participating nodes. Numerous
studies have been conducted on improving the resilience of federated learning
against poisoning and Sybil attacks, whereas the resilience of decentralized
learning remains largely unstudied. This research gap serves as the main
motivator for this study, in which our objective is to improve the Sybil
poisoning resilience of decentralized learning.
We present SybilWall, an innovative algorithm focused on increasing the
resilience of decentralized learning against targeted Sybil poisoning attacks.
By combining a Sybil-resistant aggregation function based on similarity between
Sybils with a novel probabilistic gossiping mechanism, we establish a new
benchmark for scalable, Sybil-resilient decentralized learning.
A comprehensive empirical evaluation demonstrated that SybilWall outperforms
existing state-of-the-art solutions designed for federated learning scenarios
and is the only algorithm to obtain consistent accuracy over a range of
adversarial attack scenarios. We also found SybilWall to diminish the utility
of creating many Sybils, as our evaluations demonstrate a higher success rate
among adversaries employing fewer Sybils. Finally, we suggest a number of
possible improvements to SybilWall and highlight promising future research
directions
Decentralized learning with budgeted network load using Gaussian copulas and classifier ensembles
We examine a network of learners which address the same classification task
but must learn from different data sets. The learners cannot share data but
instead share their models. Models are shared only one time so as to preserve
the network load. We introduce DELCO (standing for Decentralized Ensemble
Learning with COpulas), a new approach allowing to aggregate the predictions of
the classifiers trained by each learner. The proposed method aggregates the
base classifiers using a probabilistic model relying on Gaussian copulas.
Experiments on logistic regressor ensembles demonstrate competing accuracy and
increased robustness in case of dependent classifiers. A companion python
implementation can be downloaded at https://github.com/john-klein/DELC
Making Byzantine Decentralized Learning Efficient
Decentralized-SGD (D-SGD) distributes heavy learning tasks across multiple
machines (a.k.a., {\em nodes}), effectively dividing the workload per node by
the size of the system. However, a handful of \emph{Byzantine} (i.e.,
misbehaving) nodes can jeopardize the entire learning procedure. This
vulnerability is further amplified when the system is \emph{asynchronous}.
Although approaches that confer Byzantine resilience to D-SGD have been
proposed, these significantly impact the efficiency of the process to the point
of even negating the benefit of decentralization. This naturally raises the
question: \emph{can decentralized learning simultaneously enjoy Byzantine
resilience and reduced workload per node?}
We answer positively by proposing \newalgorithm{} that ensures Byzantine
resilience without losing the computational efficiency of D-SGD. Essentially,
\newalgorithm{} weakens the impact of Byzantine nodes by reducing the variance
in local updates using \emph{Polyak's momentum}. Then, by establishing
coordination between nodes via {\em signed echo broadcast} and a {\em
nearest-neighbor averaging} scheme, we effectively tolerate Byzantine nodes
whilst distributing the overhead amongst the non-Byzantine nodes. To
demonstrate the correctness of our algorithm, we introduce and analyze a novel
{\em Lyapunov function} that accounts for the {\em non-Markovian model drift}
arising from the use of momentum. We also demonstrate the efficiency of
\newalgorithm{} through experiments on several image classification tasks.Comment: 63 pages,5 figure
Two problems in applying Ljung's 'projection algorithms' to the analysis of decentralized learning
We show that Ljung's projection algorthms, which we have recently been used by economists to establish convergence to rational expectations equilibrium, do not
seem to apply to learning or forecasting behavior that one would normally call
"decentralized." If the algorithm is defined in a way that allows individuals to have
differing information. then Ljung's theorem does not apply. And even if a similar
theorem could be proved that would allow for differing information, there remains
a Lyapunov-like condition that is central to Ljung's projection method and which
requires that individual beliefs be narrowly related to the equilibrium and to one
another.Publicad
- …