Search CORE

96,179 research outputs found

Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy

Author: Avestimehr Salman
Kalan Seyed Mohammadreza Mousavi
Li Songze
Raviv Netanel
Soltanolkotabi Mahdi
Yu Qian
Publication venue
Publication date: 01/04/2019
Field of study

We consider a scenario involving computations over a massive dataset stored distributedly across multiple workers, which is at the core of distributed learning algorithms. We propose Lagrange Coded Computing (LCC), a new framework to simultaneously provide (1) resiliency against stragglers that may prolong computations; (2) security against Byzantine (or malicious) workers that deliberately modify the computation for their benefit; and (3) (information-theoretic) privacy of the dataset amidst possible collusion of workers. LCC, which leverages the well-known Lagrange polynomial to create computation redundancy in a novel coded form across workers, can be applied to any computation scenario in which the function of interest is an arbitrary multivariate polynomial of the input dataset, hence covering many computations of interest in machine learning. LCC significantly generalizes prior works to go beyond linear computations. It also enables secure and private computing in distributed settings, improving the computation and communication efficiency of the state-of-the-art. Furthermore, we prove the optimality of LCC by showing that it achieves the optimal tradeoff between resiliency, security, and privacy, i.e., in terms of tolerating the maximum number of stragglers and adversaries, and providing data privacy against the maximum number of colluding workers. Finally, we show via experiments on Amazon EC2 that LCC speeds up the conventional uncoded implementation of distributed least-squares linear regression by up to

13.43\times

, and also achieves a

2.36\times

12.65\times

speedup over the state-of-the-art straggler mitigation strategies

arXiv.org e-Print Archive

Caltech Authors

Large-Scale Kernel Methods for Independence Testing

Author: Filippi Sarah
Gretton Arthur
Sejdinovic Dino
Zhang Qinyi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/06/2016
Field of study

Representations of probability measures in reproducing kernel Hilbert spaces provide a flexible framework for fully nonparametric hypothesis tests of independence, which can capture any type of departure from independence, including nonlinear associations and multivariate interactions. However, these approaches come with an at least quadratic computational cost in the number of observations, which can be prohibitive in many applications. Arguably, it is exactly in such large-scale datasets that capturing any type of dependence is of interest, so striking a favourable tradeoff between computational efficiency and test performance for kernel independence tests would have a direct impact on their applicability in practice. In this contribution, we provide an extensive study of the use of large-scale kernel approximations in the context of independence testing, contrasting block-based, Nystrom and random Fourier feature approaches. Through a variety of synthetic data experiments, it is demonstrated that our novel large scale methods give comparable performance with existing methods whilst using significantly less computation time and memory.Comment: 29 pages, 6 figure

arXiv.org e-Print Archive

Springer - Publisher Connector

Oxford University Research Archive

Spiral - Imperial College Digital Repository

On the Complexity of Solving Quadratic Boolean Systems

Author: Albrecht
Ars
Bardet
Berbain
Berbain
Bettale
Bouillaguet
Bruno Salvy
Canny
Coppersmith
Cox
Eisenbud
Faugère
Faugère
Faugère
Faugère
Faugère
Fisher
Fraenkel
Fröberg
Fusco
Giesbrecht
Jean-Charles Faugère
Jelonek
Kaltofen
Kipnis
Kipnis
Macaulay
Magali Bardet
Moreno-Socías
Patarin
Pierre-Jean Spaenlehauer
Semaev
Semaev
Stitzinger
Villard
Wiedemann
Yang
Yang
Publication venue: 'Elsevier BV'
Publication date: 25/05/2012
Field of study

A fundamental problem in computer science is to find all the common zeroes of

m

quadratic polynomials in

n

unknowns over

\mathbb{F}_2

. The cryptanalysis of several modern ciphers reduces to this problem. Up to now, the best complexity bound was reached by an exhaustive search in

4\log_2 n\,2^n

operations. We give an algorithm that reduces the problem to a combination of exhaustive search and sparse linear algebra. This algorithm has several variants depending on the method used for the linear algebra step. Under precise algebraic assumptions on the input system, we show that the deterministic variant of our algorithm has complexity bounded by

O(2^{0.841n})

when

m=n

, while a probabilistic variant of the Las Vegas type has expected complexity

O(2^{0.792n})

. Experiments on random systems show that the algebraic assumptions are satisfied with probability very close to~1. We also give a rough estimate for the actual threshold between our method and exhaustive search, which is as low as~200, and thus very relevant for cryptographic applications.Comment: 25 page

arXiv.org e-Print Archive

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot