Search CORE

119 research outputs found

Spectral Signatures in Backdoor Attacks

Author: Li Jerry
Madry Aleksander
Tran Brandon
Publication venue
Publication date: 01/11/2018
Field of study

A recent line of work has uncovered a new form of data poisoning: so-called \emph{backdoor} attacks. These attacks are particularly dangerous because they do not affect a network's behavior on typical, benign data. Rather, the network only deviates from its expected output when triggered by a perturbation planted by an adversary. In this paper, we identify a new property of all known backdoor attacks, which we call \emph{spectral signatures}. This property allows us to utilize tools from robust statistics to thwart the attacks. We demonstrate the efficacy of these signatures in detecting and removing poisoned examples on real image sets and state of the art neural network architectures. We believe that understanding spectral signatures is a crucial first step towards designing ML systems secure against such backdoor attacksComment: 16 pages, accepted to NIPS 201

arXiv.org e-Print Archive

DSpace@MIT

Robust polynomial regression up to the information theoretic limit

Author: Kane Daniel
Karmalkar Sushrut
Price Eric
Publication venue
Publication date: 10/08/2017
Field of study

We consider the problem of robust polynomial regression, where one receives samples

(x_i, y_i)

that are usually within

\sigma

of a polynomial

y = p(x)

, but have a

\rho

chance of being arbitrary adversarial outliers. Previously, it was known how to efficiently estimate

p

only when

\rho < \frac{1}{\log d}

. We give an algorithm that works for the entire feasible range of

\rho < 1/2

, while simultaneously improving other parameters of the problem. We complement our algorithm, which gives a factor 2 approximation, with impossibility results that show, for example, that a

1.09

approximation is impossible even with infinitely many samples.Comment: 19 Pages. To appear in FOCS 201

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Efficient Statistics, in High Dimensions, from Truncated Samples

Author: Daskalakis Constantinos
Gouleakis Themis
Tzamos Christos
Zampetakis Manolis
Publication venue
Publication date: 11/09/2018
Field of study

We provide an efficient algorithm for the classical problem, going back to Galton, Pearson, and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples. Truncated samples from a

d

-variate normal

{\cal N}(\mathbf{\mu},\mathbf{\Sigma})

means a samples is only revealed if it falls in some subset

S \subseteq \mathbb{R}^d

; otherwise the samples are hidden and their count in proportion to the revealed samples is also hidden. We show that the mean

\mathbf{\mu}

and covariance matrix

\mathbf{\Sigma}

can be estimated with arbitrary accuracy in polynomial-time, as long as we have oracle access to

S

, and

S

has non-trivial measure under the unknown

d

-variate normal distribution. Additionally we show that without oracle access to

S

, any non-trivial estimation is impossible.Comment: to appear at 59th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 201

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Byzantine Stochastic Gradient Descent

Author: Alistarh Dan-Adrian
Allen-Zhu Zeyuan
Bengio S.
Cesa-Bianchi N.
Garnett R.
Grauman K.
Larochelle H.
Li Jerry
Wallach H.
Publication venue
Publication date: 01/01/2018
Field of study

This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the

m

machines which allegedly compute stochastic gradients every iteration, an

\alpha

-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds

\varepsilon

-approximate minimizers of convex functions in

T = \tilde{O}\big( \frac{1}{\varepsilon^2 m} + \frac{\alpha^2}{\varepsilon^2} \big)

iterations. In contrast, traditional mini-batch SGD needs

T = O\big( \frac{1}{\varepsilon^2 m} \big)

iterations, but cannot tolerate Byzantine failures. Further, we provide a lower bound showing that, up to logarithmic factors, our algorithm is information-theoretically optimal both in terms of sampling complexity and time complexity

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)