1,084 research outputs found
Robust Estimators in High Dimensions without the Computational Intractability
We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an epsilon fraction of the samples. Such questions have a rich history spanning statistics, machine learning and theoretical computer science. Even in the most basic settings, the only known approaches are either computationally inefficient or lose dimension dependent factors in their error guarantees. This raises the following question: Is high-dimensional agnostic distribution learning even possible, algorithmically? In this work, we obtain the first computationally efficient algorithms for agnostically learning several fundamental classes of high-dimensional distributions: (1) a single Gaussian, (2) a product distribution on the hypercube, (3) mixtures of two product distributions (under a natural balancedness condition), and (4) mixtures of k Gaussians with identical spherical covariances. All our algorithms achieve error that is independent of the dimension, and in many cases depends nearly-linearly on the fraction of adversarially corrupted samples. Moreover, we develop a general recipe for detecting and correcting corruptions in high-dimensions, that may be applicable to many other problems.United States. Office of Naval Research (Grant N00014-12-1-0999)National Science Foundation (U.S.) (CAREER Award CCF-1453261)National Science Foundation (U.S.) (CAREER Award CCF-0953960)Google (Firm) (Faculty Research Award)National Science Foundation (U.S.). Graduate Research Fellowship ProgramNEC Corporatio
Confidence regions and minimax rates in outlier-robust estimation on the probability simplex
We consider the problem of estimating the mean of a distribution supported by
the -dimensional probability simplex in the setting where an
fraction of observations are subject to adversarial corruption. A simple
particular example is the problem of estimating the distribution of a discrete
random variable. Assuming that the discrete variable takes values, the
unknown parameter is a -dimensional vector belonging to
the probability simplex. We first describe various settings of contamination
and discuss the relation between these settings. We then establish minimax
rates when the quality of estimation is measured by the total-variation
distance, the Hellinger distance, or the -distance between two
probability measures. We also provide confidence regions for the unknown mean
that shrink at the minimax rate. Our analysis reveals that the minimax rates
associated to these three distances are all different, but they are all
attained by the sample average. Furthermore, we show that the latter is
adaptive to the possible sparsity of the unknown vector. Some numerical
experiments illustrating our theoretical findings are reported
Data-driven Inverse Optimization with Imperfect Information
In data-driven inverse optimization an observer aims to learn the preferences
of an agent who solves a parametric optimization problem depending on an
exogenous signal. Thus, the observer seeks the agent's objective function that
best explains a historical sequence of signals and corresponding optimal
actions. We focus here on situations where the observer has imperfect
information, that is, where the agent's true objective function is not
contained in the search space of candidate objectives, where the agent suffers
from bounded rationality or implementation errors, or where the observed
signal-response pairs are corrupted by measurement noise. We formalize this
inverse optimization problem as a distributionally robust program minimizing
the worst-case risk that the {\em predicted} decision ({\em i.e.}, the decision
implied by a particular candidate objective) differs from the agent's {\em
actual} response to a random signal. We show that our framework offers rigorous
out-of-sample guarantees for different loss functions used to measure
prediction errors and that the emerging inverse optimization problems can be
exactly reformulated as (or safely approximated by) tractable convex programs
when a new suboptimality loss function is used. We show through extensive
numerical tests that the proposed distributionally robust approach to inverse
optimization attains often better out-of-sample performance than the
state-of-the-art approaches
Robust Estimators are Hard to Compute
In modern statistics, the robust estimation of parameters of a regression hyperplane is a central problem. Robustness means that the estimation is not or only slightly affected by outliers in the data. In this paper, it is shown that the following robust estimators are hard to compute: LMS, LQS, LTS, LTA, MCD, MVE, Constrained M estimator, Projection Depth (PD) and Stahel-Donoho. In addition, a data set is presented such that the ltsReg-procedure of R has probability less than 0.0001 of finding a correct answer. Furthermore, it is described, how to design new robust estimators. --Computational statistics,complexity theory,robust statistics,algorithms,search heuristics
Efficient Statistics, in High Dimensions, from Truncated Samples
We provide an efficient algorithm for the classical problem, going back to
Galton, Pearson, and Fisher, of estimating, with arbitrary accuracy the
parameters of a multivariate normal distribution from truncated samples.
Truncated samples from a -variate normal means a samples is only revealed if it falls
in some subset ; otherwise the samples are hidden and
their count in proportion to the revealed samples is also hidden. We show that
the mean and covariance matrix can be
estimated with arbitrary accuracy in polynomial-time, as long as we have oracle
access to , and has non-trivial measure under the unknown -variate
normal distribution. Additionally we show that without oracle access to ,
any non-trivial estimation is impossible.Comment: to appear at 59th Annual IEEE Symposium on Foundations of Computer
Science (FOCS), 201
Efficient computational strategies for doubly intractable problems with applications to Bayesian social networks
Powerful ideas recently appeared in the literature are adjusted and combined
to design improved samplers for Bayesian exponential random graph models.
Different forms of adaptive Metropolis-Hastings proposals (vertical, horizontal
and rectangular) are tested and combined with the Delayed rejection (DR)
strategy with the aim of reducing the variance of the resulting Markov chain
Monte Carlo estimators for a given computational time. In the examples treated
in this paper the best combination, namely horizontal adaptation with delayed
rejection, leads to a variance reduction that varies between 92% and 144%
relative to the adaptive direction sampling approximate exchange algorithm of
Caimo and Friel (2011). These results correspond to an increased performance
which varies from 10% to 94% if we take simulation time into account. The
highest improvements are obtained when highly correlated posterior
distributions are considered.Comment: 23 pages, 8 figures. Accepted to appear in Statistics and Computin
Being Robust (in High Dimensions) Can Be Practical
Robust estimation is much more challenging in high dimensions than it is in
one dimension: Most techniques either lead to intractable optimization problems
or estimators that can tolerate only a tiny fraction of errors. Recent work in
theoretical computer science has shown that, in appropriate distributional
models, it is possible to robustly estimate the mean and covariance with
polynomial time algorithms that can tolerate a constant fraction of
corruptions, independent of the dimension. However, the sample and time
complexity of these algorithms is prohibitively large for high-dimensional
applications. In this work, we address both of these issues by establishing
sample complexity bounds that are optimal, up to logarithmic factors, as well
as giving various refinements that allow the algorithms to tolerate a much
larger fraction of corruptions. Finally, we show on both synthetic and real
data that our algorithms have state-of-the-art performance and suddenly make
high-dimensional robust estimation a realistic possibility.Comment: Appeared in ICML 201
Likelihood-based inference for max-stable processes
The last decade has seen max-stable processes emerge as a common tool for the
statistical modeling of spatial extremes. However, their application is
complicated due to the unavailability of the multivariate density function, and
so likelihood-based methods remain far from providing a complete and flexible
framework for inference. In this article we develop inferentially practical,
likelihood-based methods for fitting max-stable processes derived from a
composite-likelihood approach. The procedure is sufficiently reliable and
versatile to permit the simultaneous modeling of marginal and dependence
parameters in the spatial context at a moderate computational cost. The utility
of this methodology is examined via simulation, and illustrated by the analysis
of U.S. precipitation extremes
- …