1,084 research outputs found

    Robust Estimators in High Dimensions without the Computational Intractability

    Get PDF
    We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an epsilon fraction of the samples. Such questions have a rich history spanning statistics, machine learning and theoretical computer science. Even in the most basic settings, the only known approaches are either computationally inefficient or lose dimension dependent factors in their error guarantees. This raises the following question: Is high-dimensional agnostic distribution learning even possible, algorithmically? In this work, we obtain the first computationally efficient algorithms for agnostically learning several fundamental classes of high-dimensional distributions: (1) a single Gaussian, (2) a product distribution on the hypercube, (3) mixtures of two product distributions (under a natural balancedness condition), and (4) mixtures of k Gaussians with identical spherical covariances. All our algorithms achieve error that is independent of the dimension, and in many cases depends nearly-linearly on the fraction of adversarially corrupted samples. Moreover, we develop a general recipe for detecting and correcting corruptions in high-dimensions, that may be applicable to many other problems.United States. Office of Naval Research (Grant N00014-12-1-0999)National Science Foundation (U.S.) (CAREER Award CCF-1453261)National Science Foundation (U.S.) (CAREER Award CCF-0953960)Google (Firm) (Faculty Research Award)National Science Foundation (U.S.). Graduate Research Fellowship ProgramNEC Corporatio

    Confidence regions and minimax rates in outlier-robust estimation on the probability simplex

    Full text link
    We consider the problem of estimating the mean of a distribution supported by the kk-dimensional probability simplex in the setting where an ε\varepsilon fraction of observations are subject to adversarial corruption. A simple particular example is the problem of estimating the distribution of a discrete random variable. Assuming that the discrete variable takes kk values, the unknown parameter θ\boldsymbol \theta is a kk-dimensional vector belonging to the probability simplex. We first describe various settings of contamination and discuss the relation between these settings. We then establish minimax rates when the quality of estimation is measured by the total-variation distance, the Hellinger distance, or the L2\mathbb L^2-distance between two probability measures. We also provide confidence regions for the unknown mean that shrink at the minimax rate. Our analysis reveals that the minimax rates associated to these three distances are all different, but they are all attained by the sample average. Furthermore, we show that the latter is adaptive to the possible sparsity of the unknown vector. Some numerical experiments illustrating our theoretical findings are reported

    Data-driven Inverse Optimization with Imperfect Information

    Full text link
    In data-driven inverse optimization an observer aims to learn the preferences of an agent who solves a parametric optimization problem depending on an exogenous signal. Thus, the observer seeks the agent's objective function that best explains a historical sequence of signals and corresponding optimal actions. We focus here on situations where the observer has imperfect information, that is, where the agent's true objective function is not contained in the search space of candidate objectives, where the agent suffers from bounded rationality or implementation errors, or where the observed signal-response pairs are corrupted by measurement noise. We formalize this inverse optimization problem as a distributionally robust program minimizing the worst-case risk that the {\em predicted} decision ({\em i.e.}, the decision implied by a particular candidate objective) differs from the agent's {\em actual} response to a random signal. We show that our framework offers rigorous out-of-sample guarantees for different loss functions used to measure prediction errors and that the emerging inverse optimization problems can be exactly reformulated as (or safely approximated by) tractable convex programs when a new suboptimality loss function is used. We show through extensive numerical tests that the proposed distributionally robust approach to inverse optimization attains often better out-of-sample performance than the state-of-the-art approaches

    Robust Estimators are Hard to Compute

    Get PDF
    In modern statistics, the robust estimation of parameters of a regression hyperplane is a central problem. Robustness means that the estimation is not or only slightly affected by outliers in the data. In this paper, it is shown that the following robust estimators are hard to compute: LMS, LQS, LTS, LTA, MCD, MVE, Constrained M estimator, Projection Depth (PD) and Stahel-Donoho. In addition, a data set is presented such that the ltsReg-procedure of R has probability less than 0.0001 of finding a correct answer. Furthermore, it is described, how to design new robust estimators. --Computational statistics,complexity theory,robust statistics,algorithms,search heuristics

    Efficient Statistics, in High Dimensions, from Truncated Samples

    Full text link
    We provide an efficient algorithm for the classical problem, going back to Galton, Pearson, and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples. Truncated samples from a dd-variate normal N(μ,Σ){\cal N}(\mathbf{\mu},\mathbf{\Sigma}) means a samples is only revealed if it falls in some subset SRdS \subseteq \mathbb{R}^d; otherwise the samples are hidden and their count in proportion to the revealed samples is also hidden. We show that the mean μ\mathbf{\mu} and covariance matrix Σ\mathbf{\Sigma} can be estimated with arbitrary accuracy in polynomial-time, as long as we have oracle access to SS, and SS has non-trivial measure under the unknown dd-variate normal distribution. Additionally we show that without oracle access to SS, any non-trivial estimation is impossible.Comment: to appear at 59th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 201

    Efficient computational strategies for doubly intractable problems with applications to Bayesian social networks

    Get PDF
    Powerful ideas recently appeared in the literature are adjusted and combined to design improved samplers for Bayesian exponential random graph models. Different forms of adaptive Metropolis-Hastings proposals (vertical, horizontal and rectangular) are tested and combined with the Delayed rejection (DR) strategy with the aim of reducing the variance of the resulting Markov chain Monte Carlo estimators for a given computational time. In the examples treated in this paper the best combination, namely horizontal adaptation with delayed rejection, leads to a variance reduction that varies between 92% and 144% relative to the adaptive direction sampling approximate exchange algorithm of Caimo and Friel (2011). These results correspond to an increased performance which varies from 10% to 94% if we take simulation time into account. The highest improvements are obtained when highly correlated posterior distributions are considered.Comment: 23 pages, 8 figures. Accepted to appear in Statistics and Computin

    Being Robust (in High Dimensions) Can Be Practical

    Full text link
    Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors. Recent work in theoretical computer science has shown that, in appropriate distributional models, it is possible to robustly estimate the mean and covariance with polynomial time algorithms that can tolerate a constant fraction of corruptions, independent of the dimension. However, the sample and time complexity of these algorithms is prohibitively large for high-dimensional applications. In this work, we address both of these issues by establishing sample complexity bounds that are optimal, up to logarithmic factors, as well as giving various refinements that allow the algorithms to tolerate a much larger fraction of corruptions. Finally, we show on both synthetic and real data that our algorithms have state-of-the-art performance and suddenly make high-dimensional robust estimation a realistic possibility.Comment: Appeared in ICML 201

    Likelihood-based inference for max-stable processes

    Get PDF
    The last decade has seen max-stable processes emerge as a common tool for the statistical modeling of spatial extremes. However, their application is complicated due to the unavailability of the multivariate density function, and so likelihood-based methods remain far from providing a complete and flexible framework for inference. In this article we develop inferentially practical, likelihood-based methods for fitting max-stable processes derived from a composite-likelihood approach. The procedure is sufficiently reliable and versatile to permit the simultaneous modeling of marginal and dependence parameters in the spatial context at a moderate computational cost. The utility of this methodology is examined via simulation, and illustrated by the analysis of U.S. precipitation extremes
    corecore