Search CORE

1,982 research outputs found

Outlier detection using distributionally robust optimization under the Wasserstein metric

Author: Chen Ruidi
Paschalidis Ioannis Ch.
Publication venue
Publication date: 01/01/2017
Field of study

We present a Distributionally Robust Optimization (DRO) approach to outlier detection in a linear regression setting, where the closeness of probability distributions is measured using the Wasserstein metric. Training samples contaminated with outliers skew the regression plane computed by least squares and thus impede outlier detection. Classical approaches, such as robust regression, remedy this problem by downweighting the contribution of atypical data points. In contrast, our Wasserstein DRO approach hedges against a family of distributions that are close to the empirical distribution. We show that the resulting formulation encompasses a class of models, which include the regularized Least Absolute Deviation (LAD) as a special case. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior, and the other concerns the discrepancy between the estimated and true regression planes. Extensive numerical results demonstrate the superiority of our approach to both robust regression and the regularized LAD in terms of estimation accuracy and outlier detection rates

Boston University Institutional Repository (OpenBU)

Ridge Estimation of Inverse Covariance Matrices from High-Dimensional Data

Author: Peeters Carel F. W.
van Wieringen Wessel N.
Publication venue: 'Elsevier BV'
Publication date: 24/09/2015
Field of study

We study ridge estimation of the precision matrix in the high-dimensional setting where the number of variables is large relative to the sample size. We first review two archetypal ridge estimators and note that their utilized penalties do not coincide with common ridge penalties. Subsequently, starting from a common ridge penalty, analytic expressions are derived for two alternative ridge estimators of the precision matrix. The alternative estimators are compared to the archetypes with regard to eigenvalue shrinkage and risk. The alternatives are also compared to the graphical lasso within the context of graphical modeling. The comparisons may give reason to prefer the proposed alternative estimators

arXiv.org e-Print Archive

Robustness in sparse linear models: relative efficiency based on robust approximate message passing

Author: Bradic Jelena
Publication venue
Publication date: 30/07/2015
Field of study

Understanding efficiency in high dimensional linear models is a longstanding problem of interest. Classical work with smaller dimensional problems dating back to Huber and Bickel has illustrated the benefits of efficient loss functions. When the number of parameters

p

is of the same order as the sample size

n

p \approx n

, an efficiency pattern different from the one of Huber was recently established. In this work, we consider the effects of model selection on the estimation efficiency of penalized methods. In particular, we explore whether sparsity, results in new efficiency patterns when

p > n

. In the interest of deriving the asymptotic mean squared error for regularized M-estimators, we use the powerful framework of approximate message passing. We propose a novel, robust and sparse approximate message passing algorithm (RAMP), that is adaptive to the error distribution. Our algorithm includes many non-quadratic and non-differentiable loss functions. We derive its asymptotic mean squared error and show its convergence, while allowing

p, n, s \to \infty

, with

n/p \in (0,1)

and

n/s \in (1,\infty)

. We identify new patterns of relative efficiency regarding a number of penalized

M

estimators, when

p

is much larger than

n

. We show that the classical information bound is no longer reachable, even for light--tailed error distributions. We show that the penalized least absolute deviation estimator dominates the penalized least square estimator, in cases of heavy--tailed distributions. We observe this pattern for all choices of the number of non-zero parameters

s

, both

s \leq n

and

s \approx n

. In non-penalized problems where

s =p \approx n

, the opposite regime holds. Therefore, we discover that the presence of model selection significantly changes the efficiency patterns.Comment: 49 pages, 10 figure

arXiv.org e-Print Archive

Ezid

eScholarship - University of California