7,658 research outputs found
Loss minimization and parameter estimation with heavy tails
This work studies applications and generalizations of a simple estimation
technique that provides exponential concentration under heavy-tailed
distributions, assuming only bounded low-order moments. We show that the
technique can be used for approximate minimization of smooth and strongly
convex losses, and specifically for least squares linear regression. For
instance, our -dimensional estimator requires just
random samples to obtain a constant factor
approximation to the optimal least squares loss with probability ,
without requiring the covariates or noise to be bounded or subgaussian. We
provide further applications to sparse linear regression and low-rank
covariance matrix estimation with similar allowances on the noise and covariate
distributions. The core technique is a generalization of the median-of-means
estimator to arbitrary metric spaces.Comment: Final version as published in JML
Empirical risk minimization for heavy-tailed losses
The purpose of this paper is to discuss empirical risk minimization when the
losses are not necessarily bounded and may have a distribution with heavy
tails. In such situations, usual empirical averages may fail to provide
reliable estimates and empirical risk minimization may provide large excess
risk. However, some robust mean estimators proposed in the literature may be
used to replace empirical means. In this paper, we investigate empirical risk
minimization based on a robust estimate proposed by Catoni. We develop
performance bounds based on chaining arguments tailored to Catoni's mean
estimator.Comment: Published at http://dx.doi.org/10.1214/15-AOS1350 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Recommended from our members
Adaptive Huber Regression.
Big data can easily be contaminated by outliers or contain variables with heavy-tailed distributions, which makes many conventional methods inadequate. To address this challenge, we propose the adaptive Huber regression for robust estimation and inference. The key observation is that the robustification parameter should adapt to the sample size, dimension and moments for optimal tradeoff between bias and robustness. Our theoretical framework deals with heavy-tailed distributions with bounded (1 + δ)-th moment for any δ > 0. We establish a sharp phase transition for robust estimation of regression parameters in both low and high dimensions: when δ ≥ 1, the estimator admits a sub-Gaussian-type deviation bound without sub-Gaussian assumptions on the data, while only a slower rate is available in the regime 0 < δ < 1 and the transition is smooth and optimal. In addition, we extend the methodology to allow both heavy-tailed predictors and observation noise. Simulation studies lend further support to the theory. In a genetic study of cancer cell lines that exhibit heavy-tailedness, the proposed methods are shown to be more robust and predictive
Robust Estimation via Robust Gradient Estimation
We provide a new computationally-efficient class of estimators for risk
minimization. We show that these estimators are robust for general statistical
models: in the classical Huber epsilon-contamination model and in heavy-tailed
settings. Our workhorse is a novel robust variant of gradient descent, and we
provide conditions under which our gradient descent variant provides accurate
estimators in a general convex risk minimization problem. We provide specific
consequences of our theory for linear regression, logistic regression and for
estimation of the canonical parameters in an exponential family. These results
provide some of the first computationally tractable and provably robust
estimators for these canonical statistical models. Finally, we study the
empirical performance of our proposed methods on synthetic and real datasets,
and find that our methods convincingly outperform a variety of baselines.Comment: 48 pages, 5 figure
Robust Covariance Estimation for Approximate Factor Models
In this paper, we study robust covariance estimation under the approximate
factor model with observed factors. We propose a novel framework to first
estimate the initial joint covariance matrix of the observed data and the
factors, and then use it to recover the covariance matrix of the observed data.
We prove that once the initial matrix estimator is good enough to maintain the
element-wise optimal rate, the whole procedure will generate an estimated
covariance with desired properties. For data with only bounded fourth moments,
we propose to use Huber loss minimization to give the initial joint covariance
estimation. This approach is applicable to a much wider range of distributions,
including sub-Gaussian and elliptical distributions. We also present an
asymptotic result for Huber's M-estimator with a diverging parameter. The
conclusions are demonstrated by extensive simulations and real data analysis
Portfolio optimization for heavy-tailed assets: Extreme Risk Index vs. Markowitz
Using daily returns of the S&P 500 stocks from 2001 to 2011, we perform a
backtesting study of the portfolio optimization strategy based on the extreme
risk index (ERI). This method uses multivariate extreme value theory to
minimize the probability of large portfolio losses. With more than 400 stocks
to choose from, our study seems to be the first application of extreme value
techniques in portfolio management on a large scale. The primary aim of our
investigation is the potential of ERI in practice. The performance of this
strategy is benchmarked against the minimum variance portfolio and the equally
weighted portfolio. These fundamental strategies are important benchmarks for
large-scale applications. Our comparison includes annualized portfolio returns,
maximal drawdowns, transaction costs, portfolio concentration, and asset
diversity in the portfolio. In addition to that we study the impact of an
alternative tail index estimator. Our results show that the ERI strategy
significantly outperforms both the minimum-variance portfolio and the equally
weighted portfolio on assets with heavy tails.Comment: Manuscript accepted in the Journal of Empirical Financ
A Fast Simulation Method for the Sum of Subexponential Distributions
Estimating the probability that a sum of random variables (RVs) exceeds a
given threshold is a well-known challenging problem. Closed-form expression of
the sum distribution is usually intractable and presents an open problem. A
crude Monte Carlo (MC) simulation is the standard technique for the estimation
of this type of probability. However, this approach is computationally
expensive especially when dealing with rare events (i.e events with very small
probabilities). Importance Sampling (IS) is an alternative approach which
effectively improves the computational efficiency of the MC simulation. In this
paper, we develop a general framework based on IS approach for the efficient
estimation of the probability that the sum of independent and not necessarily
identically distributed heavy-tailed RVs exceeds a given threshold. The
proposed IS approach is based on constructing a new sampling distribution by
twisting the hazard rate of the original underlying distribution of each
component in the summation. A minmax approach is carried out for the
determination of the twisting parameter, for any given threshold. Moreover,
using this minmax optimal choice, the estimation of the probability of interest
is shown to be asymptotically optimal as the threshold goes to infinity. We
also offer some selected simulation results illustrating first the efficiency
of the proposed IS approach compared to the naive MC simulation. The
near-optimality of the minmax approach is then numerically analyzed
Learning without Concentration
We obtain sharp bounds on the performance of Empirical Risk Minimization
performed in a convex class and with respect to the squared loss, without
assuming that class members and the target are bounded functions or have
rapidly decaying tails.
Rather than resorting to a concentration-based argument, the method used here
relies on a `small-ball' assumption and thus holds for classes consisting of
heavy-tailed functions and for heavy-tailed targets.
The resulting estimates scale correctly with the `noise level' of the
problem, and when applied to the classical, bounded scenario, always improve
the known bounds
Recommended from our members
User-Friendly Covariance Estimation for Heavy-Tailed Distributions
We offer a survey of recent results on covariance estimation for heavy-tailed
distributions. By unifying ideas scattered in the literature, we propose
user-friendly methods that facilitate practical implementation. Specifically,
we introduce element-wise and spectrum-wise truncation operators, as well as
their -estimator counterparts, to robustify the sample covariance matrix.
Different from the classical notion of robustness that is characterized by the
breakdown property, we focus on the tail robustness which is evidenced by the
connection between nonasymptotic deviation and confidence level. The key
observation is that the estimators needs to adapt to the sample size,
dimensionality of the data and the noise level to achieve optimal tradeoff
between bias and robustness. Furthermore, to facilitate their practical use, we
propose data-driven procedures that automatically calibrate the tuning
parameters. We demonstrate their applications to a series of structured models
in high dimensions, including the bandable and low-rank covariance matrices and
sparse precision matrices. Numerical studies lend strong support to the
proposed methods
Geometric median and robust estimation in Banach spaces
In many real-world applications, collected data are contaminated by noise
with heavy-tailed distribution and might contain outliers of large magnitude.
In this situation, it is necessary to apply methods which produce reliable
outcomes even if the input contains corrupted measurements. We describe a
general method which allows one to obtain estimators with tight concentration
around the true parameter of interest taking values in a Banach space.
Suggested construction relies on the fact that the geometric median of a
collection of independent "weakly concentrated" estimators satisfies a much
stronger deviation bound than each individual element in the collection. Our
approach is illustrated through several examples, including sparse linear
regression and low-rank matrix recovery problems.Comment: Published at http://dx.doi.org/10.3150/14-BEJ645 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
- …