1,281 research outputs found
On the use of robust regression in econometrics
The use of robust regression estimators has gained popularity among applied econometricians. The main argument invoked to justify the use of the robust estimators is that they provide efficiency gains in the presence of outliers or non-normal errors. Unfortunately, most practitioners seem to be unaware of the fact that heteroskedastic and skewed errors can dramatically affect the properties of these estimators. In this paper we reconsider the interpretation of the specific robust estimator that has become popular in applied econometrics, and conclude that its use in this context cannot be generally recommended.
Robustness analysis of a Maximum Correntropy framework for linear regression
In this paper we formulate a solution of the robust linear regression problem
in a general framework of correntropy maximization. Our formulation yields a
unified class of estimators which includes the Gaussian and Laplacian
kernel-based correntropy estimators as special cases. An analysis of the
robustness properties is then provided. The analysis includes a quantitative
characterization of the informativity degree of the regression which is
appropriate for studying the stability of the estimator. Using this tool, a
sufficient condition is expressed under which the parametric estimation error
is shown to be bounded. Explicit expression of the bound is given and
discussion on its numerical computation is supplied. For illustration purpose,
two special cases are numerically studied.Comment: 10 pages, 5 figures, To appear in Automatic
Asymptotic theory for iterated one-step Huber-skip estimators
Iterated one-step Huber-skip M-estimators are considered for regression problems. Each one-step estimator is a reweighted least squares estimators with zero/one weights determined by the initial estimator and the data. The asymptotic theory is given for iteration of such estimators using a tightness argument. The results apply to stationary as well as non-stationary regression problems.Huber-skip; iteration; one-step M-estimators; unit roots
Robust regularized singular value decomposition with application to mortality data
We develop a robust regularized singular value decomposition (RobRSVD) method
for analyzing two-way functional data. The research is motivated by the
application of modeling human mortality as a smooth two-way function of age
group and year. The RobRSVD is formulated as a penalized loss minimization
problem where a robust loss function is used to measure the reconstruction
error of a low-rank matrix approximation of the data, and an appropriately
defined two-way roughness penalty function is used to ensure smoothness along
each of the two functional domains. By viewing the minimization problem as two
conditional regularized robust regressions, we develop a fast iterative
reweighted least squares algorithm to implement the method. Our implementation
naturally incorporates missing values. Furthermore, our formulation allows
rigorous derivation of leave-one-row/column-out cross-validation and
generalized cross-validation criteria, which enable computationally efficient
data-driven penalty parameter selection. The advantages of the new robust
method over nonrobust ones are shown via extensive simulation studies and the
mortality rate application.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS649 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Discriminative Density-ratio Estimation
The covariate shift is a challenging problem in supervised learning that
results from the discrepancy between the training and test distributions. An
effective approach which recently drew a considerable attention in the research
community is to reweight the training samples to minimize that discrepancy. In
specific, many methods are based on developing Density-ratio (DR) estimation
techniques that apply to both regression and classification problems. Although
these methods work well for regression problems, their performance on
classification problems is not satisfactory. This is due to a key observation
that these methods focus on matching the sample marginal distributions without
paying attention to preserving the separation between classes in the reweighted
space. In this paper, we propose a novel method for Discriminative
Density-ratio (DDR) estimation that addresses the aforementioned problem and
aims at estimating the density-ratio of joint distributions in a class-wise
manner. The proposed algorithm is an iterative procedure that alternates
between estimating the class information for the test data and estimating new
density ratio for each class. To incorporate the estimated class information of
the test data, a soft matching technique is proposed. In addition, we employ an
effective criterion which adopts mutual information as an indicator to stop the
iterative procedure while resulting in a decision boundary that lies in a
sparse region. Experiments on synthetic and benchmark datasets demonstrate the
superiority of the proposed method in terms of both accuracy and robustness
Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization
Principal component analysis (PCA) is widely used for dimensionality
reduction, with well-documented merits in various applications involving
high-dimensional data, including computer vision, preference measurement, and
bioinformatics. In this context, the fresh look advocated here permeates
benefits from variable selection and compressive sampling, to robustify PCA
against outliers. A least-trimmed squares estimator of a low-rank bilinear
factor analysis model is shown closely related to that obtained from an
-(pseudo)norm-regularized criterion encouraging sparsity in a matrix
explicitly modeling the outliers. This connection suggests robust PCA schemes
based on convex relaxation, which lead naturally to a family of robust
estimators encompassing Huber's optimal M-class as a special case. Outliers are
identified by tuning a regularization parameter, which amounts to controlling
sparsity of the outlier matrix along the whole robustification path of (group)
least-absolute shrinkage and selection operator (Lasso) solutions. Beyond its
neat ties to robust statistics, the developed outlier-aware PCA framework is
versatile to accommodate novel and scalable algorithms to: i) track the
low-rank signal subspace robustly, as new data are acquired in real time; and
ii) determine principal components robustly in (possibly) infinite-dimensional
feature spaces. Synthetic and real data tests corroborate the effectiveness of
the proposed robust PCA schemes, when used to identify aberrant responses in
personality assessment surveys, as well as unveil communities in social
networks, and intruders from video surveillance data.Comment: 30 pages, submitted to IEEE Transactions on Signal Processin
- …