69,568 research outputs found
A Parametric Framework for the Comparison of Methods of Very Robust Regression
There are several methods for obtaining very robust estimates of regression
parameters that asymptotically resist 50% of outliers in the data. Differences
in the behaviour of these algorithms depend on the distance between the
regression data and the outliers. We introduce a parameter that
defines a parametric path in the space of models and enables us to study, in a
systematic way, the properties of estimators as the groups of data move from
being far apart to close together. We examine, as a function of , the
variance and squared bias of five estimators and we also consider their power
when used in the detection of outliers. This systematic approach provides tools
for gaining knowledge and better understanding of the properties of robust
estimators.Comment: Published in at http://dx.doi.org/10.1214/13-STS437 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Bayesian Restricted Likelihood Methods: Conditioning on Insufficient Statistics in Bayesian Regression
Bayesian methods have proven themselves to be successful across a wide range
of scientific problems and have many well-documented advantages over competing
methods. However, these methods run into difficulties for two major and
prevalent classes of problems: handling data sets with outliers and dealing
with model misspecification. We outline the drawbacks of previous solutions to
both of these problems and propose a new method as an alternative. When working
with the new method, the data is summarized through a set of insufficient
statistics, targeting inferential quantities of interest, and the prior
distribution is updated with the summary statistics rather than the complete
data. By careful choice of conditioning statistics, we retain the main benefits
of Bayesian methods while reducing the sensitivity of the analysis to features
of the data not captured by the conditioning statistics. For reducing
sensitivity to outliers, classical robust estimators (e.g., M-estimators) are
natural choices for conditioning statistics. A major contribution of this work
is the development of a data augmented Markov chain Monte Carlo (MCMC)
algorithm for the linear model and a large class of summary statistics. We
demonstrate the method on simulated and real data sets containing outliers and
subject to model misspecification. Success is manifested in better predictive
performance for data points of interest as compared to competing methods
Regression on manifolds: Estimation of the exterior derivative
Collinearity and near-collinearity of predictors cause difficulties when
doing regression. In these cases, variable selection becomes untenable because
of mathematical issues concerning the existence and numerical stability of the
regression coefficients, and interpretation of the coefficients is ambiguous
because gradients are not defined. Using a differential geometric
interpretation, in which the regression coefficients are interpreted as
estimates of the exterior derivative of a function, we develop a new method to
do regression in the presence of collinearities. Our regularization scheme can
improve estimation error, and it can be easily modified to include lasso-type
regularization. These estimators also have simple extensions to the "large ,
small " context.Comment: Published in at http://dx.doi.org/10.1214/10-AOS823 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Function estimation with locally adaptive dynamic models
We present a nonparametric Bayesian method for fitting unsmooth and highly oscillating functions, which is based on a locally adaptive hierarchical extension of standard dynamic or state space models. The main idea is to introduce locally varying variances in the state equations and to add a further smoothness prior for this variance function. Estimation is fully Bayesian and carried out by recent MCMC techniques. The whole approach can be understood as an alternative to other nonparametric function estimators, such as local or penalized regression with variable bandwidth or smoothing parameter selection. Performance is illustrated with simulated data, including unsmooth examples constructed for wavelet shrinkage, and by an application to sales data. Although the approach is developed for classical Gaussian nonparametric regression, it can be extended to more complex regression problems
Autocovariance estimation in regression with a discontinuous signal and -dependent errors: A difference-based approach
We discuss a class of difference-based estimators for the autocovariance in
nonparametric regression when the signal is discontinuous (change-point
regression), possibly highly fluctuating, and the errors form a stationary
-dependent process. These estimators circumvent the explicit pre-estimation
of the unknown regression function, a task which is particularly challenging
for such signals. We provide explicit expressions for their mean squared errors
when the signal function is piecewise constant (segment regression) and the
errors are Gaussian. Based on this we derive biased-optimized estimates which
do not depend on the particular (unknown) autocovariance structure. Notably,
for positively correlated errors, that part of the variance of our estimators
which depends on the signal is minimal as well. Further, we provide sufficient
conditions for -consistency; this result is extended to piecewise
Holder regression with non-Gaussian errors.
We combine our biased-optimized autocovariance estimates with a
projection-based approach and derive covariance matrix estimates, a method
which is of independent interest. Several simulation studies as well as an
application to biophysical measurements complement this paper.Comment: 41 pages, 3 figures, 3 table
- …