13,542 research outputs found
Finding an unknown number of multivariate outliers
We use the forward search to provide robust Mahalanobis distances to detect the presence of outliers in a sample of multivariate normal data. Theoretical results on order statistics and on estimation in truncated samples provide the distribution of our test statistic. We also introduce several new robust distances with associated distributional results. Comparisons of our procedure with tests using other robust Mahalanobis distances show the good size and high power of our procedure. We also provide a unification of results on correction factors for estimation from truncated samples
A Parametric Framework for the Comparison of Methods of Very Robust Regression
There are several methods for obtaining very robust estimates of regression
parameters that asymptotically resist 50% of outliers in the data. Differences
in the behaviour of these algorithms depend on the distance between the
regression data and the outliers. We introduce a parameter that
defines a parametric path in the space of models and enables us to study, in a
systematic way, the properties of estimators as the groups of data move from
being far apart to close together. We examine, as a function of , the
variance and squared bias of five estimators and we also consider their power
when used in the detection of outliers. This systematic approach provides tools
for gaining knowledge and better understanding of the properties of robust
estimators.Comment: Published in at http://dx.doi.org/10.1214/13-STS437 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Robust correlation analyses: false positive and power validation using a new open source Matlab toolbox
Pearsonâs correlation measures the strength of the association between two variables. The technique is, however, restricted to linear associations and is overly sensitive to outliers. Indeed, a single outlier can result in a highly inaccurate summary of the data. Yet, it remains the most commonly used measure of association in psychology research. Here we describe a free Matlab(R) based toolbox (http://sourceforge.net/projects/robustcorrtool/) that computes robust measures of association between two or more random variables: the percentage-bend correlation and skipped-correlations. After illustrating how to use the toolbox, we show that robust methods, where outliers are down weighted or removed and accounted for in significance testing, provide better estimates of the true association with accurate false positive control and without loss of power. The different correlation methods were tested with normal data and normal data contaminated with marginal or bivariate outliers. We report estimates of effect size, false positive rate and power, and advise on which technique to use depending on the data at hand
- âŠ