4,954 research outputs found
Depth weighted scatter estimators
General depth weighted scatter estimators are introduced and investigated.
For general depth functions, we find out that these affine equivariant scatter
estimators are Fisher consistent and unbiased for a wide range of multivariate
distributions, and show that the sample scatter estimators are strong and
\sqrtn-consistent and asymptotically normal, and the influence functions of the
estimators exist and are bounded in general. We then concentrate on a specific
case of the general depth weighted scatter estimators, the projection depth
weighted scatter estimators, which include as a special case the well-known
Stahel-Donoho scatter estimator whose limiting distribution has long been open
until this paper. Large sample behavior, including consistency and asymptotic
normality, and efficiency and finite sample behavior, including breakdown point
and relative efficiency of the sample projection depth weighted scatter
estimators, are thoroughly investigated. The influence function and the maximum
bias of the projection depth weighted scatter estimators are derived and
examined. Unlike typical high-breakdown competitors, the projection depth
weighted scatter estimators can integrate high breakdown point and high
efficiency while enjoying a bounded-influence function and a moderate maximum
bias curve. Comparisons with leading estimators on asymptotic relative
efficiency and gross error sensitivity reveal that the projection depth
weighted scatter estimators behave very well overall and, consequently,
represent very favorable choices of affine equivariant multivariate scatter
estimators.Comment: Published at http://dx.doi.org/10.1214/009053604000000922 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
On Weighted Multivariate Sign Functions
Multivariate sign functions are often used for robust estimation and
inference. We propose using data dependent weights in association with such
functions. The proposed weighted sign functions retain desirable robustness
properties, while significantly improving efficiency in estimation and
inference compared to unweighted multivariate sign-based methods. Using
weighted signs, we demonstrate methods of robust location estimation and robust
principal component analysis. We extend the scope of using robust multivariate
methods to include robust sufficient dimension reduction and functional outlier
detection. Several numerical studies and real data applications demonstrate the
efficacy of the proposed methodology.Comment: Keywords: Multivariate sign, Principal component analysis, Data
depth, Sufficient dimension reductio
Nonparametrically consistent depth-based classifiers
We introduce a class of depth-based classification procedures that are of a
nearest-neighbor nature. Depth, after symmetrization, indeed provides the
center-outward ordering that is necessary and sufficient to define nearest
neighbors. Like all their depth-based competitors, the resulting classifiers
are affine-invariant, hence in particular are insensitive to unit changes.
Unlike the former, however, the latter achieve Bayes consistency under
virtually any absolutely continuous distributions - a concept we call
nonparametric consistency, to stress the difference with the stronger universal
consistency of the standard NN classifiers. We investigate the finite-sample
performances of the proposed classifiers through simulations and show that they
outperform affine-invariant nearest-neighbor classifiers obtained through an
obvious standardization construction. We illustrate the practical value of our
classifiers on two real data examples. Finally, we shortly discuss the possible
uses of our depth-based neighbors in other inference problems.Comment: Published at http://dx.doi.org/10.3150/13-BEJ561 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Evaluation of the Fourth Millennium Development Goal Realisation Using Robust and Nonparametric Tools Offered by Data Depth Concept
We briefly communicate results of a nonparametric and robust evaluation of
effects of \emph{the Fourth Millennium Development Goal of United Nations}.
Main aim of the goal was reducing by two thirds, between 1990--2015, the under
five months child mortality. Our novel analysis was conducted by means of very
powerful and user friendly tools offered by the \emph{Data Depth Concept} being
a collection of multivariate techniques basing on multivariate generalizations
of quantiles, ranges and order statistics. Results of our analysis are more
convincing than results obtained using classical statistical tools.Comment: The paper is basing on a poster submitted to IASC 2014 Data
Competition - the poster was the runner-up (the second place
Semiparametric Inference and Lower Bounds for Real Elliptically Symmetric Distributions
This paper has a twofold goal. The first aim is to provide a deeper
understanding of the family of the Real Elliptically Symmetric (RES)
distributions by investigating their intrinsic semiparametric nature. The
second aim is to derive a semiparametric lower bound for the estimation of the
parametric component of the model. The RES distributions represent a
semiparametric model where the parametric part is given by the mean vector and
by the scatter matrix while the non-parametric, infinite-dimensional, part is
represented by the density generator. Since, in practical applications, we are
often interested only in the estimation of the parametric component, the
density generator can be considered as nuisance. The first part of the paper is
dedicated to conveniently place the RES distributions in the framework of the
semiparametric group models. The second part of the paper, building on the
mathematical tools previously introduced, the Constrained Semiparametric
Cram\'{e}r-Rao Bound (CSCRB) for the estimation of the mean vector and of the
constrained scatter matrix of a RES distributed random vector is introduced.
The CSCRB provides a lower bound on the Mean Squared Error (MSE) of any robust
-estimator of mean vector and scatter matrix when no a-priori information on
the density generator is available. A closed form expression for the CSCRB is
derived. Finally, in simulations, we assess the statistical efficiency of the
Tyler's and Huber's scatter matrix -estimators with respect to the CSCRB.Comment: This paper has been accepted for publication in IEEE Transactions on
Signal Processin
The APM Galaxy Survey III: An Analysis of Systematic Errors in the Angular Correlation Function and Cosmological Implications
We present measurements of the angular two-point galaxy correlation function,
, from the APM Galaxy Survey. The performance of various estimators
of is assessed using simulated galaxy catalogues and analytic arguments.
Several error analyses show that residual plate-to-plate errors do not bias our
estimates of by more than . Direct comparison between our
photometry and external CCD photometry of over 13,000 galaxies from the Las
Campanas Deep Redshift Survey shows that the rms error in the APM plate zero
points lies in the range 0.04-0.05 magnitudes, in agreement with our previous
estimates. We estimate the effects on of atmospheric extinction and
obscuration by dust in our Galaxy and conclude that these are negligible. We
use our best estimates of the systematic errors in the survey to calculate
corrected estimates of . Deep redshift surveys are used to determine the
selection function of the APM Galaxy Survey, and this is applied in Limber's
equation to compute how scales as a function of limiting magnitude. Our
estimates of are in excellent agreement with the scaling relation,
providing further evidence that systematic errors in the APM survey are small.
We explicitly remove large-scale structure by applying filters to the APM
galaxy maps and conclude that there is still strong evidence for more
clustering at large scales than predicted by the standard scale-invariant cold
dark matter (CDM) model. We compare the APM and the three dimensional power
spectrum derived by inverting , with the predictions of scale-invariant CDM
models. We show that the observations require in the range
0.2-0.3 and are incompatible with the value of the standard CDM
model.Comment: 102 pages, plain TeX plus 41 postscript figures. Submitted to MNRA
On Cross-correlating Weak Lensing Surveys
The present generation of weak lensing surveys will be superseded by surveys
run from space with much better sky coverage and high level of signal to noise
ratio, such as SNAP. However, removal of any systematics or noise will remain a
major cause of concern for any weak lensing survey. One of the best ways of
spotting any undetected source of systematic noise is to compare surveys which
probe the same part of the sky. In this paper we study various measures which
are useful in cross correlating weak lensing surveys with diverse survey
strategies. Using two different statistics - the shear components and the
aperture mass - we construct a class of estimators which encode such
cross-correlations. These techniques will also be useful in studies where the
entire source population from a specific survey can be divided into various
redshift bins to study cross correlations among them. We perform a detailed
study of the angular size dependence and redshift dependence of these
observables and of their sensitivity to the background cosmology. We find that
one-point and two-point statistics provide complementary tools which allow one
to constrain cosmological parameters and to obtain a simple estimate of the
noise of the survey.Comment: 17 pages, 9 Figures, Submitted to MNRA
- âŠ