4,954 research outputs found

    Depth weighted scatter estimators

    Full text link
    General depth weighted scatter estimators are introduced and investigated. For general depth functions, we find out that these affine equivariant scatter estimators are Fisher consistent and unbiased for a wide range of multivariate distributions, and show that the sample scatter estimators are strong and \sqrtn-consistent and asymptotically normal, and the influence functions of the estimators exist and are bounded in general. We then concentrate on a specific case of the general depth weighted scatter estimators, the projection depth weighted scatter estimators, which include as a special case the well-known Stahel-Donoho scatter estimator whose limiting distribution has long been open until this paper. Large sample behavior, including consistency and asymptotic normality, and efficiency and finite sample behavior, including breakdown point and relative efficiency of the sample projection depth weighted scatter estimators, are thoroughly investigated. The influence function and the maximum bias of the projection depth weighted scatter estimators are derived and examined. Unlike typical high-breakdown competitors, the projection depth weighted scatter estimators can integrate high breakdown point and high efficiency while enjoying a bounded-influence function and a moderate maximum bias curve. Comparisons with leading estimators on asymptotic relative efficiency and gross error sensitivity reveal that the projection depth weighted scatter estimators behave very well overall and, consequently, represent very favorable choices of affine equivariant multivariate scatter estimators.Comment: Published at http://dx.doi.org/10.1214/009053604000000922 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On Weighted Multivariate Sign Functions

    Full text link
    Multivariate sign functions are often used for robust estimation and inference. We propose using data dependent weights in association with such functions. The proposed weighted sign functions retain desirable robustness properties, while significantly improving efficiency in estimation and inference compared to unweighted multivariate sign-based methods. Using weighted signs, we demonstrate methods of robust location estimation and robust principal component analysis. We extend the scope of using robust multivariate methods to include robust sufficient dimension reduction and functional outlier detection. Several numerical studies and real data applications demonstrate the efficacy of the proposed methodology.Comment: Keywords: Multivariate sign, Principal component analysis, Data depth, Sufficient dimension reductio

    Nonparametrically consistent depth-based classifiers

    Full text link
    We introduce a class of depth-based classification procedures that are of a nearest-neighbor nature. Depth, after symmetrization, indeed provides the center-outward ordering that is necessary and sufficient to define nearest neighbors. Like all their depth-based competitors, the resulting classifiers are affine-invariant, hence in particular are insensitive to unit changes. Unlike the former, however, the latter achieve Bayes consistency under virtually any absolutely continuous distributions - a concept we call nonparametric consistency, to stress the difference with the stronger universal consistency of the standard kkNN classifiers. We investigate the finite-sample performances of the proposed classifiers through simulations and show that they outperform affine-invariant nearest-neighbor classifiers obtained through an obvious standardization construction. We illustrate the practical value of our classifiers on two real data examples. Finally, we shortly discuss the possible uses of our depth-based neighbors in other inference problems.Comment: Published at http://dx.doi.org/10.3150/13-BEJ561 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Evaluation of the Fourth Millennium Development Goal Realisation Using Robust and Nonparametric Tools Offered by Data Depth Concept

    Full text link
    We briefly communicate results of a nonparametric and robust evaluation of effects of \emph{the Fourth Millennium Development Goal of United Nations}. Main aim of the goal was reducing by two thirds, between 1990--2015, the under five months child mortality. Our novel analysis was conducted by means of very powerful and user friendly tools offered by the \emph{Data Depth Concept} being a collection of multivariate techniques basing on multivariate generalizations of quantiles, ranges and order statistics. Results of our analysis are more convincing than results obtained using classical statistical tools.Comment: The paper is basing on a poster submitted to IASC 2014 Data Competition - the poster was the runner-up (the second place

    Semiparametric Inference and Lower Bounds for Real Elliptically Symmetric Distributions

    Full text link
    This paper has a twofold goal. The first aim is to provide a deeper understanding of the family of the Real Elliptically Symmetric (RES) distributions by investigating their intrinsic semiparametric nature. The second aim is to derive a semiparametric lower bound for the estimation of the parametric component of the model. The RES distributions represent a semiparametric model where the parametric part is given by the mean vector and by the scatter matrix while the non-parametric, infinite-dimensional, part is represented by the density generator. Since, in practical applications, we are often interested only in the estimation of the parametric component, the density generator can be considered as nuisance. The first part of the paper is dedicated to conveniently place the RES distributions in the framework of the semiparametric group models. The second part of the paper, building on the mathematical tools previously introduced, the Constrained Semiparametric Cram\'{e}r-Rao Bound (CSCRB) for the estimation of the mean vector and of the constrained scatter matrix of a RES distributed random vector is introduced. The CSCRB provides a lower bound on the Mean Squared Error (MSE) of any robust MM-estimator of mean vector and scatter matrix when no a-priori information on the density generator is available. A closed form expression for the CSCRB is derived. Finally, in simulations, we assess the statistical efficiency of the Tyler's and Huber's scatter matrix MM-estimators with respect to the CSCRB.Comment: This paper has been accepted for publication in IEEE Transactions on Signal Processin

    The APM Galaxy Survey III: An Analysis of Systematic Errors in the Angular Correlation Function and Cosmological Implications

    Get PDF
    We present measurements of the angular two-point galaxy correlation function, w(theta)w(theta), from the APM Galaxy Survey. The performance of various estimators of ww is assessed using simulated galaxy catalogues and analytic arguments. Several error analyses show that residual plate-to-plate errors do not bias our estimates of ww by more than 10−310^{-3}. Direct comparison between our photometry and external CCD photometry of over 13,000 galaxies from the Las Campanas Deep Redshift Survey shows that the rms error in the APM plate zero points lies in the range 0.04-0.05 magnitudes, in agreement with our previous estimates. We estimate the effects on ww of atmospheric extinction and obscuration by dust in our Galaxy and conclude that these are negligible. We use our best estimates of the systematic errors in the survey to calculate corrected estimates of ww. Deep redshift surveys are used to determine the selection function of the APM Galaxy Survey, and this is applied in Limber's equation to compute how ww scales as a function of limiting magnitude. Our estimates of ww are in excellent agreement with the scaling relation, providing further evidence that systematic errors in the APM survey are small. We explicitly remove large-scale structure by applying filters to the APM galaxy maps and conclude that there is still strong evidence for more clustering at large scales than predicted by the standard scale-invariant cold dark matter (CDM) model. We compare the APM ww and the three dimensional power spectrum derived by inverting ww, with the predictions of scale-invariant CDM models. We show that the observations require Gamma=Omega0hGamma=Omega_0 h in the range 0.2-0.3 and are incompatible with the value Gamma=0.5Gamma=0.5 of the standard CDM model.Comment: 102 pages, plain TeX plus 41 postscript figures. Submitted to MNRA

    On Cross-correlating Weak Lensing Surveys

    Full text link
    The present generation of weak lensing surveys will be superseded by surveys run from space with much better sky coverage and high level of signal to noise ratio, such as SNAP. However, removal of any systematics or noise will remain a major cause of concern for any weak lensing survey. One of the best ways of spotting any undetected source of systematic noise is to compare surveys which probe the same part of the sky. In this paper we study various measures which are useful in cross correlating weak lensing surveys with diverse survey strategies. Using two different statistics - the shear components and the aperture mass - we construct a class of estimators which encode such cross-correlations. These techniques will also be useful in studies where the entire source population from a specific survey can be divided into various redshift bins to study cross correlations among them. We perform a detailed study of the angular size dependence and redshift dependence of these observables and of their sensitivity to the background cosmology. We find that one-point and two-point statistics provide complementary tools which allow one to constrain cosmological parameters and to obtain a simple estimate of the noise of the survey.Comment: 17 pages, 9 Figures, Submitted to MNRA
    • 

    corecore