33 research outputs found

    An optimal transportation approach for assessing almost stochastic order

    Full text link
    When stochastic dominance FstGF\leq_{st}G does not hold, we can improve agreement to stochastic order by suitably trimming both distributions. In this work we consider the L2L_2-Wasserstein distance, W2\mathcal W_2, to stochastic order of these trimmed versions. Our characterization for that distance naturally leads to consider a W2\mathcal W_2-based index of disagreement with stochastic order, εW2(F,G)\varepsilon_{\mathcal W_2}(F,G). We provide asymptotic results allowing to test H0:εW2(F,G)ε0H_0: \varepsilon_{\mathcal W_2}(F,G)\geq \varepsilon_0 vs Ha:εW2(F,G)<ε0H_a: \varepsilon_{\mathcal W_2}(F,G)<\varepsilon_0, that, under rejection, would give statistical guarantee of almost stochastic dominance. We include a simulation study showing a good performance of the index under the normal model

    Models for the Assessment of Treatment Improvement: The Ideal and the Feasible

    Get PDF
    Comparisons of different treatments or production processes are the goals of a significant fraction of applied research. Unsurprisingly, two sample problems play a main role in statistics through natural questions such as. Is the the new treatment significantly better than the old. However, this is only partially answered by some of the usual statistical tools for this task. More importantly, often practitioners are not aware of the real meaning behind these statistical procedures. We analyze these troubles from the point of view of the order between distributions, the stochastic order, showing evidence of the limitations of the usual approaches, paying special attention to the classical comparison of means under the normal model. We discuss the unfeasibility of statistically proving stochastic dominance, but show that it is possible, instead, to gather statistical evidence to conclude that slightly relaxed versions of stochastic dominance hold.Research partially supported by the Spanish Ministerio de Economía y Competitividad y fondos FEDER, grants MTM2014-56235-C2-1-P and MTM2014-56235-C2-2, and by Consejería de Educación de la Junta de Castilla y León, grant VA212U13

    Letter to the editor

    Get PDF
    AbstractThis letter shows how the main result contained in a paper recently appeared in the Journal of Multivariate Analysis was in fact a particular case of a more general theorem published three years before

    Wide consensus aggregation in the Wasserstein space. Application to location-scatter families

    Get PDF
    We introduce a general theory for a consensus-based combination of estimations of probability measures. Potential applications include parallelized or distributed sampling schemes as well as variations on aggregation from resampling techniques like boosting or bagging. Taking into account the possibility of very discrepant estimations, instead of a full consensus we consider a "wide consensus" procedure. The approach is based on the consideration of trimmed barycenters in the Wasserstein space of probability measures. We provide general existence and consistency results as well as suitable properties of these robustified Fréchet means. In order to get quick applicability, we also include characterizations of barycenters of probabilities that belong to (non necessarily elliptical) location and scatter families. For these families, we provide an iterative algorithm for the effective computation of trimmed barycenters, based on a consistent algorithm for computing barycenters, guarantying applicability in a wide setting of statistical problems

    Robustness and Outliers

    Get PDF
    Producción CientíficaUnexpected deviations from assumed models as well as the presence of certain amounts of outlying data are common in most practical statistical applications. This fact could lead to undesirable solutions when applying non-robust statistical techniques. This is often the case in cluster analysis, too. The search for homogeneous groups with large heterogeneity between them can be spoiled due to the lack of robustness of standard clustering methods. For instance, the presence of (even few) outlying observations may result in heterogeneous clusters artificially joined together or in the detection of spurious clusters merely made up of outlying observations. In this chapter we will analyze the effects of different kinds of outlying data in cluster analysis and explore several alternative methodologies designed to avoid or minimize their undesirable effects.Ministerio de Economía, Industria y Competitividad (MTM2014-56235-C2-1-P)Junta de Castilla y León (programa de apoyo a proyectos de investigación – Ref. VA212U13
    corecore