33 research outputs found
An optimal transportation approach for assessing almost stochastic order
When stochastic dominance does not hold, we can improve
agreement to stochastic order by suitably trimming both distributions. In this
work we consider the Wasserstein distance, , to stochastic
order of these trimmed versions. Our characterization for that distance
naturally leads to consider a -based index of disagreement with
stochastic order, . We provide asymptotic
results allowing to test vs , that,
under rejection, would give statistical guarantee of almost stochastic
dominance. We include a simulation study showing a good performance of the
index under the normal model
Models for the Assessment of Treatment Improvement: The Ideal and the Feasible
Comparisons of different treatments or production processes are the goals of a significant fraction of applied research. Unsurprisingly, two sample problems play a main role in statistics through natural questions such as. Is the the new treatment significantly better than the old. However, this is only partially answered by some of the usual statistical tools for this task. More importantly, often practitioners are not aware of the real meaning behind these statistical procedures. We analyze these troubles from the point of view of the order between distributions, the stochastic order, showing evidence of the limitations of the usual approaches, paying special attention to the classical comparison of means under the normal model. We discuss the unfeasibility of statistically proving stochastic dominance, but show that it is possible, instead, to gather statistical evidence to conclude that slightly relaxed versions of stochastic dominance hold.Research partially supported by the Spanish Ministerio de Economía y Competitividad y fondos FEDER,
grants MTM2014-56235-C2-1-P and MTM2014-56235-C2-2, and by Consejería de Educación de la Junta de Castilla y León, grant VA212U13
Letter to the editor
AbstractThis letter shows how the main result contained in a paper recently appeared in the Journal of Multivariate Analysis was in fact a particular case of a more general theorem published three years before
Wide consensus aggregation in the Wasserstein space. Application to location-scatter families
We introduce a general theory for a consensus-based combination of estimations of probability measures. Potential applications include parallelized or distributed sampling schemes as well as variations on aggregation from resampling techniques like boosting or bagging. Taking into account the possibility of very discrepant estimations, instead of a full consensus we consider a "wide consensus" procedure. The approach is based on the consideration of trimmed barycenters in the Wasserstein space of probability measures. We provide general existence and consistency results as well as suitable properties of these robustified Fréchet means. In order to get quick applicability, we also include characterizations of barycenters of probabilities that belong to (non necessarily elliptical) location and scatter families. For these families, we provide an iterative algorithm for the effective computation of trimmed barycenters, based on a consistent algorithm for computing barycenters, guarantying applicability in a wide setting of statistical problems
Robustness and Outliers
Producción CientíficaUnexpected deviations from assumed models as well as the presence of certain amounts of outlying data are common in most practical statistical applications. This fact could lead to undesirable solutions when applying non-robust statistical techniques. This is often the case in cluster analysis, too. The search for homogeneous groups with large heterogeneity between them can be spoiled due to the lack of robustness of standard clustering methods. For instance, the presence of (even few) outlying observations may result in heterogeneous clusters artificially joined together or in the detection of spurious clusters merely made up of outlying observations. In this chapter we will analyze the effects of different kinds of outlying data in cluster analysis and explore several alternative methodologies designed to avoid or minimize their undesirable effects.Ministerio de Economía, Industria y Competitividad (MTM2014-56235-C2-1-P)Junta de Castilla y León (programa de apoyo a proyectos de investigación – Ref. VA212U13