26 research outputs found

    Supervised classification for a family of Gaussian functional models

    Full text link
    In the framework of supervised classification (discrimination) for functional data, it is shown that the optimal classification rule can be explicitly obtained for a class of Gaussian processes with "triangular" covariance functions. This explicit knowledge has two practical consequences. First, the consistency of the well-known nearest neighbors classifier (which is not guaranteed in the problems with functional data) is established for the indicated class of processes. Second, and more important, parametric and nonparametric plug-in classifiers can be obtained by estimating the unknown elements in the optimal rule. The performance of these new plug-in classifiers is checked, with positive results, through a simulation study and a real data example.Comment: 30 pages, 6 figures, 2 table

    Anthropocene Geomorphic Change. Climate or Human Activities?

    Get PDF
    An analysis of the evolution of sedimentation rates and disasters caused by surface geologic processes during the last century, at a global scale, is presented. Results show that erosion/sedimentation processes and frequency of such disasters increased substantially, especially after midtwentieth century, coinciding with the period of intense change known as the ?Great Acceleration.? Increases for this type of disasters are significantly greater than for other disasters related to natural processes, and about 1 order of magnitude in little more than half a century. This implies an important ?global geomorphic change.? Comparisons and correlations between changes observed in those processes and potential natural (rainfall) and human (degree of land surface transformation) drivers showed a strong relationship with the latter, and not so clear with the former. This suggests that the intensification of surface geologic processes is most likely due to a greater extent to a land transformation/geomorphic processes coupling than a climate/geomorphic processes one.Funding was provided by projects: CAMGEO CGL2006–11341, Spain; PICT2011–1685, Argentina; MTM2014–56235‐C2–2 and CGL2017–82703‐R, Spain

    A parametric registration model for warped distributions with Wasserstein’s distance.

    Get PDF
    Producción CientíficaWe consider a parametric deformation model for distributions. More precisely, we assume we observe J samples of random variables which are warped from an unknown distribution template. We tackle in this paper the problem of estimating the individual deformation parameters. For this, we construct a registering criterion based on the Wasserstein distance to quantify the alignment of the distributions. We prove consistency of the empirical estimators.Junta de Castilla y León (programa de apoyo a proyectos de investigación – Ref. VA212U13)Ministerio de Economía, Industria y Competitividad (MTM2011-28657-C02-01)Ministerio de Economía, Industria y Competitividad (TM2011-28657-C02-02

    Models for the Assessment of Treatment Improvement: The Ideal and the Feasible

    Get PDF
    Comparisons of different treatments or production processes are the goals of a significant fraction of applied research. Unsurprisingly, two sample problems play a main role in statistics through natural questions such as. Is the the new treatment significantly better than the old. However, this is only partially answered by some of the usual statistical tools for this task. More importantly, often practitioners are not aware of the real meaning behind these statistical procedures. We analyze these troubles from the point of view of the order between distributions, the stochastic order, showing evidence of the limitations of the usual approaches, paying special attention to the classical comparison of means under the normal model. We discuss the unfeasibility of statistically proving stochastic dominance, but show that it is possible, instead, to gather statistical evidence to conclude that slightly relaxed versions of stochastic dominance hold.Research partially supported by the Spanish Ministerio de Economía y Competitividad y fondos FEDER, grants MTM2014-56235-C2-1-P and MTM2014-56235-C2-2, and by Consejería de Educación de la Junta de Castilla y León, grant VA212U13

    Wide consensus aggregation in the Wasserstein space. Application to location-scatter families

    Get PDF
    We introduce a general theory for a consensus-based combination of estimations of probability measures. Potential applications include parallelized or distributed sampling schemes as well as variations on aggregation from resampling techniques like boosting or bagging. Taking into account the possibility of very discrepant estimations, instead of a full consensus we consider a "wide consensus" procedure. The approach is based on the consideration of trimmed barycenters in the Wasserstein space of probability measures. We provide general existence and consistency results as well as suitable properties of these robustified Fréchet means. In order to get quick applicability, we also include characterizations of barycenters of probabilities that belong to (non necessarily elliptical) location and scatter families. For these families, we provide an iterative algorithm for the effective computation of trimmed barycenters, based on a consistent algorithm for computing barycenters, guarantying applicability in a wide setting of statistical problems

    Goodness-of-fit tests for the functional linear model based on randomly projected empirical processes

    Get PDF
    We consider marked empirical processes indexed by a randomly projected functional covariate to construct goodness-of-fit tests for the functional linear model with scalar response. The test statistics are built from continuous functionals over the projected process, resulting in computationally efficient tests that exhibit root-n convergence rates and circumvent the curse of dimensionality. The weak convergence of the empirical process is obtained conditionally on a random direction, whilst the almost surely equivalence between the testing for significance expressed on the original and on the projected functional covariate is proved. The computation of the test in practice involves calibration by wild bootstrap resampling and the combination of several p-values, arising from different projections, by means of the false discovery rate method. The finite sample properties of the tests are illustrated in a simulation study for a variety of linear models, underlying processes, and alternatives. The software provided implements the tests and allows the replication of simulations and data applications.Supported by projects MTM2014-56235-C2-2-P and MTM2017-86061-C2-2-P from the Spanish Ministry of Economy, Industry and Competitiveness. Supported by projects MTM2013-41383-P and MTM2016-76969-P from the Spanish Ministry of Economy, Industry and Competitiveness, and the European Regional Development Fund; project 10MDS207015PR from Dirección Xeral de I + D, Xunta de Galicia

    Distribution and quantile functions, ranks and signs in dimension d: a measure transportation approach

    Get PDF
    Unlike the real line, the real space Rd, for d 2, is not canonically ordered. As a consequence,such fundamental univariate concepts as quantileand distribution functions and their empirical counterparts, involving ranksand signs, do not canonically extend to the multivariate context. Palliating that lack of a canonical ordering has been an open problem for more than half a century, generating an abundant literature and motivating, among others, the development of statistical depth and copula-based methods. We show that, unlike the many definitions proposed in the literature, the measure transportation-based ranks and signs introduced in Chernozhukov, Galichon, Hallin and Henry (Ann. Statist. 45 (2017) 223-256) enjoy all the properties that make univariate ranks a successful tool for semiparametric inference. Related with those ranks, we propose a new center-outward definition of multivariate distribution and quantile functions, along with their empirical counterparts, for which we establish a Glivenko-Cantelli result. Our approach is based on McCann (Duke Math. J. 80 (1995) 309-323) and our results do not require any moment assumptions. The resulting ranks and signs are shown to be strictly distribution-free and essentially maximal ancillary in the sense of Basu (Sankhya 21 (1959) 247-256) which, in semiparametric models involving noise with unspecified density, can be interpreted as a finite-sample form of semiparametric efficiency. Although constituting a sufficient summary of the sample, empirical center-outward distribution functions are defined at observed values only. A continuous extension to the entire d-dimensional space, yielding smooth empirical quantile contours and sign curves while preserving the essential monotonicity and Glivenko- Cantelli features of the concept, is provided. A numerical study of the resulting empirical quantile contours is conducted.This paper results from the merging of Hallin (2017) and del Barrio, Cuesta-Albertos, Hallin and Matrán (2018). Eustasio del Barrio, Juan Cuesta-Albertos and Carlos Matrán are supported in part by FEDER, Spanish Ministerio de Economía y Competitividad, grant MTM2017-86061-C2; Eustasio del Barrio and Carlos Matrán also acknowledge the support of the Junta de Castilla y León, grants VA005P17 and VA002G18. Marc Hallin thanks Marc Henry for guiding his first steps into the subtleties of measure transportation

    Impartial Trimmed k-means for Functional Data

    Get PDF
    Fil: Cuesta-Albertos, Juan Antonio. Universidad de San Andrés. Departamento de Matemática y Ciencias; Argentina.Fil: Fraiman, Ricardo. Universidad de San Andrés. Departamento de Matemática y Ciencias; Argentina
    corecore