121 research outputs found

    Resistant estimates for high dimensional and functional data based on random projections

    Get PDF
    We herein propose a new robust estimation method based on random projections that is adaptive and, automatically produces a robust estimate, while enabling easy computations for high or infinite dimensional data. Under some restricted contamination models, the procedure is robust and attains full efficiency. We tested the method using both simulated and real data.Comment: 24 pages, 6 figure

    Resistant estimates for high dimensional and functional data based on random projections

    Get PDF
    We herein propose a new robust estimation method based on random projections that is adaptive and automatically produces a robust estimate, while enabling easy computations for high or infinite dimensional data. Under some restricted contamination models, the procedure is robust and attains full efficiency. We tested the method using both simulated and real data.Fil: Fraiman, Jacob Ricardo. Universidad de San Andrés; Argentina. Universidad de la República; Uruguay. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Svarc, Marcela. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de San Andrés; Argentin

    Interpretable Clustering using Unsupervised Binary Trees

    Get PDF
    We herein introduce a new method of interpretable clustering that uses unsupervised binary trees. It is a three-stage procedure, the first stage of which entails a series of recursive binary splits to reduce the heterogeneity of the data within the new subsamples. During the second stage (pruning), consideration is given to whether adjacent nodes can be aggregated. Finally, during the third stage (joining), similar clusters are joined together, even if they do not descend from the same node originally. Consistency results are obtained, and the procedure is used on simulated and real data sets.Comment: 25 pages, 6 figure

    Feature Selection for Functional Data

    Full text link
    In this paper we address the problem of feature selection when the data is functional, we study several statistical procedures including classification, regression and principal components. One advantage of the blinding procedure is that it is very flexible since the features are defined by a set of functions, relevant to the problem being studied, proposed by the user. Our method is consistent under a set of quite general assumptions, and produces good results with the real data examples that we analyze.Comment: 22 pages, 4 figure

    An ANOVA approach for statistical comparisons of brain networks

    Get PDF
    The study of brain networks has developed extensively over the last couple of decades. By contrast, techniques for the statistical analysis of these networks are less developed. In this paper, we focus on the statistical comparison of brain networks in a nonparametric framework and discuss the associated detection and identification problems. We tested network differences between groups with an analysis of variance (ANOVA) test we developed specifically for networks. We also propose and analyse the behaviour of a new statistical procedure designed to identify different subnetworks. As an example, we show the application of this tool in resting-state fMRI data obtained from the Human Connectome Project. We identify, among other variables, that the amount of sleep the days before the scan is a relevant variable that must be controlled. Finally, we discuss the potential bias in neuroimaging findings that is generated by some behavioural and brain structure variables. Our method can also be applied to other kind of networks such as protein interaction networks, gene networks or social networks.Fil: Fraiman Borrazás, Daniel Edmundo. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de San Andrés. Departamento de Matemáticas y Ciencias; ArgentinaFil: Fraiman, Jacob Ricardo. Universidad de la República; Uruguay. Instituto Pasteur de Montevideo; Urugua

    A nonparametric approach to the estimation of lengths and surface areas

    Get PDF
    The Minkowski content L0(G)L_0(G) of a body G⊂RdG\subset{\mathbb{R}}^d represents the boundary length (for d=2d=2) or the surface area (for d=3d=3) of GG. A method for estimating L0(G)L_0(G) is proposed. It relies on a nonparametric estimator based on the information provided by a random sample (taken on a rectangle containing GG) in which we are able to identify whether every point is inside or outside GG. Some theoretical properties concerning strong consistency, L1L_1-error and convergence rates are obtained. A practical application to a problem of image analysis in cardiology is discussed in some detail. A brief simulation study is provided.Comment: Published at http://dx.doi.org/10.1214/009053606000001532 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore