121 research outputs found
Resistant estimates for high dimensional and functional data based on random projections
We herein propose a new robust estimation method based on random projections
that is adaptive and, automatically produces a robust estimate, while enabling
easy computations for high or infinite dimensional data. Under some restricted
contamination models, the procedure is robust and attains full efficiency. We
tested the method using both simulated and real data.Comment: 24 pages, 6 figure
Resistant estimates for high dimensional and functional data based on random projections
We herein propose a new robust estimation method based on random projections that is adaptive and automatically produces a robust estimate, while enabling easy computations for high or infinite dimensional data. Under some restricted contamination models, the procedure is robust and attains full efficiency. We tested the method using both simulated and real data.Fil: Fraiman, Jacob Ricardo. Universidad de San AndrĂ©s; Argentina. Universidad de la RepĂşblica; Uruguay. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas; ArgentinaFil: Svarc, Marcela. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas; Argentina. Universidad de San AndrĂ©s; Argentin
Interpretable Clustering using Unsupervised Binary Trees
We herein introduce a new method of interpretable clustering that uses
unsupervised binary trees. It is a three-stage procedure, the first stage of
which entails a series of recursive binary splits to reduce the heterogeneity
of the data within the new subsamples. During the second stage (pruning),
consideration is given to whether adjacent nodes can be aggregated. Finally,
during the third stage (joining), similar clusters are joined together, even if
they do not descend from the same node originally. Consistency results are
obtained, and the procedure is used on simulated and real data sets.Comment: 25 pages, 6 figure
Feature Selection for Functional Data
In this paper we address the problem of feature selection when the data is
functional, we study several statistical procedures including classification,
regression and principal components. One advantage of the blinding procedure is
that it is very flexible since the features are defined by a set of functions,
relevant to the problem being studied, proposed by the user. Our method is
consistent under a set of quite general assumptions, and produces good results
with the real data examples that we analyze.Comment: 22 pages, 4 figure
An ANOVA approach for statistical comparisons of brain networks
The study of brain networks has developed extensively over the last couple of decades. By contrast, techniques for the statistical analysis of these networks are less developed. In this paper, we focus on the statistical comparison of brain networks in a nonparametric framework and discuss the associated detection and identification problems. We tested network differences between groups with an analysis of variance (ANOVA) test we developed specifically for networks. We also propose and analyse the behaviour of a new statistical procedure designed to identify different subnetworks. As an example, we show the application of this tool in resting-state fMRI data obtained from the Human Connectome Project. We identify, among other variables, that the amount of sleep the days before the scan is a relevant variable that must be controlled. Finally, we discuss the potential bias in neuroimaging findings that is generated by some behavioural and brain structure variables. Our method can also be applied to other kind of networks such as protein interaction networks, gene networks or social networks.Fil: Fraiman Borrazás, Daniel Edmundo. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas; Argentina. Universidad de San AndrĂ©s. Departamento de Matemáticas y Ciencias; ArgentinaFil: Fraiman, Jacob Ricardo. Universidad de la RepĂşblica; Uruguay. Instituto Pasteur de Montevideo; Urugua
A nonparametric approach to the estimation of lengths and surface areas
The Minkowski content of a body represents
the boundary length (for ) or the surface area (for ) of . A
method for estimating is proposed. It relies on a nonparametric
estimator based on the information provided by a random sample (taken on a
rectangle containing ) in which we are able to identify whether every point
is inside or outside . Some theoretical properties concerning strong
consistency, -error and convergence rates are obtained. A practical
application to a problem of image analysis in cardiology is discussed in some
detail. A brief simulation study is provided.Comment: Published at http://dx.doi.org/10.1214/009053606000001532 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …