Search CORE

4,965 research outputs found

AN ADAPTIVE COMPOSITE QUANTILE APPROACH TO DIMENSION REDUCTION

Author: Kong Efang
Xia Yingcun
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 07/08/2014
Field of study

Sufficient dimension reduction [Li 1991] has long been a prominent issue in multivariate nonparametric regression analysis. To uncover the central dimension reduction space, we propose in this paper an adaptive composite quantile approach. Compared to existing methods, (1) it requires minimal assumptions and is capable of revealing all dimension reduction directions; (2) it is robust against outliers and (3) it is structure-adaptive, thus more efficient. Asymptotic results are proved and numerical examples are provided, including a real data analysis

arXiv.org e-Print Archive

Crossref

Kent Academic Repository

The Ginibre ensemble and Gaussian analytic functions

Author: Bálint Virág
Dumitriu
Girko
Hough
Kostlan
Manjunath Krishnapur
Nazarov
Ramírez
Rider
Sodin
Sodin
Trotter
Valkó
Publication venue: 'Oxford University Press (OUP)'
Publication date: 12/12/2011
Field of study

We show that as

n

changes, the characteristic polynomial of the

n\times n

random matrix with i.i.d. complex Gaussian entries can be described recursively through a process analogous to P\'olya's urn scheme. As a result, we get a random analytic function in the limit, which is given by a mixture of Gaussian analytic functions. This gives another reason why the zeros of Gaussian analytic functions and the Ginibre ensemble exhibit similar local repulsion, but different global behavior. Our approach gives new explicit formulas for the limiting analytic function.Comment: 23 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Open Access Repository of IISc Research Publications

Fast, Exact Bootstrap Principal Component Analysis for p>1 million

Author: Caffo Brian
Fisher Aaron
Schwartz Brian
Zipunnikov Vadim
Publication venue
Publication date: 14/05/2014
Field of study

Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (

p

) is much larger than the number of subjects (

n

), the challenge of calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same

n

-dimensional subspace as the original sample. As a result, all bootstrap principal components are limited to the same

n

-dimensional subspace and can be efficiently represented by their low dimensional coordinates in that subspace. Several uncertainty metrics can be computed solely based on the bootstrap distribution of these low dimensional coordinates, without calculating or storing the

p

-dimensional bootstrap components. Fast bootstrap PCA is applied to a dataset of sleep electroencephalogram (EEG) recordings (

p=900

n=392

), and to a dataset of brain magnetic resonance images (MRIs) (

p\approx

3 million,

n=352

). For the brain MRI dataset, our method allows for standard errors for the first 3 principal components based on 1000 bootstrap samples to be calculated on a standard laptop in 47 minutes, as opposed to approximately 4 days with standard methods.Comment: 25 pages, including 9 figures and link to R package. 2014-05-14 update: final formatting edits for journal submission, condensed figure

arXiv.org e-Print Archive

CiteSeerX

Sufficient dimension reduction based on an ensemble of minimum average variance estimators

Author: Li Bing
Yin Xiangrong
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 15/03/2012
Field of study

We introduce a class of dimension reduction estimators based on an ensemble of the minimum average variance estimates of functions that characterize the central subspace, such as the characteristic functions, the Box--Cox transformations and wavelet basis. The ensemble estimators exhaustively estimate the central subspace without imposing restrictive conditions on the predictors, and have the same convergence rate as the minimum average variance estimates. They are flexible and easy to implement, and allow repeated use of the available sample, which enhances accuracy. They are applicable to both univariate and multivariate responses in a unified form. We establish the consistency and convergence rate of these estimators, and the consistency of a cross validation criterion for order determination. We compare the ensemble estimators with other estimators in a wide variety of models, and establish their competent performance.Comment: Published in at http://dx.doi.org/10.1214/11-AOS950 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Determining the dimension of iterative Hessian transformation

Author: Cook R. Dennis
Li Bing
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 30/08/2005
Field of study

The central mean subspace (CMS) and iterative Hessian transformation (IHT) have been introduced recently for dimension reduction when the conditional mean is of interest. Suppose that X is a vector-valued predictor and Y is a scalar response. The basic problem is to find a lower-dimensional predictor \eta^TX such that E(Y|X)=E(Y|\eta^TX). The CMS defines the inferential object for this problem and IHT provides an estimating procedure. Compared with other methods, IHT requires fewer assumptions and has been shown to perform well when the additional assumptions required by those methods fail. In this paper we give an asymptotic analysis of IHT and provide stepwise asymptotic hypothesis tests to determine the dimension of the CMS, as estimated by IHT. Here, the original IHT method has been modified to be invariant under location and scale transformations. To provide empirical support for our asymptotic results, we will present a series of simulation studies. These agree well with the theory. The method is applied to analyze an ozone data set.Comment: Published at http://dx.doi.org/10.1214/009053604000000661 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref