Search CORE

126,492 research outputs found

Brownian distance covariance

Author: J. Székely
L. Rizzo
Maria
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 06/10/2010
Field of study

Distance correlation is a new class of multivariate dependence coefficients applicable to random vectors of arbitrary and not necessarily equal dimension. Distance covariance and distance correlation are analogous to product-moment covariance and correlation, but generalize and extend these classical bivariate measures of dependence. Distance correlation characterizes independence: it is zero if and only if the random vectors are independent. The notion of covariance with respect to a stochastic process is introduced, and it is shown that population distance covariance coincides with the covariance with respect to Brownian motion; thus, both can be called Brownian distance covariance. In the bivariate case, Brownian covariance is the natural extension of product-moment covariance, as we obtain Pearson product-moment covariance by replacing the Brownian motion in the definition with identity. The corresponding statistic has an elegantly simple computing formula. Advantages of applying Brownian covariance and correlation vs the classical Pearson covariance and correlation are discussed and illustrated.Comment: This paper discussed in: [arXiv:0912.3295], [arXiv:1010.0822], [arXiv:1010.0825], [arXiv:1010.0828], [arXiv:1010.0836], [arXiv:1010.0838], [arXiv:1010.0839]. Rejoinder at [arXiv:1010.0844]. Published in at http://dx.doi.org/10.1214/09-AOAS312 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Independence test for high dimensional data based on regularized canonical correlation coefficients

Author: Pan Guangming
Yang Yanrong
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2015
Field of study

This paper proposes a new statistic to test independence between two high dimensional random vectors

{\mathbf{X}}:p_1\times1

and

{\mathbf{Y}}:p_2\times1

. The proposed statistic is based on the sum of regularized sample canonical correlation coefficients of

{\mathbf{X}}

and

{\mathbf{Y}}

. The asymptotic distribution of the statistic under the null hypothesis is established as a corollary of general central limit theorems (CLT) for the linear statistics of classical and regularized sample canonical correlation coefficients when

p_1

and

p_2

are both comparable to the sample size

n

. As applications of the developed independence test, various types of dependent structures, such as factor models, ARCH models and a general uncorrelated but dependent case, etc., are investigated by simulations. As an empirical application, cross-sectional dependence of daily stock returns of companies between different sections in the New York Stock Exchange (NYSE) is detected by the proposed test.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1284 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Approximating Subadditive Hadamard Functions on Implicit Matrices

Author: Braverman Vladimir
Roytman Alan
Vorsanger Gregory
Publication venue
Publication date: 03/11/2015
Field of study

An important challenge in the streaming model is to maintain small-space approximations of entrywise functions performed on a matrix that is generated by the outer product of two vectors given as a stream. In other works, streams typically define matrices in a standard way via a sequence of updates, as in the work of Woodruff (2014) and others. We describe the matrix formed by the outer product, and other matrices that do not fall into this category, as implicit matrices. As such, we consider the general problem of computing over such implicit matrices with Hadamard functions, which are functions applied entrywise on a matrix. In this paper, we apply this generalization to provide new techniques for identifying independence between two vectors in the streaming model. The previous state of the art algorithm of Braverman and Ostrovsky (2010) gave a

(1 \pm \epsilon)

-approximation for the

L_1

distance between the product and joint distributions, using space

O(\log^{1024}(nm) \epsilon^{-1024})

, where

m

is the length of the stream and

n

denotes the size of the universe from which stream elements are drawn. Our general techniques include the

L_1

distance as a special case, and we give an improved space bound of

O(\log^{12}(n) \log^{2}({nm \over \epsilon})\epsilon^{-7})

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Progress on Polynomial Identity Testing - II

Author: Saxena Nitin
Publication venue
Publication date: 05/01/2014
Field of study

We survey the area of algebraic complexity theory; with the focus being on the problem of polynomial identity testing (PIT). We discuss the key ideas that have gone into the results of the last few years.Comment: 17 pages, 1 figure, surve

arXiv.org e-Print Archive

CiteSeerX

Crossref