Search CORE

218 research outputs found

On high-dimensional sign tests

Author: Paindaveine Davy
Verdebout Thomas
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/01/2016
Field of study

Sign tests are among the most successful procedures in multivariate nonparametric statistics. In this paper, we consider several testing problems in multivariate analysis, directional statistics and multivariate time series analysis, and we show that, under appropriate symmetry assumptions, the fixed-

p

multivariate sign tests remain valid in the high-dimensional case. Remarkably, our asymptotic results are universal, in the sense that, unlike in most previous works in high-dimensional statistics,

p

may go to infinity in an arbitrary way as

n

does. We conduct simulations that (i) confirm our asymptotic results, (ii) reveal that, even for relatively large

p

, chi-square critical values are to be favoured over the (asymptotically equivalent) Gaussian ones and (iii) show that, for testing i.i.d.-ness against serial dependence in the high-dimensional case, Portmanteau sign tests outperform their competitors in terms of validity-robustness.Comment: Published at http://dx.doi.org/10.3150/15-BEJ710 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

DI-fusion

Convergence and Fluctuations of Regularized Tyler Estimators

Author: Alouini Mohamed-Slim
Couillet Romain
Kammoun Abla
Pascal Frederic
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/04/2015
Field of study

This article studies the behavior of regularized Tyler estimators (RTEs) of scatter matrices. The key advantages of these estimators are twofold. First, they guarantee by construction a good conditioning of the estimate and second, being a derivative of robust Tyler estimators, they inherit their robustness properties, notably their resilience to the presence of outliers. Nevertheless, one major problem that poses the use of RTEs in practice is represented by the question of setting the regularization parameter

\rho

. While a high value of

\rho

is likely to push all the eigenvalues away from zero, it comes at the cost of a larger bias with respect to the population covariance matrix. A deep understanding of the statistics of RTEs is essential to come up with appropriate choices for the regularization parameter. This is not an easy task and might be out of reach, unless one considers asymptotic regimes wherein the number of observations

n

and/or their size

N

increase together. First asymptotic results have recently been obtained under the assumption that

N

and

n

are large and commensurable. Interestingly, no results concerning the regime of

n

going to infinity with

N

fixed exist, even though the investigation of this assumption has usually predated the analysis of the most difficult

N

and

n

large case. This motivates our work. In particular, we prove in the present paper that the RTEs converge to a deterministic matrix when

n\to\infty

with

N

fixed, which is expressed as a function of the theoretical covariance matrix. We also derive the fluctuations of the RTEs around this deterministic matrix and establish that these fluctuations converge in distribution to a multivariate Gaussian distribution with zero mean and a covariance depending on the population covariance and the parameter

\rho

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

HAL-Rennes 1

Random geometric graphs in high dimension

Author: Ariosto Sebastiano
Erba Vittorio
Gherardi Marco
Rotondo Pietro
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2020
Field of study

Many machine learning algorithms used for dimensional reduction and manifold learning leverage on the computation of the nearest neighbours to each point of a dataset to perform their tasks. These proximity relations define a so-called geometric graph, where two nodes are linked if they are sufficiently close to each other. Random geometric graphs, where the positions of nodes are randomly generated in a subset of

\mathbb{R}^{d}

, offer a null model to study typical properties of datasets and of machine learning algorithms. Up to now, most of the literature focused on the characterization of low-dimensional random geometric graphs whereas typical datasets of interest in machine learning live in high-dimensional spaces (

d \gg 10^{2}

). In this work, we consider the infinite dimensions limit of hard and soft random geometric graphs and we show how to compute the average number of subgraphs of given finite size

k

, e.g. the average number of

k

-cliques. This analysis highlights that local observables display different behaviors depending on the chosen ensemble: soft random geometric graphs with continuous activation functions converge to the naive infinite dimensional limit provided by Erd\"os-R\'enyi graphs, whereas hard random geometric graphs can show systematic deviations from it. We present numerical evidence that our analytical insights, exact in infinite dimensions, provide a good approximation also for dimension

d\gtrsim10

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Università degli Studi di Parma

AIR Universita degli studi di Milano

A Deterministic Equivalent for the Analysis of Non-Gaussian Correlated MIMO Multiple Access Channels

Author: Chen Jung-Chieh
Guo Mei-Hui
Pan Guangming
Wen Chao-Kai
Wong Kai-Kit
Publication venue
Publication date: 20/08/2011
Field of study

Large dimensional random matrix theory (RMT) has provided an efficient analytical tool to understand multiple-input multiple-output (MIMO) channels and to aid the design of MIMO wireless communication systems. However, previous studies based on large dimensional RMT rely on the assumption that the transmit correlation matrix is diagonal or the propagation channel matrix is Gaussian. There is an increasing interest in the channels where the transmit correlation matrices are generally nonnegative definite and the channel entries are non-Gaussian. This class of channel models appears in several applications in MIMO multiple access systems, such as small cell networks (SCNs). To address these problems, we use the generalized Lindeberg principle to show that the Stieltjes transforms of this class of random matrices with Gaussian or non-Gaussian independent entries coincide in the large dimensional regime. This result permits to derive the deterministic equivalents (e.g., the Stieltjes transform and the ergodic mutual information) for non-Gaussian MIMO channels from the known results developed for Gaussian MIMO channels, and is of great importance in characterizing the spectral efficiency of SCNs.Comment: This paper is the revision of the original manuscript titled "A Deterministic Equivalent for the Analysis of Small Cell Networks". We have revised the original manuscript and reworked on the organization to improve the presentation as well as readabilit

arXiv.org e-Print Archive

UCL Discovery

DR-NTU (Digital Repository of NTU)