Search CORE

14 research outputs found

A free central-limit theorem for dynamical systems

Author: Austern Morgane
Publication venue
Publication date: 18/08/2020
Field of study

The free central-limit theorem, a fundamental theorem in free probability, states that empirical averages of freely independent random variables are asymptotically semi-circular. We extend this theorem to general dynamical systems of operators that we define using a free random variable

X

coupled with a group of *-automorphims describing the evolution of

X

. We introduce free mixing coefficients that measure how far a dynamical system is from being freely independent. Under conditions on those coefficients, we prove that the free central-limit theorem also holds for these processes and provide Berry-Essen bounds. We generalize this to triangular arrays and U-statistics. Finally we draw connections with classical probability and random matrix theory with a series of examples

arXiv.org e-Print Archive

Recommended from our members

Limit theorems beyond sums of I.I.D observations

Author: Austern Morgane
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

We consider second and third order limit theorems--namely central-limit theorems, Berry-Esseen bounds and concentration inequalities-- and extend them for "symmetric" random objects, and general estimators of exchangeable structures. At first, we consider random processes whose distribution satisfies a symmetry property. Examples include exchangeability, stationarity, and various others. We show that, under a suitable mixing condition, estimates computed as ergodic averages of such processes satisfy a central limit theorem, a Berry-Esseen bound, and a concentration inequality. These are generalized further to triangular arrays, to a class of generalized U-statistics, and to a form of random censoring. As applications, we obtain new results on exchangeability, and on estimation in random fields and certain network model; extend results on graphon models; give a simpler proof of a recent central limit theorem for marked point processes; and establish asymptotic normality of the empirical entropy of a large class of processes. In certain special cases, we recover well-known properties, which can hence be interpreted as a direct consequence of symmetry. The proofs adapt Stein's method. Subsequently, we consider a sequence of-potentially random-functions evaluated along a sequence of exchangeable structures. We show that, under general stability conditions, those values are asymptotically normal. Those conditions are vaguely reminiscent of those familiar from concentration results, however not identical. We require that the output of the functions does not vary significantly when an entry is disturbed; and the size of this variation should not depend markedly on the other entries. Our result generalizes a number of known results, and as corollaries, we obtain new results for several applications: For randomly sub-sampled subgraphs; for risk estimates obtained by K-fold cross validation; and for the empirical risk of double bagging algorithms. The proof adapts the martingale central-limit theorem

Columbia University Academic Commons

Limit theorems for distributions invariant under groups of transformations

Author: Austern Morgane
Orbanz Peter
Publication venue: INST MATHEMATICAL STATISTICS-IMS
Publication date: 01/08/2022
Field of study

A distributional symmetry is invariance of a distribution under a group of transformations. Exchangeability and stationarity are examples. We explain that a result of ergodic theory implies a law of large numbers for such invariant distributions: If the group satisfies suitable conditions, expectations can be estimated by averaging over subsets of transformations, and these estimators are strongly consistent. We show that, if a mixing condition holds, the averages also satisfy a central limit theorem, a Berry-Esseen bound, and concentration. These are extended further to apply to triangular arrays, to randomly subsampled averages, and to a generalization of U-statistics. As applications, we obtain a general limit theorem for exchangeable random structures, and new results on stationary random fields, network models, and a class of marked point processes. We also establish asymptotic normality of the empirical entropy for a large class of processes. Some known results are recovered as special cases, and can hence be interpreted as an outcome of symmetry. The proofs adapt Stein's method

UCL Discovery

Wasserstein-p Bounds in the Central Limit Theorem under Weak Dependence

Author: Austern Morgane
Liu Tianle
Publication venue
Publication date: 19/09/2022
Field of study

The central limit theorem is one of the most fundamental results in probability and has been successfully extended to locally dependent data and strongly-mixing random fields. In this paper, we establish its rate of convergence for transport distances, namely for arbitrary

p\ge1

we obtain an upper bound for the Wassertein-

p

distance for locally dependent random variables and strongly mixing stationary random fields. Our proofs adapt the Stein dependency neighborhood method to the Wassertein-

p

distance and as a by-product we establish high-order local expansions of the Stein equation for dependent random variables. Finally, we demonstrate how our results can be used to obtain tail bounds that are asymptotically tight, and decrease polynomially fast, for the empirical average of weakly dependent random variables

arXiv.org e-Print Archive

Wasserstein-p Bounds in the Central Limit Theorem Under Local Dependence

Author: Austern Morgane
Liu Tianle
Publication venue
Publication date: 09/07/2023
Field of study

The central limit theorem (CLT) is one of the most fundamental results in probability; and establishing its rate of convergence has been a key question since the 1940s. For independent random variables, a series of recent works established optimal error bounds under the Wasserstein-p distance (with p>=1). In this paper, we extend those results to locally dependent random variables, which include m-dependent random fields and U-statistics. Under conditions on the moments and the dependency neighborhoods, we derive optimal rates in the CLT for the Wasserstein-p distance. Our proofs rely on approximating the empirical average of dependent observations by the empirical average of i.i.d. random variables. To do so, we expand the Stein equation to arbitrary orders by adapting the Stein's dependency neighborhood method. Finally we illustrate the applicability of our results by obtaining efficient tail bounds.Comment: 49 pages. arXiv admin note: substantial text overlap with arXiv:2209.0937

arXiv.org e-Print Archive

Asymptotics of Network Embeddings Learned via Subsampling

Author: Austern Morgane
Davison Andrew
Publication venue
Publication date: 05/07/2021
Field of study

Network data are ubiquitous in modern machine learning, with tasks of interest including node classification, node clustering and link prediction. A frequent approach begins by learning an Euclidean embedding of the network, to which algorithms developed for vector-valued data are applied. For large networks, embeddings are learned using stochastic gradient methods where the sub-sampling scheme can be freely chosen. Despite the strong empirical performance of such methods, they are not well understood theoretically. Our work encapsulates representation methods using a subsampling approach, such as node2vec, into a single unifying framework. We prove, under the assumption that the graph is exchangeable, that the distribution of the learned embedding vectors asymptotically decouples. Moreover, we characterize the asymptotic distribution and provided rates of convergence, in terms of the latent parameters, which includes the choice of loss function and the embedding dimension. This provides a theoretical foundation to understand what the embedding vectors represent and how well these methods perform on downstream tasks. Notably, we observe that typically used loss functions may lead to shortcomings, such as a lack of Fisher consistency.Comment: 98 pages, 3 figures, 1 tabl

arXiv.org e-Print Archive

Smooth Edgeworth Expansion and Wasserstein- $p$ Bounds for Mixing Random Fields

Author: Austern Morgane
Liu Tianle
Publication venue
Publication date: 13/09/2023
Field of study

The Edgeworth expansion is a central tool of probability that offers some refinement on the central limit theorem by providing higher-order approximations. In this paper, we consider

d

-dimensional mixing random fields

\bigl(X^{(n)}_{i}\bigr)_{i\in T_{n}}

and study the empirical average

W_n:=\sigma_n^{-1} \sum_{i\in T_n}X^{(n)}_{i}

. Firstly, under mixing and moment conditions, we obtain a smooth Edgeworth expansion for

W_n

to any order

p

. The proof relies on the Stein's method and a new constructive graph approach. Then we utilize the obtained expansion and a newly proposed method in Liu and Austern [2023] to obtain the first rates for the central limit theorem in Wasserstein-

p

distance for arbitrary

p\ge 1

. Finally, we apply those results to obtain tail bounds and non-uniform Berry-Esseen bounds with polynomial decay.Comment: 92 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2209.0937

arXiv.org e-Print Archive