25,155 research outputs found
Cleaning large correlation matrices: tools from random matrix theory
This review covers recent results concerning the estimation of large
covariance matrices using tools from Random Matrix Theory (RMT). We introduce
several RMT methods and analytical techniques, such as the Replica formalism
and Free Probability, with an emphasis on the Marchenko-Pastur equation that
provides information on the resolvent of multiplicatively corrupted noisy
matrices. Special care is devoted to the statistics of the eigenvectors of the
empirical correlation matrix, which turn out to be crucial for many
applications. We show in particular how these results can be used to build
consistent "Rotationally Invariant" estimators (RIE) for large correlation
matrices when there is no prior on the structure of the underlying process. The
last part of this review is dedicated to some real-world applications within
financial markets as a case in point. We establish empirically the efficacy of
the RIE framework, which is found to be superior in this case to all previously
proposed methods. The case of additively (rather than multiplicatively)
corrupted noisy matrices is also dealt with in a special Appendix. Several open
problems and interesting technical developments are discussed throughout the
paper.Comment: 165 pages, article submitted to Physics Report
Free Random Levy Variables and Financial Probabilities
We suggest that Free Random Variables, represented here by large random
matrices with spectral Levy disorder, may be relevant for several problems
related to the modeling of financial systems. In particular, we consider a
financial covariance matrix composed of asymmetric and free random Levy
matrices. We derive an algebraic equation for the resolvent and solve it to
extract the spectral density. The free eigenvalue spectrum is in remarkable
agreement with the one obtained from the covariance matrix of the SP500
financial market.Comment: 8 pages with 2 EPS figures; talk given by M.A. Nowak at NATO Advanced
Research Workshop ``Applications of Physics to Economic Modeling'', Prague,
8-10 February, 200
Multivariate type G Mat\'ern stochastic partial differential equation random fields
For many applications with multivariate data, random field models capturing
departures from Gaussianity within realisations are appropriate. For this
reason, we formulate a new class of multivariate non-Gaussian models based on
systems of stochastic partial differential equations with additive type G noise
whose marginal covariance functions are of Mat\'ern type. We consider four
increasingly flexible constructions of the noise, where the first two are
similar to existing copula-based models. In contrast to these, the latter two
constructions can model non-Gaussian spatial data without replicates.
Computationally efficient methods for likelihood-based parameter estimation and
probabilistic prediction are proposed, and the flexibility of the suggested
models is illustrated by numerical examples and two statistical applications
A Random Matrix Approach to VARMA Processes
We apply random matrix theory to derive spectral density of large sample
covariance matrices generated by multivariate VMA(q), VAR(q) and VARMA(q1,q2)
processes. In particular, we consider a limit where the number of random
variables N and the number of consecutive time measurements T are large but the
ratio N/T is fixed. In this regime the underlying random matrices are
asymptotically equivalent to Free Random Variables (FRV). We apply the FRV
calculus to calculate the eigenvalue density of the sample covariance for
several VARMA-type processes. We explicitly solve the VARMA(1,1) case and
demonstrate a perfect agreement between the analytical result and the spectra
obtained by Monte Carlo simulations. The proposed method is purely algebraic
and can be easily generalized to q1>1 and q2>1.Comment: 16 pages, 6 figures, submitted to New Journal of Physic
Fast and reliable MCMC for cosmological parameter estimation
Markov Chain Monte Carlo (MCMC) techniques are now widely used for
cosmological parameter estimation. Chains are generated to sample the posterior
probability distribution obtained following the Bayesian approach. An important
issue is how to optimize the efficiency of such sampling and how to diagnose
whether a finite-length chain has adequately sampled the underlying posterior
probability distribution. We show how the power spectrum of a single such
finite chain may be used as a convergence diagnostic by means of a fitting
function, and discuss strategies for optimizing the distribution for the
proposed steps. The methods developed are applied to current CMB and LSS data
interpreted using both a pure adiabatic cosmological model and a mixed
adiabatic/isocurvature cosmological model including possible correlations
between modes. For the latter application, because of the increased
dimensionality and the presence of degeneracies, the need for tuning MCMC
methods for maximum efficiency becomes particularly acute.Comment: 12 pages, 17 figures. Submitted to MNRA
Inferring hidden states in Langevin dynamics on large networks: Average case performance
We present average performance results for dynamical inference problems in
large networks, where a set of nodes is hidden while the time trajectories of
the others are observed. Examples of this scenario can occur in signal
transduction and gene regulation networks. We focus on the linear stochastic
dynamics of continuous variables interacting via random Gaussian couplings of
generic symmetry. We analyze the inference error, given by the variance of the
posterior distribution over hidden paths, in the thermodynamic limit and as a
function of the system parameters and the ratio {\alpha} between the number of
hidden and observed nodes. By applying Kalman filter recursions we find that
the posterior dynamics is governed by an "effective" drift that incorporates
the effect of the observations. We present two approaches for characterizing
the posterior variance that allow us to tackle, respectively, equilibrium and
nonequilibrium dynamics. The first appeals to Random Matrix Theory and reveals
average spectral properties of the inference error and typical posterior
relaxation times, the second is based on dynamical functionals and yields the
inference error as the solution of an algebraic equation.Comment: 20 pages, 5 figure
"Nonlinear" covariance matrix and portfolio theory for non-Gaussian multivariate distributions
This paper offers a precise analytical characterization of the distribution
of returns for a portfolio constituted of assets whose returns are described by
an arbitrary joint multivariate distribution. In this goal, we introduce a
non-linear transformation that maps the returns onto gaussian variables whose
covariance matrix provides a new measure of dependence between the non-normal
returns, generalizing the covariance matrix into a non-linear fractional
covariance matrix. This nonlinear covariance matrix is chiseled to the specific
fat tail structure of the underlying marginal distributions, thus ensuring
stability and good-conditionning. The portfolio distribution is obtained as the
solution of a mapping to a so-called phi-q field theory in particle physics, of
which we offer an extensive treatment using Feynman diagrammatic techniques and
large deviation theory, that we illustrate in details for multivariate Weibull
distributions. The main result of our theory is that minimizing the portfolio
variance (i.e. the relatively ``small'' risks) may often increase the large
risks, as measured by higher normalized cumulants. Extensive empirical tests
are presented on the foreign exchange market that validate satisfactorily the
theory. For ``fat tail'' distributions, we show that an adequete prediction of
the risks of a portfolio relies much more on the correct description of the
tail structure rather than on their correlations.Comment: Latex, 76 page
Superstatistical generalisations of Wishart-Laguerre ensembles of random matrices
Using Beck and Cohen's superstatistics, we introduce in a systematic way a family of generalized Wishart–Laguerre ensembles of random matrices with Dyson index β = 1, 2 and 4. The entries of the data matrix are Gaussian random variables whose variances η fluctuate from one sample to another according to a certain probability density f(η) and a single deformation parameter γ. Three superstatistical classes for f(η) are usually considered: χ2-, inverse χ2- and log-normal distributions. While the first class, already considered by two of the authors, leads to a power-law decay of the spectral density, we here introduce and solve exactly a superposition of Wishart–Laguerre ensembles with inverse χ2-distribution. The corresponding macroscopic spectral density is given by a γ-deformation of the semi-circle and Marčenko–Pastur laws, on a non-compact support with exponential tails. After discussing in detail the validity of Wigner's surmise in the Wishart–Laguerre class, we introduce a generalized γ-dependent surmise with stretched-exponential tails, which well approximates the individual level spacing distribution in the bulk. The analytical results are in excellent agreement with numerical simulations. To illustrate our findings we compare the χ2- and inverse χ2-classes to empirical data from financial covariance matrices
- …