25,155 research outputs found

    Cleaning large correlation matrices: tools from random matrix theory

    Full text link
    This review covers recent results concerning the estimation of large covariance matrices using tools from Random Matrix Theory (RMT). We introduce several RMT methods and analytical techniques, such as the Replica formalism and Free Probability, with an emphasis on the Marchenko-Pastur equation that provides information on the resolvent of multiplicatively corrupted noisy matrices. Special care is devoted to the statistics of the eigenvectors of the empirical correlation matrix, which turn out to be crucial for many applications. We show in particular how these results can be used to build consistent "Rotationally Invariant" estimators (RIE) for large correlation matrices when there is no prior on the structure of the underlying process. The last part of this review is dedicated to some real-world applications within financial markets as a case in point. We establish empirically the efficacy of the RIE framework, which is found to be superior in this case to all previously proposed methods. The case of additively (rather than multiplicatively) corrupted noisy matrices is also dealt with in a special Appendix. Several open problems and interesting technical developments are discussed throughout the paper.Comment: 165 pages, article submitted to Physics Report

    Free Random Levy Variables and Financial Probabilities

    Full text link
    We suggest that Free Random Variables, represented here by large random matrices with spectral Levy disorder, may be relevant for several problems related to the modeling of financial systems. In particular, we consider a financial covariance matrix composed of asymmetric and free random Levy matrices. We derive an algebraic equation for the resolvent and solve it to extract the spectral density. The free eigenvalue spectrum is in remarkable agreement with the one obtained from the covariance matrix of the SP500 financial market.Comment: 8 pages with 2 EPS figures; talk given by M.A. Nowak at NATO Advanced Research Workshop ``Applications of Physics to Economic Modeling'', Prague, 8-10 February, 200

    Multivariate type G Mat\'ern stochastic partial differential equation random fields

    Full text link
    For many applications with multivariate data, random field models capturing departures from Gaussianity within realisations are appropriate. For this reason, we formulate a new class of multivariate non-Gaussian models based on systems of stochastic partial differential equations with additive type G noise whose marginal covariance functions are of Mat\'ern type. We consider four increasingly flexible constructions of the noise, where the first two are similar to existing copula-based models. In contrast to these, the latter two constructions can model non-Gaussian spatial data without replicates. Computationally efficient methods for likelihood-based parameter estimation and probabilistic prediction are proposed, and the flexibility of the suggested models is illustrated by numerical examples and two statistical applications

    A Random Matrix Approach to VARMA Processes

    Full text link
    We apply random matrix theory to derive spectral density of large sample covariance matrices generated by multivariate VMA(q), VAR(q) and VARMA(q1,q2) processes. In particular, we consider a limit where the number of random variables N and the number of consecutive time measurements T are large but the ratio N/T is fixed. In this regime the underlying random matrices are asymptotically equivalent to Free Random Variables (FRV). We apply the FRV calculus to calculate the eigenvalue density of the sample covariance for several VARMA-type processes. We explicitly solve the VARMA(1,1) case and demonstrate a perfect agreement between the analytical result and the spectra obtained by Monte Carlo simulations. The proposed method is purely algebraic and can be easily generalized to q1>1 and q2>1.Comment: 16 pages, 6 figures, submitted to New Journal of Physic

    Fast and reliable MCMC for cosmological parameter estimation

    Full text link
    Markov Chain Monte Carlo (MCMC) techniques are now widely used for cosmological parameter estimation. Chains are generated to sample the posterior probability distribution obtained following the Bayesian approach. An important issue is how to optimize the efficiency of such sampling and how to diagnose whether a finite-length chain has adequately sampled the underlying posterior probability distribution. We show how the power spectrum of a single such finite chain may be used as a convergence diagnostic by means of a fitting function, and discuss strategies for optimizing the distribution for the proposed steps. The methods developed are applied to current CMB and LSS data interpreted using both a pure adiabatic cosmological model and a mixed adiabatic/isocurvature cosmological model including possible correlations between modes. For the latter application, because of the increased dimensionality and the presence of degeneracies, the need for tuning MCMC methods for maximum efficiency becomes particularly acute.Comment: 12 pages, 17 figures. Submitted to MNRA

    Inferring hidden states in Langevin dynamics on large networks: Average case performance

    Get PDF
    We present average performance results for dynamical inference problems in large networks, where a set of nodes is hidden while the time trajectories of the others are observed. Examples of this scenario can occur in signal transduction and gene regulation networks. We focus on the linear stochastic dynamics of continuous variables interacting via random Gaussian couplings of generic symmetry. We analyze the inference error, given by the variance of the posterior distribution over hidden paths, in the thermodynamic limit and as a function of the system parameters and the ratio {\alpha} between the number of hidden and observed nodes. By applying Kalman filter recursions we find that the posterior dynamics is governed by an "effective" drift that incorporates the effect of the observations. We present two approaches for characterizing the posterior variance that allow us to tackle, respectively, equilibrium and nonequilibrium dynamics. The first appeals to Random Matrix Theory and reveals average spectral properties of the inference error and typical posterior relaxation times, the second is based on dynamical functionals and yields the inference error as the solution of an algebraic equation.Comment: 20 pages, 5 figure

    "Nonlinear" covariance matrix and portfolio theory for non-Gaussian multivariate distributions

    Full text link
    This paper offers a precise analytical characterization of the distribution of returns for a portfolio constituted of assets whose returns are described by an arbitrary joint multivariate distribution. In this goal, we introduce a non-linear transformation that maps the returns onto gaussian variables whose covariance matrix provides a new measure of dependence between the non-normal returns, generalizing the covariance matrix into a non-linear fractional covariance matrix. This nonlinear covariance matrix is chiseled to the specific fat tail structure of the underlying marginal distributions, thus ensuring stability and good-conditionning. The portfolio distribution is obtained as the solution of a mapping to a so-called phi-q field theory in particle physics, of which we offer an extensive treatment using Feynman diagrammatic techniques and large deviation theory, that we illustrate in details for multivariate Weibull distributions. The main result of our theory is that minimizing the portfolio variance (i.e. the relatively ``small'' risks) may often increase the large risks, as measured by higher normalized cumulants. Extensive empirical tests are presented on the foreign exchange market that validate satisfactorily the theory. For ``fat tail'' distributions, we show that an adequete prediction of the risks of a portfolio relies much more on the correct description of the tail structure rather than on their correlations.Comment: Latex, 76 page

    Superstatistical generalisations of Wishart-Laguerre ensembles of random matrices

    Get PDF
    Using Beck and Cohen's superstatistics, we introduce in a systematic way a family of generalized Wishart–Laguerre ensembles of random matrices with Dyson index β = 1, 2 and 4. The entries of the data matrix are Gaussian random variables whose variances η fluctuate from one sample to another according to a certain probability density f(η) and a single deformation parameter γ. Three superstatistical classes for f(η) are usually considered: χ2-, inverse χ2- and log-normal distributions. While the first class, already considered by two of the authors, leads to a power-law decay of the spectral density, we here introduce and solve exactly a superposition of Wishart–Laguerre ensembles with inverse χ2-distribution. The corresponding macroscopic spectral density is given by a γ-deformation of the semi-circle and Marčenko–Pastur laws, on a non-compact support with exponential tails. After discussing in detail the validity of Wigner's surmise in the Wishart–Laguerre class, we introduce a generalized γ-dependent surmise with stretched-exponential tails, which well approximates the individual level spacing distribution in the bulk. The analytical results are in excellent agreement with numerical simulations. To illustrate our findings we compare the χ2- and inverse χ2-classes to empirical data from financial covariance matrices
    • …
    corecore