11 research outputs found

    Asymptotic Normality of the Maximum Pseudolikelihood Estimator for Fully Visible Boltzmann Machines

    Full text link
    Boltzmann machines (BMs) are a class of binary neural networks for which there have been numerous proposed methods of estimation. Recently, it has been shown that in the fully visible case of the BM, the method of maximum pseudolikelihood estimation (MPLE) results in parameter estimates which are consistent in the probabilistic sense. In this article, we investigate the properties of MPLE for the fully visible BMs further, and prove that MPLE also yields an asymptotically normal parameter estimator. These results can be used to construct confidence intervals and to test statistical hypotheses. We support our theoretical results by showing that the estimator behaves as expected in a simulation study

    Quasi-pseudolikelihood in Markov network structure learning

    Get PDF
    Probabilistic graphical models are a versatile tool for doing statistical inference with complex models. The main impediment for their use, especially with more elaborate models, is the heavy computational cost incurred. The development of approximations that enable the use of graphical models in various tasks while requiring less computational resources is therefore an important area of research. In this thesis, we test one such recently proposed family of approximations, called quasi-pseudolikelihood (QPL). Graphical models come in two main variants: directed models and undirected models, of which the latter are also called Markov networks or Markov random fields. Here we focus solely on the undirected case with continuous valued variables. The specific inference task the QPL approximations target is model structure learning, i.e. learning the model dependence structure from data. In the theoretical part of the thesis, we define the basic concepts that underpin the use of graphical models and derive the general QPL approximation. As a novel contribution, we show that one member of the QPL approximation family is not consistent in the general case: asymptotically, for this QPL version, there exists a case where the learned dependence structure does not converge to the true model structure. In the empirical part of the thesis, we test two members of the QPL family on simulated datasets. We generate datasets from Ising models and Sherrington-Kirkpatrick models and try to learn them using QPL approximations. As a reference method, we use the well-established Graphical lasso (Glasso). Based on our results, the tested QPL approximations work well with relatively sparse dependence structures, while the more densely connected models, especially with weaker interaction strengths, present challenges that call for further research

    Distributed Learning, Prediction and Detection in Probabilistic Graphs.

    Full text link
    Critical to high-dimensional statistical estimation is to exploit the structure in the data distribution. Probabilistic graphical models provide an efficient framework for representing complex joint distributions of random variables through their conditional dependency graph, and can be adapted to many high-dimensional machine learning applications. This dissertation develops the probabilistic graphical modeling technique for three statistical estimation problems arising in real-world applications: distributed and parallel learning in networks, missing-value prediction in recommender systems, and emerging topic detection in text corpora. The common theme behind all proposed methods is a combination of parsimonious representation of uncertainties in the data, optimization surrogate that leads to computationally efficient algorithms, and fundamental limits of estimation performance in high dimension. More specifically, the dissertation makes the following theoretical contributions: (1) We propose a distributed and parallel framework for learning the parameters in Gaussian graphical models that is free of iterative global message passing. The proposed distributed estimator is shown to be asymptotically consistent, improve with increasing local neighborhood sizes, and have a high-dimensional error rate comparable to that of the centralized maximum likelihood estimator. (2) We present a family of latent variable Gaussian graphical models whose marginal precision matrix has a “low-rank plus sparse” structure. Under mild conditions, we analyze the high-dimensional parameter error bounds for learning this family of models using regularized maximum likelihood estimation. (3) We consider a hypothesis testing framework for detecting emerging topics in topic models, and propose a novel surrogate test statistic for the standard likelihood ratio. By leveraging the theory of empirical processes, we prove asymptotic consistency for the proposed test and provide guarantees of the detection performance.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/110499/1/mengzs_1.pd

    Collective behaviours in the stock market -- A maximum entropy approach

    Full text link
    Scale invariance, collective behaviours and structural reorganization are crucial for portfolio management (portfolio composition, hedging, alternative definition of risk, etc.). This lack of any characteristic scale and such elaborated behaviours find their origin in the theory of complex systems. There are several mechanisms which generate scale invariance but maximum entropy models are able to explain both scale invariance and collective behaviours. The study of the structure and collective modes of financial markets attracts more and more attention. It has been shown that some agent based models are able to reproduce some stylized facts. Despite their partial success, there is still the problem of rules design. In this work, we used a statistical inverse approach to model the structure and co-movements in financial markets. Inverse models restrict the number of assumptions. We found that a pairwise maximum entropy model is consistent with the data and is able to describe the complex structure of financial systems. We considered the existence of a critical state which is linked to how the market processes information, how it responds to exogenous inputs and how its structure changes. The considered data sets did not reveal a persistent critical state but rather oscillations between order and disorder. In this framework, we also showed that the collective modes are mostly dominated by pairwise co-movements and that univariate models are not good candidates to model crashes. The analysis also suggests a genuine adaptive process since both the maximum variance of the log-likelihood and the accuracy of the predictive scheme vary through time. This approach may provide some clue to crash precursors and may provide highlights on how a shock spreads in a financial network and if it will lead to a crash. The natural continuation of the present work could be the study of such a mechanism.Comment: 146 pages, PhD Thesi

    Asymptotic normality of the maximum pseudolikelihood estimator for fully visible boltzmann machines

    No full text
    Boltzmann machines (BMs) are a class of binary neural networks for which there have been numerous proposed methods of estimation. Recently, it has been shown that in the fully visible case of the BM, the method of maximum pseudolikelihood estimation (MPLE) results in parameter estimates, which are consistent in the probabilistic sense. In this brief, we investigate the properties of MPLE for the fully visible BMs further, and prove that MPLE also yields an asymptotically normal parameter estimator. These results can be used to construct confidence intervals and to test statistical hypotheses. These constructions provide a closed-form alternative to the current methods that require Monte Carlo simulation or resampling. We support our theoretical results by showing that the estimator behaves as expected in simulation studies

    Asymptotic Normality of the Maximum Pseudolikelihood Estimator for Fully Visible Boltzmann Machines

    No full text

    A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium

    Get PDF
    When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its ρ parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available

    A Statistical Approach to the Alignment of fMRI Data

    Get PDF
    Multi-subject functional Magnetic Resonance Image studies are critical. The anatomical and functional structure varies across subjects, so the image alignment is necessary. We define a probabilistic model to describe functional alignment. Imposing a prior distribution, as the matrix Fisher Von Mises distribution, of the orthogonal transformation parameter, the anatomical information is embedded in the estimation of the parameters, i.e., penalizing the combination of spatially distant voxels. Real applications show an improvement in the classification and interpretability of the results compared to various functional alignment methods
    corecore