8 research outputs found

    Variational Bayesian Inference: Message Passing Schemes and Streamlined Multilevel Data Analysis

    Full text link
    University of Technology Sydney. Faculty of Science.Mean-field variational Bayes (MFVB) is a deterministic technique for approximating intractable integrals arising in Bayesian inference. They are typically used for making approximate inference for parameters in complex statistical models. Most of its foundational literature and applications are in Machine Learning. However, in the age of “Big Data”, and by extension large sample sizes, MFVB has become an important tool in Statistics. The approximating schemes afforded by MFVB rely on heavy algebraic derivations across the model. The emergence of Big Data has resulted in more complex statistical models, making the process of formulating an MFVB algorithm cumbersome. Fortunately, the MFVB updating scheme can be simplified by representing the parameters of the statistical model in a probabilistic graph. The derivations are made more efficient by decomposing the required computations into calculations that are local to each node in the graph. We focus on constructing variational Bayesian inference algorithms based on a modularised format known as variational message passing (VMP), which is founded upon the notion of messages passed between fragments on a factor graph. Primitive functions, which represent the localised messages over factor graph fragments, are derived and can be called upon for direct implementation into arbitrarily large statistical models. The MFVB and VMP approaches result in superficially different algorithms, but converge to identical posterior density function approximations because they are founded upon the same optimisation problem. For complex statistical models, VMP has the advantage that the iterative updates are adjusted into a modularised format by taking advantage of the localised computations afforded by variational Bayesian methods. The resulting algorithm is a sequence of fragment-based functions that represent a compartmentalisation of the required algebra and computer coding. Despite the computational convenience of VMP algorithms over their MFVB counterparts, the speed of both classes is limited for multilevel data models, such as Gaussian response linear mixed models. Statistical inference on such models requires standard matrix operations, such as inversion and matrix-vector multiplication, on sparse matrices, which are difficult to achieve efficiently. Furthermore, computational storage issues restrict the size of such models. Streamlined matrix algebraic results are necessary for implementing fast frequentist and variational Bayesian inference, which is not inhibited by storage-greedy sparse matrix operations, on multilevel data models. This thesis develops factor graph fragment functions that can be used to build complex statistical models and achieves streamlined matrix algebraic derivations for multilevel data analysis

    Measure Transport with Kernel Stein Discrepancy

    Get PDF
    Measure transport underpins several recent algorithms for posterior approximation in the Bayesian context, wherein a transport map is sought to minimise the Kullback--Leibler divergence (KLD) from the posterior to the approximation. The KLD is a strong mode of convergence, requiring absolute continuity of measures and placing restrictions on which transport maps can be permitted. Here we propose to minimise a kernel Stein discrepancy (KSD) instead, requiring only that the set of transport maps is dense in an L2L^2 sense and demonstrating how this condition can be validated. The consistency of the associated posterior approximation is established and empirical results suggest that KSD is competitive and more flexible alternative to KLD for measure transport

    Streamlined solutions to multilevel sparse matrix problems

    No full text
    We define and solve classes of sparse matrix problems that arise in multilevel modelling and data analysis. The classes are indexed by the number of nested units, with two-level problems corresponding to the common situation, in which data on level-1 units are grouped within a two-level structure. We provide full solutions for two-level and three-level problems, and their derivations provide blueprints for the challenging, albeit rarer in applications, higher-level versions of the problem. While our linear system solutions are a concise recasting of existing results, our matrix inverse sub-block results are novel and facilitate streamlined computation of standard errors in frequentist inference as well as allowing streamlined mean field variational Bayesian inference for models containing higher-level random effects. doi: 10.1017/S144618112000006

    A patient-centric modeling framework captures recovery from SARS-CoV-2 infection.

    No full text
    The biology driving individual patient responses to severe acute respiratory syndrome coronavirus 2 infection remains ill understood. Here, we developed a patient-centric framework leveraging detailed longitudinal phenotyping data and covering a year after disease onset, from 215 infected individuals with differing disease severities. Our analyses revealed distinct 'systemic recovery' profiles, with specific progression and resolution of the inflammatory, immune cell, metabolic and clinical responses. In particular, we found a strong inter-patient and intra-patient temporal covariation of innate immune cell numbers, kynurenine metabolites and lipid metabolites, which highlighted candidate immunologic and metabolic pathways influencing the restoration of homeostasis, the risk of death and that of long COVID. Based on these data, we identified a composite signature predictive of systemic recovery, using a joint model on cellular and molecular parameters measured soon after disease onset. New predictions can be generated using the online tool http://shiny.mrc-bsu.cam.ac.uk/apps/covid-19-systemic-recovery-prediction-app , designed to test our findings prospectively
    corecore