Search CORE

15 research outputs found

General adjoint-differentiated Laplace approximation

Author: Margossian Charles C.
Publication venue
Publication date: 26/06/2023
Field of study

The hierarchical prior used in Latent Gaussian models (LGMs) induces a posterior geometry prone to frustrate inference algorithms. Marginalizing out the latent Gaussian variable using an integrated Laplace approximation removes the offending geometry, allowing us to do efficient inference on the hyperparameters. To use gradient-based inference we need to compute the approximate marginal likelihood and its gradient. The adjoint-differentiated Laplace approximation differentiates the marginal likelihood and scales well with the dimension of the hyperparameters. While this method can be applied to LGMs with any prior covariance, it only works for likelihoods with a diagonal Hessian. Furthermore, the algorithm requires methods which compute the first three derivatives of the likelihood with current implementations relying on analytical derivatives. I propose a generalization which is applicable to a broader class of likelihoods and does not require analytical derivatives of the likelihood. Numerical experiments suggest the added flexibility comes at no computational cost: on a standard LGM, the new method is in fact slightly faster than the existing adjoint-differentiated Laplace approximation. I also apply the general method to an LGM with an unconventional likelihood. This example highlights the algorithm's potential, as well as persistent challenges

arXiv.org e-Print Archive

Amortized Variational Inference: When and Why?

Author: Blei David M.
Margossian Charles C.
Publication venue
Publication date: 20/07/2023
Field of study

Amortized variational inference (A-VI) is a method for approximating the intractable posterior distributions that arise in probabilistic models. The defining feature of A-VI is that it learns a global inference function that maps each observation to its local latent variable's approximate posterior. This stands in contrast to the more classical factorized (or mean-field) variational inference (F-VI), which directly learns the parameters of the approximating distribution for each latent variable. In deep generative models, A-VI is used as a computational trick to speed up inference for local latent variables. In this paper, we study A-VI as a general alternative to F-VI for approximate posterior inference. A-VI cannot produce an approximation with a lower Kullback-Leibler divergence than F-VI's optimal solution, because the amortized family is a subset of the factorized family. Thus a central theoretical problem is to characterize when A-VI still attains F-VI's optimal solution. We derive conditions on both the model and the inference function under which A-VI can theoretically achieve F-VI's optimum. We show that for a broad class of hierarchical models, including deep generative models, it is possible to close the gap between A-VI and F-VI. Further, for an even broader class of models, we establish when and how to expand the domain of the inference function to make amortization a feasible strategy. Finally, we prove that for certain models -- including hidden Markov models and Gaussian processes -- A-VI cannot match F-VI's solution, no matter how expressive the inference function is. We also study A-VI empirically. On several examples, we corroborate our theoretical results and investigate the performance of A-VI when varying the complexity of the inference function. When the gap between A-VI and F-VI can be closed, we find that the required complexity of the function need not scale with the number of observations, and that A-VI often converges faster than F-VI

arXiv.org e-Print Archive

Bayesian workflow for disease transmission modeling in Stan

Author: Grinsztajn Léo
Margossian Charles C.
Riou Julien
Semenova Elizaveta
Publication venue
Publication date: 08/09/2021
Field of study

This tutorial shows how to build, fit, and criticize disease transmission models in Stan, and should be useful to researchers interested in modeling the SARS-CoV-2 pandemic and other infectious diseases in a Bayesian framework. Bayesian modeling provides a principled way to quantify uncertainty and incorporate both data and prior knowledge into the model estimates. Stan is an expressive probabilistic programming language that abstracts the inference and allows users to focus on the modeling. As a result, Stan code is readable and easily extensible, which makes the modeler's work more transparent. Furthermore, Stan's main inference engine, Hamiltonian Monte Carlo sampling, is amiable to diagnostics, which means the user can verify whether the obtained inference is reliable. In this tutorial, we demonstrate how to formulate, fit, and diagnose a compartmental transmission model in Stan, first with a simple Susceptible-Infected-Recovered (SIR) model, then with a more elaborate transmission model used during the SARS-CoV-2 pandemic. We also cover advanced topics which can further help practitioners fit sophisticated models; notably, how to use simulations to probe the model and priors, and computational techniques to scale-up models based on ordinary differential equations

arXiv.org e-Print Archive

Oxford University Research Archive

Bern Open Repository and Information System (BORIS)

Nested $\widehat R$ : Assessing the convergence of Markov chain Monte Carlo when running many short chains

Author: Gelman Andrew
Hoffman Matthew D.
Margossian Charles C.
Riou-Durand Lionel
Sountsov Pavel
Vehtari Aki
Publication venue
Publication date: 15/09/2022
Field of study

The growing availability of hardware accelerators such as GPUs has generated interest in Markov chains Monte Carlo (MCMC) workflows which run a large number of chains in parallel. Each chain still needs to forget its initial state but the subsequent sampling phase can be almost arbitrarily short. To determine if the resulting short chains are reliable, we need to assess how close the Markov chains are to convergence to their stationary distribution. The

\widehat R

statistic is a battle-tested convergence diagnostic but unfortunately can require long chains to work well. We present a nested design to overcome this challenge, and introduce tuning parameters to control the reliability, bias, and variance of convergence diagnostics

arXiv.org e-Print Archive

Bayesian workflow for disease transmission modeling in Stan [tutorial].

Author: Grinsztajn Léo
Margossian Charles C
Riou Julien
Semenova Elizaveta
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

This tutorial shows how to build, fit, and criticize disease transmission models in Stan, and should be useful to researchers interested in modeling the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic and other infectious diseases in a Bayesian framework. Bayesian modeling provides a principled way to quantify uncertainty and incorporate both data and prior knowledge into the model estimates. Stan is an expressive probabilistic programming language that abstracts the inference and allows users to focus on the modeling. As a result, Stan code is readable and easily extensible, which makes the modeler's work more transparent. Furthermore, Stan's main inference engine, Hamiltonian Monte Carlo sampling, is amiable to diagnostics, which means the user can verify whether the obtained inference is reliable. In this tutorial, we demonstrate how to formulate, fit, and diagnose a compartmental transmission model in Stan, first with a simple susceptible-infected-recovered model, then with a more elaborate transmission model used during the SARS-CoV-2 pandemic. We also cover advanced topics which can further help practitioners fit sophisticated models; notably, how to use simulations to probe the model and priors, and computational techniques to scale-up models based on ordinary differential equations

Oxford University Research Archive

Bern Open Repository and Information System (BORIS)

Estimation of SARS-CoV-2 mortality during the early stages of an epidemic: A modeling study in Hubei, China, and six regions in Europe.

Author: Althaus Christian L.
Counotte Michel J.
Hauser Anthony
Konstantinoudis Garyfallos
Low Nicola
Margossian Charles C
Riou Julien
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/07/2020
Field of study

BACKGROUND As of 16 May 2020, more than 4.5 million cases and more than 300,000 deaths from disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) have been reported. Reliable estimates of mortality from SARS-CoV-2 infection are essential for understanding clinical prognosis, planning healthcare capacity, and epidemic forecasting. The case-fatality ratio (CFR), calculated from total numbers of reported cases and reported deaths, is the most commonly reported metric, but it can be a misleading measure of overall mortality. The objectives of this study were to (1) simulate the transmission dynamics of SARS-CoV-2 using publicly available surveillance data and (2) infer estimates of SARS-CoV-2 mortality adjusted for biases and examine the CFR, the symptomatic case-fatality ratio (sCFR), and the infection-fatality ratio (IFR) in different geographic locations. METHOD AND FINDINGS We developed an age-stratified susceptible-exposed-infected-removed (SEIR) compartmental model describing the dynamics of transmission and mortality during the SARS-CoV-2 epidemic. Our model accounts for two biases: preferential ascertainment of severe cases and right-censoring of mortality. We fitted the transmission model to surveillance data from Hubei Province, China, and applied the same model to six regions in Europe: Austria, Bavaria (Germany), Baden-Württemberg (Germany), Lombardy (Italy), Spain, and Switzerland. In Hubei, the baseline estimates were as follows: CFR 2.4% (95% credible interval [CrI] 2.1%-2.8%), sCFR 3.7% (3.2%-4.2%), and IFR 2.9% (2.4%-3.5%). Estimated measures of mortality changed over time. Across the six locations in Europe, estimates of CFR varied widely. Estimates of sCFR and IFR, adjusted for bias, were more similar to each other but still showed some degree of heterogeneity. Estimates of IFR ranged from 0.5% (95% CrI 0.4%-0.6%) in Switzerland to 1.4% (1.1%-1.6%) in Lombardy, Italy. In all locations, mortality increased with age. Among individuals 80 years or older, estimates of the IFR suggest that the proportion of all those infected with SARS-CoV-2 who will die ranges from 20% (95% CrI 16%-26%) in Switzerland to 34% (95% CrI 28%-40%) in Spain. A limitation of the model is that count data by date of onset are required, and these are not available in all countries. CONCLUSIONS We propose a comprehensive solution to the estimation of SARS-Cov-2 mortality from surveillance data during outbreaks. The CFR is not a good predictor of overall mortality from SARS-CoV-2 and should not be used for evaluation of policy or comparison across settings. Geographic differences in IFR suggest that a single IFR should not be applied to all settings to estimate the total size of the SARS-CoV-2 epidemic in different countries. The sCFR and IFR, adjusted for right-censoring and preferential ascertainment of severe cases, are measures that can be used to improve and monitor clinical and public health strategies to reduce the deaths from SARS-CoV-2 infection

Directory of Open Access Journals

Bern Open Repository and Information System (BORIS)