Robust Bayesian inference via optimal transport misfit measures: applications and algorithms

Abstract

Model misspecification constitutes a major obstacle to reliable inference in many problems. In the Bayesian setting, model misspecification can lead to inconsistency as well as overconfidence in the posterior distribution associated with any quantity of interest, i.e., under-reporting of uncertainty. This thesis develops a Bayesian framework to reduce the impact of a type of model misspecification arising in inference problems involving time series data: unmodeled time warping between the observed and modeled data. Inference problems involving dynamical systems, signal processing, and more generally functional data can be affected by this type of misspecification. Inverse problems in seismology are an important example of this class: inaccuracies in characterizing the complex, spatially heterogeneous propagation velocities of seismic waves can lead to error in their modeled time evolution. Data are insufficient to constrain these propagation velocities, and therefore we instead seek robustness to model error. Instrumental to our approach is the use of transport–Lagrangian (TL) distances as loss/misfit functions: such distances can be understood as “graph-space” optimal transport distances, and they naturally disregard certain features of the data that are more sensitive to time warping. We show that, compared to standard misfit functions, they produce posterior distributions that are both less biased and less dispersed. In particular, we use moment tensor inversion, a seismic inverse problem, as our primary motivating application and demonstrate improved inversion performance of the TL loss—by a variety of statistical and physical metrics—for a range of increasingly complex inversion and misspecification scenarios. At the same time, we address several broader methodological issues. First, in the absence of a tractable expression for a TL-based likelihood, we construct a consistent prior-to-posterior update using the notion of a Gibbs posterior. We then compare the impact of different loss functions on the Gibbs posterior through a broader exploration of what constitutes “good” inference in the misspecified setting, via several statistical scoring rules and rank statistics, as well as application-specific physical criteria. In an effort to link our generalized (Gibbs) Bayesian approach to a more traditional Bayesian setting, we also conduct an analytical and numerical investigation of statistical properties of the transport-Lagrangian distance between random noisy signals. As a complement to Bayesian inversion, we also demonstrate the utility of optimal transport distances for frequentist regression. We study the linear regression model with TL loss, describe the geometry of the associated mixed-integer optimization problem, and propose dedicated algorithms that exploit its underlying structure. We then compare TL linear regression with classical linear regression in several applications. Finally, we discuss potential generalizations of TL distances to include the notion of “shape” through time series embeddings, as well as possible extensions of the proposed framework to other forms of model misspecification.Ph.D

    Similar works

    Full text

    thumbnail-image

    Available Versions