1,660 research outputs found
Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models
Structured additive regression provides a general framework for complex
Gaussian and non-Gaussian regression models, with predictors comprising
arbitrary combinations of nonlinear functions and surfaces, spatial effects,
varying coefficients, random effects and further regression terms. The large
flexibility of structured additive regression makes function selection a
challenging and important task, aiming at (1) selecting the relevant
covariates, (2) choosing an appropriate and parsimonious representation of the
impact of covariates on the predictor and (3) determining the required
interactions. We propose a spike-and-slab prior structure for function
selection that allows to include or exclude single coefficients as well as
blocks of coefficients representing specific model terms. A novel
multiplicative parameter expansion is required to obtain good mixing and
convergence properties in a Markov chain Monte Carlo simulation approach and is
shown to induce desirable shrinkage properties. In simulation studies and with
(real) benchmark classification data, we investigate sensitivity to
hyperparameter settings and compare performance to competitors. The flexibility
and applicability of our approach are demonstrated in an additive piecewise
exponential model with time-varying effects for right-censored survival times
of intensive care patients with sepsis. Geoadditive and additive mixed logit
model applications are discussed in an extensive appendix
Recommended from our members
Composing Deep Learning and Bayesian Nonparametric Methods
Recent progress in Bayesian methods largely focus on non-conjugate models featured with extensive use of black-box functions: continuous functions implemented with neural networks. Using deep neural networks, Bayesian models can reasonably fit big data while at the same time capturing model uncertainty. This thesis targets at a more challenging problem: how do we model general random objects, including discrete ones, using random functions? Our conclusion is: many (discrete) random objects are in nature a composition of Poisson processes and random functions}. Thus, all discreteness is handled through the Poisson process while random functions captures the rest complexities of the object. Thus the title: composing deep learning and Bayesian nonparametric methods.
This conclusion is not a conjecture. In spacial cases such as latent feature models , we can prove this claim by working on infinite dimensional spaces, and that is how Bayesian nonparametric kicks in. Moreover, we will assume some regularity assumptions on random objects such as exchangeability. Then the representations will show up magically using representation theorems. We will see this two times throughout this thesis.
One may ask: when a random object is too simple, such as a non-negative random vector in the case of latent feature models, how can we exploit exchangeability? The answer is to aggregate infinite random objects and map them altogether onto an infinite dimensional space. And then assume exchangeability on the infinite dimensional space. We demonstrate two examples of latent feature models by (1) concatenating them as an infinite sequence (Section 2,3) and (2) stacking them as a 2d array (Section 4).
Besides, we will see that Bayesian nonparametric methods are useful to model discrete patterns in time series data. We will showcase two examples: (1) using variance Gamma processes to model change points (Section 5), and (2) using Chinese restaurant processes to model speech with switching speakers (Section 6).
We also aware that the inference problem can be non-trivial in popular Bayesian nonparametric models. In Section 7, we find a novel solution of online inference for the popular HDP-HMM model
Accelerating Asymptotically Exact MCMC for Computationally Intensive Models via Local Approximations
We construct a new framework for accelerating Markov chain Monte Carlo in
posterior sampling problems where standard methods are limited by the
computational cost of the likelihood, or of numerical models embedded therein.
Our approach introduces local approximations of these models into the
Metropolis-Hastings kernel, borrowing ideas from deterministic approximation
theory, optimization, and experimental design. Previous efforts at integrating
approximate models into inference typically sacrifice either the sampler's
exactness or efficiency; our work seeks to address these limitations by
exploiting useful convergence characteristics of local approximations. We prove
the ergodicity of our approximate Markov chain, showing that it samples
asymptotically from the \emph{exact} posterior distribution of interest. We
describe variations of the algorithm that employ either local polynomial
approximations or local Gaussian process regressors. Our theoretical results
reinforce the key observation underlying this paper: when the likelihood has
some \emph{local} regularity, the number of model evaluations per MCMC step can
be greatly reduced without biasing the Monte Carlo average. Numerical
experiments demonstrate multiple order-of-magnitude reductions in the number of
forward model evaluations used in representative ODE and PDE inference
problems, with both synthetic and real data.Comment: A major update of the theory and example
Multiscale Methods for Random Composite Materials
Simulation of material behaviour is not only a vital tool in accelerating product development and increasing design efficiency but also in advancing our fundamental understanding of materials. While homogeneous, isotropic materials are often simple to simulate, advanced, anisotropic materials pose a more sizeable challenge. In simulating entire composite components such as a 25m aircraft wing made by stacking several 0.25mm thick plies, finite element models typically exceed millions or even a billion unknowns. This problem is exacerbated by the inclusion of sub-millimeter manufacturing defects for two reasons. Firstly, a finer resolution is required which makes the problem larger. Secondly, defects introduce randomness. Traditionally, this randomness or uncertainty has been quantified heuristically since commercial codes are largely unsuccessful in solving problems of this size. This thesis develops a rigorous uncertainty quantification (UQ) framework permitted by a state of the art finite element package \texttt{dune-composites}, also developed here, designed for but not limited to composite applications. A key feature of this open-source package is a robust, parallel and scalable preconditioner \texttt{GenEO}, that guarantees constant iteration counts independent of problem size. It boasts near perfect scaling properties in both, a strong and a weak sense on over cores. It is numerically verified by solving industrially motivated problems containing upwards of 200 million unknowns. Equipped with the capability of solving expensive models, a novel stochastic framework is developed to quantify variability in part performance arising from localized out-of-plane defects. Theoretical part strength is determined for independent samples drawn from a distribution inferred from B-scans of wrinkles. Supported by literature, the results indicate a strong dependence between maximum misalignment angle and strength knockdown based on which an engineering model is presented to allow rapid estimation of residual strength bypassing expensive simulations. The engineering model itself is built from a large set of simulations of residual strength, each of which is computed using the following two step approach. First, a novel parametric representation of wrinkles is developed where the spread of parameters defines the wrinkle distribution. Second, expensive forward models are only solved for independent wrinkles using \texttt{dune-composites}. Besides scalability the other key feature of \texttt{dune-composites}, the \texttt{GenEO} coarse space, doubles as an excellent multiscale basis which is exploited to build high quality reduced order models that are orders of magnitude smaller. This is important because it enables multiple coarse solves for the cost of one fine solve. In an MCMC framework, where many solves are wasted in arriving at the next independent sample, this is a sought after quality because it greatly increases effective sample size for a fixed computational budget thus providing a route to high-fidelity UQ. This thesis exploits both, new solvers and multiscale methods developed here to design an efficient Bayesian framework to carry out previously intractable (large scale) simulations calibrated by experimental data. These new capabilities provide the basis for future work on modelling random heterogeneous materials while also offering the scope for building virtual test programs including nonlinear analyses, all of which can be implemented within a probabilistic setting
- …