8 research outputs found

    BASS: An R Package for Fitting and Performing Sensitivity Analysis of Bayesian Adaptive Spline Surfaces

    Get PDF
    We present the R package BASS as a tool for nonparametric regression. The primary focus of the package is fitting fully Bayesian adaptive spline surface (BASS) models and performing global sensitivity analyses of these models. The BASS framework is similar to that of Bayesian multivariate adaptive regression splines (BMARS) from Denison, Mallick, and Smith (1998), but with many added features. The software is built to efficiently handle significant amounts of data with many continuous or categorical predictors and with functional response. Under our Bayesian framework, most priors are automatic but these can be modified by the user to focus on parsimony and the avoidance of overfitting. If directed to do so, the software uses parallel tempering to improve the reversible jump Markov chain Monte Carlo (RJMCMC) methods used to perform inference. We discuss the implementation of these features and present the performance of BASS in a number of analyses of simulated and real data

    Generalized Bayesian MARS: Tools for Emulating Stochastic Computer Models

    Full text link
    The multivariate adaptive regression spline (MARS) approach of Friedman (1991) and its Bayesian counterpart (Francom et al. 2018) are effective approaches for the emulation of computer models. The traditional assumption of Gaussian errors limits the usefulness of MARS, and many popular alternatives, when dealing with stochastic computer models. We propose a generalized Bayesian MARS (GBMARS) framework which admits the broad class of generalized hyperbolic distributions as the induced likelihood function. This allows us to develop tools for the emulation of stochastic simulators which are parsimonious, scalable, interpretable and require minimal tuning, while providing powerful predictive and uncertainty quantification capabilities. GBMARS is capable of robust regression with t distributions, quantile regression with asymmetric Laplace distributions and a general form of "Normal-Wald" regression in which the shape of the error distribution and the structure of the mean function are learned simultaneously. We demonstrate the effectiveness of GBMARS on various stochastic computer models and we show that it compares favorably to several popular alternatives

    Discovering Active Subspaces for High-Dimensional Computer Models

    Full text link
    Dimension reduction techniques have long been an important topic in statistics, and active subspaces (AS) have received much attention this past decade in the computer experiments literature. The most common approach towards estimating the AS is to use Monte Carlo with numerical gradient evaluation. While sensible in some settings, this approach has obvious drawbacks. Recent research has demonstrated that active subspace calculations can be obtained in closed form, conditional on a Gaussian process (GP) surrogate, which can be limiting in high-dimensional settings for computational reasons. In this paper, we produce the relevant calculations for a more general case when the model of interest is a linear combination of tensor products. These general equations can be applied to the GP, recovering previous results as a special case, or applied to the models constructed by other regression techniques including multivariate adaptive regression splines (MARS). Using a MARS surrogate has many advantages including improved scaling, better estimation of active subspaces in high dimensions and the ability to handle a large number of prior distributions in closed form. In one real-world example, we obtain the active subspace of a radiation-transport code with 240 inputs and 9,372 model runs in under half an hour

    Emulation and Uncertainty Quantification for Models with Functional Response Using Bayesian Adaptive Splines

    No full text
    When a computer code is used to simulate a complex system, a fundamental task is to assess the uncertainty of the simulator. In the case of computationally expensive simulators, this is often accomplished via a surrogate statistical model, a statistical output emulator. An effective emulator is one that provides good approximations to the computer code output for wide ranges of input values. In addition, an emulator should be able to handle large dimensional simulation output for a relevant number of inputs; it should flexibly capture heterogeneities in the variability of the response surface; it should be fast to evaluate for arbitrary combinations of input parameters; and it should provide an accurate quantification of the emulation uncertainty. In this work, we develop Bayesian adaptive spline methods for emulation of computer models that output functions. We introduce modifications to traditional Bayesian adaptive spline approaches that allow for fitting large amounts of data and allow for more efficient Markov chain Monte Carlo sampling. We develop a functional approach to sensitivity analysis that can be performed using this emulator. We present a sensitivity analysis of a computer model of the deformation of a protective plate used in pressure driven experiments. This example serves as an illustration of the ability of Bayesian adaptive spline emulators to fulfill all the necessities of computability, flexibility and reliable calculation on relevant measures of sensitivity.We extend the methods to emulation of an atmospheric dispersion simulator that outputs a plume in space and time based on inputs detailing the characteristics of the release, some of which are categorical. We achieve accurate emulation using Bayesian adaptive splines to model weights on empirical orthogonal functions. We extend the adaptive spline methodology to allow for categorical inputs. We use this emulator as well as appropriately identifiable simulator discrepancy and observational error models to calibrate the simulator using a dataset from an experimental release of particles from the Diablo Canyon Nuclear Power Plant in Central California. Since the release was controlled, these characteristics are known, allowing us to compare our findings to the truth.We further extend the methods to emulate a computer model that outputs misaligned functional data. We do this by modeling the aligned, or warped, data as well as the warping functions, using separate Bayesian adaptive spline models. We explore inference methods that treat these models jointly and separately, and establish methods to ensure that the warping functions are non-decreasing. These methods are applied to a high-energy-density physics model that outputs a curve representing energy as a function of time
    corecore