Search CORE

3,959 research outputs found

Efficient construction of Bayes optimal designs for stochastic process models

Author: Boys Richard J.
Gillespie Colin S.
Publication venue
Publication date: 17/09/2018
Field of study

Stochastic process models are now commonly used to analyse complex biological, ecological and industrial systems. Increasingly there is a need to deliver accurate estimates of model parameters and assess model fit by optimizing the timing of measurement of these processes. Standard methods to construct Bayes optimal designs, such as the well known \Muller algorithm, are computationally intensive even for relatively simple models. A key issue is that, in determining the merit of a design, the utility function typically requires summaries of many parameter posterior distributions, each determined via a computer-intensive scheme such as MCMC. This paper describes a fast and computationally efficient scheme to determine optimal designs for stochastic process models. The algorithm compares favourably with other methods for determining optimal designs and can require up to an order of magnitude fewer utility function evaluations for the same accuracy in the optimal design solution. It benefits from being embarrassingly parallel and is ideal for running on multi-core computers. The method is illustrated by determining different sized optimal designs for three problems of increasing complexity

arXiv.org e-Print Archive

Design Issues for Generalized Linear Models: A Review

Author: Ghosh Malay
Khuri André I.
Mukherjee Bhramar
Sinha Bikas K.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

Generalized linear models (GLMs) have been used quite effectively in the modeling of a mean response under nonstandard conditions, where discrete as well as continuous data distributions can be accommodated. The choice of design for a GLM is a very important task in the development and building of an adequate model. However, one major problem that handicaps the construction of a GLM design is its dependence on the unknown parameters of the fitted model. Several approaches have been proposed in the past 25 years to solve this problem. These approaches, however, have provided only partial solutions that apply in only some special cases, and the problem, in general, remains largely unresolved. The purpose of this article is to focus attention on the aforementioned dependence problem. We provide a survey of various existing techniques dealing with the dependence problem. This survey includes discussions concerning locally optimal designs, sequential designs, Bayesian designs and the quantile dispersion graph approach for comparing designs for GLMs.Comment: Published at http://dx.doi.org/10.1214/088342306000000105 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Active Learning Via Sequential Design and Uncertainty Sampling

Author: Chang Yuan-chin Ivan
Park Eunsik
Wang Jing
Publication venue
Publication date: 18/06/2014
Field of study

Classification is an important task in many fields including biomedical research and machine learning. Traditionally, a classification rule is constructed based a bunch of labeled data. Recently, due to technological innovation and automatic data collection schemes, we easily encounter with data sets containing large amounts of unlabeled samples. Because to label each of them is usually costly and inefficient, how to utilize these unlabeled data in a classifier construction process becomes an important problem. In machine learning literature, active learning or semi-supervised learning are popular concepts discussed under this situation, where classification algorithms recruit new unlabeled subjects sequentially based on the information learned from previous stages of its learning process, and these new subjects are then labeled and included as new training samples. From a statistical aspect, these methods can be recognized as a hybrid of the sequential design and stochastic approximation procedure. In this paper, we study sequential learning procedures for building efficient and effective classifiers, where only the selected subjects are labeled and included in its learning stage. The proposed algorithm combines the ideas of Bayesian sequential optimal design and uncertainty sampling. Computational issues of the algorithm are discussed. Numerical results using both synthesized data and real examples are reported.Comment: 25 pages, 8 figure

arXiv.org e-Print Archive

Optimal Experimental Design for Constrained Inverse Problems

Author: Chung Julianne
Chung Matthias
Ruthotto Lars
Publication venue
Publication date: 15/08/2017
Field of study

In this paper, we address the challenging problem of optimal experimental design (OED) of constrained inverse problems. We consider two OED formulations that allow reducing the experimental costs by minimizing the number of measurements. The first formulation assumes a fine discretization of the design parameter space and uses sparsity promoting regularization to obtain an efficient design. The second formulation parameterizes the design and seeks optimal placement for these measurements by solving a small-dimensional optimization problem. We consider both problems in a Bayes risk as well as an empirical Bayes risk minimization framework. For the unconstrained inverse state problem, we exploit the closed form solution for the inner problem to efficiently compute derivatives for the outer OED problem. The empirical formulation does not require an explicit solution of the inverse problem and therefore allows to integrate constraints efficiently. A key contribution is an efficient optimization method for solving the resulting, typically high-dimensional, bilevel optimization problem using derivative-based methods. To overcome the lack of non-differentiability in active set methods for inequality constraints problems, we use a relaxed interior point method. To address the growing computational complexity of empirical Bayes OED, we parallelize the computation over the training models. Numerical examples and illustrations from tomographic reconstruction, for various data sets and under different constraints, demonstrate the impact of constraints on the optimal design and highlight the importance of OED for constrained problems.Comment: 19 pages, 8 figure

arXiv.org e-Print Archive

Bayesian Modeling of Inconsistent Plastic Response due to Material Variability

Author: Boyce Brad L.
Jones Reese E.
Khalil Mohammad
Ostien Jakob T.
Rizzi Francesco
Templeton Jeremy A.
Publication venue: 'Elsevier BV'
Publication date: 31/08/2018
Field of study

The advent of fabrication techniques such as additive manufacturing has focused attention on the considerable variability of material response due to defects and other microstructural aspects. This variability motivates the development of an enhanced design methodology that incorporates inherent material variability to provide robust predictions of performance. In this work, we develop plasticity models capable of representing the distribution of mechanical responses observed in experiments using traditional plasticity models of the mean response and recently developed uncertainty quantification (UQ) techniques. We demonstrate that the new method provides predictive realizations that are superior to more traditional ones, and how these UQ techniques can be used in model selection and assessing the quality of calibrated physical parameters.Comment: 21 pages, 6 composite figures. arXiv admin note: substantial text overlap with arXiv:1802.0148

arXiv.org e-Print Archive

Efficient Bayesian experimentation using an expected information gain lower bound

Author: Ghanem Roger G.
Hajali Paris
Tsilifis Panagiotis
Publication venue
Publication date: 10/03/2016
Field of study

Experimental design is crucial for inference where limitations in the data collection procedure are present due to cost or other restrictions. Optimal experimental designs determine parameters that in some appropriate sense make the data the most informative possible. In a Bayesian setting this is translated to updating to the best possible posterior. Information theoretic arguments have led to the formation of the expected information gain as a design criterion. This can be evaluated mainly by Monte Carlo sampling and maximized by using stochastic approximation methods, both known for being computationally expensive tasks. We propose a framework where a lower bound of the expected information gain is used as an alternative design criterion. In addition to alleviating the computational burden, this also addresses issues concerning estimation bias. The problem of permeability inference in a large contaminated area is used to demonstrate the validity of our approach where we employ the massively parallel version of the multiphase multicomponent simulator TOUGH2 to simulate contaminant transport and a Polynomial Chaos approximation of the forward model that further accelerates the objective function evaluations. The proposed methodology is demonstrated to a setting where field measurements are available

arXiv.org e-Print Archive

Probabilistic representation and inverse design of metamaterials based on a deep generative model with semi-supervised learning strategy

Author: Cheng Feng
Liu Yongmin
Ma Wei
Wen Qinlong
Xu Yihao
Publication venue
Publication date: 30/01/2019
Field of study

The research of metamaterials has achieved enormous success in the manipulation of light in an artificially prescribed manner using delicately designed sub-wavelength structures, so-called meta-atoms. Even though modern numerical methods allow to accurately calculate the optical response of complex structures, the inverse design of metamaterials is still a challenging task due to the non-intuitive and non-unique relationship between physical structures and optical responses. To better unveil this implicit relationship and thus facilitate metamaterial design, we propose to represent metamaterials and model the inverse design problem in a probabilistically generative manner. By employing an encoder-decoder configuration, our deep generative model compresses the meta-atom design and optical response into a latent space, where similar designs and similar optical responses are automatically clustered together. Therefore, by sampling in the latent space, the stochastic latent variables function as codes, from which the candidate designs are generated upon given requirements in a decoding process. With the effective latent representation of metamaterials, we can elegantly model the complex structure-performance relationship in an interpretable way, and solve the one-to-many mapping issue that is intractable in a deterministic model. Moreover, to alleviate the burden of numerical calculation in data collection, we develop a semi-supervised learning strategy that allows our model to utilize unlabeled data in addition to labeled data during training, simultaneously optimizing the generative inverse design and deterministic forward prediction in an end-to-end manner. On a data-driven basis, the proposed model can serve as a comprehensive and efficient tool that accelerates the design, characterization and even new discovery in the research domain of metamaterials and photonics in general.Comment: 28 pages, 5 figure

arXiv.org e-Print Archive

Unbiased MLMC stochastic gradient-based optimization of Bayesian experimental designs

Author: Foster Adam
Goda Takashi
Hironaka Tomohiko
Kitade Wataru
Publication venue
Publication date: 28/07/2021
Field of study

In this paper we propose an efficient stochastic optimization algorithm to search for Bayesian experimental designs such that the expected information gain is maximized. The gradient of the expected information gain with respect to experimental design parameters is given by a nested expectation, for which the standard Monte Carlo method using a fixed number of inner samples yields a biased estimator. In this paper, applying the idea of randomized multilevel Monte Carlo (MLMC) methods, we introduce an unbiased Monte Carlo estimator for the gradient of the expected information gain with finite expected squared

\ell_2

-norm and finite expected computational cost per sample. Our unbiased estimator can be combined well with stochastic gradient descent algorithms, which results in our proposal of an optimization algorithm to search for an optimal Bayesian experimental design. Numerical experiments confirm that our proposed algorithm works well not only for a simple test problem but also for a more realistic pharmacokinetic problem.Comment: major revision, 26 pages, 6 figure

arXiv.org e-Print Archive

Bayesian Nonparametric Estimation for Dynamic Treatment Regimes with Sequential Transition Times

Author: Mueller Peter
Thall Peter F.
Wahed Abdus S.
Xu Yanxun
Publication venue: 'Informa UK Limited'
Publication date: 12/05/2014
Field of study

Dynamic treatment regimes in oncology and other disease areas often can be characterized by an alternating sequence of treatments or other actions and transition times between disease states. The sequence of transition states may vary substantially from patient to patient, depending on how the regime plays out, and in practice there often are many possible counterfactual outcome sequences. For evaluating the regimes, the mean final overall time may be expressed as a weighted average of the means of all possible sums of successive transitions times. A common example arises in cancer therapies where the transition times between various sequences of treatments, disease remission, disease progression, and death characterize overall survival time. For the general setting, we propose estimating mean overall outcome time by assuming a Bayesian nonparametric regression model for the logarithm of each transition time. A dependent Dirichlet process prior with Gaussian process base measure (DDP-GP) is assumed, and a joint posterior is obtained by Markov chain Monte Carlo (MCMC) sampling. We provide general guidelines for constructing a prior using empirical Bayes methods. We compare the proposed approach with inverse probability of treatment weighting. These comparisons are done by simulation studies of both single-stage and multi-stage regimes, with treatment assignment depending on baseline covariates. The method is applied to analyze a dataset arising from a clinical trial involving multi-stage chemotherapy regimes for acute leukemia. An R program for implementing the DDP-GP-based Bayesian nonparametric analysis is freely available at https://www.ma.utexas.edu/users/yxu/

arXiv.org e-Print Archive

Computer emulation with non-stationary Gaussian processes

Author: Montagna Silvia
Tokdar Surya T.
Publication venue
Publication date: 29/01/2015
Field of study

Gaussian process (GP) models are widely used to emulate propagation uncertainty in computer experiments. GP emulation sits comfortably within an analytically tractable Bayesian framework. Apart from propagating uncertainty of the input variables, a GP emulator trained on finitely many runs of the experiment also offers error bars for response surface estimates at unseen input values. This helps select future input values where the experiment should be run to minimize the uncertainty in the response surface estimation. However, traditional GP emulators use stationary covariance functions, which perform poorly and lead to sub-optimal selection of future input points when the response surface has sharp local features, such as a jump discontinuity or an isolated tall peak. We propose an easily implemented non-stationary GP emulator, based on two stationary GPs, one nested into the other, and demonstrate its superior ability in handling local features and selecting future input points from the boundaries of such features

arXiv.org e-Print Archive