111,550 research outputs found
Statistical validation and calibration of computer models
This thesis deals with modeling, validation and calibration problems in experiments of computer models. Computer models are mathematic representations of real systems developed for understanding and investigating the systems. Before a computer model
is used, it often needs to be validated by comparing the computer outputs with physical observations and calibrated by adjusting internal model parameters in order to improve the agreement between the computer outputs and physical observations.
As computer models become more powerful and popular, the complexity of input and output data raises new computational challenges and stimulates the development of novel statistical modeling methods.
One challenge is to deal with computer models with random inputs (random effects). This kind of computer models is very common in engineering applications. For example, in a thermal experiment in the Sandia National Lab (Dowding et al. 2008), the volumetric heat capacity and thermal conductivity are random input variables. If input variables are randomly sampled from particular distributions with unknown parameters, the existing methods in the literature are not directly applicable. The reason is that integration over the random variable distribution is needed for the joint likelihood and the integration cannot always be expressed in a closed form. In this research, we propose a new approach which combines the nonlinear mixed effects model and the Gaussian process model (Kriging model). Different model formulations are also studied to have an better understanding of validation and calibration activities by using the thermal problem.
Another challenge comes from computer models with functional outputs. While many methods have been developed for modeling computer experiments with single response, the literature on modeling computer experiments with functional response is sketchy. Dimension reduction techniques can be used to overcome the complexity problem of function response; however, they generally involve two steps. Models are first fit at each individual setting of the input to reduce the dimensionality of the functional data. Then the estimated parameters of the models are treated as new responses, which are further modeled for prediction. Alternatively, pointwise models are first constructed at each time point and then functional curves are fit to the parameter estimates obtained from the fitted models. In this research, we first propose a functional regression model to relate functional responses to both design and time variables in one single step. Secondly, we propose a functional kriging model which uses variable selection methods by imposing a penalty function. we show that the proposed model performs better than dimension reduction based approaches and the kriging model without regularization. In addition, non-asymptotic theoretical bounds on the estimation error are presented.Ph.D.Committee Chair: Tsui, Kwok-Leung; Committee Member: Goldsman, David; Committee Member: Hung, Ying; Committee Member: Shi, Jianjun; Committee Member: Vengazhiyil, Rosha
Design of Experiments for Screening
The aim of this paper is to review methods of designing screening
experiments, ranging from designs originally developed for physical experiments
to those especially tailored to experiments on numerical models. The strengths
and weaknesses of the various designs for screening variables in numerical
models are discussed. First, classes of factorial designs for experiments to
estimate main effects and interactions through a linear statistical model are
described, specifically regular and nonregular fractional factorial designs,
supersaturated designs and systematic fractional replicate designs. Generic
issues of aliasing, bias and cancellation of factorial effects are discussed.
Second, group screening experiments are considered including factorial group
screening and sequential bifurcation. Third, random sampling plans are
discussed including Latin hypercube sampling and sampling plans to estimate
elementary effects. Fourth, a variety of modelling methods commonly employed
with screening designs are briefly described. Finally, a novel study
demonstrates six screening methods on two frequently-used exemplars, and their
performances are compared
Multi-Resolution Functional ANOVA for Large-Scale, Many-Input Computer Experiments
The Gaussian process is a standard tool for building emulators for both
deterministic and stochastic computer experiments. However, application of
Gaussian process models is greatly limited in practice, particularly for
large-scale and many-input computer experiments that have become typical. We
propose a multi-resolution functional ANOVA model as a computationally feasible
emulation alternative. More generally, this model can be used for large-scale
and many-input non-linear regression problems. An overlapping group lasso
approach is used for estimation, ensuring computational feasibility in a
large-scale and many-input setting. New results on consistency and inference
for the (potentially overlapping) group lasso in a high-dimensional setting are
developed and applied to the proposed multi-resolution functional ANOVA model.
Importantly, these results allow us to quantify the uncertainty in our
predictions. Numerical examples demonstrate that the proposed model enjoys
marked computational advantages. Data capabilities, both in terms of sample
size and dimension, meet or exceed best available emulation tools while meeting
or exceeding emulation accuracy
Observational-Interventional Priors for Dose-Response Learning
Controlled interventions provide the most direct source of information for
learning causal effects. In particular, a dose-response curve can be learned by
varying the treatment level and observing the corresponding outcomes. However,
interventions can be expensive and time-consuming. Observational data, where
the treatment is not controlled by a known mechanism, is sometimes available.
Under some strong assumptions, observational data allows for the estimation of
dose-response curves. Estimating such curves nonparametrically is hard: sample
sizes for controlled interventions may be small, while in the observational
case a large number of measured confounders may need to be marginalized. In
this paper, we introduce a hierarchical Gaussian process prior that constructs
a distribution over the dose-response curve by learning from observational
data, and reshapes the distribution with a nonparametric affine transform
learned from controlled interventions. This function composition from different
sources is shown to speed-up learning, which we demonstrate with a thorough
sensitivity analysis and an application to modeling the effect of therapy on
cognitive skills of premature infants
Multivariate emulation of computer simulators: model selection and diagnostics with application to a humanitarian relief model
We present a common framework for Bayesian emulation methodologies for multivariate-output simulators, or computer models, that employ either parametric linear models or nonparametric Gaussian processes. Novel diagnostics suitable for multivariate covariance-separable emulators are developed and techniques to improve the adequacy of an emulator are discussed and implemented. A variety of emulators are compared for a humanitarian relief simulator, modelling aid missions to Sicily after a volcanic eruption and earthquake, and a sensitivity analysis is conducted to determine the sensitivity of the simulator output to changes in the input variables. The results from parametric and nonparametric emulators are compared in terms of prediction accuracy, uncertainty quantification and scientific interpretability
- …