340,452 research outputs found

    Statistical validation and calibration of computer models

    Get PDF
    This thesis deals with modeling, validation and calibration problems in experiments of computer models. Computer models are mathematic representations of real systems developed for understanding and investigating the systems. Before a computer model is used, it often needs to be validated by comparing the computer outputs with physical observations and calibrated by adjusting internal model parameters in order to improve the agreement between the computer outputs and physical observations. As computer models become more powerful and popular, the complexity of input and output data raises new computational challenges and stimulates the development of novel statistical modeling methods. One challenge is to deal with computer models with random inputs (random effects). This kind of computer models is very common in engineering applications. For example, in a thermal experiment in the Sandia National Lab (Dowding et al. 2008), the volumetric heat capacity and thermal conductivity are random input variables. If input variables are randomly sampled from particular distributions with unknown parameters, the existing methods in the literature are not directly applicable. The reason is that integration over the random variable distribution is needed for the joint likelihood and the integration cannot always be expressed in a closed form. In this research, we propose a new approach which combines the nonlinear mixed effects model and the Gaussian process model (Kriging model). Different model formulations are also studied to have an better understanding of validation and calibration activities by using the thermal problem. Another challenge comes from computer models with functional outputs. While many methods have been developed for modeling computer experiments with single response, the literature on modeling computer experiments with functional response is sketchy. Dimension reduction techniques can be used to overcome the complexity problem of function response; however, they generally involve two steps. Models are first fit at each individual setting of the input to reduce the dimensionality of the functional data. Then the estimated parameters of the models are treated as new responses, which are further modeled for prediction. Alternatively, pointwise models are first constructed at each time point and then functional curves are fit to the parameter estimates obtained from the fitted models. In this research, we first propose a functional regression model to relate functional responses to both design and time variables in one single step. Secondly, we propose a functional kriging model which uses variable selection methods by imposing a penalty function. we show that the proposed model performs better than dimension reduction based approaches and the kriging model without regularization. In addition, non-asymptotic theoretical bounds on the estimation error are presented.Ph.D.Committee Chair: Tsui, Kwok-Leung; Committee Member: Goldsman, David; Committee Member: Hung, Ying; Committee Member: Shi, Jianjun; Committee Member: Vengazhiyil, Rosha

    Input variable selection in time-critical knowledge integration applications: A review, analysis, and recommendation paper

    Get PDF
    This is the post-print version of the final paper published in Advanced Engineering Informatics. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.The purpose of this research is twofold: first, to undertake a thorough appraisal of existing Input Variable Selection (IVS) methods within the context of time-critical and computation resource-limited dimensionality reduction problems; second, to demonstrate improvements to, and the application of, a recently proposed time-critical sensitivity analysis method called EventTracker to an environment science industrial use-case, i.e., sub-surface drilling. Producing time-critical accurate knowledge about the state of a system (effect) under computational and data acquisition (cause) constraints is a major challenge, especially if the knowledge required is critical to the system operation where the safety of operators or integrity of costly equipment is at stake. Understanding and interpreting, a chain of interrelated events, predicted or unpredicted, that may or may not result in a specific state of the system, is the core challenge of this research. The main objective is then to identify which set of input data signals has a significant impact on the set of system state information (i.e. output). Through a cause-effect analysis technique, the proposed technique supports the filtering of unsolicited data that can otherwise clog up the communication and computational capabilities of a standard supervisory control and data acquisition system. The paper analyzes the performance of input variable selection techniques from a series of perspectives. It then expands the categorization and assessment of sensitivity analysis methods in a structured framework that takes into account the relationship between inputs and outputs, the nature of their time series, and the computational effort required. The outcome of this analysis is that established methods have a limited suitability for use by time-critical variable selection applications. By way of a geological drilling monitoring scenario, the suitability of the proposed EventTracker Sensitivity Analysis method for use in high volume and time critical input variable selection problems is demonstrated.E

    Canonical correlation analysis and DEA for azorean agriculture efficiency

    Get PDF
    In this paper we will document the application of canonical correlation analysis to variable aggregation using the correlations of the original variables with the canonical variates. A case study, about farms in Terceira Island, with a small data set is presented. In this data set of 30 farms we intend to use 17 input variables and 2 output variables to measure DEA efficiency. Without any data reduction procedure several problems known as “curse of dimensionality” are expected. With the data reduction procedures suggested it was possible to conclude quite acceptable and domain consistent conclusions.N/

    An extended orthogonal forward regression algorithm for system identification using entropy

    Get PDF
    In this paper, a fast identification algorithm for nonlinear dynamic stochastic system identification is presented. The algorithm extends the classical Orthogonal Forward Regression (OFR) algorithm so that instead of using the Error Reduction Ratio (ERR) for term selection, a new optimality criterion —Shannon’s Entropy Power Reduction Ratio(EPRR) is introduced to deal with both Gaussian and non-Gaussian signals. It is shown that the new algorithm is both fast and reliable and examples are provided to illustrate the effectiveness of the new approach

    Model structure selection using an integrated forward orthogonal search algorithm assisted by squared correlation and mutual information

    No full text
    Model structure selection plays a key role in non-linear system identification. The first step in non-linear system identification is to determine which model terms should be included in the model. Once significant model terms have been determined, a model selection criterion can then be applied to select a suitable model subset. The well known Orthogonal Least Squares (OLS) type algorithms are one of the most efficient and commonly used techniques for model structure selection. However, it has been observed that the OLS type algorithms may occasionally select incorrect model terms or yield a redundant model subset in the presence of particular noise structures or input signals. A very efficient Integrated Forward Orthogonal Search (IFOS) algorithm, which is assisted by the squared correlation and mutual information, and which incorporates a Generalised Cross-Validation (GCV) criterion and hypothesis tests, is introduced to overcome these limitations in model structure selection

    A unified wavelet-based modelling framework for non-linear system identification: the WANARX model structure

    Get PDF
    A new unified modelling framework based on the superposition of additive submodels, functional components, and wavelet decompositions is proposed for non-linear system identification. A non-linear model, which is often represented using a multivariate non-linear function, is initially decomposed into a number of functional components via the wellknown analysis of variance (ANOVA) expression, which can be viewed as a special form of the NARX (non-linear autoregressive with exogenous inputs) model for representing dynamic input–output systems. By expanding each functional component using wavelet decompositions including the regular lattice frame decomposition, wavelet series and multiresolution wavelet decompositions, the multivariate non-linear model can then be converted into a linear-in-theparameters problem, which can be solved using least-squares type methods. An efficient model structure determination approach based upon a forward orthogonal least squares (OLS) algorithm, which involves a stepwise orthogonalization of the regressors and a forward selection of the relevant model terms based on the error reduction ratio (ERR), is employed to solve the linear-in-the-parameters problem in the present study. The new modelling structure is referred to as a wavelet-based ANOVA decomposition of the NARX model or simply WANARX model, and can be applied to represent high-order and high dimensional non-linear systems

    Nonlinear Dimension Reduction for Micro-array Data (Small n and Large p)

    Get PDF
    corecore