23,862 research outputs found

    Missing Observations in Split-Plot Central Composite Designs: The Loss in Relative A-, G-, and V- Efficiency

    Get PDF
    The trace (A), maximum average prediction variance (G), and integrated average prediction variance (V) criteria are experimental design evaluation criteria, which are based on precision of estimates of parameters and responses. Central Composite Designs(CCD) conducted within a split-plot structure (split-plot CCDs) consists of factorial (), whole-plot axial (), subplot axial (), and center () points, each of which play different role in model estimation. This work studies relative A-, G- and V-efficiency losses due to missing pairs of observations in split-plot CCDs under different ratios (d) of whole-plot and sub-plot error variances. Three candidate designs of different sizes were considered and for each of the criteria, relative efficiency functions were formulated and used to investigate the efficiency of each of the designs when some observations were missing relative to the full one. Maximum A-efficiency losses of 19.1, 10.6, and 15.7% were observed at = 0.5, due to missing pairs , , and , respectively, indicating a negative effect on the precision of estimates of model parameters of these designs. However, missing observations of the pairs- , , , , and did not exhibit any negative effect on these designs' relative A-efficiency. Maximum G- and Vefficiency losses of 10.1,16.1,0.1% and 0.1, 1.1, 0.2%, were observed, respectively, at = 0.5, when the pairs- , , , were missing, indicating a significant increase in the designs' maximum and average variances of prediction. In all, the efficiency losses become insignificant as d increases. Thus, the study has identified the positive impact of correlated observations on efficiency of experimental designs. Keywords: Missing Observations, Efficiency Loss, Prediction varianc

    Exploiting correlation in the construction of D-optimal response surface designs.

    Get PDF
    Cost considerations and difficulties in performing completely randomized experiments often dictate the necessity to run response surface experiments in a bi-randomization format. The resulting compound symmetric error structure not only affects estimation and inference procedures but it also has severe consequences for the optimality of the designs used. Fir this reason, it should be taken into account explicitly when constructing the design. In this paper, an exchange algorithm for constructing D-optimal bi-randomization designs is developed and the resulting designs are analyzed. Finally, the concept of bi-randomization experiments is refined, yielding very efficient designs, which, in many cases, outperform D-optimal completely randomized experiments.Structure;

    Sequential Design for Computer Experiments with a Flexible Bayesian Additive Model

    Full text link
    In computer experiments, a mathematical model implemented on a computer is used to represent complex physical phenomena. These models, known as computer simulators, enable experimental study of a virtual representation of the complex phenomena. Simulators can be thought of as complex functions that take many inputs and provide an output. Often these simulators are themselves expensive to compute, and may be approximated by "surrogate models" such as statistical regression models. In this paper we consider a new kind of surrogate model, a Bayesian ensemble of trees (Chipman et al. 2010), with the specific goal of learning enough about the simulator that a particular feature of the simulator can be estimated. We focus on identifying the simulator's global minimum. Utilizing the Bayesian version of the Expected Improvement criterion (Jones et al. 1998), we show that this ensemble is particularly effective when the simulator is ill-behaved, exhibiting nonstationarity or abrupt changes in the response. A number of illustrations of the approach are given, including a tidal power application.Comment: 21 page

    Engineering design applications of surrogate-assisted optimization techniques

    No full text
    The construction of models aimed at learning the behaviour of a system whose responses to inputs are expensive to measure is a branch of statistical science that has been around for a very long time. Geostatistics has pioneered a drive over the last half century towards a better understanding of the accuracy of such ‘surrogate’ models of the expensive function. Of particular interest to us here are some of the even more recent advances related to exploiting such formulations in an optimization context. While the classic goal of the modelling process has been to achieve a uniform prediction accuracy across the domain, an economical optimization process may aim to bias the distribution of the learning budget towards promising basins of attraction. This can only happen, of course, at the expense of the global exploration of the space and thus finding the best balance may be viewed as an optimization problem in itself. We examine here a selection of the state of-the-art solutions to this type of balancing exercise through the prism of several simple, illustrative problems, followed by two ‘real world’ applications: the design of a regional airliner wing and the multi-objective search for a low environmental impact hous

    Comparison of Optimality Criteria of Reduced Models for Response Surface Designs with Restricted Randomization

    Get PDF
    In this work, DD-, GG-, and AA- efficiencies and the scaled average prediction variance, IVIV criterion, are computed and compared for second-order split-plot central composite design. These design optimality criteria are evaluated across the set of reduced split-plot central composite design models for three design variables under various ratios of the variance components (or degrees of correlation dd). It was observed that DD, AA, GG, and IVIV for these models strongly depend on the values of dd; they are robust to changes in the interaction terms and vary dramatically with the number of, and changes in the squared terms

    Hierarchical spatial models for predicting tree species assemblages across large domains

    Full text link
    Spatially explicit data layers of tree species assemblages, referred to as forest types or forest type groups, are a key component in large-scale assessments of forest sustainability, biodiversity, timber biomass, carbon sinks and forest health monitoring. This paper explores the utility of coupling georeferenced national forest inventory (NFI) data with readily available and spatially complete environmental predictor variables through spatially-varying multinomial logistic regression models to predict forest type groups across large forested landscapes. These models exploit underlying spatial associations within the NFI plot array and the spatially-varying impact of predictor variables to improve the accuracy of forest type group predictions. The richness of these models incurs onerous computational burdens and we discuss dimension reducing spatial processes that retain the richness in modeling. We illustrate using NFI data from Michigan, USA, where we provide a comprehensive analysis of this large study area and demonstrate improved prediction with associated measures of uncertainty.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS250 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Advocating better habitat use and selection models in bird ecology

    Get PDF
    Studies on habitat use and habitat selection represent a basic aspect of bird ecology, due to its importance in natural history, distribution, response to environmental changes, management and conservation. Basically, a statistical model that identifies environmental variables linked to a species presence is searched for. In this sense, there is a wide array of analytical methods that identify important explanatory variables within a model, with higher explanatory and predictive power than classical regression approaches. However, some of these powerful models are not widespread in ornithological studies, partly because of their complex theory, and in some cases, difficulties on their implementation and interpretation. Here, I describe generalized linear models and other five statistical models for the analysis of bird habitat use and selection outperforming classical approaches: generalized additive models, mixed effects models, occupancy models, binomial N-mixture models and decision trees (classification and regression trees, bagging, random forests and boosting). Each of these models has its benefits and drawbacks, but major advantages include dealing with non-normal distributions (presence-absence and abundance data typically found in habitat use and selection studies), heterogeneous variances, non-linear and complex relationships among variables, lack of statistical independence and imperfect detection. To aid ornithologists in making use of the methods described, a readable description of each method is provided, as well as a flowchart along with some recommendations to help them decide the most appropriate analysis. The use of these models in ornithological studies is encouraged, given their huge potential as statistical tools in bird ecology.Fil: Palacio, Facundo Xavier. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Naturales y Museo. División Zoología de Vertebrados. Sección Ornitología; Argentin

    Response Surface Splitplot Designs: A Literature Review

    Get PDF
    The fundamental principles of experiment design are factorization, replication, randomization, and local control of error. In many industrial experiments, however, departure from these principles is commonplace. Often in our experiments, complete randomization is not feasible because factor level settings are hard, impractical, or inconvenient to change, or the resources available to execute under homogeneous conditions are limited. These restrictions in randomization result in split-plot experiments. Also, we are often interested in fitting second-order models, which lead to second-order split-plot experiments. Although response surface methodology has experienced a phenomenal growth since its inception, second-order split-plot design has received only modest attention relative to other topics during the same period. Many graduate textbooks either ignore or only provide a relatively basic treatise of this subject. The peer-reviewed literature on second-order split-plot designs, especially with blocking, is scarce, limited in examples, and often provides limited or too general guidelines. This deficit of information leaves practitioners ill-prepared to face the many challenges associated with these types of designs. This article seeks to provide an overview of recent literature on response surface split-plot designs to help practitioners in dealing with these types of designs
    corecore