58 research outputs found

    Computer construction of experimental plans

    Get PDF
    Experimental plans identify the treatment allocated to each unit and they are necessary for the supervision of most comparative experiments. Few computer programs have been written for constructing experimental plans but many for analysing data arising from designed experiments. In this thesis the construction of experimental plans is reviewed so as to determine requirements for a computer program. One program, DSIGNX, is described. Four main steps in the construction are identified: declaration, formation of the unrandomized plan (the design), randomization and output. The formation of the design is given most attention. The designs considered are those found to be important in agricultural experimentation and a basic objective is set that the 'proposed' program should construct most designs presented in standard texts (e.g. Cochran and Cox (1957)) together with important designs which have been developed recently. Topics discussed include block designs, factorial designs, orthogonal Latin squares and designs for experiments with non-independent observations. Some topics are discussed in extra detail; these include forming standard designs and selecting defining contrasts in symmetric factorial experiments, general procedures for orthogonal Latin squares and constructing serially balanced designs. Emphasis is placed on design generators, especially the design key and generalized cyclic generators, because of their versatility. These generators are shown to provide solutions to most balanced and partially balanced incomplete block designs and to provide efficient block designs and row and column designs. They are seen to be of fundamental importance in constructing factorial designs. Other versatile generators are described but no attempt is made to include all construction techniques. Methods for deriving one design from another or for combining two or more designs are shown to extend the usefulness of the generators. Optimal design procedures and the evaluation of designs are briefly discussed. Methods of randomization are described including automatic procedures based on defined block structures and some forms of restricted randomization for the levels of specified factors. Many procedures presented in the thesis have been included in a computer program DSIGNX. The facilities provided by the program and the language are described and illustrated by practical examples. Finally, the structure of the program and its method of working are described and simplified versions of the principal algorithms presented

    Optimal Row-Column Designs for Correlated Errors and Nested Row-Column Designs for Uncorrelated Errors

    Get PDF
    In this dissertation the design problems are considered in the row-column setting for second order autonormal errors when the treatment effects are estimated by generalized least squares, and in the nested row-column setting for uncorrelated errors when the treatment effects are estimated by ordinary least squares. In the former case, universal optimality conditions are derived separately for designs in the plane and on the torus using more general linear models than those considered elsewhere in the literature. Examples of universally optimum planar designs are given, and a method is developed for the construction of optimum and near optimum designs, that produces several infinite series of universally optimum designs on the torus and near optimum designs in the plane. Efficiencies are calculated for planar versions of the torus designs, which are found to be highly efficient with respect to some commonly used optimality criterion. In the nested row-column setting, several methods of construction of balanced and partially balanced incomplete block designs with nested rows and columns are developed, from which many infinite series of designs are obtained. In particular, 149 balanced incomplete block designs with nested rows and columns are listed (80 appear to be new) for the number of treatments, v \u3c 101, a prime power

    Relations among partitions

    Get PDF
    Combinatorialists often consider a balanced incomplete-block design to consist of a set of points, a set of blocks, and an incidence relation between them which satisfies certain conditions. To a statistician, such a design is a set of experimental units with two partitions, one into blocks and the other into treatments: it is the relation between these two partitions which gives the design its properties. The most common binary relations between partitions that occur in statistics are refinement, orthogonality and balance. When there are more than two partitions, the binary relations may not suffice to give all the properties of the system. I shall survey work in this area, including designs such as double Youden rectangles.PostprintPeer reviewe

    Quantile estimation using auxiliary information with applications to soil texture data

    Get PDF
    In the Major Land Resource Area (MLRA) 107 pilot project, a multi-phase probability sampling design for updating soil surveys was implemented in western Iowa. In general, multi-phase designs are used when a variable of interest is expensive to measure, but is strongly related to another (auxiliary) variable which is inexpensive to observe. In a multi-phase design, the auxiliary variable is observed for a sample and the study variable is observed for a relatively small sub-sample. In the estimation stage, the auxiliary information is used to improve estimators of distributional quantities relating to the study variable. In particular, we consider estimation of quantiles in this context;Chambers and Dunstan (1986) (CD) presented an estimator for a finite population distribution function which incorporates auxiliary information. A linear relationship between the study variable and the auxiliary information is assumed. The residuals in the linear model are assumed to be homoskedastic. We derive a Bahadur-like representation for the quantile estimator corresponding to the CD distribution function estimator. This expression is used to derive an expression for the asymptotic variance of the quantile estimator;We consider estimation of quantiles for soil texture profiles using data from the MLRA 107 pilot project. The laboratory determination of soil texture is the variable of interest. Auxiliary information is available in the form of field determinations of soil texture. Due to the multi-phase sampling design used for data collection, field determinations are available at more sites than laboratory determinations. The CD quantile estimator is modified to incorporate sampling weights and to allow heteroskedasticity in the assumed linear model;A Bayesian approach to this estimation problem is also considered. A hierarchical model is used to describe the relationships between observed data and unknown parameters. Soil horizon profiles are modeled as realizations of Markov chains. Transformed textures are modeled with Gaussian mixtures. The posterior distribution of soil texture profiles is numerically approximated using a Gibbs sampler. The hierarchical model provides a comprehensive framework which may be useful for analyzing other variables collected in the pilot project. The two approaches are compared using simulated and real data

    Supervised learning methods for association detection, biomarker discovery, and pattern recognition in compositional omics data

    Get PDF
    Rapid advances and reduced cost in high throughput sequencing (HTS) technologies have enabled widespread profiling of microbial metagenomes and microbiomes in humans to better understand associations between microbial communities and disease. Data generated using these technologies are vast, high-dimensional, and nuanced, including limitations in instrument sequencing capacities and measurements that are inherently relative rather than absolute. Unlike absolute measurements, these relative counts — referred to as compositional data — require special methods for analysis and interpretation. Unfortunately, compositional data methodology are esoteric and generally not well adapted to high throughput sequencing data. Because of this, HTS data are often analyzed with traditional statistical methods that do not properly account for the underlying compositional sample space. This practice may result in spurious associations being reported which may limit study-to-study generalizations and reproducibility. In this thesis, building on existing literature in compositional data analysis and feature selection methodology, we develop a novel statistical association test and a powerful machine learning framework using robust pairwise logratios. Additionally, for each method, we developed freely available (GitHub) R packages (SelEnergyPermR \& DiCoVarML) with functions to perform the core analysis of each method. In the first chapter we provide a basic overview of compositional data and its connection to HTS data. In the second chapter, we present the SelEnergyPerm method for detecting sparse associations in high dimensional metagenomic data. In the third chapter, building on the concept of differential compositional variation proposed in SelEnergyPerm, we present the DiCoVarML framework for supervised classification and biomarker discovery. In the final chapter, we apply the SelEnergyPerm method to test for an association between toxicant exposures and the composition of microbial communities in the nasal passage. Using a parsimonious logratio signature detected by SelEnergyPerm, we then perform integrative analysis, where we explore the connection between nasal microbiome dsybiosis and immune mediator expression in nasal lavage fluid.Doctor of Philosoph

    On the addition of further treatments to Latin Square designs

    Get PDF
    Statisticians have made use of Latin Squares for randomized trials in the design of comparative experiments since the 1920s. Through cross-disciplinary use of Group theory, Statistics and Computing Science the author looks at the applications of the Latin Square as row-column design for scientific comparative experiments. The writer presents his argument, based on likelihood theory, for an F-test on Latin Square designs. A distinction between the combinatorial object and the row-column design known as the Latin Square is explicitly presented for the first time. Using statistical properties together with the tools of group actions on sets of block designs, the author brings new evidence to bear on well known issues such as (i) non-existence of two mutually orthogonal Latin Squares of size six and (ii) enumeration and classification of combinatorial layouts obtainable from superimposing two and three symbols on Latin Squares of size six. The possibility for devising non-parametric computer-intensive permutation tests in statistical experiments designed under 2 or 3 blocking constraints seems to have been explored by the author over the candidate's research period - See Appendix V: Part 2 - for the first time. The discovery that a projective plane does not determine all FIZ-inequivalent complete sets of Mutually Orthogonal Latin Squares is proved by fully enumerating the possibilities for those of size p < 7. The discovery of thousands of representatives of a class of balanced superimpositions of four treatments on Latin Squares of size six through a systematic computer search is reported. These results were presented at the 16th British Combinatorial Conference 1997. Indications of openings for further research are given at the end of the manuscript

    A holistic evaluation concept for long-term structural health monitoring

    Get PDF
    [no abstract

    Subject Index Volumes 1–200

    Get PDF
    • …
    corecore