68 research outputs found

    Bowker's Test for Symmetry and Modifications within the Algebraic Framework

    Get PDF
    Categorical data occur in a wide range of statistical applications. If the data are observed in matched pairs, it is often of interest to examine the differences between the responses. We concentrate on tests of axial symmetry in two-way tables. A commonly used procedure is the Bowker test which is a generalization of the McNemar test. The test decision is based on a x2-approximation which might not be adequate, for example if the table is sparse. Therefore modifications of the test statistic have been proposed. We suggest a test of symmetry based on Bowker's test and Markov Chain Monte Carlo methods following the algorithm of Diaconis and Sturmfels (1998). We carry out a simulation study to determine and com- pare the performance of the simulation test, the Bowker test and two modifications. --Computational commutative algebra,Diaconis-Sturmfels algorithm,matched-pairs data,MCMC,Metropolis-Hastings algorithm,test for symmetry

    Bowker's Test for Symmetry and Modifications within the Algebraic Framework

    Get PDF
    Categorical data occur in a wide range of statistical applications. If the data are observed in matched pairs, it is often of interest to examine the di®erences between the responses. We concentrate on tests of axial symmetry in two-way tables. A commonly used procedure is the Bowker test which is a generalization of the McNemar test. The test decision is based on a X²-approximation which might not be adequate, for example if the table is sparse. Therefore modifications of the test statistic have been proposed. We suggest a test of symmetry based on Bowker's test and Markov Chain Monte Carlo methods following the algorithm of Diaconis and Sturmfels (1998). We carry out a simulation study to determine and compare the performance of the simulation test, the Bowker test and two modifications

    Strategies for Multi-Response Parameter Design using Loss Functions and Joint Optimization Plots

    Get PDF
    The development of high-quality products or production processes can often be greatly improved by statistically planned and analysed experiments. Taguchi methods proved to be a milestone in this field, suggesting optimal design settings for a single measured response. However, these often fail to meet the needs of today's products and manufacturing processes, which require simultaneous optimization over several quality characteristics. Current extensions for handling multi-responses assume that all responses are weighted beforehand in terms of costs due to deviations from desired target settings. Such information is usually unavailable, especially with manufacturing processes. As an alternative solution, we propose strategies that use sequences of possible weights assigned to each of the multiple responses. For each weighting a design factor combination is derived, which minimizes a respective estimated multivariate loss function and is optimal with respect to some compromise of the responses. This compromise can be graphically displayed to the engineer, who can thereby gain much more insight into the production process and draw more valuable conclusions

    Outlier identification rules for generalized linear models

    Get PDF
    Observations which seem to deviate strongly from the main part of the data may occur in every statistical analysis. These observations usually labelled as outliers, may cause completely misleading results when using standard methods and may also contain information about special events or dependencies. Therefore it is interest to identify them. We discuss outliers in situations where a generalized linear model is assumed as null-model for the regular data and introduce rules for their identifications. For the special cases of a loglinear Poisson model and a logistic regression model some one-step identifiers based on robust and non-robust estimators are proposed and compared. --

    Numerical algebraic fan of a design for statistical model building

    Get PDF
    In this article we develop methods for the analysis of non-standard experimental designs by using techniques from algebraic statistics. Our work is motivated by a thermal spraying process used to produce a particle coating on a surface, e.g. for wear protection or durable medical instruments. In this application non-standard designs occur as intermediate results from initial standard designs in a two-stage production process. We investigate algebraic methods to derive better identifiable models with particular emphasis on the second stage of two-stage processes. Ideas from algebraic statistics are explored where the design as finite set of distinct experimental settings is expressed as solution of a system of polynomials. Thereby the design is identified by a polynomial ideal and features and properties of the ideal are explored and provide inside into the structures of models identifiable by the design [Pistone et al., 2001, Riccomagno, 2009]. Holliday et al. [1999] apply these ideas to a problem from the automotive industry with an incomplete standard factorial design, Bates et al. [2003] to the question of finding good polynomial metamodels for computer experiments. In our thermal spraying application, designs for the controllable process parameters are run and properties of particles in flight measured as intermediate responses. The final output describes the coating properties, which are very time-consuming and expensive to measure as the specimen has to be destroyed. It is desirable to predict coating properties either on the basis of process parameters and/or from particle properties. Rudak et al. [2012] provides a first comparison of different modeling approaches. There are still open questions: which models are identifiable with the different choices of input (process parameters, particle properties, or both)? Is it better to base the second model between particle and coating properties on estimated expected values or the observations themselves? The present article is a contribution in this direction. Especially in the second stage particle properties as input variables are observed values from the originally chosen design for the controllable factors. The resulting design on the particle property level can be tackled with algebraic statistics to determine identifiable models. However, it turns out that resulting models contain elements which are only identifiable due to small deviations of the design from more regular points, hence leading to unwanted unstable model results. We tackle this problem with tools from algebraic statistics. Because of the fact that data in the second stage are very noisy, we extend existing theory by switching from symbolic, exact computations to numerical computations in the calculation of the design ideal and of its fan. Specifically, instead of polynomials whose solution are the design points, we identify a design with a set of polynomials which "almost vanish" at the design points using results and algorithms from Fassino [2010]. The paper is organized as follows. In Section 2 three different approaches towards the modeling of a final output in a two-stage process are introduced and compared. The algebraic treatment and reasoning is the same whatever the approach. Section 3 contains the theoretical background of algebraic statistics for experimental design, always exemplified for the special application. Section 4 is the case study itself
    corecore