55 research outputs found

    Bowker's Test for Symmetry and Modifications within the Algebraic Framework

    Get PDF
    Categorical data occur in a wide range of statistical applications. If the data are observed in matched pairs, it is often of interest to examine the differences between the responses. We concentrate on tests of axial symmetry in two-way tables. A commonly used procedure is the Bowker test which is a generalization of the McNemar test. The test decision is based on a x2-approximation which might not be adequate, for example if the table is sparse. Therefore modifications of the test statistic have been proposed. We suggest a test of symmetry based on Bowker's test and Markov Chain Monte Carlo methods following the algorithm of Diaconis and Sturmfels (1998). We carry out a simulation study to determine and com- pare the performance of the simulation test, the Bowker test and two modifications. --Computational commutative algebra,Diaconis-Sturmfels algorithm,matched-pairs data,MCMC,Metropolis-Hastings algorithm,test for symmetry

    Outlier identification rules for generalized linear models

    Get PDF
    Observations which seem to deviate strongly from the main part of the data may occur in every statistical analysis. These observations usually labelled as outliers, may cause completely misleading results when using standard methods and may also contain information about special events or dependencies. Therefore it is interest to identify them. We discuss outliers in situations where a generalized linear model is assumed as null-model for the regular data and introduce rules for their identifications. For the special cases of a loglinear Poisson model and a logistic regression model some one-step identifiers based on robust and non-robust estimators are proposed and compared. --

    MCD-RoSIS - A robust procedure for variable selection

    Get PDF
    Consider the task of estimating a regression function for describing the relationship between a response and a vector of p predictors. Often only a small subset of all given candidate predictors actually effects the response, while the rest might inhibit the analysis. Procedures for variable selection aim to identify the true predictors. A method for variable selection when the dimension p of the regressor space is much larger than the sample size n is SIS — Sure Independence Screening — recently proposed by Fan and Lv (2008). The number of predictors is to be reduced to a value less than the number of observations before conducting the regression analysis. As SIS is based on nonrobust estimators, outliers in the data might lead to the elimination of true predictors. Hence, Gather and Guddat (2008) propose a robustified version of SIS called RoSIS which is based on robust estimators. Here, we give a modification of RoSIS by using the MCD estimator in the new algorithm. The new procedure MCD-RoSIS leads to better results, especially under collinearity. In a simulation study we compare the performance of SIS, RoSIS and MCD-RoSIS w.r.t. their robustness against different types of data contamination as well as different degrees of collinearity

    Total Interaction Index: A Variance-based Sensitivity Index for Function Decomposition

    Get PDF
    http://mucm.ac.uk/UCM2012/Forms/Downloads/Posters/Fruth.pdfInternational audienc

    Crossed-Derivative Based Sensitivity Measures for Interaction Screening

    Get PDF
    Global sensitivity analysis is used to quantify the influence of input variables on a numerical model output. Sobol' indices are now classical sensitivity measures. However their estimation requires a large number of model evaluations, especially when interaction effects are of interest. Derivative-based global sensitivity measures (DGSM) have recently shown their efficiency for the identification of non-influential inputs. In this paper, we define crossed DGSM, based on second-order derivatives of model output. By using a L2- Poincaré inequality, we provide a crossed-DGSM based maximal bound for the superset importance (i.e. total Sobol' indices of an interaction between two inputs). In order to apply this result, we discuss how to estimate the Poincaré constant for various probability distributions. Several analytical and numerical tests show the performance of the bound and allow to develop a generic strategy for interaction screenin

    Data-driven Kriging models based on FANOVA-decomposition

    Get PDF
    Preprint, Working Paper, Document sans référence, etc.International audienceKriging models have been widely used in computer experiments for the analysis of time-consuming computer codes. Based on kernels, they are flexible and can be tuned to many situations. In this paper, we construct kernels that reproduce the computer code complexity by mimicking its interaction structure. While the standard tensor-product kernel implicitly assumes that all interactions are active, the new kernels are suited for a general interaction structure, and will take advantage of the absence of interaction between some inputs. The methodology is twofold. First, the interaction structure is estimated from the data, using a first initial standard Kriging model, and represented by a so-called FANOVA graph. New FANOVA-based sensitivity indices are introduced to detect active interactions. Then this graph is used to derive the form of the kernel, and the corresponding Kriging model is estimated by maximum likelihood. The performance of the overall procedure is illustrated by several 3-dimensional and 6-dimensional simulated and real examples. A substantial improvement is observed when the computer code has a relatively high level of complexit
    • …
    corecore