1,342 research outputs found

    MATS: Inference for potentially Singular and Heteroscedastic MANOVA

    Get PDF
    In many experiments in the life sciences, several endpoints are recorded per subject. The analysis of such multivariate data is usually based on MANOVA models assuming multivariate normality and covariance homogeneity. These assumptions, however, are often not met in practice. Furthermore, test statistics should be invariant under scale transformations of the data, since the endpoints may be measured on different scales. In the context of high-dimensional data, Srivastava and Kubokawa (2013) proposed such a test statistic for a specific one-way model, which, however, relies on the assumption of a common non-singular covariance matrix. We modify and extend this test statistic to factorial MANOVA designs, incorporating general heteroscedastic models. In particular, our only distributional assumption is the existence of the group-wise covariance matrices, which may even be singular. We base inference on quantiles of resampling distributions, and derive confidence regions and ellipsoids based on these quantiles. In a simulation study, we extensively analyze the behavior of these procedures. Finally, the methods are applied to a data set containing information on the 2016 presidential elections in the USA with unequal and singular empirical covariance matrices

    Impact of unbalancedness and heteroscedasticity on classic parametric significance tests of two-way fixed-effects ANOVA tests

    Get PDF
    Classic parametric statistical tests, like the analysis of variance (ANOVA), are powerful tools used for comparing population means. These tests produce accurate results provided the data satisfies underlying assumptions such as homoscedasticity and balancedness, otherwise biased results are obtained. However, these assumptions are rarely satisfied in real-life. Alternative procedures must be explored. This thesis aims at investigating the impact of heteroscedasticity and unbalancedness on effect sizes in two-way fixed-effects ANOVA models. A real-life dataset, from which three different samples were simulated was used to investigate the changes in effect sizes under the influence of unequal variances and unbalancedness. The parametric bootstrap approach was proposed in case of unequal variances and non-normality. The results obtained indicated that heteroscedasticity significantly inflates effect sizes while unbalancedness has non-significant impact on effect sizes in two-way ANOVA models. However, the impact worsens when the data is both unbalanced and heteroscedastic.StatisticsM. Sc. (Statistics

    A robustness study of parametric and non-parametric tests in model-based multifactor dimensionality reduction for epistasis detection

    Get PDF
    Background: Applying a statistical method implies identifying underlying (model) assumptions and checking their validity in the particular context. One of these contexts is association modeling for epistasis detection. Here, depending on the technique used, violation of model assumptions may result in increased type I error, power loss, or biased parameter estimates. Remedial measures for violated underlying conditions or assumptions include data transformation or selecting a more relaxed modeling or testing strategy. Model-Based Multifactor Dimensionality Reduction (MB-MDR) for epistasis detection relies on association testing between a trait and a factor consisting of multilocus genotype information. For quantitative traits, the framework is essentially Analysis of Variance (ANOVA) that decomposes the variability in the trait amongst the different factors. In this study, we assess through simulations, the cumulative effect of deviations from normality and homoscedasticity on the overall performance of quantitative Model-Based Multifactor Dimensionality Reduction (MB-MDR) to detect 2-locus epistasis signals in the absence of main effects. Methodology: Our simulation study focuses on pure epistasis models with varying degrees of genetic influence on a quantitative trait. Conditional on a multilocus genotype, we consider quantitative trait distributions that are normal, chi-square or Student's t with constant or non-constant phenotypic variances. All data are analyzed with MB-MDR using the built-in Student's t-test for association, as well as a novel MB-MDR implementation based on Welch's t-test. Traits are either left untransformed or are transformed into new traits via logarithmic, standardization or rank-based transformations, prior to MB-MDR modeling. Results: Our simulation results show that MB-MDR controls type I error and false positive rates irrespective of the association test considered. Empirically-based MB-MDR power estimates for MB-MDR with Welch's t-tests are generally lower than those for MB-MDR with Student's t-tests. Trait transformations involving ranks tend to lead to increased power compared to the other considered data transformations. Conclusions: When performing MB-MDR screening for gene-gene interactions with quantitative traits, we recommend to first rank-transform traits to normality and then to apply MB-MDR modeling with Student's t-tests as internal tests for association

    MATS: inference for potentially singular and heteroscedastic MANOVA

    Get PDF

    Some solutions to the multivariate Behrens-Fisher problem for dissimilarity-based analyses

    Get PDF
    The essence of the generalised multivariate Behrens–Fisher problem (BFP) is how to test the null hypothesis of equality of mean vectors for two or more populations when their dispersion matrices differ. Solutions to the BFP usually assume variables are multivariate normal and do not handle high‐dimensional data. In ecology, species' count data are often high‐dimensional, non‐normal and heterogeneous. Also, interest lies in analysing compositional dissimilarities among whole communities in non‐Euclidean (semi‐metric or non‐metric) multivariate space. Hence, dissimilarity‐based tests by permutation (e.g., PERMANOVA, ANOSIM) are used to detect differences among groups of multivariate samples. Such tests are not robust, however, to heterogeneity of dispersions in the space of the chosen dissimilarity measure, most conspicuously for unbalanced designs. Here, we propose a modification to the PERMANOVA test statistic, coupled with either permutation or bootstrap resampling methods, as a solution to the BFP for dissimilarity‐based tests. Empirical simulations demonstrate that the type I error remains close to nominal significance levels under classical scenarios known to cause problems for the un‐modified test. Furthermore, the permutation approach is found to be more powerful than the (more conservative) bootstrap for detecting changes in community structure for real ecological datasets. The utility of the approach is shown through analysis of 809 species of benthic soft‐sediment invertebrates from 101 sites in five areas spanning 1960 km along the Norwegian continental shelf, based on the Jaccard dissimilarity measure
    • 

    corecore