1,412,261 research outputs found

    Multiple testing correction in linear mixed models.

    Get PDF
    BackgroundMultiple hypothesis testing is a major issue in genome-wide association studies (GWAS), which often analyze millions of markers. The permutation test is considered to be the gold standard in multiple testing correction as it accurately takes into account the correlation structure of the genome. Recently, the linear mixed model (LMM) has become the standard practice in GWAS, addressing issues of population structure and insufficient power. However, none of the current multiple testing approaches are applicable to LMM.ResultsWe were able to estimate per-marker thresholds as accurately as the gold standard approach in real and simulated datasets, while reducing the time required from months to hours. We applied our approach to mouse, yeast, and human datasets to demonstrate the accuracy and efficiency of our approach.ConclusionsWe provide an efficient and accurate multiple testing correction approach for linear mixed models. We further provide an intuition about the relationships between per-marker threshold, genetic relatedness, and heritability, based on our observations in real data

    Generalized fiducial inference for normal linear mixed models

    Get PDF
    While linear mixed modeling methods are foundational concepts introduced in any statistical education, adequate general methods for interval estimation involving models with more than a few variance components are lacking, especially in the unbalanced setting. Generalized fiducial inference provides a possible framework that accommodates this absence of methodology. Under the fabric of generalized fiducial inference along with sequential Monte Carlo methods, we present an approach for interval estimation for both balanced and unbalanced Gaussian linear mixed models. We compare the proposed method to classical and Bayesian results in the literature in a simulation study of two-fold nested models and two-factor crossed designs with an interaction term. The proposed method is found to be competitive or better when evaluated based on frequentist criteria of empirical coverage and average length of confidence intervals for small sample sizes. A MATLAB implementation of the proposed algorithm is available from the authors.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1030 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Note on the Identifiability of Generalized Linear Mixed Models

    Full text link
    I present here a simple proof that, under general regularity conditions, the standard parametrization of generalized linear mixed model is identifiable. The proof is based on the assumptions of generalized linear mixed models on the first and second order moments and some general mild regularity conditions, and, therefore, is extensible to quasi-likelihood based generalized linear models. In particular, binomial and Poisson mixed models with dispersion parameter are identifiable when equipped with the standard parametrization.Comment: 9 pages, no figure

    Variational approximation for mixtures of linear mixed models

    Full text link
    Mixtures of linear mixed models (MLMMs) are useful for clustering grouped data and can be estimated by likelihood maximization through the EM algorithm. The conventional approach to determining a suitable number of components is to compare different mixture models using penalized log-likelihood criteria such as BIC.We propose fitting MLMMs with variational methods which can perform parameter estimation and model selection simultaneously. A variational approximation is described where the variational lower bound and parameter updates are in closed form, allowing fast evaluation. A new variational greedy algorithm is developed for model selection and learning of the mixture components. This approach allows an automatic initialization of the algorithm and returns a plausible number of mixture components automatically. In cases of weak identifiability of certain model parameters, we use hierarchical centering to reparametrize the model and show empirically that there is a gain in efficiency by variational algorithms similar to that in MCMC algorithms. Related to this, we prove that the approximate rate of convergence of variational algorithms by Gaussian approximation is equal to that of the corresponding Gibbs sampler which suggests that reparametrizations can lead to improved convergence in variational algorithms as well.Comment: 36 pages, 5 figures, 2 tables, submitted to JCG

    Linear quantile mixed models

    Get PDF
    Dependent data arise in many studies. For example, children with the same parents or living in neighbouring geographic areas tend to be more alike in many characteristics than individuals chosen at random from the population at large; observations taken repeatedly on the same individual are likely to be more similar than observations from different individuals. Frequently adopted sampling designs, such as cluster, multilevel, spatial, and repeated measures (or longitudinal or panel), may induce this dependence, which the analysis of the data needs to take into due account. In a previous publication (Geraci and Bottai, Biostatistics 2007), we proposed a conditional quantile regression model for continuous responses where a random intercept was included along with fixed-coefficient predictors to account for between-subjects dependence in the context of longitudinal data analysis. Conditional on the random intercept, the response was assumed to follow an asymmetric Laplace distribution. The approach hinged upon the link existing between the minimization of weighted least absolute deviations, typically used in quantile regression, and the maximization of Laplace likelihood. As a follow up to that study, here we consider an extension of those models to more complex dependence structures in the data, which are modelled by including multiple random effects in the linear conditional quantile functions. Differently from the Gibbs sampling expectation-maximization approach proposed previously, the estimation of the fixed regression coefficients and of the random effects covariance matrix is based on a combination of Gaussian quadrature approximations and optimization algorithms. The former include Gauss-Hermite and Gauss-Laguerre quadratures for, respectively, normal and double exponential (i.e., symmetric Laplace) random effects; the latter include a gradient search algorithm and general purpose optimizers. As a result, some of the computational burden associated with large Gibbs sample sizes is avoided. We also discuss briefly an estimation approach based on generalized Clarke derivatives. Finally, a simulation study is presented and some preliminary results are shown

    Model Selection in Linear Mixed Models

    Full text link
    Linear mixed effects models are highly flexible in handling a broad range of data types and are therefore widely used in applications. A key part in the analysis of data is model selection, which often aims to choose a parsimonious model with other desirable properties from a possibly very large set of candidate statistical models. Over the last 5-10 years the literature on model selection in linear mixed models has grown extremely rapidly. The problem is much more complicated than in linear regression because selection on the covariance structure is not straightforward due to computational issues and boundary problems arising from positive semidefinite constraints on covariance matrices. To obtain a better understanding of the available methods, their properties and the relationships between them, we review a large body of literature on linear mixed model selection. We arrange, implement, discuss and compare model selection methods based on four major approaches: information criteria such as AIC or BIC, shrinkage methods based on penalized loss functions such as LASSO, the Fence procedure and Bayesian techniques.Comment: Published in at http://dx.doi.org/10.1214/12-STS410 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore