1,412,261 research outputs found
Multiple testing correction in linear mixed models.
BackgroundMultiple hypothesis testing is a major issue in genome-wide association studies (GWAS), which often analyze millions of markers. The permutation test is considered to be the gold standard in multiple testing correction as it accurately takes into account the correlation structure of the genome. Recently, the linear mixed model (LMM) has become the standard practice in GWAS, addressing issues of population structure and insufficient power. However, none of the current multiple testing approaches are applicable to LMM.ResultsWe were able to estimate per-marker thresholds as accurately as the gold standard approach in real and simulated datasets, while reducing the time required from months to hours. We applied our approach to mouse, yeast, and human datasets to demonstrate the accuracy and efficiency of our approach.ConclusionsWe provide an efficient and accurate multiple testing correction approach for linear mixed models. We further provide an intuition about the relationships between per-marker threshold, genetic relatedness, and heritability, based on our observations in real data
Generalized fiducial inference for normal linear mixed models
While linear mixed modeling methods are foundational concepts introduced in
any statistical education, adequate general methods for interval estimation
involving models with more than a few variance components are lacking,
especially in the unbalanced setting. Generalized fiducial inference provides a
possible framework that accommodates this absence of methodology. Under the
fabric of generalized fiducial inference along with sequential Monte Carlo
methods, we present an approach for interval estimation for both balanced and
unbalanced Gaussian linear mixed models. We compare the proposed method to
classical and Bayesian results in the literature in a simulation study of
two-fold nested models and two-factor crossed designs with an interaction term.
The proposed method is found to be competitive or better when evaluated based
on frequentist criteria of empirical coverage and average length of confidence
intervals for small sample sizes. A MATLAB implementation of the proposed
algorithm is available from the authors.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1030 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Recommended from our members
D-optimal designs formulti-response linear mixed models
Linear mixed models have become popular in many statistical applications duringrecent years. However design issues for multi-response linear mixed models are rarelydiscussed. Themain purpose of this paper is to investigate D-optimal designs formultiresponselinear mixed models. We provide two equivalence theorems to characterizethe optimal designs for the estimation of the fixed effects and the prediction of randomeffects, respectively. Two examples of the D-optimal designs formulti-response linearmixed models are given for illustration
A Note on the Identifiability of Generalized Linear Mixed Models
I present here a simple proof that, under general regularity conditions, the
standard parametrization of generalized linear mixed model is identifiable. The
proof is based on the assumptions of generalized linear mixed models on the
first and second order moments and some general mild regularity conditions,
and, therefore, is extensible to quasi-likelihood based generalized linear
models. In particular, binomial and Poisson mixed models with dispersion
parameter are identifiable when equipped with the standard parametrization.Comment: 9 pages, no figure
Variational approximation for mixtures of linear mixed models
Mixtures of linear mixed models (MLMMs) are useful for clustering grouped
data and can be estimated by likelihood maximization through the EM algorithm.
The conventional approach to determining a suitable number of components is to
compare different mixture models using penalized log-likelihood criteria such
as BIC.We propose fitting MLMMs with variational methods which can perform
parameter estimation and model selection simultaneously. A variational
approximation is described where the variational lower bound and parameter
updates are in closed form, allowing fast evaluation. A new variational greedy
algorithm is developed for model selection and learning of the mixture
components. This approach allows an automatic initialization of the algorithm
and returns a plausible number of mixture components automatically. In cases of
weak identifiability of certain model parameters, we use hierarchical centering
to reparametrize the model and show empirically that there is a gain in
efficiency by variational algorithms similar to that in MCMC algorithms.
Related to this, we prove that the approximate rate of convergence of
variational algorithms by Gaussian approximation is equal to that of the
corresponding Gibbs sampler which suggests that reparametrizations can lead to
improved convergence in variational algorithms as well.Comment: 36 pages, 5 figures, 2 tables, submitted to JCG
Linear quantile mixed models
Dependent data arise in many studies. For example, children with the same parents or living in neighbouring geographic areas tend to be more alike in many characteristics than individuals chosen at random from the population at large; observations taken repeatedly on the same individual are likely to be more similar than observations from different individuals. Frequently adopted sampling designs, such as cluster, multilevel, spatial, and repeated measures (or longitudinal or panel), may induce this dependence, which the analysis of the data needs to take into due account. In a previous publication (Geraci and Bottai, Biostatistics 2007), we proposed a conditional quantile regression model for continuous responses where a random intercept was included along with fixed-coefficient predictors to account for between-subjects dependence in the context of longitudinal data analysis. Conditional on the random intercept, the response was assumed to follow an asymmetric Laplace distribution. The approach hinged upon the link existing between the minimization of weighted least absolute deviations, typically used in quantile regression, and the maximization of Laplace likelihood. As a follow up to that study, here we consider an extension of those models to more complex dependence structures in the data, which are modelled by including multiple random effects in the linear conditional quantile functions. Differently from the Gibbs sampling expectation-maximization approach proposed previously, the estimation of the fixed regression coefficients and of the random effects covariance matrix is based on a combination of Gaussian quadrature approximations and optimization algorithms. The former include Gauss-Hermite and Gauss-Laguerre quadratures for, respectively, normal and double exponential (i.e., symmetric Laplace) random effects; the latter include a gradient search algorithm and general purpose optimizers. As a result, some of the computational burden associated with large Gibbs sample sizes is avoided. We also discuss briefly an estimation approach based on generalized Clarke derivatives. Finally, a simulation study is presented and some preliminary results are shown
Model Selection in Linear Mixed Models
Linear mixed effects models are highly flexible in handling a broad range of
data types and are therefore widely used in applications. A key part in the
analysis of data is model selection, which often aims to choose a parsimonious
model with other desirable properties from a possibly very large set of
candidate statistical models. Over the last 5-10 years the literature on model
selection in linear mixed models has grown extremely rapidly. The problem is
much more complicated than in linear regression because selection on the
covariance structure is not straightforward due to computational issues and
boundary problems arising from positive semidefinite constraints on covariance
matrices. To obtain a better understanding of the available methods, their
properties and the relationships between them, we review a large body of
literature on linear mixed model selection. We arrange, implement, discuss and
compare model selection methods based on four major approaches: information
criteria such as AIC or BIC, shrinkage methods based on penalized loss
functions such as LASSO, the Fence procedure and Bayesian techniques.Comment: Published in at http://dx.doi.org/10.1214/12-STS410 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …
