44,629 research outputs found

    Analysis of factorial experiments using mixed-effects models: options for estimation, prediction and inference

    Get PDF
    In linear mixed-effects modelling of experiments, estimation of variance components, prediction of random effects, and computation of denominator degrees of freedom associated with inference on fixed effects, are important elements of the analysis. This thesis investigates alternatives to the likelihoodbased procedures for analysis of factorial experiments with normally distributed observations. Consistent methods, such as the maximum likelihood method, can be disadvantageous in cases where only small samples are available. Moreover, the algorithms used in linear mixed-effects models can be computationally demanding in large datasets. In this thesis, Hendersonā€™s method 3, a non-iterative variance component estimation method, was considered for estimation of the variance components in a two-way mixed linear model with three variance components. The variance component estimator corresponding to one of the random effects was improved by perturbing the standard unbiased estimator. The improved variance component estimator performed better in terms of mean square error. In an application on a quantitative trait loci (QTL) study, the modified estimator was compared to the restricted maximum likelihood estimator on data from European wild boar Ɨ domestic pig intercross. The modified estimator was shown to approximate the results obtained from the restricted maximum likelihood (REML) method very closely. For balanced and unbalanced data in two-way with and without interaction models, the generalized prediction intervals for the random effects were derived. The coverage probabilities of the proposed intervals were compared with those based on the REML method and the approximate methods of Satterthwaite (1946) and Kenward and Roger (1997). The coverage of the proposed intervals was closer to the chosen nominal level than coverage of prediction intervals based on the REML method. With focus on Type I error, the implications of the available options in the mixed procedure of SAS and the lmer function of R for the inference on the fixed effects were examined. With the default setting of SAS, the frequency of Type I error was higher than with R. The Type I error rate in SAS was close to the nominal value when negative estimates of the variance components were allowed. Both software packages occasionally produced inaccurate results

    Analysing cluster randomised controlled trials using GLMM, GEE1, GEE2, and QIF: results from four case studies

    Get PDF
    Background Using four case studies, we aim to provide practical guidance and recommendations for the analysis of cluster randomised controlled trials. Methods Four modelling approaches (Generalized Linear Mixed Models with parameters estimated by maximum likelihood/restricted maximum likelihood; Generalized Linear Models with parameters estimated by Generalized Estimating Equations (1st order or second order) and Quadratic Inference Function, for analysing correlated individual participant level outcomes in cluster randomised controlled trials were identified after we reviewed the literature. We systematically searched the online bibliography databases of MEDLINE, EMBASE, PsycINFO (via OVID), CINAHL (via EBSCO), and SCOPUS. We identified the above-mentioned four statistical analytical approaches and applied them to four case studies of cluster randomised controlled trials with the number of clusters ranging from 10 to 100, and individual participants ranging from 748 to 9,207. Results were obtained for both continuous and binary outcomes using R and SAS statistical packages. Results The intracluster correlation coefficient (ICC) estimates for the case studies were less than 0.05 and are consistent with the observed ICC values commonly reported in primary care and community-based cluster randomised controlled trials. In most cases, the four methods produced similar results. However, in a few analyses, quadratic inference function produced different results compared to the generalized linear mixed model, first-order generalized estimating equations, and second-order generalized estimating equations, especially in trials with small to moderate numbers of clusters. Conclusion This paper demonstrates the analysis of cluster randomised controlled trials with four modelling approaches. The results obtained were similar in most cases, however, for trials with few clusters we do recommend that the quadratic inference function should be used with caution, and where possible a small sample correction should be used. The generalisability of our results is limited to studies with similar features to our case studies, for example, studies with a similar-sized ICC. It is important to conduct simulation studies to comprehensively evaluate the performance of the four modelling approaches

    Likelihood Inference for Models with Unobservables: Another View

    Full text link
    There have been controversies among statisticians on (i) what to model and (ii) how to make inferences from models with unobservables. One such controversy concerns the difference between estimation methods for the marginal means not necessarily having a probabilistic basis and statistical models having unobservables with a probabilistic basis. Another concerns likelihood-based inference for statistical models with unobservables. This needs an extended-likelihood framework, and we show how one such extension, hierarchical likelihood, allows this to be done. Modeling of unobservables leads to rich classes of new probabilistic models from which likelihood-type inferences can be made naturally with hierarchical likelihood.Comment: This paper discussed in: [arXiv:1010.0804], [arXiv:1010.0807], [arXiv:1010.0810]. Rejoinder at [arXiv:1010.0814]. Published in at http://dx.doi.org/10.1214/09-STS277 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Penalized additive regression for space-time data: a Bayesian perspective

    Get PDF
    We propose extensions of penalized spline generalized additive models for analysing space-time regression data and study them from a Bayesian perspective. Non-linear effects of continuous covariates and time trends are modelled through Bayesian versions of penalized splines, while correlated spatial effects follow a Markov random field prior. This allows to treat all functions and effects within a unified general framework by assigning appropriate priors with different forms and degrees of smoothness. Inference can be performed either with full (FB) or empirical Bayes (EB) posterior analysis. FB inference using MCMC techniques is a slight extension of own previous work. For EB inference, a computationally efficient solution is developed on the basis of a generalized linear mixed model representation. The second approach can be viewed as posterior mode estimation and is closely related to penalized likelihood estimation in a frequentist setting. Variance components, corresponding to smoothing parameters, are then estimated by using marginal likelihood. We carefully compare both inferential procedures in simulation studies and illustrate them through real data applications. The methodology is available in the open domain statistical package BayesX and as an S-plus/R function

    Linear mixed models with endogenous covariates: modeling sequential treatment effects with application to a mobile health study

    Full text link
    Mobile health is a rapidly developing field in which behavioral treatments are delivered to individuals via wearables or smartphones to facilitate health-related behavior change. Micro-randomized trials (MRT) are an experimental design for developing mobile health interventions. In an MRT the treatments are randomized numerous times for each individual over course of the trial. Along with assessing treatment effects, behavioral scientists aim to understand between-person heterogeneity in the treatment effect. A natural approach is the familiar linear mixed model. However, directly applying linear mixed models is problematic because potential moderators of the treatment effect are frequently endogenous---that is, may depend on prior treatment. We discuss model interpretation and biases that arise in the absence of additional assumptions when endogenous covariates are included in a linear mixed model. In particular, when there are endogenous covariates, the coefficients no longer have the customary marginal interpretation. However, these coefficients still have a conditional-on-the-random-effect interpretation. We provide an additional assumption that, if true, allows scientists to use standard software to fit linear mixed model with endogenous covariates, and person-specific predictions of effects can be provided. As an illustration, we assess the effect of activity suggestion in the HeartSteps MRT and analyze the between-person treatment effect heterogeneity

    A classical approach for the analysis of generalized linear mixed models.

    Get PDF
    Thesis (M.Sc.)-University of Natal, Durban, 2004.Generalized linear mixed models (GLMMs) accommodate the study of overdispersion and correlation inherent in hierarchically structured data. These models are an extension of generalized linear models (GLMs) and linear mixed models (LMMs). The linear predictor of a GLM is extended to include an unobserved, albeit realized, vector of Gaussian distributed random effects. Conditional on these random effects, responses are assumed to be independent. The objective function for parameter estimation is an integrated quasi-likelihood (IQL) function which is often intractable since it may consist of high-dimensional integrals. Therefore, an exact maximum likelihood analysis is not feasible. The penalized quasi-likelihood (PQL) function, derived from a first-order Laplace expansion to the IQL about the optimum value of the random effects and under the assumption of slowly varying weights, is an approximate technique for statistical inference in GLMMs. Replacing the conditional weighted quasi-deviance function in the Laplace-approximated IQL by the generalized chi-squared statistic leads to a corrected profile quasilikelihood function for the restricted maximum likelihood (REML) estimation of dispersion components by Fisher scoring. Evaluation of mean parameters, for fixed dispersion components, by iterative weighted least squares (IWLS) yields joint estimates of fixed effects and random effects. Thus, the PQL criterion involves repeated fitting of a Gaussian LMM with a linked response vector and a conditional iterated weight matrix. In some instances, PQL estimates fail to converge to a neighbourhood of their true values. Bias-corrected PQL estimators (CPQL) have hence been proposed, using asymptotic analysis and simulation. The pseudo-likelihood algorithm is an alternative estimation procedure for GLMMs. Global score statistics for hypothesis testing of overdispersion, correlation and heterogeneity in GLMMs has been developed as well as individual score statistics for testing null dispersion components separately. A conditional mean squared error of prediction (CMSEP) has also been considered as a general measure of predictive uncertainty. Local influence measures for testing the robustness of parameter estimates, by inducing minor perturbations into GLMMs, are recent advances in the study of these models. Commercial statistical software is available for the analysis of GLMMs

    Structured additive regression for multicategorical space-time data: A mixed model approach

    Get PDF
    In many practical situations, simple regression models suffer from the fact that the dependence of responses on covariates can not be sufficiently described by a purely parametric predictor. For example effects of continuous covariates may be nonlinear or complex interactions between covariates may be present. A specific problem of space-time data is that observations are in general spatially and/or temporally correlated. Moreover, unobserved heterogeneity between individuals or units may be present. While, in recent years, there has been a lot of work in this area dealing with univariate response models, only limited attention has been given to models for multicategorical space-time data. We propose a general class of structured additive regression models (STAR) for multicategorical responses, allowing for a flexible semiparametric predictor. This class includes models for multinomial responses with unordered categories as well as models for ordinal responses. Non-linear effects of continuous covariates, time trends and interactions between continuous covariates are modelled through Bayesian versions of penalized splines and flexible seasonal components. Spatial effects can be estimated based on Markov random fields, stationary Gaussian random fields or two-dimensional penalized splines. We present our approach from a Bayesian perspective, allowing to treat all functions and effects within a unified general framework by assigning appropriate priors with different forms and degrees of smoothness. Inference is performed on the basis of a multicategorical linear mixed model representation. This can be viewed as posterior mode estimation and is closely related to penalized likelihood estimation in a frequentist setting. Variance components, corresponding to inverse smoothing parameters, are then estimated by using restricted maximum likelihood. Numerically efficient algorithms allow computations even for fairly large data sets. As a typical example we present results on an analysis of data from a forest health survey
    • ā€¦
    corecore