1,084 research outputs found

    A goodness-of-fit test for the random-effects distribution in mixed models

    Get PDF
    In this paper, we develop a simple diagnostic test for the random-effects distribution in mixed models. The test is based on the gradient function, a graphical tool proposed by Verbeke and Molenberghs to check the impact of assumptions about the random-effects distribution in mixed models on inferences. Inference is conducted through the bootstrap. The proposed test is easy to implement and applicable in a general class of mixed models. The operating characteristics of the test are evaluated in a simulation study, and the method is further illustrated using two real data analyses

    Bayesian Generalized Linear Mixed Effects Models Using Normal-Independent Distributions: Formulation and Applications

    Get PDF
    A standard assumption is that the random effects of Generalized Linear Mixed Effects Models (GLMMs) follow the normal distribution. However, this assumption has been found to be quite unrealistic and sometimes too restrictive as revealed in many real-life situations. A common case of departures from normality includes the presence of outliers leading to heavy-tailed distributed random effects. This work, therefore, aims to develop a robust GLMM framework by replacing the normality assumption on the random effects by the distributions belonging to the Normal-Independent (NI) class. The resulting models are called the Normal-Independent GLMM (NI-GLMM). The four special cases of the NI class considered in these models’ formulations include the normal, Student-t, Slash and contaminated normal distributions. A full Bayesian technique was adopted for estimation and inference. A real-life data set on cotton bolls was used to demonstrate the performance of the proposed NI-GLMM methodology

    Linear Mixed Models with Marginally Symmetric Nonparametric Random Effects

    Full text link
    Linear mixed models (LMMs) are used as an important tool in the data analysis of repeated measures and longitudinal studies. The most common form of LMMs utilize a normal distribution to model the random effects. Such assumptions can often lead to misspecification errors when the random effects are not normal. One approach to remedy the misspecification errors is to utilize a point-mass distribution to model the random effects; this is known as the nonparametric maximum likelihood-fitted (NPML) model. The NPML model is flexible but requires a large number of parameters to characterize the random-effects distribution. It is often natural to assume that the random-effects distribution be at least marginally symmetric. The marginally symmetric NPML (MSNPML) random-effects model is introduced, which assumes a marginally symmetric point-mass distribution for the random effects. Under the symmetry assumption, the MSNPML model utilizes half the number of parameters to characterize the same number of point masses as the NPML model; thus the model confers an advantage in economy and parsimony. An EM-type algorithm is presented for the maximum likelihood (ML) estimation of LMMs with MSNPML random effects; the algorithm is shown to monotonically increase the log-likelihood and is proven to be convergent to a stationary point of the log-likelihood function in the case of convergence. Furthermore, it is shown that the ML estimator is consistent and asymptotically normal under certain conditions, and the estimation of quantities such as the random-effects covariance matrix and individual a posteriori expectations is demonstrated

    Reduced Bayesian Hierarchical Models: Estimating Health Effects of Simultaneous Exposure to Multiple Pollutants

    Get PDF
    Quantifying the health effects associated with simultaneous exposure to many air pollutants is now a research priority of the US EPA. Bayesian hierarchical models (BHM) have been extensively used in multisite time series studies of air pollution and health to estimate health effects of a single pollutant adjusted for potential confounding of other pollutants and other time-varying factors. However, when the scientific goal is to estimate the impacts of many pollutants jointly, a straightforward application of BHM is challenged by the need to specify a random-effect distribution on a high-dimensional vector of nuisance parameters, which often do not have an easy interpretation. In this paper we introduce a new BHM formulation, which we call reduced BHM , aimed at analyzing clustered data sets in the presence of a large number of random effects that are not of primary scientific interest. At the first stage of the reduced BHM, we calculate the integrated likelihood of the parameter of interest (e.g. excess number of deaths attributed to simultaneous exposure to high levels of many pollutants). At the second stage, we specify a flexible random-effect distribution directly on the parameter of interest. The reduced BHM overcomes many of the challenges in the specification and implementation of full BHM in the context of a large number of nuisance parameters. In simulation studies we show that the reduced BHM performs comparably to the full BHM in many scenarios, and even performs better in some cases. Methods are applied to estimate location-specific and overall relative risks of cardiovascular hospital admissions associated with simultaneous exposure to elevated levels of particulate matter and ozone in 51 US counties during the period 1999-2005

    How to Control for Many Covariates? Reliable Estimators Based on the Propensity Score

    Get PDF
    We investigate the finite sample properties of a large number of estimators for the average treatment effect on the treated that are suitable when adjustment for observable covariates is required, like inverse probability weighting, kernel and other variants of matching, as well as different parametric models. The simulation design used is based on real data usually employed for the evaluation of labour market programmes in Germany. We vary several dimensions of the design that are of practical importance, like sample size, the type of the outcome variable, and aspects of the selection process. We find that trimming individual observations with too much weight as well as the choice of tuning parameters is important for all estimators. The key conclusion from our simulations is that a particular radius matching estimator combined with regression performs best overall, in particular when robustness to misspecifications of the propensity score is considered an important property.propensity score matching, kernel matching, inverse probability weighting, selection on observables, empirical Monte Carlo study, finite sample properties

    GENERALIZED LINEAR MIXED MODEL ESTIMATION USING PROC GLIMMIX: RESULTS FROM SIMULATIONS WHEN THE DATA AND MODEL MATCH, AND WHEN THE MODEL IS MISSPECIFIED

    Get PDF
    A simulation study was conducted to determine how well SAS® PROC GLIMMIX (SAS Institute, Cary, NC), statistical software to fit generalized linear mixed models (GLMMs), performed for a simple GLMM, using its default settings, as a naïve user would do. Data were generated from a wide variety of distributions with the same sets of linear predictors, and under several conditions. Then, the data sets were analyzed by using the correct model (the generating model and estimating model were the same) and, subsequently, by misspecifying the estimating model, all using default settings. The data generation model was a randomized complete block design where the model parameters and sample sizes were adjusted to yield 80% power for the F-test on treatment means given a 30 block experiment with block-by-treatment interaction and with additional treatment replications within each block. Convergence rates were low for the exponential and Poisson distributions, even when the generating and estimating models matched. The normal and lognormal distributions converged 100% of the time; convergence rates for other distributions varied. As expected, reducing the number of blocks from 30 to five and increasing replications within blocks to keep total N the same reduced power to 40% or less. Except for the exponential distribution, estimates of treatment means and variance parameters were accurate with only slight biases. Misspecifying the estimating model by omitting the block-by-treatment random effect made F-tests too liberal. Since omitting that term from the model, effectively ignoring a process involved in giving rise to the data, produces symptoms of over-dispersion, several potential remedies were investigated. For all distributions, the historically recommended variance stabilizing transformation was applied, and then the transformed data were fit using a linear mixed model. For one-parameter members of the exponential family an over-dispersion parameter was included in the estimating model. The negative binomial distribution was also examined as the estimating model distribution. None of these remedial steps corrected the over-dispersion problem created by misspecifying the linear predictor, although using a variance stabilizing transformation did improve convergence rates on most distributions investigated

    How to control for many covariates? Reliable estimators based on the propensity score

    Get PDF
    We investigate the finite sample properties of a large number of estimators for the average treatment effect on the treated that are suitable when adjustment for observable covariates is required, like inverse pro¬bability weighting, kernel and other variants of matching, as well as different parametric models. The simulation design used is based on real data usually employed for the evaluation of labour market programmes in Germany. We vary several dimensions of the design that are of practical importance, like sample size, the type of the outcome variable, and aspects of the selection process. We find that trimming individual observations with too much weight as well as the choice of tuning parameters is important for all estimators. The key conclusion from our simulations is that a particular radius matching estimator combined with regression performs best overall, in particular when robustness to misspecifications of the propensity score is considered an important property.Propensity score matching, kernel matching, inverse probability weighting, selection on observables, empirical Monte Carlo study, finite sample properties

    Modeling Agreement between Binary Classifications of Multiple Raters in R and SAS

    Get PDF
    Cancer screening and diagnostic tests often are classified using a binary outcome such as diseased or not diseased. Recently large-scale studies have been conducted to assess agreement between many raters. Measures of agreement using the class of generalized linear mixed models were implemented efficiently in four recently introduced R and SAS packages in large-scale agreement studies incorporating binary classifications. Simulation studies were conducted to compare the performance across the packages and apply the agreement methods to two cancer studies
    • …
    corecore