490 research outputs found

    Partially observed information and inference about non-Gaussian mixed linear models

    Full text link
    In mixed linear models with nonnormal data, the Gaussian Fisher information matrix is called a quasi-information matrix (QUIM). The QUIM plays an important role in evaluating the asymptotic covariance matrix of the estimators of the model parameters, including the variance components. Traditionally, there are two ways to estimate the information matrix: the estimated information matrix and the observed one. Because the analytic form of the QUIM involves parameters other than the variance components, for example, the third and fourth moments of the random effects, the estimated QUIM is not available. On the other hand, because of the dependence and nonnormality of the data, the observed QUIM is inconsistent. We propose an estimator of the QUIM that consists partially of an observed form and partially of an estimated one. We show that this estimator is consistent and computationally very easy to operate. The method is used to derive large sample tests of statistical hypotheses that involve the variance components in a non-Gaussian mixed linear model. Finite sample performance of the test is studied by simulations and compared with the delete-group jackknife method that applies to a special case of non-Gaussian mixed linear models.Comment: Published at http://dx.doi.org/10.1214/009053605000000543 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The subset argument and consistency of MLE in GLMM: Answer to an open problem and beyond

    Full text link
    We give answer to an open problem regarding consistency of the maximum likelihood estimators (MLEs) in generalized linear mixed models (GLMMs) involving crossed random effects. The solution to the open problem introduces an interesting, nonstandard approach to proving consistency of the MLEs in cases of dependent observations. Using the new technique, we extend the results to MLEs under a general GLMM. An example is used to further illustrate the technique.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1084 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Mean squared error of empirical predictor

    Full text link
    The term ``empirical predictor'' refers to a two-stage predictor of a linear combination of fixed and random effects. In the first stage, a predictor is obtained but it involves unknown parameters; thus, in the second stage, the unknown parameters are replaced by their estimators. In this paper, we consider mean squared errors (MSE) of empirical predictors under a general setup, where ML or REML estimators are used for the second stage. We obtain second-order approximation to the MSE as well as an estimator of the MSE correct to the same order. The general results are applied to mixed linear models to obtain a second-order approximation to the MSE of the empirical best linear unbiased predictor (EBLUP) of a linear mixed effect and an estimator of the MSE of EBLUP whose bias is correct to second order. The general mixed linear model includes the mixed ANOVA model and the longitudinal model as special cases

    Fence methods for mixed model selection

    Full text link
    Many model search strategies involve trading off model fit with model complexity in a penalized goodness of fit measure. Asymptotic properties for these types of procedures in settings like linear regression and ARMA time series have been studied, but these do not naturally extend to nonstandard situations such as mixed effects models, where simple definition of the sample size is not meaningful. This paper introduces a new class of strategies, known as fence methods, for mixed model selection, which includes linear and generalized linear mixed models. The idea involves a procedure to isolate a subgroup of what are known as correct models (of which the optimal model is a member). This is accomplished by constructing a statistical fence, or barrier, to carefully eliminate incorrect models. Once the fence is constructed, the optimal model is selected from among those within the fence according to a criterion which can be made flexible. In addition, we propose two variations of the fence. The first is a stepwise procedure to handle situations of many predictors; the second is an adaptive approach for choosing a tuning constant. We give sufficient conditions for consistency of fence and its variations, a desirable property for a good model selection procedure. The methods are illustrated through simulation studies and real data analysis.Comment: Published in at http://dx.doi.org/10.1214/07-AOS517 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Reliability of environmental sampling culture results using the negative binomial intraclass correlation coefficient.

    Get PDF
    The Intraclass Correlation Coefficient (ICC) is commonly used to estimate the similarity between quantitative measures obtained from different sources. Overdispersed data is traditionally transformed so that linear mixed model (LMM) based ICC can be estimated. A common transformation used is the natural logarithm. The reliability of environmental sampling of fecal slurry on freestall pens has been estimated for Mycobacterium avium subsp. paratuberculosis using the natural logarithm transformed culture results. Recently, the negative binomial ICC was defined based on a generalized linear mixed model for negative binomial distributed data. The current study reports on the negative binomial ICC estimate which includes fixed effects using culture results of environmental samples. Simulations using a wide variety of inputs and negative binomial distribution parameters (r; p) showed better performance of the new negative binomial ICC compared to the ICC based on LMM even when negative binomial data was logarithm, and square root transformed. A second comparison that targeted a wider range of ICC values showed that the mean of estimated ICC closely approximated the true ICC

    High-dimensional genome-wide association study and misspecified mixed model analysis

    Full text link
    We study behavior of the restricted maximum likelihood (REML) estimator under a misspecified linear mixed model (LMM) that has received much attention in recent gnome-wide association studies. The asymptotic analysis establishes consistency of the REML estimator of the variance of the errors in the LMM, and convergence in probability of the REML estimator of the variance of the random effects in the LMM to a certain limit, which is equal to the true variance of the random effects multiplied by the limiting proportion of the nonzero random effects present in the LMM. The aymptotic results also establish convergence rate (in probability) of the REML estimators as well as a result regarding convergence of the asymptotic conditional variance of the REML estimator. The asymptotic results are fully supported by the results of empirical studies, which include extensive simulation studies that compare the performance of the REML estimator (under the misspecified LMM) with other existing methods.Comment: 3 figure

    The Environmental Impact of Plastic Waste

    Get PDF
    The pollution caused by disposable plastic products is becoming more and more serious, and “plastic limit” has become a global consensus. This article mainly discusses the pollution problem from the following aspects: Integrate all relevant important indicators to establish a multiple regression model of the maximum amount of disposable plastic waste to estimate the maximum amount of disposable waste in the future without causing further damage to the environment; Establish an environmental safety level evaluation model and analyze the impact of plastic waste on environmental safety; Try to set the lowest level target that can be achieved by global waste at this stage, and conduct correlation analysis on the impact of humans, enterprises, and the environment; Select several countries based on their comprehensive strengths, conduct a comparative analysis of their plastic production, economic strength, and environment, and try to explore their responsibilities

    Robust Estimation Of Multivariate Failure Data With Time-Modulated Frailty

    Get PDF
    A time-modulated frailty model is proposed for analyzing multivariate failure data. The effect of frailties, which may not be constant over time, is discussed. We assume a parametric model for the baseline hazard, but avoid the parametric assumption for the frailty distribution. The well-known connection between survival times and Poisson regression model is used. The parameters of interest are estimated by generalized estimating equations (GEE) or by penalized GEE. Simulation studies show that the procedure is successful to detect the effect of time-modulated frailty. The method is also applied to a placebo controlled randomized clinical trial of gamma interferon, a study of chronic granulomatous disease (CGD)

    Iterative estimating equations: Linear convergence and asymptotic properties

    Get PDF
    We propose an iterative estimating equations procedure for analysis of longitudinal data. We show that, under very mild conditions, the probability that the procedure converges at an exponential rate tends to one as the sample size increases to infinity. Furthermore, we show that the limiting estimator is consistent and asymptotically efficient, as expected. The method applies to semiparametric regression models with unspecified covariances among the observations. In the special case of linear models, the procedure reduces to iterative reweighted least squares. Finite sample performance of the procedure is studied by simulations, and compared with other methods. A numerical example from a medical study is considered to illustrate the application of the method.Comment: Published in at http://dx.doi.org/10.1214/009053607000000208 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore