490 research outputs found
Partially observed information and inference about non-Gaussian mixed linear models
In mixed linear models with nonnormal data, the Gaussian Fisher information
matrix is called a quasi-information matrix (QUIM). The QUIM plays an important
role in evaluating the asymptotic covariance matrix of the estimators of the
model parameters, including the variance components. Traditionally, there are
two ways to estimate the information matrix: the estimated information matrix
and the observed one. Because the analytic form of the QUIM involves parameters
other than the variance components, for example, the third and fourth moments
of the random effects, the estimated QUIM is not available. On the other hand,
because of the dependence and nonnormality of the data, the observed QUIM is
inconsistent. We propose an estimator of the QUIM that consists partially of an
observed form and partially of an estimated one. We show that this estimator is
consistent and computationally very easy to operate. The method is used to
derive large sample tests of statistical hypotheses that involve the variance
components in a non-Gaussian mixed linear model. Finite sample performance of
the test is studied by simulations and compared with the delete-group jackknife
method that applies to a special case of non-Gaussian mixed linear models.Comment: Published at http://dx.doi.org/10.1214/009053605000000543 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The subset argument and consistency of MLE in GLMM: Answer to an open problem and beyond
We give answer to an open problem regarding consistency of the maximum
likelihood estimators (MLEs) in generalized linear mixed models (GLMMs)
involving crossed random effects. The solution to the open problem introduces
an interesting, nonstandard approach to proving consistency of the MLEs in
cases of dependent observations. Using the new technique, we extend the results
to MLEs under a general GLMM. An example is used to further illustrate the
technique.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1084 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Mean squared error of empirical predictor
The term ``empirical predictor'' refers to a two-stage predictor of a linear
combination of fixed and random effects. In the first stage, a predictor is
obtained but it involves unknown parameters; thus, in the second stage, the
unknown parameters are replaced by their estimators. In this paper, we consider
mean squared errors (MSE) of empirical predictors under a general setup, where
ML or REML estimators are used for the second stage. We obtain second-order
approximation to the MSE as well as an estimator of the MSE correct to the same
order. The general results are applied to mixed linear models to obtain a
second-order approximation to the MSE of the empirical best linear unbiased
predictor (EBLUP) of a linear mixed effect and an estimator of the MSE of EBLUP
whose bias is correct to second order. The general mixed linear model includes
the mixed ANOVA model and the longitudinal model as special cases
Fence methods for mixed model selection
Many model search strategies involve trading off model fit with model
complexity in a penalized goodness of fit measure. Asymptotic properties for
these types of procedures in settings like linear regression and ARMA time
series have been studied, but these do not naturally extend to nonstandard
situations such as mixed effects models, where simple definition of the sample
size is not meaningful. This paper introduces a new class of strategies, known
as fence methods, for mixed model selection, which includes linear and
generalized linear mixed models. The idea involves a procedure to isolate a
subgroup of what are known as correct models (of which the optimal model is a
member). This is accomplished by constructing a statistical fence, or barrier,
to carefully eliminate incorrect models. Once the fence is constructed, the
optimal model is selected from among those within the fence according to a
criterion which can be made flexible. In addition, we propose two variations of
the fence. The first is a stepwise procedure to handle situations of many
predictors; the second is an adaptive approach for choosing a tuning constant.
We give sufficient conditions for consistency of fence and its variations, a
desirable property for a good model selection procedure. The methods are
illustrated through simulation studies and real data analysis.Comment: Published in at http://dx.doi.org/10.1214/07-AOS517 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Reliability of environmental sampling culture results using the negative binomial intraclass correlation coefficient.
The Intraclass Correlation Coefficient (ICC) is commonly used to estimate the similarity between quantitative measures obtained from different sources. Overdispersed data is traditionally transformed so that linear mixed model (LMM) based ICC can be estimated. A common transformation used is the natural logarithm. The reliability of environmental sampling of fecal slurry on freestall pens has been estimated for Mycobacterium avium subsp. paratuberculosis using the natural logarithm transformed culture results. Recently, the negative binomial ICC was defined based on a generalized linear mixed model for negative binomial distributed data. The current study reports on the negative binomial ICC estimate which includes fixed effects using culture results of environmental samples. Simulations using a wide variety of inputs and negative binomial distribution parameters (r; p) showed better performance of the new negative binomial ICC compared to the ICC based on LMM even when negative binomial data was logarithm, and square root transformed. A second comparison that targeted a wider range of ICC values showed that the mean of estimated ICC closely approximated the true ICC
High-dimensional genome-wide association study and misspecified mixed model analysis
We study behavior of the restricted maximum likelihood (REML) estimator under
a misspecified linear mixed model (LMM) that has received much attention in
recent gnome-wide association studies. The asymptotic analysis establishes
consistency of the REML estimator of the variance of the errors in the LMM, and
convergence in probability of the REML estimator of the variance of the random
effects in the LMM to a certain limit, which is equal to the true variance of
the random effects multiplied by the limiting proportion of the nonzero random
effects present in the LMM. The aymptotic results also establish convergence
rate (in probability) of the REML estimators as well as a result regarding
convergence of the asymptotic conditional variance of the REML estimator. The
asymptotic results are fully supported by the results of empirical studies,
which include extensive simulation studies that compare the performance of the
REML estimator (under the misspecified LMM) with other existing methods.Comment: 3 figure
The Environmental Impact of Plastic Waste
The pollution caused by disposable plastic products is becoming more and more serious, and “plastic limit” has become a global consensus. This article mainly discusses the pollution problem from the following aspects: Integrate all relevant important indicators to establish a multiple regression model of the maximum amount of disposable plastic waste to estimate the maximum amount of disposable waste in the future without causing further damage to the environment; Establish an environmental safety level evaluation model and analyze the impact of plastic waste on environmental safety; Try to set the lowest level target that can be achieved by global waste at this stage, and conduct correlation analysis on the impact of humans, enterprises, and the environment; Select several countries based on their comprehensive strengths, conduct a comparative analysis of their plastic production, economic strength, and environment, and try to explore their responsibilities
Robust Estimation Of Multivariate Failure Data With Time-Modulated Frailty
A time-modulated frailty model is proposed for analyzing multivariate failure data. The effect of frailties, which may not be constant over time, is discussed. We assume a parametric model for the baseline hazard, but avoid the parametric assumption for the frailty distribution. The well-known connection between survival times and Poisson regression model is used. The parameters of interest are estimated by generalized estimating equations (GEE) or by penalized GEE. Simulation studies show that the procedure is successful to detect the effect of time-modulated frailty. The method is also applied to a placebo controlled randomized clinical trial of gamma interferon, a study of chronic granulomatous disease (CGD)
Iterative estimating equations: Linear convergence and asymptotic properties
We propose an iterative estimating equations procedure for analysis of
longitudinal data. We show that, under very mild conditions, the probability
that the procedure converges at an exponential rate tends to one as the sample
size increases to infinity. Furthermore, we show that the limiting estimator is
consistent and asymptotically efficient, as expected. The method applies to
semiparametric regression models with unspecified covariances among the
observations. In the special case of linear models, the procedure reduces to
iterative reweighted least squares. Finite sample performance of the procedure
is studied by simulations, and compared with other methods. A numerical example
from a medical study is considered to illustrate the application of the method.Comment: Published in at http://dx.doi.org/10.1214/009053607000000208 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …