Recently, applied sciences, including longitudinal and clustered studies in
biomedicine require the analysis of ultra-high dimensional linear mixed effects
models where we need to select important fixed effect variables from a vast
pool of available candidates. However, all existing literature assume that all
the available covariates and random effect components are independent of the
model error which is often violated (endogeneity) in practice. In this paper,
we first investigate this important issue in ultra-high dimensional linear
mixed effects models with particular focus on the fixed effects selection. We
study the effects of different types of endogeneity on existing regularization
methods and prove their inconsistencies. Then, we propose a new profiled
focused generalized method of moments (PFGMM) approach to consistently select
fixed effects under 'error-covariate' endogeneity, i.e., in the presence of
correlation between the model error and covariates. Our proposal is proved to
be oracle consistent with probability tending to one and works well under most
other type of endogeneity too. Additionally, we also propose and illustrate a
few consistent parameter estimators, including those of the variance
components, along with variable selection through PFGMM. Empirical simulations
and an interesting real data example further support the claimed utility of our
proposal.Comment: To appear in Statistica Sinica (2020