51 research outputs found
On Robustness in some extended regression models
Generalized Linear Models extends classical regression models to
non-normal response variables and allows a non-linear relation
between the mean of the responses and the predictors. In addition,
when the responses are correlated or show overdispersion, one can
add a linear combination of random components to the linear
predictor. The resulting models are known as Generalized Linear
Mixed Models. Traditional estimation methods in these classes of
models rely on distributional assumptions about the random
components, as well as the implicit assumption that the
explanatory variables are uncorrelated with the error term. In
Chapters 2 and 3 we investigate, using the Change-of-Variance
Function, the behavior of the asymptotic variance-covariance
matrix of the class of M-estimators when the distribution of the
random components is slightly contaminated. In Chapter 4 we study
a different concept of robustness for classical models that
contain explanatory variables correlated with the error term. For
these models we propose an instrumental variables estimator and
study its robustness by means of its Influence Function.
We extend the definitions of Change-of-Variance Function to
Generalized Linear Models and Generalized Linear Mixed Models. We
use them to analyze in detail the sensitivity of the asymptotic
variance of the maximum likelihood estimator. For the first class
of models, we found that, in general, a contamination of the
distribution can seriously affect the asymptotic variance of the
estimators. For the second class, we focus on the Poisson-Gamma
model and two mixed-effects Binomial models. We found that the
effect of a contamination in the mixing distribution on the
asymptotic variance of the maximum likelihood estimator remain
bounded for both models. A simulation study was performed in all
cases to illustrate the relevance of our results.
Finally, we propose a robust instrumental variables estimator
based on high breakdown point S-estimators of location and
scatter. The resulting estimator has bounded Influence Function
and satisfies the usual asymptotic properties for suitable choices
of the S-estimator used. We also derive an estimate for the
asymptotic covariance matrix of our estimator which is robust
against outliers and leverage points. We illustrate our results
using a real data example
Robust and Efficient Methods for Bayesian Finite Population Inference.
Bayesian model-based approaches provide data-driven estimates of population quantity of interest from complex survey data to achieve balance between bias correction and efficiency. We focus on the issue of accommodating sample weights equal to the inverse of the probabilities of inclusion. In settings with highly variable weights, weight "trimming" is often employed in an ad-hoc manner to decrease variance, while possibly increasing bias. We consider three model-based methods to provide principled bias-variance tradeoffs.
Weighted estimators can be developed in a model-based framework by including interactions between the quantity of interest and the weights; weight pooling builds a variable selection model that drops interactions on various weight values; and estimation proceeds using the posterior distribution of model averages. The extension considers a weight pooling linear spline model that uses a linear spline to capture regression coefficient patterns for all strata, and collapses together the strata with minor differences. Our model achieves robustness when weights are needed to guard against model misspecification, and efficiency when weight-coefficient interactions could be ignored. We also model interactions between the weights and estimators of interest as random effects, reducing overall RMSEs by shrinking interactions toward zero when such shrinkage is supported by data. We adapt a flexible Laplace prior distribution to gain robustness against model misspecification. We find that weight smoothing models with Laplace priors approximate unweighted estimates when weighting is not necessary, and could greatly reduce the RMSE if strong pattern exists in data in linear model setting. Under logistic regression with same sample size, the estimates are still robust, but with less gain in efficiency. Finally, we adapt a Dirichlet process mixture (DPM) model that can approximat highly-skewed and multimodal distributions, often with few components. The extended weighted DPM version define the DP prior as a mixture of DP random basis measures that is a function of covariates, extends applications to regression, and creates a natural link to survey weights. We also investigate its application to provide a new approach for quantile regression inference with complex survey design. Simulation results suggest great reduction in RMSE from weighted DPM method under most of the scenarios.PhDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111372/1/xiaxi_1.pd
- …