51 research outputs found

    On Robustness in some extended regression models

    Get PDF
    Generalized Linear Models extends classical regression models to non-normal response variables and allows a non-linear relation between the mean of the responses and the predictors. In addition, when the responses are correlated or show overdispersion, one can add a linear combination of random components to the linear predictor. The resulting models are known as Generalized Linear Mixed Models. Traditional estimation methods in these classes of models rely on distributional assumptions about the random components, as well as the implicit assumption that the explanatory variables are uncorrelated with the error term. In Chapters 2 and 3 we investigate, using the Change-of-Variance Function, the behavior of the asymptotic variance-covariance matrix of the class of M-estimators when the distribution of the random components is slightly contaminated. In Chapter 4 we study a different concept of robustness for classical models that contain explanatory variables correlated with the error term. For these models we propose an instrumental variables estimator and study its robustness by means of its Influence Function. We extend the definitions of Change-of-Variance Function to Generalized Linear Models and Generalized Linear Mixed Models. We use them to analyze in detail the sensitivity of the asymptotic variance of the maximum likelihood estimator. For the first class of models, we found that, in general, a contamination of the distribution can seriously affect the asymptotic variance of the estimators. For the second class, we focus on the Poisson-Gamma model and two mixed-effects Binomial models. We found that the effect of a contamination in the mixing distribution on the asymptotic variance of the maximum likelihood estimator remain bounded for both models. A simulation study was performed in all cases to illustrate the relevance of our results. Finally, we propose a robust instrumental variables estimator based on high breakdown point S-estimators of location and scatter. The resulting estimator has bounded Influence Function and satisfies the usual asymptotic properties for suitable choices of the S-estimator used. We also derive an estimate for the asymptotic covariance matrix of our estimator which is robust against outliers and leverage points. We illustrate our results using a real data example

    Robust and Efficient Methods for Bayesian Finite Population Inference.

    Full text link
    Bayesian model-based approaches provide data-driven estimates of population quantity of interest from complex survey data to achieve balance between bias correction and efficiency. We focus on the issue of accommodating sample weights equal to the inverse of the probabilities of inclusion. In settings with highly variable weights, weight "trimming" is often employed in an ad-hoc manner to decrease variance, while possibly increasing bias. We consider three model-based methods to provide principled bias-variance tradeoffs. Weighted estimators can be developed in a model-based framework by including interactions between the quantity of interest and the weights; weight pooling builds a variable selection model that drops interactions on various weight values; and estimation proceeds using the posterior distribution of model averages. The extension considers a weight pooling linear spline model that uses a linear spline to capture regression coefficient patterns for all strata, and collapses together the strata with minor differences. Our model achieves robustness when weights are needed to guard against model misspecification, and efficiency when weight-coefficient interactions could be ignored. We also model interactions between the weights and estimators of interest as random effects, reducing overall RMSEs by shrinking interactions toward zero when such shrinkage is supported by data. We adapt a flexible Laplace prior distribution to gain robustness against model misspecification. We find that weight smoothing models with Laplace priors approximate unweighted estimates when weighting is not necessary, and could greatly reduce the RMSE if strong pattern exists in data in linear model setting. Under logistic regression with same sample size, the estimates are still robust, but with less gain in efficiency. Finally, we adapt a Dirichlet process mixture (DPM) model that can approximat highly-skewed and multimodal distributions, often with few components. The extended weighted DPM version define the DP prior as a mixture of DP random basis measures that is a function of covariates, extends applications to regression, and creates a natural link to survey weights. We also investigate its application to provide a new approach for quantile regression inference with complex survey design. Simulation results suggest great reduction in RMSE from weighted DPM method under most of the scenarios.PhDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111372/1/xiaxi_1.pd
    • …
    corecore