Enhancing Statistical Inference of Generalized Linear Regression Models Under Data Uncertainty

Abstract

This dissertation addresses the challenges of data uncertainty in statistical inference. It focuses on two primary research topics concerning the impact of uncertainty on covariates (influential factors) and the uncertainty on the response variable in generalized regression models. It brings opportunities as well as challenges in estimation and prediction when data uncertainty is involved.In the research, the issue of data uncertainty affecting the response variable has been studied, particularly when data heterogeneity is present. To address this challenge, a finite mixture Weibull regression modeling method is proposed. This approach explicitly considers the presence of potential data heterogeneity by introducing a latent variable to represent sub-populations and employing the Expectation-Maximization (EM) algorithm in the estimation of the latent variable. When addressing the impact of measurement error on covariates, existing research has extensively explored this area. However, two significant challenges remain to be addressed. The first is the presence of mixture error, where both classical and Berkson errors coexist, introducing bias to the model inference. The second challenge lies in the computational and mathematical burden associated with generalized linear regression models. Overcoming these difficulties is important for improving the accuracy, efficiency, and interpretability of the generalized linear regression model in the presence of measurement errors. This dissertation delves into the investigation of two specific types of generalized linear regression models, namely, the Poisson regression for count data modeling and the Weibull regression for time-to-event data analysis. Both models are analyzed in the context of covariates affected by mixture error. In the dissertation, two innovative approaches are proposed to address the impact of mixture error in two types of generalized linear regression models. For the Poisson regression model, an error-structure adapted quasi-likelihood estimation method is proposed, while for the Weibull regression model, an error-structure adapted Markov Chain Monte Carlo (MCMC) estimation method is proposed. These methods are designed to effectively address the challenges posed by mixture error in both models. To demonstrate the effectiveness of our proposed methods, numerical case studies using data from a Valley Fever investigation and the Fram- ingham Heart Study were conducted. The research work presented in this dissertation provides a portfolio of solutions to improve the estimation and prediction of generalized linear regression in the presence of data uncertainty. The effectiveness and efficiency of the proposed methodologies have been demonstrated and justified via numerical simulation case studies.Release after 08/18/202

    Similar works