7 research outputs found

    Bayesian Influence Diagnostic Methods for Parametric Regression Models

    Get PDF
    The goals of assessing the influence of individual observations in statistical analysis are not only to identify influential observations such as outliers and high leverage points, but also to determine the importance of each observation in the analysis for a better model fit. Thus, assessing the influence of individual observations on a model, choosing an appropriate dimensionality of a model and selecting the best model for a given dataset are very important and highly relevant problems in any formal statistical analysis. Recently, Bayesian methodologies have been getting enormous attention in biomedical research due to the potential advantages of fitting a vast array of complex models posed by modern data. As the demand for Bayesian data analysis and modeling increases, we need good diagnostic methods for model assessment and selection. In this dissertation, we develop Bayesian diagnostic measures based on case-deletion to assess the influence of each observation to model fit and model complexity. First, we propose Bayesian case influence diagnostics for complex survival models. In detail, we develop case deletion influence diagnostics for both the joint and marginal posterior distributions based on the Kullback-Leibler divergence. Second, we introduce three types of Bayesian case influence measures based on case deletion, namely the Φ-divergence, Cook's posterior mode distance and Cook's posterior mean distance to evaluate the effects of deleting a set of observations in general Bayesian parametric models. We also examine the statistical properties of these three Bayesian case influence measures and their applications to identification of influential sets and model complexity. In any deletion diagnostic, "size matters" issue persists and it is a fundamental issue of influence analysis, because the size of the deletion diagnostic is associated with the size of the perturbation. For Cook's distance, that is Cook's distance is a monotonic function of the size of perturbation. Thus, we develop a scaled version of Cook's distance to address the size issue for deletion diagnostics in general parametric models.Doctor of Philosoph

    The Generalized Method of Moments for Mixture and Mixed Models

    Get PDF
    Mixture models can be found in a wide variety of statistical applications. However, undertaking statistical inference in mixture models, especially non-parametric mixture models, can be challenging. A general, or nonparametric, mixture model has effectively an infinite dimensional parameter space. In frequentist statistics, the maximum likelihood estimator with an infinite dimensional parameter may not be consistent or efficient in the sense that the Cramer-Rao bound is not attained even asymptotically. In Bayesian statistics, a prior on an infinite dimensional space is not well defined and can be highly informative even with large amounts of data. In this thesis, we mainly consider mixture and mixed-effects models, when the mixing distribution is non-parametric. Following the dimensionality reduction idea in [Marriott, 2002], we propose a reparameterization-approximation framework with a complete orthonormal basis in a Hilbert space. The parameters in the reparameterized models are interpreted as the generalized moments of a mixing distribution. We consider different orthonormal bases, including the families of orthogonal polynomials and the eigenfunctions of positive self-adjoint integral operators. We also study the approximation errors of the truncation approximations of the reparameterized models in some special cases. The generalized moments in the truncated approximations of the reparameterized models have a natural parameter space, called the generalized moment space. We study the geometric properties of the generalized moment space and obtain two important geometric properties: the positive representation and the gradient characterization. The positive representation reveals the identifiability of the mixing distribution by its generalized moments and provides an upper bound of the number of the support points of the mixing distribution. On the other hand, the gradient characterization provides the foundation of the class of gradient-based algorithms when the feasible set is the generalized moment space. Next, we aim to fit a non-parametric mixture model by a set of generalized moment conditions, which are from the proposed reparameterization-approximation procedure. We propose a new estimation method, called the generalized method of moments for mixture models. The proposed estimation method involves minimizing a quadratic objective function over the generalized moment space. The proposed estimators can be easily computed through the gradient-based algorithms. We show the convergence rate of the mean squared error of the proposed estimators, as the sample size goes to infinity. Moreover, we design the quadratic objective function to ensure that the proposed estimators are robust to the outliers. Compared to the other existing estimation methods for mixture models, the GMM for mixture models is more computationally friendly and robust to outliers. Lastly, we consider the hypothesis testing problem on the regression parameter in a mixed-effects model with univariate random effects. Through our new procedures, we obtain a series of estimating equations parameterized in the regression parameter and the generalized moments of the random-effects distribution. These parameters are estimated under the framework of the generalized method of moments. In the case that the number of the generalized moments diverges with the sample size and the dimension of the regression parameter is fixed, we compute the convergence rate of the generalized method of moments estimators for the mixed-effects models with univariate random effects. Since the regularity conditions in [Wilks, 1938] fail under our context, it is challenging to construct an asymptotically χ2\chi^2 test statistic. We propose using ensemble inference, in which an asymptotically χ2\chi^2 test statistic is constructed from a series of the estimators obtained from the generalized estimating equations with different working correlation matrices

    Timely and reliable evaluation of the effects of interventions: a framework for adaptive meta-analysis (FAME)

    Get PDF
    Most systematic reviews are retrospective and use aggregate data AD) from publications, meaning they can be unreliable, lag behind therapeutic developments and fail to influence ongoing or new trials. Commonly, the potential influence of unpublished or ongoing trials is overlooked when interpreting results, or determining the value of updating the meta-analysis or need to collect individual participant data (IPD). Therefore, we developed a Framework for Adaptive Metaanalysis (FAME) to determine prospectively the earliest opportunity for reliable AD meta-analysis. We illustrate FAME using two systematic reviews in men with metastatic (M1) and non-metastatic (M0)hormone-sensitive prostate cancer (HSPC)

    A Semiparametric Bayesian to Poisson Mixed-Effects Model for Epileptics Data

    No full text
    corecore