255 research outputs found

    Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint

    Get PDF
    In certain genetic studies, clinicians and genetic counselors are interested in estimating the cumulative risk of a disease for individuals with and without a rare deleterious mutation. Estimating the cumulative risk is difficult, however, when the estimates are based on family history data. Often, the genetic mutation status in many family members is unknown; instead, only estimated probabilities of a patient having a certain mutation status are available. Also, ages of disease-onset are subject to right censoring. Existing methods to estimate the cumulative risk using such family-based data only provide estimation at individual time points, and are not guaranteed to be monotonic or nonnegative. In this paper, we develop a novel method that combines Expectation-Maximization and isotonic regression to estimate the cumulative risk across the entire support. Our estimator is monotonic, satisfies self-consistent estimating equations and has high power in detecting differences between the cumulative risks of different populations. Application of our estimator to a Parkinson's disease (PD) study provides the age-at-onset distribution of PD in PARK2 mutation carriers and noncarriers, and reveals a significant difference between the distribution in compound heterozygous carriers compared to noncarriers, but not between heterozygous carriers and noncarriers.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS730 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Heap : a command for estimating discrete outcome variable models in the presence of heaping at known points

    Get PDF
    Self-reported survey data are often plagued by the presence of heaping. Accounting for this measurement error is crucial for the identification and consistent estimation of the underlying model (parameters) from such data. In this article, we introduce two commands. The first command, heapmph, estimates the parameters of a discrete-time mixed proportional hazard model with gammaunobserved heterogeneity, allowing for fixed and individual-specific censoring and different-sized heap points. The second command, heapop, extends the framework to ordered choice outcomes, subject to heaping. We also provide suitable specification tests

    On Estimation and Inference under Order Restrictions.

    Full text link
    The aim of statistical analysis and inference is to draw meaningful conclusions. In the case where there is prior knowledge of stochastic orderings or inequalities, it is desirable to incorporate this information in the estimation. This avoids possible unrealistic estimates, and may also lead to gain in efficiency. In this dissertation we first present the constrained nonparametric maximum likelihood estimator (C-NPMLE) of the survivor functions in one- and two-sample settings. Dykstra (1982) also considered C-NPMLE for such problems, however, as we show, Dykstra's method has an error and does not always give the C-NPMLE. We corrected this error and simulation shows improvement in efficiency compared to Dykstra's estimator. Confidence intervals based on bootstrap methods are proposed. Uniqueness and consistency of the proposed estimators is established. Second, we propose a new estimator, the pointwise C-NPMLE, which is defined at each time t by the estimates of the survivor functions subject to constraints at t only. The estimator is shown to be non-increasing in t, and the consistency and the asymptotic distribution of the estimators are presented. In the development of this estimator and the characterization of its properties, we transform the problem into one that uses the profile likelihood; we adapt the pool-adjacent-violators algorithm, in which pooling is defined in a special way. Different methods to construct confidence intervals are also proposed. The estimator is shown to have good properties compared to other potential estimators. Finally, we propose a new method to construct confidence intervals (CIs) for G independent normal means under the linear ordering constraint. The method is based on defining intermediate random variables that are related to the original observations and using the CIs of the means of these intermediate random variables to restrict the original CIs from the separate groups. This method is extended to the case with three or more groups and the simulation studies show that the proposed CIs have coverage rates close to nominal levels with reduced average widths.Ph.D.BiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89689/1/yongpark_1.pd

    A Simple GMM Estimator for the Semi-Parametric Mixed Proportional Hazard Model

    Get PDF
    Ridder and Woutersen (2003) have shown that under a weak condition on the baseline hazard there exist root-N consistent estimators of the parameters in a semiparametric Mixed Proportional Hazard model with a parametric baseline hazard and unspecified distribution of the unobserved heterogeneity. We extend the Linear Rank Estimator (LRE) of Tsiatis (1990) and Robins and Tsiatis (1991) to this class of models. The optimal LRE is a two-step estimator. We propose a simple first-step estimator that is close to optimal if there is no unobserved heterogeneity. The efficiency gain associated with the optimal LRE increases with the degree of unobserved heterogeneity.mixed proportional hazard, linear rank estimation, counting process

    Marginal Proportional Hazards Models for Clustered Interval-Censored Data with Time-Dependent Covariates

    Get PDF
    The Botswana Combination Prevention Project was a cluster-randomized HIV prevention trial whose follow-up period coincided with Botswanaā€™s national adoption of a universal test-and-treat strategy for HIV management. Of interest is whether, and to what extent, this change in policy (i) modified the observed preventative effects of the study intervention and (ii) was associated with a reduction in the population-level incidence of HIV in Botswana. To address these questions, we propose a stratified proportional hazards model for clustered interval-censored data with time-dependent covariates and develop a composite expectation maximization algorithm that facilitates estimation of model parameters without placing parametric assumptions on either the baseline hazard functions or the within-cluster dependence structure. We show that the resulting estimators for the regression parameters are consistent and asymptotically normal. We also propose and provide theoretical justification for the use of the profile composite likelihood function to construct a robust sandwich estimator for the variance. We characterize the finite-sample performance and robustness of these estimators through extensive simulation studies. Finally, we conclude by applying this stratified proportional hazards model to a re-analysis of the Botswana Combination Prevention Project, with the national adoption of a universal test-and-treat strategy now modeled as a time-dependent covariate

    A simple GMM estimator for the semi-parametric mixed proportional hazard model

    Get PDF
    Ridder and Woutersen (2003) have shown that under a weak condition on the baseline hazard, there exist root-N consistent estimators of the parameters in a semiparametric Mixed Proportional Hazard model with a parametric baseline hazard and unspeciļæ½ed distribution of the unobserved heterogeneity. We extend the Linear Rank Estimator (LRE) of Tsiatis (1990) and Robins and Tsiatis (1991) to this class of models. The optimal LRE is a two-step estimator. We propose a simple one-step estimator that is close to optimal if there is no unobserved heterogeneity. The eĀ¢ ciency gain associated with the optimal LRE increases with the degree of unobserved heterogeneity.
    • ā€¦
    corecore