255 research outputs found
Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint
In certain genetic studies, clinicians and genetic counselors are interested
in estimating the cumulative risk of a disease for individuals with and without
a rare deleterious mutation. Estimating the cumulative risk is difficult,
however, when the estimates are based on family history data. Often, the
genetic mutation status in many family members is unknown; instead, only
estimated probabilities of a patient having a certain mutation status are
available. Also, ages of disease-onset are subject to right censoring. Existing
methods to estimate the cumulative risk using such family-based data only
provide estimation at individual time points, and are not guaranteed to be
monotonic or nonnegative. In this paper, we develop a novel method that
combines Expectation-Maximization and isotonic regression to estimate the
cumulative risk across the entire support. Our estimator is monotonic,
satisfies self-consistent estimating equations and has high power in detecting
differences between the cumulative risks of different populations. Application
of our estimator to a Parkinson's disease (PD) study provides the age-at-onset
distribution of PD in PARK2 mutation carriers and noncarriers, and reveals a
significant difference between the distribution in compound heterozygous
carriers compared to noncarriers, but not between heterozygous carriers and
noncarriers.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS730 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Heap : a command for estimating discrete outcome variable models in the presence of heaping at known points
Self-reported survey data are often plagued by the presence of heaping. Accounting for this measurement error is crucial for the identification and consistent estimation of the underlying model (parameters) from such data. In this article, we introduce two commands. The first command, heapmph, estimates the parameters of a discrete-time mixed proportional hazard model with gammaunobserved heterogeneity, allowing for fixed and individual-specific censoring and different-sized heap points. The second command, heapop, extends the framework to ordered choice outcomes, subject to heaping. We also provide suitable specification tests
On Estimation and Inference under Order Restrictions.
The aim of statistical analysis and inference is to draw meaningful conclusions. In the case where there is prior knowledge of stochastic orderings or inequalities, it is desirable to incorporate this information in the estimation. This avoids possible unrealistic estimates, and may also lead to gain in efficiency.
In this dissertation we first present the constrained nonparametric maximum likelihood estimator (C-NPMLE) of the survivor functions in one- and two-sample settings. Dykstra (1982) also considered C-NPMLE for such problems, however, as we show, Dykstra's method has an error and does not always give the C-NPMLE. We corrected this error and simulation shows improvement in efficiency compared to Dykstra's estimator. Confidence intervals based on bootstrap methods are proposed. Uniqueness and consistency of the proposed estimators is established.
Second, we propose a new estimator, the pointwise C-NPMLE, which is defined at each time t by the estimates of the survivor functions subject to constraints at t only. The estimator is shown to be non-increasing in t, and the consistency and the asymptotic distribution of the estimators are presented. In the development of this estimator and the characterization of its properties, we transform the problem into one that uses the profile likelihood; we adapt the pool-adjacent-violators algorithm, in which pooling is defined in a special way. Different methods to construct confidence intervals are also proposed. The estimator is shown to have good properties compared to other potential estimators.
Finally, we propose a new method to construct confidence intervals (CIs) for G independent normal means under the linear ordering constraint. The method is based on defining intermediate random variables that are related to the original observations and using the CIs of the means of these intermediate random variables to restrict the original CIs from the separate groups. This method is extended to the case with three or more groups and the simulation studies show that the proposed CIs have coverage rates close to nominal levels with reduced average widths.Ph.D.BiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89689/1/yongpark_1.pd
A Simple GMM Estimator for the Semi-Parametric Mixed Proportional Hazard Model
Ridder and Woutersen (2003) have shown that under a weak condition on the baseline hazard there exist root-N consistent estimators of the parameters in a semiparametric Mixed Proportional Hazard model with a parametric baseline hazard and unspecified distribution of the unobserved heterogeneity. We extend the Linear Rank Estimator (LRE) of Tsiatis (1990) and Robins and Tsiatis (1991) to this class of models. The optimal LRE is a two-step estimator. We propose a simple first-step estimator that is close to optimal if there is no unobserved heterogeneity. The efficiency gain associated with the optimal LRE increases with the degree of unobserved heterogeneity.mixed proportional hazard, linear rank estimation, counting process
Marginal Proportional Hazards Models for Clustered Interval-Censored Data with Time-Dependent Covariates
The Botswana Combination Prevention Project was a cluster-randomized HIV prevention trial whose follow-up period coincided with Botswanaās national adoption of a universal test-and-treat strategy for HIV management. Of interest is whether, and to what extent, this change in policy (i) modified the observed preventative effects of the study intervention and (ii) was associated with a reduction in the population-level incidence of HIV in Botswana. To address these questions, we propose a stratified proportional hazards model for clustered interval-censored data with time-dependent covariates and develop a composite expectation maximization algorithm that facilitates estimation of model parameters without placing parametric assumptions on either the baseline hazard functions or the within-cluster dependence structure. We show that the resulting estimators for the regression parameters are consistent and asymptotically normal. We also propose and provide theoretical justification for the use of the profile composite likelihood function to construct a robust sandwich estimator for the variance. We characterize the finite-sample performance and robustness of these estimators through extensive simulation studies. Finally, we conclude by applying this stratified proportional hazards model to a re-analysis of the Botswana Combination Prevention Project, with the national adoption of a universal test-and-treat strategy now modeled as a time-dependent covariate
A simple GMM estimator for the semi-parametric mixed proportional hazard model
Ridder and Woutersen (2003) have shown that under a weak condition on the baseline hazard, there exist root-N consistent estimators of the parameters in a semiparametric Mixed Proportional Hazard model with a parametric baseline hazard and unspeciļæ½ed distribution of the unobserved heterogeneity. We extend the Linear Rank Estimator (LRE) of Tsiatis (1990) and Robins and Tsiatis
(1991) to this class of models. The optimal LRE is a two-step estimator. We propose a simple one-step estimator that is close to optimal if there is no unobserved heterogeneity. The eĀ¢ ciency gain associated with the optimal LRE increases with the degree of unobserved heterogeneity.
- ā¦