Search CORE

157 research outputs found

A Hybrid Newton-Type Method for the Linear Regression in Case-cohort Studies

Author: Nan Bin
Yu Menggang
Publication venue: Collection of Biostatistics Research Archive
Publication date: 16/12/2004
Field of study

Case-cohort designs are increasingly commonly used in large epidemiological cohort studies. Nan, Yu, and Kalbeisch (2004) provided the asymptotic results for censored linear regression models in case-cohort studies. In this article, we consider computational aspects of their proposed rank based estimating methods. We show that the rank based discontinuous estimating functions for case-cohort studies are monotone, a property established for cohort data in the literature, when generalized Gehan type of weights are used. Though the estimating problem can be formulated to a linear programming problem as that for cohort data, due to its easily uncontrollable large scale even for a moderate sample size, we instead propose a Newton-type iterated method to search for an approximate root for the discontinuous monotone estimating function. Simulation results provide a good demonstration of the proposed method

Collection Of Biostatistics Research Archive

Semiparametric Regression Models with Missing Data: the Mathematics in the Work of Robins et al.

Author: Nan Bin
Yu Menggang
Publication venue: Collection of Biostatistics Research Archive
Publication date: 03/05/2003
Field of study

This review is an attempt to understand the landmark papers of Robins, Rotnitzky, and Zhao (1994) and Robins and Rotnitzky (1992). We revisit their main results and corresponding proofs using the theory outlined in the monograph by Bickel, Klaassen, Ritov, and Wellner (1993). We also discuss an illustrative example to show the details of applying these theoretical results

Collection Of Biostatistics Research Archive

Regression Calibration in Semiparametric Accelerated Failure Time Models

Author: Nan Bin
Yu Menggang
Publication venue: 'Wiley'
Publication date: 01/06/2010
Field of study

In large cohort studies, it often happens that some covariates are expensive to measure and hence only measured on a validation set. On the other hand, relatively cheap but error-prone measurements of the covariates are available for all subjects. Regression calibration (RC) estimation method ( Prentice, 1982 , Biometrika 69 , 331–342) is a popular method for analyzing such data and has been applied to the Cox model by Wang et al. (1997, Biometrics 53 , 131–145) under normal measurement error and rare disease assumptions. In this article, we consider the RC estimation method for the semiparametric accelerated failure time model with covariates subject to measurement error. Asymptotic properties of the proposed method are investigated under a two-phase sampling scheme for validation data that are selected via stratified random sampling, resulting in neither independent nor identically distributed observations. We show that the estimates converge to some well-defined parameters. In particular, unbiased estimation is feasible under additive normal measurement error models for normal covariates and under Berkson error models. The proposed method performs well in finite-sample simulation studies. We also apply the proposed method to a depression mortality study.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/79345/1/j.1541-0420.2009.01295.x.pd

Deep Blue Documents at the University of Michigan

Marginal and Conditional Distribution Estimation from Double-Sampled Semi-Competing Risks Data

Author: Yiannoutsos Constantin T.
Yu Menggang
Publication venue: 'Wiley'
Publication date: 01/03/2015
Field of study

Informative dropout is a vexing problem for any biomedical study. Most existing statistical methods attempt to correct estimation bias related to this phenomenon by specifying unverifiable assumptions about the dropout mechanism. We consider a cohort study in Africa that uses an outreach programme to ascertain the vital status for dropout subjects. These data can be used to identify a number of relevant distributions. However, as only a subset of dropout subjects were followed, vital status ascertainment was incomplete. We use semi-competing risk methods as our analysis framework to address this specific case where the terminal event is incompletely ascertained and consider various procedures for estimating the marginal distribution of dropout and the marginal and conditional distributions of survival. We also consider model selection and estimation efficiency in our setting. Performance of the proposed methods is demonstrated via simulations, asymptotic study and analysis of the study data

IUPUIScholarWorks

PubMed Central

Subgroup Identification Using the personalized Package

Author: Huling Jared D.
Yu Menggang
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 31/05/2021
Field of study

A plethora of disparate statistical methods have been proposed for subgroup identification to help tailor treatment decisions for patients. However a majority of them do not have corresponding R packages and the few that do pertain to particular statistical methods or provide little means of evaluating whether meaningful subgroups have been found. Recently, the work of Chen, Tian, Cai, and Yu (2017) unified many of these subgroup identification methods into one general, consistent framework. The goal of the personalized package is to provide a corresponding unified software framework for subgroup identification analyses that provides not only estimation of subgroups, but evaluation of treatment effects within estimated subgroups. The personalized package allows for a variety of subgroup identification methods for many types of outcomes commonly encountered in medical settings. The package is built to incorporate the entire subgroup identification analysis pipeline including propensity score diagnostics, subgroup estimation, analysis of the treatment effects within subgroups, and evaluation of identified subgroups. In this framework, different methods can be accessed with little change in the analysis code. Similarly, new methods can easily be incorporated into the package. Besides familiar statistical models, the package also allows flexible machine learning tools to be leveraged in subgroup identification. Further estimation improvements can be obtained via efficiency augmentation

Journal of Statistical Software

Prediction of Coronary Artery Disease Risk Based on Multiple Longitudinal Biomarkers

Author: Gao Sujuan
Yang Lili
Yu Menggang
Publication venue: 'Wiley'
Publication date: 01/04/2016
Field of study

In the last decade, few topics in the area of cardiovascular disease (CVD) research have received as much attention as risk prediction. One of the well-documented risk factors for CVD is high blood pressure (BP). Traditional CVD risk prediction models consider BP levels measured at a single time and such models form the basis for current clinical guidelines for CVD prevention. However, in clinical practice, BP levels are often observed and recorded in a longitudinal fashion. Information on BP trajectories can be powerful predictors for CVD events. We consider joint modeling of time to coronary artery disease and individual longitudinal measures of systolic and diastolic BPs in a primary care cohort with up to 20 years of follow-up. We applied novel prediction metrics to assess the predictive performance of joint models. Predictive performances of proposed joint models and other models were assessed via simulations and illustrated using the primary care cohort

IUPUIScholarWorks

PubMed Central

Joint Models for Multiple Longitudinal Processes and Time-to-event Outcome

Author: Gao Sujuan
Yang Lili
Yu Menggang
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2016
Field of study

Joint models are statistical tools for estimating the association between time-to-event and longitudinal outcomes. One challenge to the application of joint models is its computational complexity. Common estimation methods for joint models include a two-stage method, Bayesian and maximum-likelihood methods. In this work, we consider joint models of a time-to-event outcome and multiple longitudinal processes and develop a maximum-likelihood estimation method using the expectation–maximization algorithm. We assess the performance of the proposed method via simulations and apply the methodology to a data set to determine the association between longitudinal systolic and diastolic blood pressure measures and time to coronary artery disease

IUPUIScholarWorks

PubMed Central