28 research outputs found

    Statistical learning methods for subgroup discovery with survival outcome

    Get PDF
    In clinical trials, it is important to understand and characterize disease and treatment response heterogeneity among patients so that precision medicine can particularly target certain subsets of patients, defined by baseline characteristics. Feature variables, such as demographic characteristics, genetic, genomic and environmental information, combined with a patient's survival outcome, can be used to explore such latent heterogeneity. In the first project, we propose a mixture model to explore each patient's latent survival pattern, where the mixing probabilities for latent groups are modeled through a multinomial distribution. The Bayesian information criterion (BIC) is used for selecting the number of latent groups. Furthermore, we incorporate variable selection with the adaptive lasso into inference so that only a few feature variables will be selected to characterize the latent heterogeneity. We show that our adaptive lasso estimator has oracle properties when the number of parameters diverges with the sample size. The finite sample performance is evaluated by simulation studies under different scenarios, and the proposed method is illustrated by the data from a breast cancer clinical trial (IBCSG) and the data of the assay of free light chain. In the second project, we develop a mixture survival tree model for direct risk classification. We assume that the patients can be classified into a pre-specified number of risk groups, where each group has distinct survival profile. Our proposed tree-based method is devised to estimate latent group membership using the Expectation Maximization (EM) algorithm. The observed data log-likelihood function is used as the splitting criterion in recursive partitioning. We examine the monotone likelihood property of the proposed algorithm. The finite sample performance is evaluated by extensive simulation studies and the proposed method is illustrated by a case study in breast cancer. In the third project, we study the unobserved heterogeneity in patient's treatment response. We consider a semi-parametric approach to directly classify patients into different latent subgroups where each subgroup of patients demonstrates a distinct average treatment effect. A random forest algorithm is developed to learn how the baseline covariates determine the unobserved heterogeneity in patients. In each individual tree, the EM algorithm is incorporated to handle the unobserved subgroup membership. The observed data log-likelihood function is used as the splitting criterion in recursive partition. A variable importance measurement is derived to facilitate identifying important features related to subgroup membership assignment. We evaluate the numeric performance of our proposed random forest model via extensive simulation studies and provide an application to a Phase III randomized clinical trial in patients with hematological malignancies.Doctor of Philosoph

    Maternal and fetal genetic effects on birth weight and their relevance to cardio-metabolic risk factors.

    Get PDF
    Birth weight variation is influenced by fetal and maternal genetic and non-genetic factors, and has been reproducibly associated with future cardio-metabolic health outcomes. In expanded genome-wide association analyses of own birth weight (n = 321,223) and offspring birth weight (n = 230,069 mothers), we identified 190 independent association signals (129 of which are novel). We used structural equation modeling to decompose the contributions of direct fetal and indirect maternal genetic effects, then applied Mendelian randomization to illuminate causal pathways. For example, both indirect maternal and direct fetal genetic effects drive the observational relationship between lower birth weight and higher later blood pressure: maternal blood pressure-raising alleles reduce offspring birth weight, but only direct fetal effects of these alleles, once inherited, increase later offspring blood pressure. Using maternal birth weight-lowering genotypes to proxy for an adverse intrauterine environment provided no evidence that it causally raises offspring blood pressure, indicating that the inverse birth weight-blood pressure association is attributable to genetic effects, and not to intrauterine programming.The Fenland Study is funded by the Medical Research Council (MC_U106179471) and Wellcome Trust

    Mixture survival trees for cancer risk classification

    No full text
    In oncology studies, it is important to understand and characterize disease heterogeneity among patients so that patients can be classified into different risk groups and one can identify high-risk patients at the right time. This information can then be used to identify a more homogeneous patient population for developing precision medicine. In this paper, we propose a mixture survival tree approach for direct risk classification. We assume that the patients can be classified into a pre-specified number of risk groups, where each group has distinct survival profile. Our proposed tree-based methods are devised to estimate latent group membership using an EM algorithm. The observed data log-likelihood function is used as the splitting criterion in recursive partitioning. The finite sample performance is evaluated by extensive simulation studies and the proposed method is illustrated by a case study in breast cancer

    Deterministic Restriction on Pluripotent State Dissolution by Cell-Cycle Pathways

    Get PDF
    SummaryDuring differentiation, human embryonic stem cells (hESCs) shut down the regulatory network conferring pluripotency in a process we designated pluripotent state dissolution (PSD). In a high-throughput RNAi screen using an inclusive set of differentiation conditions, we identify centrally important and context-dependent processes regulating PSD in hESCs, including histone acetylation, chromatin remodeling, RNA splicing, and signaling pathways. Strikingly, we detected a strong and specific enrichment of cell-cycle genes involved in DNA replication and G2 phase progression. Genetic and chemical perturbation studies demonstrate that the S and G2 phases attenuate PSD because they possess an intrinsic propensity toward the pluripotent state that is independent of G1 phase. Our data therefore functionally establish that pluripotency control is hardwired to the cell-cycle machinery, where S and G2 phase-specific pathways deterministically restrict PSD, whereas the absence of such pathways in G1 phase potentially permits the initiation of differentiation
    corecore