19 research outputs found

    Application and Extension of Weighted Quantile Sum Regression for the Development of a Clinical Risk Prediction Tool

    Get PDF
    In clinical settings, the diagnosis of medical conditions is often aided by measurement of various serum biomarkers through the use of laboratory tests. These biomarkers provide information about different aspects of a patient’s health and the overall function of different organs. In this dissertation, we develop and validate a weighted composite index that aggregates the information from a variety of health biomarkers covering multiple organ systems. The index can be used for predicting all-cause mortality and could also be used as a holistic measure of overall physiological health status. We refer to it as the Health Status Metric (HSM). Validation analysis shows that the HSM is predictive of long-term mortality risk and exhibits a robust association with concurrent chronic conditions, recent hospital utilization, and self-rated health. We develop the HSM using Weighted Quantile Sum (WQS) regression (Gennings et al., 2013; Carrico, 2013), a novel penalized regression technique that imposes nonnegativity and unit-sum constraints on the coefficients used to weight index components. In this dissertation, we develop a number of extensions to the WQS regression technique and apply them to the construction of the HSM. We introduce a new guided approach for the standardization of index components which accounts for potential nonlinear relationships with the outcome of interest. An extended version of the WQS that accommodates interaction effects among index components is also developed and implemented. In addition, we demonstrate that ensemble learning methods borrowed from the field of machine learning can be used to improve the predictive power of the WQS index. Specifically, we show that the use of techniques such as weighted bagging, the random subspace method and stacked generalization in conjunction with the WQS model can produce an index with substantially enhanced predictive accuracy. Finally, practical applications of the HSM are explored. A comparative study is performed to evaluate the feasibility and effectiveness of a number of ‘real-time’ imputation strategies in potential software applications for computing the HSM. In addition, the efficacy of the HSM as a predictor of hospital readmission is assessed in a cohort of emergency department patients

    A novel comprehensive clinical stratification model to refine prognosis of glioblastoma patients undergoing surgical resection

    Get PDF
    Despite recent discoveries in genetics and molecular fields, glioblastoma (GBM) prognosis still remains unfavorable with less than 10% of patients alive 5 years after diagnosis. Numerous studies have focused on the research of biological biomarkers to stratify GBM patients. We addressed this issue in our study by using clinical/molecular and image data, which is generally available to Neurosurgical Departments in order to create a prognostic score that can be useful to stratify GBM patients undergoing surgical resection. By using the random forest approach [CART analysis (classification and regression tree)] on Survival time data of 465 cases, we developed a new prediction score resulting in 10 groups based on extent of resection (EOR), age, tumor volumetric features, intraoperative protocols and tumor molecular classes. The resulting tree was trimmed according to similarities in the relative hazard ratios amongst groups, giving rise to a 5-group classification tree. These 5 groups were different in terms of overall survival (OS) (p < 0.000). The score performance in predicting death was defined by a Harrell\u2019s c-index of 0.79 (95% confidence interval [0.76\u20130.81]). The proposed score could be useful in a clinical setting to refine the prognosis of GBM patients after surgery and prior to postoperative treatment

    Evaluating amino acid isotopic biomarkers of added sugar and animal protein intakes

    Get PDF
    Thesis (Ph.D.) University of Alaska Fairbanks, 2023Diet is an important risk factor for chronic disease, but identifying precise relationships between diet and disease remains challenging because of the high error in self-reported dietary measurements. Objective biomarkers of diet can help account for this error. Natural abundance amino acid carbon isotopes ratios (AA CIRs) and nitrogen isotope ratios (AA NIRs) are candidate biomarkers of multiple dietary intakes in the US, due to their natural variation in the diet. In this dissertation, I evaluate whether AA CIRs and AA NIRs are sensitive and specific biomarkers of intakes, such as added sugar and animal protein, in the context of two controlled feeding studies. In the first controlled feeding study, participants resided at the Phoenix branch of the National Institute of Diabetes and Digestive and Kidney Diseases and consumed study diets for a 12-week duration. These diets had the same macronutrient profile but varied in the presence or absence of 3 intakes with isotopic variation: sugar-sweetened beverages (SSBs), meat, and fish. I demonstrate that most nonessential AA CIRs, in plasma and red blood cells, have specific responses to SSB intake and that essential AA CIRs have specific responses to meat intake. I also estimate the turnover rates of AA CIRs. Next, I show that both fish and meat intake influence AA NIRs in plasma and red blood cells. In the second controlled feeding study, participants were recruited in Phoenix, Arizona across sex, age, and BMI groups. Participants consumed study diets reflecting their usual intake for a 15-day period, and I present the correlations between various intakes and AA CIRs from serum collected at the end of the feeding period. I find a moderate correlation between the alanine CIR and added sugar intake and also multiple moderate to high correlations between protein-related intakes and all AA CIRs. I present a model of added sugar intake, which has modest explanatory power. The work in this dissertation further attests to the biomarker potential of AA CIRs and NIRs in the US, and it suggests that AA CIRs are especially promising biomarkers of SSB and animal protein intakes.Chapter 1: General introduction -- Chapter 2: The carbon isotope ratios of nonessential amino acids identify sugar-sweetened beverage (SSB) consumers in a 12-wk inpatient feeding study of 32 adult men with varying SSB and meat exposures -- Chapter 3: Amino acid nitrogen isotope ratios respond to fish and meat intake in a 12-week inpatient feeding study of men -- Chapter 4: Evaluating a model of added sugar intake based on amino acid carbon isotope ratios in a controlled feeding study of U.S. adults -- Chapter 5: General conclusions -- Appendices

    Tree-based methods for survival analysis and high-dimensional data

    Get PDF
    Machine learning techniques have garnered significant popularity due to their capacity to handle high dimensional data. Tree-based methods are among the most popular machine learning approaches. My dissertation aims on improving existing tree-based methods and developing statistical framework for understanding the proposed methods. It contains three topics: recursively imputed survival tree, reinforcement learning trees and reinforcement learning trees for right censored survival data. A central idea of my dissertation is focused on increasing the chance of using signaled variables as splitting rule during the tree construction while not losing the randomness/diversity, hence a more accurate model can be built. However, different methods achieve this by using different approaches. Recursively imputed survival tree recursively impute censored observations and refit the survival tree model. This approach allows better use of the censored observations during the tree construction, it also changes the dynamic of splitting rule selections during the tree construction so that signaled variables can be emphasized more in the refitted model. Reinforcement learning trees takes a direct approach to emphasize signaled variables in the tree construction. An embedded model is fitted at each internal node while searching for splitting rules. The variable with the largest variable importance measure is used as the splitting variable. A new theoretical framework is proposed to show consistency and convergence rate of this new approach. In the third topic, we further extend reinforcement learning trees to right censored survival data. Brier score is utilized to calculate the variable importance measures. We also show a desirable property of the proposed method that can help correct the bias of variable importance measures when correlated variables are present in the model.Doctor of Philosoph

    Developing clinical prediction models for diabetes classification and progression

    Get PDF
    Patients with type 1 and type 2 diabetes have very different treatment and care requirements. Overlapping phenotypes and lack of clear classification guidelines make it difficult for clinicians to differentiate between type 1 and type 2 diabetes at diagnosis. The rate of glycaemic deterioration is highly variable in patients with type 2 diabetes but there is no single test to accurately identify which patients will progress rapidly to requiring insulin therapy. Incorrect treatment and care decisions in diabetes can have life-threatening consequences. The aim of this thesis is to develop clinical prediction models that can be incorporated into routine clinical practice to assist clinicians with the classification and care of patient diagnosed with diabetes. We addressed the problem first by integrating features previously associated with classification of type 1 and type 2 diabetes to develop a diagnostic model using logistic regression to identify, at diagnosis, patients with type 1 diabetes. The high performance achieved by this model was comparable to that of machine learning algorithms. In patients diagnosed with type 2 diabetes, we found that patients who were GADA positive and had genetic susceptibility to type 1 diabetes progressed more rapidly to requiring insulin therapy. We built upon this finding to develop a prognostic model integrating predictive features of glycaemic deterioration to predict early insulin requirement in adults diagnosed with type 2 diabetes. The three main findings of this thesis have the potential to change the way that patients with diabetes are managed in clinical practice. Use of the diagnostic model developed to identify patients with type 1 diabetes has the potential to reduce misclassification. Classifying patients according to the model has the benefit of being more akin to the treatment needs of the patient rather than the aetiopathological definitions used in current clinical guidelines. The design of the model lends itself to implementing a triage-based approach to diabetes subtype diagnosis. Our second main finding alters the clinical implications of a positive GADA test in patients diagnosed with type 2 diabetes. For identifying patients likely to progress rapidly to insulin, genetic testing is only beneficial in patients who test positive for GADA. In clinical practice, a two-step screening process could be implemented - only patients who test positive for GADA in the first step would go on for genetic testing. The prognostic model can be used in clinical practice to predict a patient’s rate of glycaemic deterioration leading to a requirement for insulin. The availability of this data will enable clinical practices to more effectively manage their patient lists, prioritising more intensive follow up for those patients who are at high risk of rapid progression. Patients are likely to benefit from tailored treatment. Another key clinical use of the prognostic model is the identification of patients who would benefit most from GADA testing saving both inconvenience to the patient and a cost-benefit to the health service

    Tumor heterogeneity in glioblastoma:a real-life brain teaser

    Get PDF
    corecore