1,806 research outputs found

    Cost-effective survival prediction for patients with advanced prostate cancer using clinical trial and real-world hospital registry datasets

    Get PDF
    Introduction Predictive survival modeling offers systematic tools for clinical decision-making and individualized tailoring of treatment strategies to improve patient outcomes while reducing overall healthcare costs. In 2015, a number of machine learning and statistical models were benchmarked in the DREAM 9.5 Prostate Cancer Challenge, based on open clinical trial data for metastatic castration resistant prostate cancer (mCRPC). However, applying these models into clinical practice poses a practical challenge due to the inclusion of a large number of model variables, some of which are not routinely monitored or are expensive to measure. Objectives To develop cost-specified variable selection algorithms for constructing cost-effective prognostic models of overall survival that still preserve sufficient model performance for clinical decision making. Methods Penalized Cox regression models were used for the survival prediction. For the variable selection, we implemented two algorithms: (i) LASSO regularization approach; and (ii) a greedy cost-specified variable selection algorithm. The models were compared in three cohorts of mCRPC patients from randomized clinical trials (RCT), as well as in a real-world cohort (RWC) of advanced prostate cancer patients treated at the Turku University Hospital. Hospital laboratory expenses were utilized as a reference for computing the costs of introducing new variables into the models. Results Compared to measuring the full set of clinical variables, economic costs could be reduced by half without a significant loss of model performance. The greedy algorithm outperformed the LASSO-based variable selection with the lowest tested budgets. The overall top performance was higher with the LASSO algorithm. Conclusion The cost-specified variable selection offers significant budget optimization capability for the real-world survival prediction without compromising the predictive power of the model.Peer reviewe

    Cost-effective survival prediction for patients with advanced prostate cancer using clinical trial and real-world hospital registry datasets

    Get PDF
    IntroductionPredictive survival modeling offers systematic tools for clinical decision-making and individualized tailoring of treatment strategies to improve patient outcomes while reducing overall healthcare costs. In 2015, a number of machine learning and statistical models were benchmarked in the DREAM 9.5 Prostate Cancer Challenge, based on open clinical trial data for metastatic castration resistant prostate cancer (mCRPC). However, applying these models into clinical practice poses a practical challenge due to the inclusion of a large number of model variables, some of which are not routinely monitored or are expensive to measure.ObjectivesTo develop cost-specified variable selection algorithms for constructing cost-effective prognostic models of overall survival that still preserve sufficient model performance for clinical decision making.MethodsPenalized Cox regression models were used for the survival prediction. For the variable selection, we implemented two algorithms: (i) LASSO regularization approach; and (ii) a greedy cost-specified variable selection algorithm. The models were compared in three cohorts of mCRPC patients from randomized clinical trials (RCT), as well as in a real-world cohort (RWC) of advanced prostate cancer patients treated at the Turku University Hospital. Hospital laboratory expenses were utilized as a reference for computing the costs of introducing new variables into the models.ResultsCompared to measuring the full set of clinical variables, economic costs could be reduced by half without a significant loss of model performance. The greedy algorithm outperformed the LASSO-based variable selection with the lowest tested budgets. The overall top performance was higher with the LASSO algorithm.ConclusionThe cost-specified variable selection offers significant budget optimization capability for the real-world survival prediction without compromising the predictive power of the model.</div

    Oscar : Optimal subset cardinality regression using the L0-pseudonorm with applications to prognostic modelling of prostate cancer

    Get PDF
    Author summaryFeature subset selection has become a crucial part of building biomedical models, due to the abundance of available predictors in many applications, yet there remains an uncertainty of their importance and generalization ability. Regularized regression methods have become popular approaches to tackle this challenge by balancing the model goodness-of-fit against the increasing complexity of the model in terms of coefficients that deviate from zero. Regularization norms are pivotal in formulating the model complexity, and currently L-1-norm (LASSO), L-2-norm (Ridge Regression) and their hybrid (Elastic Net) dominate the field. In this paper, we present a novel methodology that is based on the L-0-pseudonorm, also known as the best subset selection, which has largely gone overlooked due to its challenging discrete nature. Our methodology makes use of a continuous transformation of the discrete optimization problem, and provides effective solvers implemented in a user friendly R software package. We exemplify the use of oscar-package in the context of prostate cancer prognostic prediction using both real-world hospital registry and clinical cohort data. By benchmarking the methodology against existing regularization methods, we illustrate the advantages of the L-0-pseudonorm for better clinical applicability, selection of grouped features, and demonstrate its applicability in high-dimensional transcriptomics datasets.In many real-world applications, such as those based on electronic health records, prognostic prediction of patient survival is based on heterogeneous sets of clinical laboratory measurements. To address the trade-off between the predictive accuracy of a prognostic model and the costs related to its clinical implementation, we propose an optimized L-0-pseudonorm approach to learn sparse solutions in multivariable regression. The model sparsity is maintained by restricting the number of nonzero coefficients in the model with a cardinality constraint, which makes the optimization problem NP-hard. In addition, we generalize the cardinality constraint for grouped feature selection, which makes it possible to identify key sets of predictors that may be measured together in a kit in clinical practice. We demonstrate the operation of our cardinality constraint-based feature subset selection method, named OSCAR, in the context of prognostic prediction of prostate cancer patients, where it enables one to determine the key explanatory predictors at different levels of model sparsity. We further explore how the model sparsity affects the model accuracy and implementation cost. Lastly, we demonstrate generalization of the presented methodology to high-dimensional transcriptomics data.Peer reviewe

    And now for real:outcomes of castration-resistant prostate cancer patients in the Netherlands

    Get PDF

    And now for real:outcomes of castration-resistant prostate cancer patients in the Netherlands

    Get PDF

    Machine learning and computational methods to identify molecular and clinical markers for complex diseases – case studies in cancer and obesity

    Get PDF
    In biomedical research, applied machine learning and bioinformatics are the essential disciplines heavily involved in translating data-driven findings into medical practice. This task is especially accomplished by developing computational tools and algorithms assisting in detection and clarification of underlying causes of the diseases. The continuous advancements in high-throughput technologies coupled with the recently promoted data sharing policies have contributed to presence of a massive wealth of data with remarkable potential to improve human health care. In concordance with this massive boost in data production, innovative data analysis tools and methods are required to meet the growing demand. The data analyzed by bioinformaticians and computational biology experts can be broadly divided into molecular and conventional clinical data categories. The aim of this thesis was to develop novel statistical and machine learning tools and to incorporate the existing state-of-the-art methods to analyze bio-clinical data with medical applications. The findings of the studies demonstrate the impact of computational approaches in clinical decision making by improving patients risk stratification and prediction of disease outcomes. This thesis is comprised of five studies explaining method development for 1) genomic data, 2) conventional clinical data and 3) integration of genomic and clinical data. With genomic data, the main focus is detection of differentially expressed genes as the most common task in transcriptome profiling projects. In addition to reviewing available differential expression tools, a data-adaptive statistical method called Reproducibility Optimized Test Statistic (ROTS) is proposed for detecting differential expression in RNA-sequencing studies. In order to prove the efficacy of ROTS in real biomedical applications, the method is used to identify prognostic markers in clear cell renal cell carcinoma (ccRCC). In addition to previously known markers, novel genes with potential prognostic and therapeutic role in ccRCC are detected. For conventional clinical data, ensemble based predictive models are developed to provide clinical decision support in treatment of patients with metastatic castration resistant prostate cancer (mCRPC). The proposed predictive models cover treatment and survival stratification tasks for both trial-based and realworld patient cohorts. Finally, genomic and conventional clinical data are integrated to demonstrate the importance of inclusion of genomic data in predictive ability of clinical models. Again, utilizing ensemble-based learners, a novel model is proposed to predict adulthood obesity using both genetic and social-environmental factors. Overall, the ultimate objective of this work is to demonstrate the importance of clinical bioinformatics and machine learning for bio-clinical marker discovery in complex disease with high heterogeneity. In case of cancer, the interpretability of clinical models strongly depends on predictive markers with high reproducibility supported by validation data. The discovery of these markers would increase chance of early detection and improve prognosis assessment and treatment choice

    Modeling and prediction of advanced prostate cancer

    Get PDF
    Background: Prostate cancer (PCa) is the most commonly diagnosed cancer and second leading cause of cancer-related deaths for men in Western countries. The advanced form of the disease is life-threatening with few options for curative therapies. The development of novel therapeutic alternatives would greatly benefit from a more comprehensive and tailored mathematical and statistical methodology. In particular, statistical inference of treatment effects and the prediction of time-dependent effects in both preclinical and clinical studies remains a challenging yet interesting opportunity for applied mathematicians. Such methods are likely to improve the reproducibility and translatability of results and offer possibility for novel holistic insights into disease progression, diagnosis, and prognosis. Methods: Several novel statistical and mathematical techniques were developed over the course of this thesis work for the in vivo modeling of PCa treatment responses. A matching-based, blinded randomized allocation procedure for preclinical experiments was developed that provides assistance for the statistical design of animal intervention studies, e.g., through power analysis and accounting for the stratification of individuals. For the post-intervention testing of treatment effects, two novel mixed-effects models were developed that aim to address the characteristic challenges of preclinical longitudinal experiments, including the heterogeneous response profiles observed in animal studies. Subsequently, a Finnish clinical PCa hospital registry cohort was inspected with a strong emphasis on prostate-specific antigen (PSA), the most commonly used PCa marker. After exploring the PSA trends using penalized splines, a generalized mixed-effects prediction model was implemented with a focus on the ultra-sensitive range of the PSA assay. Finally, for metastatic, aggressive PCa, an ensemble Cox regression methodology was developed for overall survival prediction in the DREAM 9.5 mCRPC Challenge based on open datasets from controlled clinical trials. Results: The advantages of the improved experimental design and two proposed statistical models were demonstrated in terms of both increased statistical power and accuracy in simulated and real preclinical testing settings. Penalized regression models applied to the clinical patient datasets support the use of PSA in the ultra-sensitive range together with a model for relapse prediction. Furthermore, the novel ensemble-based Cox regression model that was developed for the overall survival prediction in advanced PCa outperformed the state-of-the-art benchmark and all other models submitted to the Challenge and provided novel predictors of disease progression and treatment responses. Conclusions: The methods and results provide preclinical researchers and clinicians with novel tools for comprehensive modeling and prediction of PCa. All methodology is available as open source R statistical software packages and/or web-based graphical user interfaces

    Economic Evaluation of Potential Applications of Gene Expression Profiling in Clinical Oncology

    Get PDF
    Histopathological analysis of tumor is currently the main tool used to guide cancer management. Gene expression profiling may provide additional valuable information for both classification and prognostication of individual tumors. A number of gene expression profiling assays have been developed recently to inform therapy decisions in women with early stage breast cancer and help identify the primary tumor site in patients with metastatic cancer of unknown primary. The impact of these assays on health and economic outcomes, if introduced into general practice, has not been determined. I aimed to conduct an economic evaluation of regulatory-approved gene expression profiling assays for breast cancer and cancer of unknown primary for the purpose of determining whether these technologies represent value for money from the perspective of the Canadian health care system. I developed decision-analytic models to project the lifetime clinical and economic consequences of early stage breast cancer and metastatic cancer of unknown primary. I used Manitoba Cancer Registry and Manitoba administrative health databases to model current “real-world” Canadian clinical practices. I applied available data about gene expression profiling assays from secondary sources on these models to predict the impact of these assays on current clinical and economic outcomes. In the base case, gene expression profiling assays in early stage breast cancer and cancer of unknown primary resulted in incremental cost effectiveness ratios of less than $100,000 per quality-adjusted life-year gained. These results were most sensitive to the uncertainty associated with the accuracy of the assay, patient-physician response to gene expression profiling information and patient survival. The potential application of these gene expression profiling assays in clinical oncology appears to be cost-effective in the Canadian healthcare system. Field evaluation of these assays to establish their impact on cancer management and patient survival may have a large societal impact and should be initiated in Canada to ensure their clinical utility and cost-effectiveness. The use of Canadian provincial administrative population data in decision modeling is useful to quantify uncertainty about gene expression profiling assays and guide the use of novel funding models such as conditional funding alongside a field evaluation
    corecore