6 research outputs found

    Machine learning prediction of breast cancer survival using age, sex, length of stay, mode of diagnosis and location of cancer

    Get PDF
    Breast cancer is one of the leading causes of death in females and survival depends on early diagnosis and treatment. This paper applied machine learning techniques in prediction of breast cancer survival (dead or alive) using age, sex, length of stay, mode of diagnosis and location of cancer as predictors (independent variables). The data was obtained from the outpatient department of the University of Ilorin Teaching Hospital, Ilorin, Nigeria. The sample size of 300 consists of 175 females and 25 males who were admitted at the hospital and treated for breast cancer. The patients were later discharged or died. Adaptive boosting (AdaBoost) performed best out of the data mining models used in the classification in all the three cases where the target class is average over classes, alive or dead. The AdaBoost performed best with the classification accuracy and area under curve (AUC) of 98.3% and 99.9% respectively. Furthermore, a probe on the prediction by AdaBoost showed that the probability of dead due to breast cancer is 0.47, which the length of stay hugely contributed to the high probability, location of breast cancer and mode of diagnosis contributed minimally while age and sex contributed insignificantly. The high probability of breast cancer mortality predicted in this paper is a call for concern as early detection of breast cancer, routine breast examination and breast cancer awareness are crucial in increasing the probability of survival. The results can be used to design a decision support system that can increase the chances of breast cancer survival

    Machine learning to improve interpretability of clinical, radiological and panel-based genomic data of glioma grade 4 patients undergoing surgical resection

    Get PDF
    Background: Glioma grade 4 (GG4) tumors, including astrocytoma IDH-mutant grade 4 and the astrocytoma IDH wt are the most common and aggressive primary tumors of the central nervous system. Surgery followed by Stupp protocol still remains the first-line treatment in GG4 tumors. Although Stupp combination can prolong survival, prognosis of treated adult patients with GG4 still remains unfavorable. The introduction of innovative multi-parametric prognostic models may allow refinement of prognosis of these patients. Here, Machine Learning (ML) was applied to investigate the contribution in predicting overall survival (OS) of different available data (e.g. clinical data, radiological data, or panel-based sequencing data such as presence of somatic mutations and amplification) in a mono-institutional GG4 cohort. Methods: By next-generation sequencing, using a panel of 523 genes, we performed analysis of copy number variations and of types and distribution of nonsynonymous mutations in 102 cases including 39 carmustine wafer (CW) treated cases. We also calculated tumor mutational burden (TMB). ML was applied using eXtreme Gradient Boosting for survival (XGBoost-Surv) to integrate clinical and radiological information with genomic data. Results: By ML modeling (concordance (c)- index = 0.682 for the best model), the role of predicting OS of radiological parameters including extent of resection, preoperative volume and residual volume was confirmed. An association between CW application and longer OS was also showed. Regarding gene mutations, a role in predicting OS was defined for mutations of BRAF and of other genes involved in the PI3K-AKT-mTOR signaling pathway. Moreover, an association between high TMB and shorter OS was suggested. Consistently, when a cutoff of 1.7 mutations/megabase was applied, cases with higher TMB showed significantly shorter OS than cases with lower TMB. Conclusions: The contribution of tumor volumetric data, somatic gene mutations and TBM in predicting OS of GG4 patients was defined by ML modeling

    Can machine learning methods contribute as a decision support system in sequential oligometastatic radioablation therapy?

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsCancer treatment is among the major medical challenges of this century. Sequential oligometastatic radio-ablation (SOMA) is a novel treatment method that aims at ablating reoccurring metastasis in a single session with a targeted high dose of radiation. To know if SOMA is the best possible treatment method for a patient, the benefits of each available therapy need to be understood and evaluated. The ability to model complex systems, such as cancer treatment, is the strength of machine learning techniques. These techniques have improved the understanding of numerous medical therapies already. In some cases, they can serve as medical support systems if they deliver reliable results that doctors can trust and understand. The results obtained from applying numerous machine learning techniques to the data of SOMA-treated patients show that there are favorable techniques in some cases. It was observed that the Random Forest algorithm proved superior at different classification tasks. Additionally, regression problems opposed a great challenge, as the amount of data is very limited. Finally, SHAP values - a novel machine learning interpretation technique – provided valuable insights into understanding the rationale of each algorithm. They proved that the machine learning algorithms could learn patterns aligned with the human intuition in the problems presented. SHAP values show great potential in bridging the gap between complex machine learning algorithms and their interpretability. They display how an algorithm learns from the data and derives results. This opens up exciting possibilities for applying machine learning algorithms in the real world

    Preface

    Get PDF

    Machine learning explainability in breast cancer survival

    No full text
    Machine Learning (ML) can improve the diagnosis, treatment decisions, and understanding of cancer. However, the low explainability of how “black box” ML methods produce their output hinders their clinical adoption. In this paper, we used data from the Netherlands Cancer Registry to generate a ML-based model to predict 10-year overall survival of breast cancer patients. Then, we used Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) to interpret the model's predictions. We found that, overall, LIME and SHAP tend to be consistent when explaining the contribution of different features. Nevertheless, the feature ranges where they have a mismatch can also be of interest, since they can help us identifying “turning points” where features go from favoring survived to favoring deceased (or vice versa). Explainability techniques can pave the way for better acceptance of ML techniques. However, their evaluation and translation to real-life scenarios need to be researched further
    corecore