7,576 research outputs found

    Cost-Sensitive Learning for Recurrence Prediction of Breast Cancer

    Get PDF
    Breast cancer is one of the top cancer-death causes and specifically accounts for 10.4% of all cancer incidences among women. The prediction of breast cancer recurrence has been a challenging research problem for many researchers. Data mining techniques have recently received considerable attention, especially when used for the construction of prognosis models from survival data. However, existing data mining techniques may not be effective to handle censored data. Censored instances are often discarded when applying classification techniques to prognosis. In this paper, we propose a cost-sensitive learning approach to involve the censored data in prognostic assessment with better recurrence prediction capability. The proposed approach employs an outcome inference mechanism to infer the possible probabilistic outcome of each censored instance and adopt the cost-proportionate rejection sampling and a committee machine strategy to take into account these instances with probabilistic outcomes during the classification model learning process. We empirically evaluate the effectiveness of our proposed approach for breast cancer recurrence prediction and include a censored-data-discarding method (i.e., building the recurrence prediction model by only using uncensored data) and the Kaplan-Meier method (a common prognosis method) as performance benchmarks. Overall, our evaluation results suggest that the proposed approach outperforms its benchmark techniques, measured by precision, recall and F1 score

    Prognostic value of routine laboratory variables in prediction of breast cancer recurrence.

    Get PDF
    The prognostic value of routine laboratory variables in breast cancer has been largely overlooked. Based on laboratory tests commonly performed in clinical practice, we aimed to develop a new model to predict disease free survival (DFS) after surgical removal of primary breast cancer. In a cohort of 1,596 breast cancer patients, we analyzed the associations of 33 laboratory variables with patient DFS. Based on 3 significant laboratory variables (hemoglobin, alkaline phosphatase, and international normalized ratio), together with important demographic and clinical variables, we developed a prognostic model, achieving the area under the curve of 0.79. We categorized patients into 3 risk groups according to the prognostic index developed from the final model. Compared with the patients in the low-risk group, those in the medium- and high-risk group had a significantly increased risk of recurrence with a hazard ratio (HR) of 1.75 (95% confidence interval [CI] 1.30-2.38) and 4.66 (95% CI 3.54-6.14), respectively. The results from the training set were validated in the testing set. Overall, our prognostic model incorporating readily available routine laboratory tests is powerful in identifying breast cancer patients who are at high risk of recurrence. Further study is warranted to validate its clinical application

    Enhanced breast Cancer Relapse Prediction Based on Ensemble Learning Approaches

    Get PDF
    Predicting progression and deciding on the best follow-up techniques for breast cancer patients is difficult because the illness is diverse and characterized by varying relapse risks. Due to its prevalence, breast cancer has become the top cause of mortality among women worldwide, making diagnosis and prognosis particularly challenging areas of medical study. In addition, the fear of a cancer relapse is a major factor influencing cancer patients' quality of life. The study aims to help doctors determine the likelihood of a breast cancer relapse by applying ensemble learning techniques. In this research, artificial neural networks (ANN) and deep neural networks (DNN) ensembled with Weighted averaging, minority, and majority voting approaches have been investigated for performance enhancements on the breast cancer recurrence dataset sourced from the UCI-ML repository. The empirical analysis shows that this ensemble learning-enabled proposed novel approach shows improved accuracy, precision, sensitivity, specificity, and F1-score of 96.21%, 96.59%, 98.84%, 84.62%, and 97.41%, respectively. The findings of this study can aid doctors in making more informed treatment decisions, thereby improving patient outcomes

    Machine learning algorithms to predict breast cancer recurrence using structured and unstructured sources from electronic health records

    Get PDF
    Recurrence is a critical aspect of breast cancer (BC) that is inexorably tied to mortality. Reuse of healthcare data through Machine Learning (ML) algorithms offers great opportunities to improve the stratification of patients at risk of cancer recurrence. We hypothesized that combining features from structured and unstructured sources would provide better prediction results for 5-year cancer recurrence than either source alone. We collected and preprocessed clinical data from a cohort of BC patients, resulting in 823 valid subjects for analysis. We derived three sets of features: structured information, features from free text, and a combination of both. We evaluated the performance of five ML algorithms to predict 5-year cancer recurrence and selected the best-performing to test our hypothesis. The XGB (eXtreme Gradient Boosting) model yielded the best performance among the five evaluated algorithms, with precision = 0.900, recall = 0.907, F1-score = 0.897, and area under the receiver operating characteristic AUROC = 0.807. The best prediction results were achieved with the structured dataset, followed by the unstructured dataset, while the combined dataset achieved the poorest performance. ML algorithms for BC recurrence prediction are valuable tools to improve patient risk stratification, help with post-cancer monitoring, and plan more effective follow-up. Structured data provides the best results when fed to ML algorithms. However, an approach based on natural language processing offers comparable results while potentially requiring less mapping effort.European Union | Ref. 875406Fondo Europeo de Desarrollo Regional (FEDER)Xunta de Galici

    Current status of sentinel lymph node biopsy in solid malignancies

    Get PDF
    Lymphatic mapping and sentinel lymph node biopsy were first reported in 1977 by Cabanas for penile cancer. Since that time, the technique has become rapidly assimilated into clinical practice. The sentinel node concept has been validated in cutaneous melanoma and breast cancer. However, follow-up data of patients from randomised trials is needed to establish the clinical significance of sentinel lymph node biopsy before accepting the procedure as a standard of care. This technique has the potential to be utilised in all solid tumours like colon, gastric, oesophageal, lung, gynaecologic, and head and neck cancer. This paper reviews the current status of sentinel lymph node biopsy in solid tumours

    Predicting Response to Platin Chemotherapy Agents with Biochemically-inspired Machine Learning

    Get PDF
    Selection of effective genes that accurately predict chemotherapy response could improve cancer outcomes. We compare optimized gene signatures for cisplatin, carboplatin, and oxaliplatin response in the same cell lines, and respectively validate each with cancer patient data. Supervised support vector machine learning was used to derive gene sets whose expression was related to cell line GI50 values by backwards feature selection with cross-validation. Specific genes and functional pathways distinguishing sensitive from resistant cell lines are identified by contrasting signatures obtained at extreme vs. median GI50 thresholds. Ensembles of gene signatures at different thresholds are combined to reduce dependence on specific GI50 values for predicting drug response. The most accurate gene signatures for each platin are: cisplatin: BARD1, BCL2, BCL2L1, CDKN2C, FAAP24, FEN1, MAP3K1, MAPK13, MAPK3, NFKB1, NFKB2, SLC22A5, SLC31A2, TLR4, TWIST1; carboplatin: AKT1, EIF3K, ERCC1, GNGT1, GSR, MTHFR, NEDD4L, NLRP1, NRAS, RAF1, SGK1, TIGD1, TP53, VEGFB, VEGFC; oxaliplatin: BRAF, FCGR2A, IGF1, MSH2, NAGK, NFE2L2, NQO1, PANK3, SLC47A1, SLCO1B1, UGT1A1. TCGA bladder, ovarian and colorectal cancer patients were used to test cisplatin, carboplatin and oxaliplatin signatures (respectively), resulting in 71.0%, 60.2% and 54.5% accuracy in predicting disease recurrence and 59%, 61% and 72% accuracy in predicting remission. One cisplatin signature predicted 100% of recurrence in non-smoking bladder cancer patients (57% disease-free; N=19), and 79% recurrence in smokers (62% disease-free; N=35). This approach should be adaptable to other studies of chemotherapy response, independent of drug or cancer types

    Network modeling of patients' biomolecular profiles for clinical phenotype/outcome prediction

    Get PDF
    Methods for phenotype and outcome prediction are largely based on inductive supervised models that use selected biomarkers to make predictions, without explicitly considering the functional relationships between individuals. We introduce a novel network-based approach named Patient-Net (P-Net) in which biomolecular profiles of patients are modeled in a graph-structured space that represents gene expression relationships between patients. Then a kernel-based semi-supervised transductive algorithm is applied to the graph to explore the overall topology of the graph and to predict the phenotype/clinical outcome of patients. Experimental tests involving several publicly available datasets of patients afflicted with pancreatic, breast, colon and colorectal cancer show that our proposed method is competitive with state-of-the-art supervised and semi-supervised predictive systems. Importantly, P-Net also provides interpretable models that can be easily visualized to gain clues about the relationships between patients, and to formulate hypotheses about their stratification
    • …
    corecore