3 research outputs found

    Cost-Sensitive Learning for Recurrence Prediction of Breast Cancer

    Get PDF
    Breast cancer is one of the top cancer-death causes and specifically accounts for 10.4% of all cancer incidences among women. The prediction of breast cancer recurrence has been a challenging research problem for many researchers. Data mining techniques have recently received considerable attention, especially when used for the construction of prognosis models from survival data. However, existing data mining techniques may not be effective to handle censored data. Censored instances are often discarded when applying classification techniques to prognosis. In this paper, we propose a cost-sensitive learning approach to involve the censored data in prognostic assessment with better recurrence prediction capability. The proposed approach employs an outcome inference mechanism to infer the possible probabilistic outcome of each censored instance and adopt the cost-proportionate rejection sampling and a committee machine strategy to take into account these instances with probabilistic outcomes during the classification model learning process. We empirically evaluate the effectiveness of our proposed approach for breast cancer recurrence prediction and include a censored-data-discarding method (i.e., building the recurrence prediction model by only using uncensored data) and the Kaplan-Meier method (a common prognosis method) as performance benchmarks. Overall, our evaluation results suggest that the proposed approach outperforms its benchmark techniques, measured by precision, recall and F1 score

    Extracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem

    Get PDF
    Application of data mining methods as a decision support system has a great benefit to predict survival of new patients. It also has a great potential for health researchers to investigate the relationship between risk factors and cancer survival. But due to the imbalanced nature of datasets associated with breast cancer survival, the accuracy of survival prognosis models is a challenging issue for researchers. This study aims to develop a predictive model for 5-year survivability of breast cancer patients and discover relationships between certain predictive variables and survival. The dataset was obtained from SEER database. First, the effectiveness of two synthetic oversampling methods Borderline SMOTE and Density based Synthetic Oversampling method (DSO) is investigated to solve the class imbalance problem. Then a combination of particle swarm optimization (PSO) and Correlation-based feature selection (CFS) is used to identify most important predictive variables. Finally, in order to build a predictive model three classifiers decision tree (C4.5), Bayesian Network, and Logistic Regression are applied to the cleaned dataset. Some assessment metrics such as accuracy, sensitivity, specificity, and G-mean are used to evaluate the performance of the proposed hybrid approach. Also, the area under ROC curve (AUC) is used to evaluate performance of feature selection method. Results show that among all combinations, DSO + PSO_CFS + C4.5 presents the best efficiency in criteria of accuracy, sensitivity, G-mean and AUC with values of 94.33%, 0.930, 0.939 and 0.939, respectively

    The role of molecular mechanisms of neoangiogenesis as tumor markers in the treatment individualization of breast cancer patients

    Get PDF
    REZIME: Karcinom dojke je vodeći uzrok smrtnosti žena širom sveta kada su u pitanju maligne bolesti. Svetska zdravstvena organizacija je procenila da će u ovom veku svaka 8. žena na planeti oboleti od ove bolesti. Uprkos ostvarenom napretku u dijagnostici i lečenju, očigledan je visok morbiditet i mortalitet od karcinoma dojke, tako da su neophodni novi pristupi u lečenju ove bolesti. Činjenica je da su današnji onkološki protokoli previše kruti i aproksimativni i donekle zanemaruju osobenosti pacijenta i biologiju tumora svakog pacijenta. Stoga se teži individualizaciji/personalizaciji terapije, koja će odgovarati biološkom profilu svakog pacijenta ponaosob, što bi vodilo uvođenju novih i poboljšanju postojećih onkoloških protokola lečenja. Za to je neophodan multidisciplinarni pristup, u kome će sarađivati eksperti iz različitih oblasti, a koji uključuje data mining sisteme za obradu podataka i obećavajuće, ali nedovoljno istražene modalitete lečenja, kao što su elektroporacija, elektrohemioterapija i fitoterapija. Prvi deo istraživanja predstavlja prospektivnu studiju koja uključuje pacijente sa dijagnozom karcinoma dojke KC Kragujevac u petogodišnjem periodu praćenja. Tokom operacije koja se rutinski izvodi na Klinici za opštu i grudnu hirurgiju u KC Kragujevac uzimani su uzorci karcinoma dojke i peritumorskog tkiva, a potom su pored standardnih patohistoloških pregleda sprovedene i dodatne analize: određivanje koncentracije metaloproteinaze 9 (MMP- 9); ispitivanje genske ekspresije parametara neoangiogeneze VEGF-A, HIF-1, CXCL-12 i iNOS (Quantitative/Real Time PCR) i proteinske ekspresije imunofluorescentnom metodom (VEGF165b i CXCR-4). Drugi deo istraživanja, predstavlja in vitro ispitivanje novih vidova terapije- elektroporacije i elektrohemioterapije na karcinomskim (MDA-MB-231, MCF-7, SW-480, HCT-116) i zdravim (MRC-5, HUVEC i hAoSm) imortalizovanim ćelijskim linijama. Citotoksični efekti elektroporacije i elektrohemioterapije na ispitivanim ćelijskim linijama, praćeni su u realnom vremenu, primenom xCELLigence sistema (Real Time Cell Analysis-RTCA), kao i na osnovu analiziranja tipa ćelijske smrti akridin oranž/etidijum bromid mikroskopskom metodom. U cilju pronalaženja novih antineoplastičnih tretmana karcinoma dojke, ispitivano je antiinvazivno dejstvo ekstrakata listova invazivnih biljaka Doktorska disertacija Robinia pseudoacacia (L) i Amorpha fruticosa (L) na MRC-5 i MDA-MB-231 ćelijama, praćenjem relativne ekspresije MMP-9, VEGF-A, HIF-1α, CXCL-12 i iNOS gena (Quantitative/Real Time PCR metodom). Za data mining obradu podataka su korišćeni softveri (Machine Learning Techniques) koji su trenirani za konkretan problem predikcije recidiva i metastaza, kao najznačajnije prognostičke parametre ishoda bolesti.SUMMARY: Breast cancer is the leading cause of mortality related to cancer among women around the world. The World Health Organization estimated that in this century, every eighth woman on the planet will be affected by this disease. Despite the progress made in diagnosis and treatment, high morbidity and mortality from breast cancer is evident, so new approaches are needed in the treatment of this disease. The fact is that today's oncology protocols are too rigid and approximate and somewhat neglect the patient's specificity and biology of the tumor of each patient. It therefore seeks individualization / personalization of therapy, corresponding to the biological profile of each patient individually, which would lead to the introduction of the new and improvement of the existing oncological protocols. This requires a multidisciplinary approach, with collaboration of experts from different fields, which will include data mining systems for data processing as well as promising, but so far insufficiently investigated treatment modalities, such as electroporation, electrochemotherapy and phytotherapy. The first part of the study is a prospective study involving patients with breast cancer diagnosis at CC Kragujevac in a five-year follow-up period. During the surgery performed routinely at the Clinic for General and Chest Surgery in CC Kragujevac, samples of breast and peritumor tissue were taken, followed by the analyzes in addition to standard pathohistological examinations: determining the concentration of metalloproteinase 9 (MMP-9); gene expression of neoangiogenesis parameters of VEGF-A, HIF-1, CXCL-12 and iNOS (Quantitative / Real Time PCR) and protein expression using the immunofluorescence method (VEGF165b and CXCR-4). The second part of the study is an in vitro examination of new forms of therapy-electroporation and electrochemotherapy in cancer (MDA-MB- 231, MCF-7, SW-480, HCT-116) and healthy (MRC-5, HUVEC and hAoSm) immortalized cell lines. The cytotoxic effects of electroporation and electrochemotherapy on the examined cell lines were monitored in real time by using the xCELLigence system (Real Time Cell Analysis-RTCA) as well as by analyzing the cell-death by acridine orange/ethidium bromide microscopic method. In order to find new antineoplastic breast cancer treatments, the anti-invasive effects of leaf extracts of Robinia pseudoacacia (L) and Amorpha fruticosa (L) invading plants on MRC-5 and MDA-MB-231 cells were studied, followed by gene expression of MMP-9, VEGF-A, HIF-1α, CXCL-12, and iNOS (Quantitative / Real Time PCR method). For data mining, data processing was used (Machine Learning Techniques) trained for the specific problem of recurrence and metastasis prediction as the most important prognostic parameters of the disease outcome
    corecore