23 research outputs found

    A Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence

    Get PDF
    The most common type of cancer in women worldwide is the Breast Cancer. Breast cancer may be detected early using Mammograms, probably before it's spread. Recurrent breast cancer could occur months or years after initial treatment. The cancer could return within the same place because the original cancer (local recurrence), or it may spread to different areas of your body (distant recurrence). Early stage treatment is done not only to cure breast cancer however additionally facilitate in preventing its repetition/recurrence. Data mining algorithms provide assistance in predicting the early-stage breast cancer that continually has been difficult analysis drawback. The projected analysis can establish the most effective algorithm that predicts the recurrence of the breast cancer and improve the accuracy the algorithms. Large information like Clump, Classification, Association Rules, Prediction and Neural Networks, Decision Trees can be analyzed using data mining applications and techniques

    APLICATIVO “WEB-MOBILE”, DERIVADO DE MÉTODOS DE INTELIGÊNCIA ARTIFICIAL, PARA PREDIÇÃO DE DEFICIÊNCIA DE TESTOSTERONA EM HOMENS

    Get PDF
    A testosterona (T) é considerada o hormônio sexual mais importante na fisiologia masculina e tem um impacto significativo nos parâmetros físicos e psicológicos, sendo um importante indicador de boa saúde geral e qualidade de vida. Alguns pacientes podem apresentar deficiência de testosterona (TD), uma condição definida como a presença de níveis séricos baixos persistentes de T e sintomas clínicos Shalender et al. (2010)

    Breast cancer classification using machine learning techniques: a comparative study

    Get PDF
    Background: The second leading deadliest disease affecting women worldwide, after  lung cancer, is breast cancer. Traditional approaches for breast cancer diagnosis suffer from time consumption and some human errors in classification. To deal with this problems, many research works based on machine learning techniques are proposed.  These approaches show  their effectiveness in data classification in many fields, especially in healthcare.      Methods: In this cross sectional study, we conducted a practical comparison between the most used machine learning algorithms in the literature. We applied kernel and linear support vector machines, random forest, decision tree, multi-layer perceptron, logistic regression, and k-nearest neighbors for breast cancer tumors classification.  The used dataset is  Wisconsin diagnosis Breast Cancer. Results: After comparing the machine learning algorithms efficiency, we noticed that multilayer perceptron and logistic regression gave  the best results with an accuracy of 98% for breast cancer classification.       Conclusion: Machine learning approaches are extensively used in medical prediction and decision support systems. This study showed that multilayer perceptron and logistic regression algorithms are  performant  ( good accuracy specificity and sensitivity) compared to the  other evaluated algorithms

    Breast Cancer Detection Via Wavelet Energy and Support Vector Machine

    Get PDF
    © 2018 IEEE. Breast cancer as one of the most feared killers of women, there are still no effective means of prevention and treatment on it. However, the popularity of its research continues to rise in academic field. The traditional medical diagnosis is mainly by observing the patient's symptoms to confirm the variety of diseases, but the efficiency is undesirable, and the scientific contribution is poor. At present, due to the dramatical development of the application of machine learning in data detection, the application of computer technology in disease diagnosis has become a new and effective means. This paper used the wavelet energy to extract features of breast cancer, then established a breast cancer predicting model, while re-use data grouping function of support vector machine (SVM), then algorithm would accurately distinguish the characteristics of the data among benign malignant tumors. So, the accuracy of intelligent diagnosis in breast cancer has be improved, and proven to be better than two state-of-the-art approaches

    Advancements in Multi-Layer Perceptron Training to Improve Classification Accuracy

    Get PDF
    Neural Networks are the popular classification tools used in Medical diagnosis for early disease detection. The performance of Neural Networks is highly depended on the training process. In the training process, the individual weights between each of the neuron are adjusted for better classification results. Many Gradient-based and Meta-heuristic training algorithms are proposed and used by the researchers to improve the training performance of Neural Network. However, there are some limitations in both Gradient-based and Meta-heuristic algorithms when there are used individually. To overcome these limitations and to improve the Multi-Layer Perceptron Network performance Hybrid algorithms are useful. In this study, a review on advancements in Multi-Layer Perceptron Network training process for the improvement of classification performance is presented

    A Comparative Analysis of Data Mining Techniques on Breast Cancer Diagnosis Data using WEKA Toolbox

    Get PDF
    Abstract—Breast cancer is considered the second most common cancer in women compared to all other cancers. It is fatal in less than half of all cases and is the main cause of mortality in women. It accounts for 16% of all cancer mortalities worldwide. Early diagnosis of breast cancer increases the chance of recovery. Data mining techniques can be utilized in the early diagnosis of breast cancer. In this paper, an academic experimental breast cancer dataset is used to perform a data mining practical experiment using the Waikato Environment for Knowledge Analysis (WEKA) tool. The WEKA Java application represents a rich resource for conducting performance metrics during the execution of experiments. Pre-processing and feature extraction are used to optimize the data. The classification process used in this study was summarized through thirteen experiments. Additionally, 10 experiments using various different classification algorithms were conducted. The introduced algorithms were: Naïve Bayes, Logistic Regression, Lazy IBK (Instance-Bases learning with parameter K), Lazy Kstar, Lazy Locally Weighted Learner, Rules ZeroR, Decision Stump, Decision Trees J48, Random Forest and Random Trees. The process of producing a predictive model was automated with the use of classification accuracy. Further, several experiments on classification of Wisconsin Diagnostic Breast Cancer and Wisconsin Breast Cancer, were conducted to compare the success rates of the different methods. Results conclude that Lazy IBK classifier k-NN can achieve 98% accuracy among other classifiers. The main advantages of the study were the compactness of using 13 different data mining models and 10 different performance measurements, and plotting figures of classifications errors

    Expert cancer model using supervised algorithms with a LASSO selection approach

    Get PDF
    One of the most critical issues of the mortality rate in the medical field in current times is breast cancer. Nowadays, a large number of men and women is facing cancer-related deaths due to the lack of early diagnosis systems and proper treatment per year. To tackle the issue, various data mining approaches have been analyzed to build an effective model that helps to identify the different stages of deadly cancers. The study successfully proposes an early cancer disease model based on five different supervised algorithms such as logistic regression (henceforth LR), decision tree (henceforth DT), random forest (henceforth RF), Support vector machine (henceforth SVM), and K-nearest neighbor (henceforth KNN). After an appropriate preprocessing of the dataset, least absolute shrinkage and selection operator (LASSO) was used for feature selection (FS) using a 10-fold cross-validation (CV) approach. Employing LASSO with 10-fold cross-validation has been a novel steps introduced in this research. Afterwards, different performance evaluation metrics were measured to show accurate predictions based on the proposed algorithms. The result indicated top accuracy was received from RF classifier, approximately 99.41% with the integration of LASSO. Finally, a comprehensive comparison was carried out on Wisconsin breast cancer (diagnostic) dataset (WBCD) together with some current works containing all features

    Performance Analysis of a new Filter and Wrapper Sequence for the Survivability Prediction of Breast Cancer Patients

    Get PDF
    Feature selection is an essential preprocessing step for removing redundant or irrelevant features from multidimensional data to improve predictive performance. Currently, medical clinical datasets are increasingly large and multidimensional and not every feature helps in the necessary predictions. So, feature selection techniques are used to determine relevant feature set that can improve the performance of a learning algorithm. This study presents a performance analysis of a new filter and wrapper sequence involving the intersection of filter methods, Mutual Information and Chi-Square followed by one of the wrapper methods: Sequential Forward Selection and Sequential Backward Selection to obtain a more informative feature set for improved prediction of the survivability of breast cancer patients from the clinical breast cancer dataset, SEER. The improvement in performance due to this filter and wrapper sequence in terms of Accuracy, False Positive Rate, False Negative Rate and Area under the Receiver Operating Characteristics curve is tested using the Machine learning algorithms: Logistic Regression, K-Nearest Neighbour, Decision Tree, Random Forest, Support Vector Machine and Multilayer Perceptron. The performance analysis supports the Sequential Backward Selection of the new filter and wrapper sequence over Sequential Forward Selection for the SEER dataset

    Random Forest-Based Prediction of Stroke Outcome

    Get PDF
    [Abstract] We research into the clinical, biochemical and neuroimaging factors associated with the outcome of stroke patients to generate a predictive model using machine learning techniques for prediction of mortality and morbidity 3-months after admission. The dataset consisted of patients with ischemic stroke (IS) and non-traumatic intracerebral hemorrhage (ICH) admitted to Stroke Unit of a European Tertiary Hospital prospectively registered. We identified the main variables for machine learning Random Forest (RF), generating a predictive model that can estimate patient mortality/morbidity according to the following groups: (1) IS + ICH, (2) IS, and (3) ICH. A total of 6022 patients were included: 4922 (mean age 71.9 ± 13.8 years) with IS and 1100 (mean age 73.3 ± 13.1 years) with ICH. NIHSS at 24, 48 h and axillary temperature at admission were the most important variables to consider for evolution of patients at 3-months. IS + ICH group was the most stable for mortality prediction [0.904 ± 0.025 of area under the receiver operating characteristics curve (AUC)]. IS group presented similar results, although variability between experiments was slightly higher (0.909 ± 0.032 of AUC). ICH group was the one in which RF had more problems to make adequate predictions (0.9837 vs. 0.7104 of AUC). There were no major differences between IS and IS + ICH groups according to morbidity prediction (0.738 and 0.755 of AUC) but, after checking normality with a Shapiro Wilk test with the null hypothesis that the data follow a normal distribution, it was rejected with W = 0.93546 (p-value < 2.2e−16). Conditions required for a parametric test do not hold, and we performed a paired Wilcoxon Test assuming the null hypothesis that all the groups have the same performance. The null hypothesis was rejected with a value < 2.2e−16, so there are statistical differences between IS and ICH groups. In conclusion, machine learning algorithms RF can be effectively used in stroke patients for long-term outcome prediction of mortality and morbidity.This study was partially supported by grants from the Spanish Ministry of Science and Innovation (SAF2017-84267-R), Xunta de Galicia (Axencia Galega de Innovación (GAIN): IN607A2018/3), Instituto de Salud Carlos III (ISCIII) (PI17/00540, PI17/01103), Spanish Research Network on Cerebrovascular Diseases RETICS-INVICTUS PLUS (RD16/0019) and by the European Union FEDER program. T. Sobrino (CPII17/00027), F. Campos (CPII19/00020) are recipients of research contracts from the Miguel Servet Program (Instituto de Salud Carlos III). General Directorate of Culture, Education and University Management of Xunta de Galicia (ED431G/01,252 ED431D 2017/16), “Galician Network for Colorectal Cancer Research" (Ref. ED431D 2017/23), Competitive Reference Groups (ED431C 2018/49), Spanish Ministry of Economy and Competitiveness via funding of the unique installation BIOCAI (UNLC08-1E-002, UNLC13-13–3503), European Regional Development Funds (FEDER).Xunta de Galicia; IN607A2018/3Xunta de Galicia; ED431G/01,252Xunta de Galicia; ED431D 2017/1
    corecore