302 research outputs found

    An improvement in support vector machine classification model using grey relational analysis for cancer diagnosis

    Get PDF
    To further improve the accuracy of classifier for cancer diagnosis, a hybrid model called GRA-SVM which comprises Support Vector Machine classifier and filter feature selection Grey Relational Analysis is proposed and tested against Wisconsin Breast Cancer Dataset (WBCD) and BUPA Disorder Dataset. The performance of GRA-SVM is compared to SVM’s in terms of accuracy, sensitivity, specificity and Area under Curve (AUC). The experimental results reveal that GRA-SVM improves the SVM accuracy of about 0.48 by using only two features for the WBCD dataset. For BUPA dataset, GRA-SVM improves the SVM accuracy of about 0.97 by using four features. Besides improving the accuracy performance, GRA-SVM also produces a ranking scheme that provides information about the priority of each feature. Therefore, based on the benefits gained, GRA-SVM is recommended as a new approach to obtain a better and more accurate result for cancer diagnosis

    Imbalanced data classification using support vector machine based on simulated annealing for enhancing penalty parameter

    Get PDF
    For pattern cataloguing and regression issues, the support vector machine (SVM) is an eminent and computationally prevailing machine learning method. It’s been effectively addressing several concrete issues across an extensive gamut of domains. SVM possesses a key aspect called penalty factor C. The choice of these aspects has a substantial impact on the classification precision of SVM as unsuitable parameter settings might drive substandard classification outcomes. Penalty factor C is required to achieve an adequate trade-off between classification errors and generalisation performance. Hence, formulating an SVM model having appropriate performance requires parameter optimisation. The simulated annealing (SA) algorithm is employed to formulate a hybrid method for evaluating SVM parameters. Additionally, the intent is to enhance system efficacy to obtain the optimal penalty parameter and balance classification performance at the same time. Our experiments with many UCI datasets indicate that the recommended technique could attain enhanced classification precision

    Comparison of Machine Learning Classification Methods in Hepatitis C Virus

    Get PDF
    The hepatitis C virus (HCV) is considered a problem to the health of societies are the main. There are around 120-130 million or 3% of the world's total population infected with HCV. Without treatment, most major infectious acute evolve into chronic, followed by diseases liver, such as cirrhosis and cancer liver. The data parameters used in this study included albumin (ALB), bilirubin (BIL), choline esterase (CHE), -glutamyl-transferase (GGT), aspartate amino-transferase (AST), alanine amino-transferase (ALT), cholesterol (CHOL), creatinine (CREA), protein (PROT), and Alkaline phosphatase (ALP). This research proposes a methodology based on machine learning classification methods including k-nearest neighbors, naïve Bayes, neural network, and random forest. The aim of this study is to assess and evaluate the level of accuracy using the algorithm classification machine learning to detect the disease HCV. The result show that the accuracy of the method NN has a value of accuracy are high, namely at 95.12% compared to the method KNN, naïve Bayes and RF in a row amounted to 89.43%, 90.24%, and 94.31%

    Improving KNN by Gases Brownian Motion Optimization Algorithm to Breast Cancer Detection

    Get PDF
    In the last decade, the application of information technology and artificial intelligence algorithms are widely developed in collecting information of cancer patients and detecting them based on proposing various detection algorithms. The K-Nearest-Neighbor classification algorithm (KNN) is one of the most popular of detection algorithms, which has two challenges in determining the value of k and the volume of computations proportional to the size of the data and sample selected for training. In this paper, the Gaussian Brownian Motion Optimization (GBMO) algorithm is utilized for improving the KNN performance to breast cancer detection. To achieve to this aim, each gas molecule contains the information such as a selected subset of features to apply the KNN and k value. The GBMO has lower time-complexity order than other algorithms and has also been observed to perform better than other optimization algorithms in other applications. The algorithm and three well-known meta-heuristic algorithms such as Genetic Algorithm (GA), Particle Swarm Optimization (PSO) and Imperialist Competitive Algorithm (ICA) have been implemented on five benchmark functions and compared the obtained results. The GBMO+KNN performed on three benchmark datasets of breast cancer from UCI and the obtained results are compared with other existing cancer detection algorithms. These comparisons show significantly improves this classification accuracy with the proposed detection algorithm

    Analisa Data Mining untuk Prediksi Penyakit Hepatitis dengan Menggunakan Metode Naive Bayes dan Support Vector Machine

    Full text link
    In the case of hepatitis disease prediction has been solved by a method using Support Vector Machine (SVM) .Penyakit hepatitis is an inflammatory disease of the liver due to viral infection that attacks and cause damage to cells and organs function hati.Penyakit forerunner hepatitis is a disease of the liver cancer. Attributes or variables that have as many as 20 attributes which consists of 19 attributes preditor and 1 as the output destination attribute used to differentiate the results of the examination. Invene dataset from the University of California (UCI) Machine Learning Repository 583 as the data used and replace missing after the data is used only to evaluate the data 153 SVMyang approach proposed in the study ini.Hasil simulations showed that by developing this model achieved a reduction in dimensions and identification hati.Salah cancer of the optimization algorithm is quite popular is Naïve Bayes. In this study, will be used also classification algorithm Support Vector Machine (SVM) will be used to establish a predictive classification model of hepatitis

    Analysis and Design of Detection for Liver Cancer using Particle Swarm Optimization and Decision Tree

    Get PDF
    Liver cancer is taken as a major cause of death all over the world. According to WHO (World Health Organization) every year 9.6 million peoples are died due to cancer worldwide. It is one of the eighth most leading causes of death in women and fifth in men as reported by the American Cancer Society. The number of death rate due to cancer is projected to increase by45 percent in between 2008 to 2030. The most common cancers are lung, breast, and liver, colorectal. Approximately 7, 82,000 peoples are died due to liver cancer each year. The most efficient way to decrease the death rate cause of liver cancer is to treat the diseases in the initial stage. Early treatment depends upon the early diagnosis, which depends on reliable diagnosis methods. CT imaging is one of the most common and important technique and it acts as an imaging tool for evaluating the patients with intuition of liver cancer. The diagnosis of liver cancer has historically been made manually by a skilled radiologist, who relied on their expertise and personal judgement to reach a conclusion. The main objective of this paper is to develop the automatic methods based on machine learning approach for accurate detection of liver cancer in order to help radiologists in the clinical practice. The paper primary contribution to the process of liver cancer lesion classification and automatic detection for clinical diagnosis. For the purpose of detecting liver cancer lesions, the best approaches based on PSO and DPSO have been given. With the help of the C4.5 decision tree classifier, wavelet-based statistical and morphological features were retrieved and categorised

    Gene selection and classification in autism gene expression data

    Get PDF
    Autism spectrum disorders (ASD) are neurodevelopmental disorders that are currently diagnosed on the basis of abnormal stereotyped behaviour as well as observable deficits in communication and social functioning. Although a variety of candidate genes have been attributed to the disorder, no single gene is applicable to more than 1–2% of the general ASD population. Despite extensive efforts, definitive genes that contribute to autism susceptibility have yet to be identified. The major problems in dealing with the gene expression dataset of autism include the presence of limited number of samples and large noises due to errors of experimental measurements and natural variation. In this study, a systematic combination of three important filters, namely t-test (TT), Wilcoxon Rank Sum (WRS) and Feature Correlation (COR) are applied along with efficient wrapper algorithm based on geometric binary particle swarm optimization-support vector machine (GBPSO-SVM), aiming at selecting and classifying the most attributed genes of autism. A new approach based on the criterion of median ratio, mean ratio and variance deviations is also applied to reduce the initial dataset prior to its involvement. Results showed that the most discriminative genes that were identified in the first and last selection steps concluded the presence of a repetitive gene (CAPS2), which was assigned as the most ASD risk gene. The fused result of genes subset that were selected by the GBPSO-SVM algorithm increased the classification accuracy to about 92.10%, which is higher than those reported in literature for the same autism dataset. Noticeably, the application of ensemble using random forest (RF) showed better performance compared to that of previous studies. However, the ensemble approach based on the employment of SVM as an integrator of the fused genes from the output branches of GBPSO-SVM outperformed the RF integrator. The overall improvement was ascribed to the selection strategies that were taken to reduce the dataset and the utilization of efficient wrapper based GBPSO-SVM algorithm
    • …
    corecore