21,714 research outputs found

    Multiclass Cancer Classification by Using Fuzzy Support Vector Machine and Binary Decision Tree With Gene Selection

    Get PDF
    We investigate the problems of multiclass cancer classification with gene selection from gene expression data. Two different constructed multiclass classifiers with gene selection are proposed, which are fuzzy support vector machine (FSVM) with gene selection and binary classification tree based on SVM with gene selection. Using F test and recursive feature elimination based on SVM as gene selection methods, binary classification tree based on SVM with F test, binary classification tree based on SVM with recursive feature elimination based on SVM, and FSVM with recursive feature elimination based on SVM are tested in our experiments. To accelerate computation, preselecting the strongest genes is also used. The proposed techniques are applied to analyze breast cancer data, small round blue-cell tumors, and acute leukemia data. Compared to existing multiclass cancer classifiers and binary classification tree based on SVM with F test or binary classification tree based on SVM with recursive feature elimination based on SVM mentioned in this paper, FSVM based on recursive feature elimination based on SVM can find most important genes that affect certain types of cancer with high recognition accuracy

    A Hybrid Approach Support Vector Machine (SVM) ā€“ Neuro Fuzzy for Fast Data Classification

    Full text link
    In recent decade, support vector machine (SVM) was a machine learning method that widely used in several application domains. It was due to SVM has a good performance for solving data classification problems, particularly in non-linear case. Nevertheless, several studies indicated that SVM still has some inadequacies, especially the high time complexity in testing phase that is caused by increasing the number of support vector for high dimensional data. To address this problem, we propose a hybrid approach SVM ā€“ Neuro Fuzzy (SVMNF), which neuro fuzzy here is used to avoid influence of support vector in testing phase of SVM. Moreover, our approach is also equipped with a feature selection that can reduce data attributes in testing phase, so that it can improve the effectiveness of time computation. Based on our evaluation in real benchmark datasets, our approach outperformed SVM in testing phase for solving data classification problems without significantly affecting the accuracy of SVM

    A Hybrid Approach Support Vector Machine (SVM) ā€“ Neuro Fuzzy For Fast Data Classification

    Get PDF
    In recent decade, support vector machine (SVM) was a machine learning method that widely used in several application domains. It was due to SVM has a good performance for solving data classification problems, particularly in non-linear case. Nevertheless, several studies indicated that SVM still has some inadequacies, especially the high time complexity in testing phase that is caused by increasing the number of support vector for high dimensional data. To address this problem, we propose a hybrid approach SVM ā€“ Neuro Fuzzy (SVMNF), which neuro fuzzy here is used to avoid influence of support vector in testing phase of SVM. Moreover, our approach is also equipped with a feature selection that can reduce data attributes in testing phase, so that it can improve the effectiveness of time computation. Based on our evaluation in real benchmark datasets, our approach outperformed SVM in testing phase for solving data classification problems without significantly affecting the accuracy of SVM

    Constructing L2-SVM-based fuzzy classifiers in high-dimensional space with automatic model selection and fuzzy rule ranking

    Get PDF
    In this paper, a new scheme for constructing parsimonious fuzzy classifiers is proposed based on the L2-support vector machine (L2-SVM) technique with model selection and feature ranking performed simultaneously in an integrated manner, in which fuzzy rules are optimally generated from data by L2-SVM learning. In order to identify the most influential fuzzy rules induced from the SVM learning, two novel indexes for fuzzy rule ranking are proposed and named as Ī±-values and Ļ‰-values of fuzzy rules in this paper. The Ī±-values are defined as the Lagrangian multipliers of the L2-SVM and adopted to evaluate the output contribution of fuzzy rules, while the Ļ‰-values are developed by considering both the rule base structure and the output contribution of fuzzy rules. As a prototype-based classifier, the L2-SVM-based fuzzy classifier evades the curse of dimensionality in high-dimensional space in the sense that the number of support vectors, which equals the number of induced fuzzy rules, is not related to the dimensionality. Experimental results on high-dimensional benchmark problems have shown that by using the proposed scheme the most influential fuzzy rules can be effectively induced and selected, and at the same time feature ranking results can also be obtained to construct parsimonious fuzzy classifiers with better generalization performance than the well-known algorithms in literature. Ā© 2007 IEEE

    An Efficient Fuzzy Clustering-Based Approach for Intrusion Detection

    Full text link
    The need to increase accuracy in detecting sophisticated cyber attacks poses a great challenge not only to the research community but also to corporations. So far, many approaches have been proposed to cope with this threat. Among them, data mining has brought on remarkable contributions to the intrusion detection problem. However, the generalization ability of data mining-based methods remains limited, and hence detecting sophisticated attacks remains a tough task. In this thread, we present a novel method based on both clustering and classification for developing an efficient intrusion detection system (IDS). The key idea is to take useful information exploited from fuzzy clustering into account for the process of building an IDS. To this aim, we first present cornerstones to construct additional cluster features for a training set. Then, we come up with an algorithm to generate an IDS based on such cluster features and the original input features. Finally, we experimentally prove that our method outperforms several well-known methods.Comment: 15th East-European Conference on Advances and Databases and Information Systems (ADBIS 11), Vienna : Austria (2011

    Entropy Based Fuzzy Support Vector Machine (EFSVM) untuk Klasifikasi Microarray Imbalanced Data

    Get PDF
    DNA microarray merupakan data yang mengandung ekspresi gen dengan ukuran sampel kecil, namun memiliki jumlah feature yang sangat besar. Selain itu masalah kelas imbalanced merupakan masalah umum dalam data microarray. Oleh karena itu diperlukan metode klasifikasi yang mampu mengatasi pemasalahan high dimensional dan juga permasalahan imbalanced. SVM merupakan salah satu metode klasifikasi yang mampu menangani sampel besar atau kecil, non-linear, high dimensional, over learning dan masalah lokal minimum. Metode SVM juga telah banyak diterapkan untuk klasifikasi data DNA microarray dan didapatkan hasil bahwa SVM memberikan kinerja terbaik di antara metode machine learning lainnya. Namun pengaruh dari imbalanced data pada SVM akan menjadi kekurangan dikarenakan SVM memperlakukan semua sampel dengan kepentingan yang sama sehingga mengakibatkan bias terhadap kelas minoritas. Salah satu metode yang mampu mengatasi imbalanced data adalah EFSVM. EFSVM mampu menghasilkan nilai AUC yang tertinggi apabila dibandingkan dengan SVM dan FSVM. Mengingat data DNA microarray merupakan high dimensional data dengan jumlah feature yang sangat besar, maka perlu dilakukan feature selection terlebih dahulu. Pada penelitian dilakukan klasifikasi terhadap data DNA microarray dengan kasus data yang imbalanced menggunakan EFSVM dengan terlebih dahulu dilakukan seleksi fitur menggunakan FCBF. Hasil performansi klasifikasi menunjukkan bahwa feature selection mampu meningkatkan performansi klasifikasi. Adanya penambahan entropy based fuzzy membership terbukti mampu menghasilkan performansi paling tinggi dibandingkan dengan SVM dan FSVM, namun untuk data yang telah dilakukan feature selection, antara FSVM dan EFSVM diperoleh hasil yang hampir sama. ============================================================================DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a major class which have significantly more instances than the other minority classes in the data. Therefore, it is needed a classification method that can solve the problem of high dimensional and imbalanced data. SVM is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, EFSVM is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different constributions to the classifier. The samples with higher class certainty, that measured by entropy, are assigned to larger fuzzy membership. The importance of the minority classes have large fuzzy membership and EFSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is high dimensional data with a very large number of features, it is necessary to do feature selection first using FCBF. Based on the overall results, EFSVM has the highest AUC value compared to SVM and FSVM

    FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification

    Full text link
    This paper introduces a novel real-time Fuzzy Supervised Learning with Binary Meta-Feature (FSL-BM) for big data classification task. The study of real-time algorithms addresses several major concerns, which are namely: accuracy, memory consumption, and ability to stretch assumptions and time complexity. Attaining a fast computational model providing fuzzy logic and supervised learning is one of the main challenges in the machine learning. In this research paper, we present FSL-BM algorithm as an efficient solution of supervised learning with fuzzy logic processing using binary meta-feature representation using Hamming Distance and Hash function to relax assumptions. While many studies focused on reducing time complexity and increasing accuracy during the last decade, the novel contribution of this proposed solution comes through integration of Hamming Distance, Hash function, binary meta-features, binary classification to provide real time supervised method. Hash Tables (HT) component gives a fast access to existing indices; and therefore, the generation of new indices in a constant time complexity, which supersedes existing fuzzy supervised algorithms with better or comparable results. To summarize, the main contribution of this technique for real-time Fuzzy Supervised Learning is to represent hypothesis through binary input as meta-feature space and creating the Fuzzy Supervised Hash table to train and validate model.Comment: FICC201

    A Review of Fault Diagnosing Methods in Power Transmission Systems

    Get PDF
    Transient stability is important in power systems. Disturbances like faults need to be segregated to restore transient stability. A comprehensive review of fault diagnosing methods in the power transmission system is presented in this paper. Typically, voltage and current samples are deployed for analysis. Three tasks/topics; fault detection, classification, and location are presented separately to convey a more logical and comprehensive understanding of the concepts. Feature extractions, transformations with dimensionality reduction methods are discussed. Fault classification and location techniques largely use artificial intelligence (AI) and signal processing methods. After the discussion of overall methods and concepts, advancements and future aspects are discussed. Generalized strengths and weaknesses of different AI and machine learning-based algorithms are assessed. A comparison of different fault detection, classification, and location methods is also presented considering features, inputs, complexity, system used and results. This paper may serve as a guideline for the researchers to understand different methods and techniques in this field
    • ā€¦
    corecore