21,714 research outputs found
Multiclass Cancer Classification by Using Fuzzy Support Vector Machine and Binary Decision Tree With Gene Selection
We investigate the problems of multiclass cancer classification with gene selection from gene expression data. Two different constructed multiclass classifiers with gene selection are proposed, which are fuzzy support vector machine (FSVM) with gene selection and binary classification tree based on SVM with gene selection. Using F test and recursive feature elimination based on SVM as gene selection methods, binary classification tree based on SVM with F test, binary classification tree based on SVM with recursive feature elimination based on SVM, and FSVM with recursive feature elimination based on SVM are tested in our experiments. To accelerate computation, preselecting the strongest genes is also used. The proposed techniques are applied to analyze breast cancer data, small round blue-cell tumors, and acute leukemia data. Compared to existing multiclass cancer classifiers and binary classification tree based on SVM with F test or binary classification tree based on SVM with recursive feature elimination based on SVM mentioned in this paper, FSVM based on recursive feature elimination based on SVM can find most important genes that affect certain types of cancer with high recognition accuracy
A Hybrid Approach Support Vector Machine (SVM) ā Neuro Fuzzy for Fast Data Classification
In recent decade, support vector machine (SVM) was a machine learning method that widely used in several application domains. It was due to SVM has a good performance for solving data classification problems, particularly in non-linear case. Nevertheless, several studies indicated that SVM still has some inadequacies, especially the high time complexity in testing phase that is caused by increasing the number of support vector for high dimensional data. To address this problem, we propose a hybrid approach SVM ā Neuro Fuzzy (SVMNF), which neuro fuzzy here is used to avoid influence of support vector in testing phase of SVM. Moreover, our approach is also equipped with a feature selection that can reduce data attributes in testing phase, so that it can improve the effectiveness of time computation. Based on our evaluation in real benchmark datasets, our approach outperformed SVM in testing phase for solving data classification problems without significantly affecting the accuracy of SVM
A Hybrid Approach Support Vector Machine (SVM) ā Neuro Fuzzy For Fast Data Classification
In recent decade, support vector machine (SVM) was a machine learning method that widely used in several application domains. It was due to SVM has a good performance for solving data classification problems, particularly in non-linear case. Nevertheless, several studies indicated that SVM still has some inadequacies, especially the high time complexity in testing phase that is caused by increasing the number of support vector for high dimensional data. To address this problem, we propose a hybrid approach SVM ā Neuro Fuzzy (SVMNF), which neuro fuzzy here is used to avoid influence of support vector in testing phase of SVM. Moreover, our approach is also equipped with a feature selection that can reduce data attributes in testing phase, so that it can improve the effectiveness of time computation. Based on our evaluation in real benchmark datasets, our approach outperformed SVM in testing phase for solving data classification problems without significantly affecting the accuracy of SVM
Constructing L2-SVM-based fuzzy classifiers in high-dimensional space with automatic model selection and fuzzy rule ranking
In this paper, a new scheme for constructing parsimonious fuzzy classifiers is proposed based on the L2-support vector machine (L2-SVM) technique with model selection and feature ranking performed simultaneously in an integrated manner, in which fuzzy rules are optimally generated from data by L2-SVM learning. In order to identify the most influential fuzzy rules induced from the SVM learning, two novel indexes for fuzzy rule ranking are proposed and named as Ī±-values and Ļ-values of fuzzy rules in this paper. The Ī±-values are defined as the Lagrangian multipliers of the L2-SVM and adopted to evaluate the output contribution of fuzzy rules, while the Ļ-values are developed by considering both the rule base structure and the output contribution of fuzzy rules. As a prototype-based classifier, the L2-SVM-based fuzzy classifier evades the curse of dimensionality in high-dimensional space in the sense that the number of support vectors, which equals the number of induced fuzzy rules, is not related to the dimensionality. Experimental results on high-dimensional benchmark problems have shown that by using the proposed scheme the most influential fuzzy rules can be effectively induced and selected, and at the same time feature ranking results can also be obtained to construct parsimonious fuzzy classifiers with better generalization performance than the well-known algorithms in literature. Ā© 2007 IEEE
An Efficient Fuzzy Clustering-Based Approach for Intrusion Detection
The need to increase accuracy in detecting sophisticated cyber attacks poses
a great challenge not only to the research community but also to corporations.
So far, many approaches have been proposed to cope with this threat. Among
them, data mining has brought on remarkable contributions to the intrusion
detection problem. However, the generalization ability of data mining-based
methods remains limited, and hence detecting sophisticated attacks remains a
tough task. In this thread, we present a novel method based on both clustering
and classification for developing an efficient intrusion detection system
(IDS). The key idea is to take useful information exploited from fuzzy
clustering into account for the process of building an IDS. To this aim, we
first present cornerstones to construct additional cluster features for a
training set. Then, we come up with an algorithm to generate an IDS based on
such cluster features and the original input features. Finally, we
experimentally prove that our method outperforms several well-known methods.Comment: 15th East-European Conference on Advances and Databases and
Information Systems (ADBIS 11), Vienna : Austria (2011
Entropy Based Fuzzy Support Vector Machine (EFSVM) untuk Klasifikasi Microarray Imbalanced Data
DNA microarray merupakan data yang mengandung ekspresi gen dengan ukuran sampel kecil, namun memiliki jumlah feature yang sangat besar. Selain itu masalah kelas imbalanced merupakan masalah umum dalam data microarray. Oleh karena itu diperlukan metode klasifikasi yang mampu mengatasi pemasalahan high dimensional dan juga permasalahan imbalanced. SVM merupakan salah satu metode klasifikasi yang mampu menangani sampel besar atau kecil, non-linear, high dimensional, over learning dan masalah lokal minimum. Metode SVM juga telah banyak diterapkan untuk klasifikasi data DNA microarray dan didapatkan hasil bahwa SVM memberikan kinerja terbaik di antara metode machine learning lainnya. Namun pengaruh dari imbalanced data pada SVM akan menjadi kekurangan dikarenakan SVM memperlakukan semua sampel dengan kepentingan yang sama sehingga mengakibatkan bias terhadap kelas minoritas. Salah satu metode yang mampu mengatasi imbalanced data adalah EFSVM. EFSVM mampu menghasilkan nilai AUC yang tertinggi apabila dibandingkan dengan SVM dan FSVM. Mengingat data DNA microarray merupakan high dimensional data dengan jumlah feature yang sangat besar, maka perlu dilakukan feature selection terlebih dahulu. Pada penelitian dilakukan klasifikasi terhadap data DNA microarray dengan kasus data yang imbalanced menggunakan EFSVM dengan terlebih dahulu dilakukan seleksi fitur menggunakan FCBF. Hasil performansi klasifikasi menunjukkan bahwa feature selection mampu meningkatkan performansi klasifikasi. Adanya penambahan entropy based fuzzy membership terbukti mampu menghasilkan performansi paling tinggi dibandingkan dengan SVM dan FSVM, namun untuk data yang telah dilakukan feature selection, antara FSVM dan EFSVM diperoleh hasil yang hampir sama.
============================================================================DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a major class which have significantly more instances than the other minority classes in the data. Therefore, it is needed a classification method that can solve the problem of high dimensional and imbalanced data. SVM is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, EFSVM is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different constributions to the classifier. The samples with higher class certainty, that measured by entropy, are assigned to larger fuzzy membership. The importance of the minority classes have large fuzzy membership and EFSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is high dimensional data with a very large number of features, it is necessary to do feature selection first using FCBF. Based on the overall results, EFSVM has the highest AUC value compared to SVM and FSVM
FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification
This paper introduces a novel real-time Fuzzy Supervised Learning with Binary
Meta-Feature (FSL-BM) for big data classification task. The study of real-time
algorithms addresses several major concerns, which are namely: accuracy, memory
consumption, and ability to stretch assumptions and time complexity. Attaining
a fast computational model providing fuzzy logic and supervised learning is one
of the main challenges in the machine learning. In this research paper, we
present FSL-BM algorithm as an efficient solution of supervised learning with
fuzzy logic processing using binary meta-feature representation using Hamming
Distance and Hash function to relax assumptions. While many studies focused on
reducing time complexity and increasing accuracy during the last decade, the
novel contribution of this proposed solution comes through integration of
Hamming Distance, Hash function, binary meta-features, binary classification to
provide real time supervised method. Hash Tables (HT) component gives a fast
access to existing indices; and therefore, the generation of new indices in a
constant time complexity, which supersedes existing fuzzy supervised algorithms
with better or comparable results. To summarize, the main contribution of this
technique for real-time Fuzzy Supervised Learning is to represent hypothesis
through binary input as meta-feature space and creating the Fuzzy Supervised
Hash table to train and validate model.Comment: FICC201
A Review of Fault Diagnosing Methods in Power Transmission Systems
Transient stability is important in power systems. Disturbances like faults need to be segregated to restore transient stability. A comprehensive review of fault diagnosing methods in the power transmission system is presented in this paper. Typically, voltage and current samples are deployed for analysis. Three tasks/topics; fault detection, classification, and location are presented separately to convey a more logical and comprehensive understanding of the concepts. Feature extractions, transformations with dimensionality reduction methods are discussed. Fault classification and location techniques largely use artificial intelligence (AI) and signal processing methods. After the discussion of overall methods and concepts, advancements and future aspects are discussed. Generalized strengths and weaknesses of different AI and machine learning-based algorithms are assessed. A comparison of different fault detection, classification, and location methods is also presented considering features, inputs, complexity, system used and results. This paper may serve as a guideline for the researchers to understand different methods and techniques in this field
- ā¦