2,572 research outputs found
On the design of an ECOC-compliant genetic algorithm
Genetic Algorithms (GA) have been previously applied to Error-Correcting Output Codes (ECOC) in state-of-the-art works in order to find a suitable coding matrix. Nevertheless, none of the presented techniques directly take into account the properties of the ECOC matrix. As a result the considered search space is unnecessarily large. In this paper, a novel Genetic strategy to optimize the ECOC coding step is presented. This novel strategy redefines the usual crossover and mutation operators in order to take into account the theoretical properties of the ECOC framework. Thus, it reduces the search space and lets the algorithm to converge faster. In addition, a novel operator that is able to enlarge the code in a smart way is introduced. The novel methodology is tested on several UCI datasets and four challenging computer vision problems. Furthermore, the analysis of the results done in terms of performance, code length and number of Support Vectors shows that the optimization process is able to find very efficient codes, in terms of the trade-off between classification performance and the number of classifiers. Finally, classification performance per dichotomizer results shows that the novel proposal is able to obtain similar or even better results while defining a more compact number of dichotomies and SVs compared to state-of-the-art approaches
Optimization of distributions differences for classification
In this paper we introduce a new classification algorithm called Optimization
of Distributions Differences (ODD). The algorithm aims to find a transformation
from the feature space to a new space where the instances in the same class are
as close as possible to one another while the gravity centers of these classes
are as far as possible from one another. This aim is formulated as a
multiobjective optimization problem that is solved by a hybrid of an
evolutionary strategy and the Quasi-Newton method. The choice of the
transformation function is flexible and could be any continuous space function.
We experiment with a linear and a non-linear transformation in this paper. We
show that the algorithm can outperform 6 other state-of-the-art classification
methods, namely naive Bayes, support vector machines, linear discriminant
analysis, multi-layer perceptrons, decision trees, and k-nearest neighbors, in
12 standard classification datasets. Our results show that the method is less
sensitive to the imbalanced number of instances comparing to these methods. We
also show that ODD maintains its performance better than other classification
methods in these datasets, hence, offers a better generalization ability
Unbalanced load flow with hybrid wavelet transform and support vector machine based Error-Correcting Output Codes for power quality disturbances classification including wind energy
Purpose. The most common methods to designa multiclass classification consist to determine a set of binary classifiers and to combine them. In this paper support vector machine with Error-Correcting Output Codes (ECOC-SVM) classifier is proposed to classify and characterize the power qualitydisturbances such as harmonic distortion,voltage sag, and voltage swell include wind farms generator in power transmission systems. Firstly three phases unbalanced load flow analysis is executed to calculate difference electric network characteristics, levels of voltage, active and reactive power. After, discrete wavelet transform is combined with the probabilistic ECOC-SVM model to construct the classifier. Finally, the ECOC-SVM classifies and identifies the disturbance type according tothe energy deviation of the discrete wavelet transform. The proposedmethod gives satisfactory accuracy with 99.2% compared with well known methods and shows that each power quality disturbances has specific deviations from the pure sinusoidal waveform,this is good at recognizing and specifies the type of disturbance generated from the wind
power generator.ΠΠ°ΠΈΠ±ΠΎΠ»Π΅Π΅ ΡΠ°ΡΠΏΡΠΎΡΡΡΠ°Π½Π΅Π½Π½ΡΠ΅ ΠΌΠ΅ΡΠΎΠ΄Ρ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΡ ΠΌΡΠ»ΡΡΠΈΠΊΠ»Π°ΡΡΠΎΠ²ΠΎΠΉ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π·Π°ΠΊΠ»ΡΡΠ°ΡΡΡΡ Π² ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΠΈ Π½Π°Π±ΠΎΡΠ° Π΄Π²ΠΎΠΈΡΠ½ΡΡ
ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΎΡΠΎΠ² ΠΈ ΠΈΡ
ΠΎΠ±ΡΠ΅Π΄ΠΈΠ½Π΅Π½ΠΈΠΈ. Π Π΄Π°Π½Π½ΠΎΠΉ ΡΡΠ°ΡΡΠ΅ ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½Π° ΠΌΠ°ΡΠΈΠ½Π° ΠΎΠΏΠΎΡΠ½ΡΡ
Π²Π΅ΠΊΡΠΎΡΠΎΠ² Ρ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΎΡΠΎΠΌ Π²ΡΡ
ΠΎΠ΄Π½ΡΡ
ΠΊΠΎΠ΄ΠΎΠ² ΠΈΡΠΏΡΠ°Π²Π»Π΅Π½ΠΈΡ ΠΎΡΠΈΠ±ΠΎΠΊ(ECOC-SVM) Ρ ΡΠ΅Π»ΡΡ ΠΊΠ»Π°ΡΡΠΈΡΠΈΡΠΈΡΠΎΠ²Π°ΡΡ ΠΈ Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΠ·ΠΎΠ²Π°ΡΡ ΡΠ°ΠΊΠΈΠ΅ Π½Π°ΡΡΡΠ΅Π½ΠΈΡ ΠΊΠ°ΡΠ΅ΡΡΠ²Π° ΡΠ»Π΅ΠΊΡΡΠΎΡΠ½Π΅ΡΠ³ΠΈΠΈ, ΠΊΠ°ΠΊ Π³Π°ΡΠΌΠΎΠ½ΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΠΈΡΠΊΠ°ΠΆΠ΅Π½ΠΈΡ, ΠΏΠ°Π΄Π΅Π½ΠΈΠ΅ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ ΠΈ ΡΠΊΠ°ΡΠΎΠΊ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ, Π²ΠΊΠ»ΡΡΠ°Ρ Π³Π΅Π½Π΅ΡΠ°ΡΠΎΡ Π²Π΅ΡΡΠΎΠ²ΡΡ
ΡΠ»Π΅ΠΊΡΡΠΎΡΡΠ°Π½ΡΠΈΠΉ Π² ΡΠΈΡΡΠ΅ΠΌΠ°Ρ
ΠΏΠ΅ΡΠ΅Π΄Π°ΡΠΈ ΡΠ»Π΅ΠΊΡΡΠΎΡΠ½Π΅ΡΠ³ΠΈΠΈ. Π‘Π½Π°ΡΠ°Π»Π° Π²ΡΠΏΠΎΠ»Π½ΡΠ΅ΡΡΡ Π°Π½Π°Π»ΠΈΠ· ΠΏΠΎΡΠΎΠΊΠ° Π½Π΅ΡΠΈΠΌΠΌΠ΅ΡΡΠΈΡΠ½ΠΎΠΉ Π½Π°Π³ΡΡΠ·ΠΊΠΈ ΡΡΠ΅Ρ
ΡΠ°Π· Π΄Π»Ρ ΡΠ°ΡΡΠ΅ΡΠ° ΡΠ°Π·Π½ΠΎΡΡΠ½ΡΡ
Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊ ΡΠ»Π΅ΠΊΡΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΡΠ΅ΡΠΈ, ΡΡΠΎΠ²Π½Π΅ΠΉ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ, Π°ΠΊΡΠΈΠ²Π½ΠΎΠΉ ΠΈ ΡΠ΅Π°ΠΊΡΠΈΠ²Π½ΠΎΠΉ ΠΌΠΎΡΠ½ΠΎΡΡΠΈ. ΠΠΎΡΠ»Π΅ ΡΡΠΎΠ³ΠΎ Π΄ΠΈΡΠΊΡΠ΅ΡΠ½ΠΎΠ΅ Π²Π΅ΠΉΠ²Π»Π΅Ρ-ΠΏΡΠ΅ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π°Π½ΠΈΠ΅ ΠΎΠ±ΡΠ΅Π΄ΠΈΠ½ΡΠ΅ΡΡΡ Ρ Π²Π΅ΡΠΎΡΡΠ½ΠΎΡΡΠ½ΠΎΠΉ ΠΌΠΎΠ΄Π΅Π»ΡΡ ECOC-SVM Π΄Π»Ρ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΡ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΎΡΠ°. ΠΠ°ΠΊΠΎΠ½Π΅Ρ, ECOC-SVM ΠΊΠ»Π°ΡΡΠΈΡΠΈΡΠΈΡΡΠ΅Ρ ΠΈ ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΡΠΈΡΡΠ΅Ρ ΡΠΈΠΏ Π²ΠΎΠ·ΠΌΡΡΠ΅Π½ΠΈΡ Π² ΡΠΎΠΎΡΠ²Π΅ΡΡΡΠ²ΠΈΠΈ Ρ ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠ΅ΠΌ ΡΠ½Π΅ΡΠ³ΠΈΠΈ Π΄ΠΈΡΠΊΡΠ΅ΡΠ½ΠΎΠ³ΠΎ Π²Π΅ΠΉΠ²Π»Π΅Ρ-ΠΏΡΠ΅ΠΎΠ±ΡΠ°Π·ΠΎΠ²Π°Π½ΠΈΡ. ΠΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½Π½ΡΠΉ ΠΌΠ΅ΡΠΎΠ΄ Π΄Π°Π΅Ρ ΡΠ΄ΠΎΠ²Π»Π΅ΡΠ²ΠΎΡΠΈΡΠ΅Π»ΡΠ½ΡΡ ΡΠΎΡΠ½ΠΎΡΡΡ 99,2% ΠΏΠΎ ΡΡΠ°Π²Π½Π΅Π½ΠΈΡ Ρ Ρ
ΠΎΡΠΎΡΠΎ ΠΈΠ·Π²Π΅ΡΡΠ½ΡΠΌΠΈ ΠΌΠ΅ΡΠΎΠ΄Π°ΠΌΠΈ ΠΈ ΠΏΠΎΠΊΠ°Π·ΡΠ²Π°Π΅Ρ, ΡΡΠΎ ΠΊΠ°ΠΆΠ΄ΠΎΠ΅ Π½Π°ΡΡΡΠ΅Π½ΠΈΠ΅ ΠΊΠ°ΡΠ΅ΡΡΠ²Π° ΡΠ»Π΅ΠΊΡΡΠΎΡΠ½Π΅ΡΠ³ΠΈΠΈ ΠΈΠΌΠ΅Π΅Ρ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΡΠ΅ ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΡ ΠΎΡ ΡΠΈΡΡΠΎ ΡΠΈΠ½ΡΡΠΎΠΈΠ΄Π°Π»ΡΠ½ΠΎΠΉ ΡΠΎΡΠΌΡ Π²ΠΎΠ»Π½Ρ, ΡΡΠΎ ΡΠΏΠΎΡΠΎΠ±ΡΡΠ²ΡΠ΅Ρ ΡΠ°ΡΠΏΠΎΠ·Π½Π°Π²Π°Π½ΠΈΡ ΠΈ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΡ ΡΠΈΠΏΠ° Π²ΠΎΠ·ΠΌΡΡΠ΅Π½ΠΈΡ, Π³Π΅Π½Π΅ΡΠΈΡΡΠ΅ΠΌΠΎΠ³ΠΎ Π²Π΅ΡΡΠΎΠ²ΡΠΌ Π³Π΅Π½Π΅ΡΠ°ΡΠΎΡΠΎΠΌ
Soft Methodology for Cost-and-error Sensitive Classification
Many real-world data mining applications need varying cost for different
types of classification errors and thus call for cost-sensitive classification
algorithms. Existing algorithms for cost-sensitive classification are
successful in terms of minimizing the cost, but can result in a high error rate
as the trade-off. The high error rate holds back the practical use of those
algorithms. In this paper, we propose a novel cost-sensitive classification
methodology that takes both the cost and the error rate into account. The
methodology, called soft cost-sensitive classification, is established from a
multicriteria optimization problem of the cost and the error rate, and can be
viewed as regularizing cost-sensitive classification with the error rate. The
simple methodology allows immediate improvements of existing cost-sensitive
classification algorithms. Experiments on the benchmark and the real-world data
sets show that our proposed methodology indeed achieves lower test error rates
and similar (sometimes lower) test costs than existing cost-sensitive
classification algorithms. We also demonstrate that the methodology can be
extended for considering the weighted error rate instead of the original error
rate. This extension is useful for tackling unbalanced classification problems.Comment: A shorter version appeared in KDD '1
Elephant Search with Deep Learning for Microarray Data Analysis
Even though there is a plethora of research in Microarray gene expression
data analysis, still, it poses challenges for researchers to effectively and
efficiently analyze the large yet complex expression of genes. The feature
(gene) selection method is of paramount importance for understanding the
differences in biological and non-biological variation between samples. In
order to address this problem, a novel elephant search (ES) based optimization
is proposed to select best gene expressions from the large volume of microarray
data. Further, a promising machine learning method is envisioned to leverage
such high dimensional and complex microarray dataset for extracting hidden
patterns inside to make a meaningful prediction and most accurate
classification. In particular, stochastic gradient descent based Deep learning
(DL) with softmax activation function is then used on the reduced features
(genes) for better classification of different samples according to their gene
expression levels. The experiments are carried out on nine most popular Cancer
microarray gene selection datasets, obtained from UCI machine learning
repository. The empirical results obtained by the proposed elephant search
based deep learning (ESDL) approach are compared with most recent published
article for its suitability in future Bioinformatics research.Comment: 12 pages, 5 Tabl
- β¦