2,572 research outputs found

    On the design of an ECOC-compliant genetic algorithm

    Get PDF
    Genetic Algorithms (GA) have been previously applied to Error-Correcting Output Codes (ECOC) in state-of-the-art works in order to find a suitable coding matrix. Nevertheless, none of the presented techniques directly take into account the properties of the ECOC matrix. As a result the considered search space is unnecessarily large. In this paper, a novel Genetic strategy to optimize the ECOC coding step is presented. This novel strategy redefines the usual crossover and mutation operators in order to take into account the theoretical properties of the ECOC framework. Thus, it reduces the search space and lets the algorithm to converge faster. In addition, a novel operator that is able to enlarge the code in a smart way is introduced. The novel methodology is tested on several UCI datasets and four challenging computer vision problems. Furthermore, the analysis of the results done in terms of performance, code length and number of Support Vectors shows that the optimization process is able to find very efficient codes, in terms of the trade-off between classification performance and the number of classifiers. Finally, classification performance per dichotomizer results shows that the novel proposal is able to obtain similar or even better results while defining a more compact number of dichotomies and SVs compared to state-of-the-art approaches

    Optimization of distributions differences for classification

    Full text link
    In this paper we introduce a new classification algorithm called Optimization of Distributions Differences (ODD). The algorithm aims to find a transformation from the feature space to a new space where the instances in the same class are as close as possible to one another while the gravity centers of these classes are as far as possible from one another. This aim is formulated as a multiobjective optimization problem that is solved by a hybrid of an evolutionary strategy and the Quasi-Newton method. The choice of the transformation function is flexible and could be any continuous space function. We experiment with a linear and a non-linear transformation in this paper. We show that the algorithm can outperform 6 other state-of-the-art classification methods, namely naive Bayes, support vector machines, linear discriminant analysis, multi-layer perceptrons, decision trees, and k-nearest neighbors, in 12 standard classification datasets. Our results show that the method is less sensitive to the imbalanced number of instances comparing to these methods. We also show that ODD maintains its performance better than other classification methods in these datasets, hence, offers a better generalization ability

    Unbalanced load flow with hybrid wavelet transform and support vector machine based Error-Correcting Output Codes for power quality disturbances classification including wind energy

    Get PDF
    Purpose. The most common methods to designa multiclass classification consist to determine a set of binary classifiers and to combine them. In this paper support vector machine with Error-Correcting Output Codes (ECOC-SVM) classifier is proposed to classify and characterize the power qualitydisturbances such as harmonic distortion,voltage sag, and voltage swell include wind farms generator in power transmission systems. Firstly three phases unbalanced load flow analysis is executed to calculate difference electric network characteristics, levels of voltage, active and reactive power. After, discrete wavelet transform is combined with the probabilistic ECOC-SVM model to construct the classifier. Finally, the ECOC-SVM classifies and identifies the disturbance type according tothe energy deviation of the discrete wavelet transform. The proposedmethod gives satisfactory accuracy with 99.2% compared with well known methods and shows that each power quality disturbances has specific deviations from the pure sinusoidal waveform,this is good at recognizing and specifies the type of disturbance generated from the wind power generator.НаиболСС распространСнныС ΠΌΠ΅Ρ‚ΠΎΠ΄Ρ‹ построСния ΠΌΡƒΠ»ΡŒΡ‚ΠΈΠΊΠ»Π°ΡΡΠΎΠ²ΠΎΠΉ классификации Π·Π°ΠΊΠ»ΡŽΡ‡Π°ΡŽΡ‚ΡΡ Π² ΠΎΠΏΡ€Π΅Π΄Π΅Π»Π΅Π½ΠΈΠΈ Π½Π°Π±ΠΎΡ€Π° Π΄Π²ΠΎΠΈΡ‡Π½Ρ‹Ρ… классификаторов ΠΈ ΠΈΡ… объСдинСнии. Π’ Π΄Π°Π½Π½ΠΎΠΉ ΡΡ‚Π°Ρ‚ΡŒΠ΅ ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½Π° машина ΠΎΠΏΠΎΡ€Π½Ρ‹Ρ… Π²Π΅ΠΊΡ‚ΠΎΡ€ΠΎΠ² с классификатором Π²Ρ‹Ρ…ΠΎΠ΄Π½Ρ‹Ρ… ΠΊΠΎΠ΄ΠΎΠ² исправлСния ошибок(ECOC-SVM) с Ρ†Π΅Π»ΡŒΡŽ ΠΊΠ»Π°ΡΡΠΈΡ„ΠΈΡ†ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ ΠΈ Ρ…Π°Ρ€Π°ΠΊΡ‚Π΅Ρ€ΠΈΠ·ΠΎΠ²Π°Ρ‚ΡŒ Ρ‚Π°ΠΊΠΈΠ΅ Π½Π°Ρ€ΡƒΡˆΠ΅Π½ΠΈΡ качСства элСктроэнСргии, ΠΊΠ°ΠΊ гармоничСскиС искаТСния, ΠΏΠ°Π΄Π΅Π½ΠΈΠ΅ напряТСния ΠΈ скачок напряТСния, Π²ΠΊΠ»ΡŽΡ‡Π°Ρ Π³Π΅Π½Π΅Ρ€Π°Ρ‚ΠΎΡ€ Π²Π΅Ρ‚Ρ€ΠΎΠ²Ρ‹Ρ… элСктростанций Π² систСмах ΠΏΠ΅Ρ€Π΅Π΄Π°Ρ‡ΠΈ элСктроэнСргии. Π‘Π½Π°Ρ‡Π°Π»Π° выполняСтся Π°Π½Π°Π»ΠΈΠ· ΠΏΠΎΡ‚ΠΎΠΊΠ° нСсиммСтричной Π½Π°Π³Ρ€ΡƒΠ·ΠΊΠΈ Ρ‚Ρ€Π΅Ρ… Ρ„Π°Π· для расчСта разностных характСристик элСктричСской сСти, ΡƒΡ€ΠΎΠ²Π½Π΅ΠΉ напряТСния, Π°ΠΊΡ‚ΠΈΠ²Π½ΠΎΠΉ ΠΈ Ρ€Π΅Π°ΠΊΡ‚ΠΈΠ²Π½ΠΎΠΉ мощности. ПослС этого дискрСтноС Π²Π΅ΠΉΠ²Π»Π΅Ρ‚-ΠΏΡ€Π΅ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Π½ΠΈΠ΅ ΠΎΠ±ΡŠΠ΅Π΄ΠΈΠ½ΡΠ΅Ρ‚ΡΡ с вСроятностной модСлью ECOC-SVM для построСния классификатора. НаконСц, ECOC-SVM классифицируСт ΠΈ ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΡ†ΠΈΡ€ΡƒΠ΅Ρ‚ Ρ‚ΠΈΠΏ возмущСния Π² соотвСтствии с ΠΎΡ‚ΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠ΅ΠΌ энСргии дискрСтного Π²Π΅ΠΉΠ²Π»Π΅Ρ‚-прСобразования. ΠŸΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½Π½Ρ‹ΠΉ ΠΌΠ΅Ρ‚ΠΎΠ΄ Π΄Π°Π΅Ρ‚ ΡƒΠ΄ΠΎΠ²Π»Π΅Ρ‚Π²ΠΎΡ€ΠΈΡ‚Π΅Π»ΡŒΠ½ΡƒΡŽ Ρ‚ΠΎΡ‡Π½ΠΎΡΡ‚ΡŒ 99,2% ΠΏΠΎ ΡΡ€Π°Π²Π½Π΅Π½ΠΈΡŽ с Ρ…ΠΎΡ€ΠΎΡˆΠΎ извСстными ΠΌΠ΅Ρ‚ΠΎΠ΄Π°ΠΌΠΈ ΠΈ ΠΏΠΎΠΊΠ°Π·Ρ‹Π²Π°Π΅Ρ‚, Ρ‡Ρ‚ΠΎ ΠΊΠ°ΠΆΠ΄ΠΎΠ΅ Π½Π°Ρ€ΡƒΡˆΠ΅Π½ΠΈΠ΅ качСства элСктроэнСргии ΠΈΠΌΠ΅Π΅Ρ‚ ΠΎΠΏΡ€Π΅Π΄Π΅Π»Π΅Π½Π½Ρ‹Π΅ отклонСния ΠΎΡ‚ чисто ΡΠΈΠ½ΡƒΡΠΎΠΈΠ΄Π°Π»ΡŒΠ½ΠΎΠΉ Ρ„ΠΎΡ€ΠΌΡ‹ Π²ΠΎΠ»Π½Ρ‹, Ρ‡Ρ‚ΠΎ способствуСт Ρ€Π°ΡΠΏΠΎΠ·Π½Π°Π²Π°Π½ΠΈΡŽ ΠΈ ΠΎΠΏΡ€Π΅Π΄Π΅Π»Π΅Π½ΠΈΡŽ Ρ‚ΠΈΠΏΠ° возмущСния, Π³Π΅Π½Π΅Ρ€ΠΈΡ€ΡƒΠ΅ΠΌΠΎΠ³ΠΎ Π²Π΅Ρ‚Ρ€ΠΎΠ²Ρ‹ΠΌ Π³Π΅Π½Π΅Ρ€Π°Ρ‚ΠΎΡ€ΠΎΠΌ

    Soft Methodology for Cost-and-error Sensitive Classification

    Full text link
    Many real-world data mining applications need varying cost for different types of classification errors and thus call for cost-sensitive classification algorithms. Existing algorithms for cost-sensitive classification are successful in terms of minimizing the cost, but can result in a high error rate as the trade-off. The high error rate holds back the practical use of those algorithms. In this paper, we propose a novel cost-sensitive classification methodology that takes both the cost and the error rate into account. The methodology, called soft cost-sensitive classification, is established from a multicriteria optimization problem of the cost and the error rate, and can be viewed as regularizing cost-sensitive classification with the error rate. The simple methodology allows immediate improvements of existing cost-sensitive classification algorithms. Experiments on the benchmark and the real-world data sets show that our proposed methodology indeed achieves lower test error rates and similar (sometimes lower) test costs than existing cost-sensitive classification algorithms. We also demonstrate that the methodology can be extended for considering the weighted error rate instead of the original error rate. This extension is useful for tackling unbalanced classification problems.Comment: A shorter version appeared in KDD '1

    Elephant Search with Deep Learning for Microarray Data Analysis

    Full text link
    Even though there is a plethora of research in Microarray gene expression data analysis, still, it poses challenges for researchers to effectively and efficiently analyze the large yet complex expression of genes. The feature (gene) selection method is of paramount importance for understanding the differences in biological and non-biological variation between samples. In order to address this problem, a novel elephant search (ES) based optimization is proposed to select best gene expressions from the large volume of microarray data. Further, a promising machine learning method is envisioned to leverage such high dimensional and complex microarray dataset for extracting hidden patterns inside to make a meaningful prediction and most accurate classification. In particular, stochastic gradient descent based Deep learning (DL) with softmax activation function is then used on the reduced features (genes) for better classification of different samples according to their gene expression levels. The experiments are carried out on nine most popular Cancer microarray gene selection datasets, obtained from UCI machine learning repository. The empirical results obtained by the proposed elephant search based deep learning (ESDL) approach are compared with most recent published article for its suitability in future Bioinformatics research.Comment: 12 pages, 5 Tabl
    • …
    corecore