520 research outputs found

    PSO-based method for svm classification on skewed data-sets

    Get PDF
    Support Vector Machines (SVM) have shown excellent generalization power in classification problems. However, on skewed data-sets, SVM learns a biased model that affects the classifier performance, which is severely damaged when the unbalanced ratio is very large. In this paper, a new external balancing method for applying SVM on skewed data sets is developed. In the first phase of the method, the separating hyperplane is computed. Support vectors are then used to generate the initial population of PSO algorithm, which is used to improve the population of artificial instances and to eliminate noise instances. Experimental results demonstrate the ability of the proposed method to improve the performance of SVM on imbalanced data-sets.Proyecto UAEM 3771/2014/CI

    An Oversampling Mechanism for Multimajority Datasets using SMOTE and Darwinian Particle Swarm Optimisation

    Get PDF
    Data skewness continues to be one of the leading factors which adversely impacts the machine learning algorithms performance. An approach to reduce this negative effect of the data variance is to pre-process the former dataset with data level resampling strategies. Resampling strategies have been seen in two forms, oversampling and undersampling. An oversampling strategy is proposed in this article for tackling multiclass imbalanced datasets. This proposed approach optimises the state-of-the-art oversampling technique SMOTE with the Darwinian Particle Swarm Optimization technique. This proposed method DOSMOTE generates synthetic optimised samples for balancing the datasets. This strategy will be more effective on multimajority datasets.  An experimental study is performed on peculiar multimajority datasets to measure the effectiveness of the proposed approach. As a result, the proposed method produces promising results when compared to the conventional oversampling strategies

    Imbalanced data classification using support vector machine based on simulated annealing for enhancing penalty parameter

    Get PDF
    For pattern cataloguing and regression issues, the support vector machine (SVM) is an eminent and computationally prevailing machine learning method. It’s been effectively addressing several concrete issues across an extensive gamut of domains. SVM possesses a key aspect called penalty factor C. The choice of these aspects has a substantial impact on the classification precision of SVM as unsuitable parameter settings might drive substandard classification outcomes. Penalty factor C is required to achieve an adequate trade-off between classification errors and generalisation performance. Hence, formulating an SVM model having appropriate performance requires parameter optimisation. The simulated annealing (SA) algorithm is employed to formulate a hybrid method for evaluating SVM parameters. Additionally, the intent is to enhance system efficacy to obtain the optimal penalty parameter and balance classification performance at the same time. Our experiments with many UCI datasets indicate that the recommended technique could attain enhanced classification precision

    Evolving interval-based representation for multiple classifier fusion.

    Get PDF
    Designing an ensemble of classifiers is one of the popular research topics in machine learning since it can give better results than using each constituent member. Furthermore, the performance of ensemble can be improved using selection or adaptation. In the former, the optimal set of base classifiers, meta-classifier, original features, or meta-data is selected to obtain a better ensemble than using the entire classifiers and features. In the latter, the base classifiers or combining algorithms working on the outputs of the base classifiers are made to adapt to a particular problem. The adaptation here means that the parameters of these algorithms are trained to be optimal for each problem. In this study, we propose a novel evolving combining algorithm using the adaptation approach for the ensemble systems. Instead of using numerical value when computing the representation for each class, we propose to use the interval-based representation for the class. The optimal value of the representation is found through Particle Swarm Optimization. During classification, a test instance is assigned to the class with the interval-based representation that is closest to the base classifiers’ prediction. Experiments conducted on a number of popular dataset confirmed that the proposed method is better than the well-known ensemble systems using Decision Template and Sum Rule as combiner, L2-loss Linear Support Vector Machine, Multiple Layer Neural Network, and the ensemble selection methods based on GA-Meta-data, META-DES, and ACO

    A Generalized Wine Quality Prediction Framework by Evolutionary Algorithms

    Get PDF
    Wine is an exciting and complex product with distinctive qualities that makes it different from other manufactured products. Therefore, the testing approach to determine the quality of wine is complex and diverse. Several elements influence wine quality, but the views of experts can cause the most considerable influence on how people view the quality of wine. The views of experts on quality is very subjective, and may not match the taste of consumer. In addition, the experts may not always be available for the wine testing. To overcome this issue, many approaches based on machine learning techniques that get the attention of the wine industry have been proposed to solve it. However, they focused only on using a particular classifier with a specific set of wine dataset. In this paper, we thus firstly propose the generalized wine quality prediction framework to provide a mechanism for finding a useful hybrid model for wine quality prediction. Secondly, based on the framework, the generalized wine quality prediction algorithm using the genetic algorithms is proposed. It first encodes the classifiers as well as their hyperparameters into a chromosome. The fitness of a chromosome is then evaluated by the average accuracy of the employed classifiers. The genetic operations are performed to generate new offspring. The evolution process is continuing until reaching the stop criteria. As a result, the proposed approach can automatically find an appropriate hybrid set of classifiers and their hyperparameters for optimizing the prediction result and independent on the dataset. At last, experiments on the wine datasets were made to show the merits and effectiveness of the proposed approach
    • …
    corecore