Search CORE

520 research outputs found

PSO-based method for svm classification on skewed data-sets

Author: Adrián Trueba Espinosa /
Cervantes Jair
Cervantes Jair
Cervantes Jair
García Lamont Farid
García Lamont Farid
García Lamont Farid
Lopez Chau Asdrubal /
LOPEZ CHAU ASDRUBAL
LOPEZ CHAU ASDRUBAL
Rodríguez Mazahua Lisbeth
Rodríguez Mazahua Lisbeth
Rodríguez Lisbeth
RUIZ CASTILLA JOSE SERGIO
Ruiz Castilla Jose Sergio
RUIZ CASTILLA JOSE SERGIO
Trueba Espinosa Adrián
Trueba Espinosa Adrián
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2016
Field of study

Support Vector Machines (SVM) have shown excellent generalization power in classification problems. However, on skewed data-sets, SVM learns a biased model that affects the classifier performance, which is severely damaged when the unbalanced ratio is very large. In this paper, a new external balancing method for applying SVM on skewed data sets is developed. In the first phase of the method, the separating hyperplane is computed. Support vectors are then used to generate the initial population of PSO algorithm, which is used to improve the population of artificial instances and to eliminate noise instances. Experimental results demonstrate the ability of the proposed method to improve the performance of SVM on imbalanced data-sets.Proyecto UAEM 3771/2014/CI

Crossref

Red Mexicana de Repositorios Institucionales

Repositorio Institucional de la Universidad Autónoma del Estado de México

Recommended from our members

Prediction of progression in idiopathic pulmonary fibrosis using CT scans atbaseline: A quantum particle swarm optimization - Random forest approach

Author: Brown Matthew S.
Goldin Jonathan G.
Kim Grace Hyun J.
Shi Yu
Wong Weng Kee
Publication venue: eScholarship, University of California
Publication date: 19/08/2019
Field of study

Idiopathic pulmonary fibrosis (IPF) is a fatal lung disease characterized by an unpredictable progressive declinein lung function. Natural history of IPF is unknown and the prediction of disease progression at the time ofdiagnosis is notoriously difficult. High resolution computed tomography (HRCT) has been used for the diagnosisof IPF, but not generally for monitoring purpose. The objective of this work is to develop a novel predictivemodel for the radiological progression pattern at voxel-wise level using only baseline HRCT scans. Mainly, thereare two challenges: (a) obtaining a data set of features for region of interest (ROI) on baseline HRCT scans andtheir follow-up status; and (b) simultaneously selecting important features from high-dimensional space, andoptimizing the prediction performance. We resolved the first challenge by implementing a study design andhaving an expert radiologist contour ROIs at baseline scans, depending on its progression status in follow-upvisits. For the second challenge, we integrated the feature selection with prediction by developing an algorithmusing a wrapper method that combines quantum particle swarm optimization to select a small number of featureswith random forest to classify early patterns of progression. We applied our proposed algorithm to analyzeanonymized HRCT images from 50 IPF subjects from a multi-center clinical trial. We showed that it yields aparsimonious model with 81.8% sensitivity, 82.2% specificity and an overall accuracy rate of 82.1% at the ROIlevel. These results are superior to other popular feature selections and classification methods, in that ourmethod produces higher accuracy in prediction of progression and more balanced sensitivity and specificity witha smaller number of selected features. Our work is the first approach to show that it is possible to use onlybaseline HRCT scans to predict progressive ROIs at 6 months to 1year follow-ups using artificial intelligence

eScholarship - University of California

An Oversampling Mechanism for Multimajority Datasets using SMOTE and Darwinian Particle Swarm Optimisation

Author: Gunasundari R.
Mary Mathew Rose
Publication venue: Auricle Global Society of Education and Research
Publication date: 10/03/2023
Field of study

Data skewness continues to be one of the leading factors which adversely impacts the machine learning algorithms performance. An approach to reduce this negative effect of the data variance is to pre-process the former dataset with data level resampling strategies. Resampling strategies have been seen in two forms, oversampling and undersampling. An oversampling strategy is proposed in this article for tackling multiclass imbalanced datasets. This proposed approach optimises the state-of-the-art oversampling technique SMOTE with the Darwinian Particle Swarm Optimization technique. This proposed method DOSMOTE generates synthetic optimised samples for balancing the datasets. This strategy will be more effective on multimajority datasets.  An experimental study is performed on peculiar multimajority datasets to measure the effectiveness of the proposed approach. As a result, the proposed method produces promising results when compared to the conventional oversampling strategies

International Journal on Recent and Innovation Trends in Computing and Communication

Imbalanced data classification using support vector machine based on simulated annealing for enhancing penalty parameter

Author: Anwar Said Amirul
Hussein Hussein Ibrahim
Publication venue: 'International University of Sarajevo'
Publication date: 12/06/2021
Field of study

For pattern cataloguing and regression issues, the support vector machine (SVM) is an eminent and computationally prevailing machine learning method. It’s been effectively addressing several concrete issues across an extensive gamut of domains. SVM possesses a key aspect called penalty factor C. The choice of these aspects has a substantial impact on the classification precision of SVM as unsuitable parameter settings might drive substandard classification outcomes. Penalty factor C is required to achieve an adequate trade-off between classification errors and generalisation performance. Hence, formulating an SVM model having appropriate performance requires parameter optimisation. The simulated annealing (SA) algorithm is employed to formulate a hybrid method for evaluating SVM parameters. Additionally, the intent is to enhance system efficacy to obtain the optimal penalty parameter and balance classification performance at the same time. Our experiments with many UCI datasets indicate that the recommended technique could attain enhanced classification precision

Periodicals of Engineering and Natural Sciences (PEN - International University of Sarajevo)

Evolving interval-based representation for multiple classifier fusion.

Author: Baghel Vimal Anand
Dang Manh Truong
Liew Alan Wee-Chung
Luong Anh Vu
McCall John
Nguyen Tien Thanh
Publication venue: 'Elsevier BV'
Publication date: 16/05/2020
Field of study

Designing an ensemble of classifiers is one of the popular research topics in machine learning since it can give better results than using each constituent member. Furthermore, the performance of ensemble can be improved using selection or adaptation. In the former, the optimal set of base classifiers, meta-classifier, original features, or meta-data is selected to obtain a better ensemble than using the entire classifiers and features. In the latter, the base classifiers or combining algorithms working on the outputs of the base classifiers are made to adapt to a particular problem. The adaptation here means that the parameters of these algorithms are trained to be optimal for each problem. In this study, we propose a novel evolving combining algorithm using the adaptation approach for the ensemble systems. Instead of using numerical value when computing the representation for each class, we propose to use the interval-based representation for the class. The optimal value of the representation is found through Particle Swarm Optimization. During classification, a test instance is assigned to the class with the interval-based representation that is closest to the base classifiers’ prediction. Experiments conducted on a number of popular dataset confirmed that the proposed method is better than the well-known ensemble systems using Decision Template and Sum Rule as combiner, L2-loss Linear Support Vector Machine, Multiple Layer Neural Network, and the ensemble selection methods based on GA-Meta-data, META-DES, and ACO

Open Access Institutional Repository at Robert Gordon University

A Generalized Wine Quality Prediction Framework by Evolutionary Algorithms

Author: Chen Chun-Hao
Hui-Ye Chiu Terry
Wu Chienwen
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 01/09/2021
Field of study

Wine is an exciting and complex product with distinctive qualities that makes it different from other manufactured products. Therefore, the testing approach to determine the quality of wine is complex and diverse. Several elements influence wine quality, but the views of experts can cause the most considerable influence on how people view the quality of wine. The views of experts on quality is very subjective, and may not match the taste of consumer. In addition, the experts may not always be available for the wine testing. To overcome this issue, many approaches based on machine learning techniques that get the attention of the wine industry have been proposed to solve it. However, they focused only on using a particular classifier with a specific set of wine dataset. In this paper, we thus firstly propose the generalized wine quality prediction framework to provide a mechanism for finding a useful hybrid model for wine quality prediction. Secondly, based on the framework, the generalized wine quality prediction algorithm using the genetic algorithms is proposed. It first encodes the classifiers as well as their hyperparameters into a chromosome. The fitness of a chromosome is then evaluated by the average accuracy of the employed classifiers. The genetic operations are performed to generate new offspring. The evolution process is continuing until reaching the stop criteria. As a result, the proposed approach can automatically find an appropriate hybrid set of classifiers and their hyperparameters for optimizing the prediction result and independent on the dataset. At last, experiments on the wine datasets were made to show the merits and effectiveness of the proposed approach

Directory of Open Access Journals

Re-UNIR

Hybrid PSO Feature Selection based Association Classification Approach for Breast Cancer Detection

Author: Dahal Keshav
Eshtay Mohammed
Qattous Hazem
Sowan Bilal
Zhang Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/11/2022
Field of study

Royal Holloway - Pure

Research Repository and Portal - University of the West of Scotland