Search CORE

7 research outputs found

GSA to Obtain SVM Kernel Parameter for Thyroid Nodule Classification

Author: Musdholifah Aina
Pramudita Dias Aziz
Publication venue: 'Universitas Gadjah Mada'
Publication date: 31/01/2020
Field of study

Support Vector Machine (SVM) is one of the most popular methods of classification problems due to its global optima solution. However, the selection of appropriate parameters and kernel values remains an obstacle in the process. The problem can be solved by adding the best value of parameter during optimization process in SVM. Gravitational Search Algorithm (GSA) will be used to optimize parameters of SVM. GSA is an optimization algorithm that is inspired by mass interaction and Newton's law of gravity. This research hybridizes the GSA and SVM to increase system accuracy. The proposed approach had been implemented to improve the classification performance of Thyroid Nodule. The data used in this research are ultrasonography image of Thyroid Nodule obtained from RSUP Dr. Sardjito, Yogyakarta. This research had been evaluated by comparing the default SVM parameters with the proposed method in term of accuracy. The experiment results showed that the use of GSA on SVM is capable to increase system accuracy. In the polynomial kernel the accuracy rose up from 58.5366 % to 89.4309 %, and 41.4634 % to 98.374 % in Polynomial kerne

IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Standardizing catch per unit effort by machine learning techniques in longline fisheries: a case study of bigeye tuna in the Atlantic Ocean

Author: Dai Yang
Fan Wei
Shi Huiming
Yang Shenglong
Publication venue: Instituto Oceanográfico - Universidade de São Paulo
Publication date: 01/06/2021
Field of study

Support vector machine (SVM) is shown to have better performance in catch per unit of effort (CPUE) standardization than other methods. The SVM performance highly relates to its parameters selection and has not been discussed in CPUE standardization. Analyzing the influence of parameter selection on SVM performance for CPUE standardization could improve model construction and performance, and thus provide useful information to stock assessment and management. We applied SVM to standardize longline catch per unit fishing effort of fishery data for bigeye tuna (Thunnus obesus) in the tropical fishing area of Atlantic Ocean and evaluated three parameters optimization methods: a Grid Search method, and two improved hybrid algorithms, namely SVMs in combination with the particle swarm optimization (PSO-SVM), and genetic algorithms (GA-SVM), in order to increase the strength of SVM. The mean absolute error (MAE), mean square error (MSE), three types of correlation coefficients and the normalized mean square error (NMSE) were computed to compare the algorithm performances. The PSO-SVM and GA-SVM algorithms had particularly high performances of indicative values in the training data and dataset, and the performances of PSO-SVM were marginally better than GA-SVM. The Grid search algorithm had best performances of indicative values in testing data. In general, PSO was appropriate to optimize the SVM parameters in CPUE standardization. The standardized CPUE was unstable and low from 2007 to 2011, increased during 2011- 2013, then decreased from 2015 to 2017. The abundance index was lower compared with before 2000 and showed a decreasing trend in recent years

Directory of Open Access Journals

Cadernos Espinosanos (E-Journal)

Customer Churn Prediction of Telecom Company Using Machine Learning Algorithms

Author: Chong Angela Yi Wen
Chuah Wen Xu
Khaw Khai Wah
Yeong Wai Chung
Publication venue: 'Penerbit UTHM'
Publication date: 03/10/2023
Field of study

We can’t escape the fact that using telecommunications has become a significant part of our everyday lives. Since the Covid-19 pandemic, the telecommunication industry has become crucial.  Hence, the industry now enjoys growth opportunities. In this study, KNN, Random Forest (RF), AdaBoost, Logistic Regression (LR), XGBoost, and Support Vector Machine (SVM) are 6 supervised machine learning algorithms that will be used in this study to predict the customer churn of a telecom company in California. The goal of this study is to identify the classifier that predicts customer churn the most effectively. As evidenced by its accuracy of 79.67%, precision of 64.67%, recall of 51.87%, and F1-score of 57.57%, XGBoost is the overall most effective classifier in this study. Next, the purpose of this study is to identify the characteristics of customers who are most likely to leave the telecom company. These characteristics were discovered based on customers’ demographics and account information. Lastly, this study also provides the company with advice on how to retain customers. The study advises company to personalize the customer experience, implement a customer loyalty program, and apply AI in customer relationship management in retaining customers

Journals of Universiti Tun Hussein Onn Malaysia (UTHM)

Development of cognitive workload models to detect driving impairment

Author: Becerra Sánchez Enriqueta Patricia
Publication venue: Universitat Politècnica de Catalunya
Publication date: 20/09/2021
Field of study

Tesi redactada en castellàDriving a vehicle is a complex activity exposed to continuous changes such as speed limits and vehicular traffic. Drivers require a high degree of concentration when performing this activity, increasing the amount of mental demand known as cognitive workload, causing vehicular accidents to the minimum negligence. In fact, human error is the leading contributing factor in over 90% of road accidents. In recent years, the subjects' cognitive workload levels while driving a vehicle have been predicted using subjective and vehicle performance tools. Other research has emphasized the use and analysis of physiological information, where electroencephalographic (EEG) signals are the most used to identify cognitive states due to their high precision. Although significant progress has been made in this area, these investigations have been based on traditional techniques or data analysis from a specific source due to the information's complexity. A new trend has been opened in the study of the internal behavior of subjects by implementing machine learning techniques to analyze information from various sources. However, there are still several challenges to face in this new line of research. This doctoral thesis presents a new model to predict the states of low and high cognitive workload of subjects when facing scenarios of driving a vehicle called GALoRSI-SVMRBF (Genetic Algorithms and Logistic Regression for the Structuring of Information-Support Vector Machine with Radial Basis Function Kernel). GALoRSI-SVMRBF is developed using machine learning algorithms based on information from EEG signals. Also, the information collected from NASA-TLX, instant online self-assessment and the error rate measure are implemented in the model. First, GALoRSI-SVMRBF proposes a new method for pattern recognition based on feature selection that combines statistical tests, genetic algorithms, and logistic regression. This method consists mainly of selecting an EEG dataset and exploring the information to identify the key features that recognize cognitive states. The selected data are defined as an index for pattern recognition and used to structure a new dataset capable of optimizing the model's learning and classification process. Second, the methodology and development of a classifier for the prediction model are presented, implementing machine learning algorithms. The classifier is developed mainly in two phases, defined as training and testing. Once the prediction model has been developed, this thesis presents the validation phase of GALoRSI-SVMRBF. The validation consists of evaluating the model's adaptability to new datasets, maintaining a high prediction rate. Finally, an analysis of the performance of GALoRSI-SVMRBF is presented. The objective is to know the model's scope and limitations, evaluating various performance metrics to find the optimal configuration for GALoRSI-SVMRBF. We found that GALoRSI-SVMRBF successfully predicts low and high cognitive workload of subjects while driving a vehicle. In general, it is observed that the model uses the information extracted from multiple EEG signals, reducing the original dataset by more than 50%, maximizing its predictive capacity, achieving a precision rate of >90% in the classification of the information. During this thesis, the experiments showed that obtaining a high percentage of prediction depends on several factors, from applying a useful collection technique data until the last step of the prediction model.La conducción de un vehículo es una actividad compleja que está expuesta a demandas que cambian continuamente por diferentes factores, tales como, el límite de velocidad, obstáculos en la vía, tráfico vehicular, entre otros. Al desempeñar esta actividad, los conductores requieren un alto grado de concentración incrementando la cantidad de demanda mental conocida como carga. En los últimos años, se han propuesto mecanismos para monitorear y/o predecir los niveles de carga cognitiva de los sujetos al conducir un vehículo, centrándose en el uso de herramientas subjetivas y de rendimiento vehicular. Otras investigaciones, han enfatizado en el uso y análisis de la información fisiológica, siendo las señales electroencefalográficas (EEG) las más utilizadas para identificar los estados cognitivos por su alta precisión. A pesar del gran avance realizado, estas investigaciones se han basado en técnicas tradicionales o en el análisis de la información proveniente de fuentes específicas para identificar el estado interno del sujeto, obteniendo modelos sobreentrenados o robustos, incrementando el tiempo de análisis afectando el desempeño del modelo. En esta tesis doctoral se presenta un nuevo modelo para predecir los estados de baja y alta carga cognitiva de los sujetos al enfrentarse a escenarios de la conducción de un vehículo denominado GALoRSI-SVMRBF (Genetic Algorithms and Logistic Regression for the Structuring of Information-Support Vector Machine with Radial Basis Function Kernel). GALoRSI-SVMRBF fue desarrollado utilizando los algoritmos de aprendizaje automático y técnicas estadísticas basado en la información proveniente de las señales EEG. Primero, GALoRSI-SVMRBF crea una base de datos extrayendo las características que serán utilizadas en el modelo a través de técnicas estadísticas. Posteriormente, propone un nuevo método para el reconocimiento de patrones basado en la selección de características que combina pruebas estadísticas, algoritmos genéticos y regresión logística. Este método consiste principalmente en seleccionar un conjunto de datos EEG y explorar la combinación de la información para identificar las características claves que contribuyan al reconocimiento de dos estados cognitivos. Después, la información seleccionada es definida como un índice para el reconocimiento de patrones y utilizada para estructurar un nuevo conjunto de datos que soporta información de uno o múltiples canales para optimizar el proceso de aprendizaje y clasificación del modelo. Por último, es desarrollado el clasificador del modelo de predicciones el cual consiste en dos etapas definidas como entrenamiento y prueba. Nosotros encontramos que GALoRSI-SVMRBF predice de manera exitosa la carga cognitiva baja y alta de los sujetos durante la conducción de un vehículo. En general, se observó que el modelo utiliza la información extraída de una o múltiples señales EEG y logrando una tasa de precisión >90% en la clasificación de la informaciónPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Evolving machine learning and deep learning models using evolutionary algorithms

Author: Xie Hailun
Publication venue
Publication date
Field of study

Despite the great success in data mining, machine learning and deep learning models are yet subject to material obstacles when tackling real-life challenges, such as feature selection, initialization sensitivity, as well as hyperparameter optimization. The prevalence of these obstacles has severely constrained conventional machine learning and deep learning methods from fulfilling their potentials. In this research, three evolving machine learning and one evolving deep learning models are proposed to eliminate above bottlenecks, i.e. improving model initialization, enhancing feature representation, as well as optimizing model configuration, respectively, through hybridization between the advanced evolutionary algorithms and the conventional ML and DL methods. Specifically, two Firefly Algorithm based evolutionary clustering models are proposed to optimize cluster centroids in K-means and overcome initialization sensitivity as well as local stagnation. Secondly, a Particle Swarm Optimization based evolving feature selection model is developed for automatic identification of the most effective feature subset and reduction of feature dimensionality for tackling classification problems. Lastly, a Grey Wolf Optimizer based evolving Convolutional Neural Network-Long Short-Term Memory method is devised for automatic generation of the optimal topological and learning configurations for Convolutional Neural Network-Long Short-Term Memory networks to undertake multivariate time series prediction problems. Moreover, a variety of tailored search strategies are proposed to eliminate the intrinsic limitations embedded in the search mechanisms of the three employed evolutionary algorithms, i.e. the dictation of the global best signal in Particle Swarm Optimization, the constraint of the diagonal movement in Firefly Algorithm, as well as the acute contraction of search territory in Grey Wolf Optimizer, respectively. The remedy strategies include the diversification of guiding signals, the adaptive nonlinear search parameters, the hybrid position updating mechanisms, as well as the enhancement of population leaders. As such, the enhanced Particle Swarm Optimization, Firefly Algorithm, and Grey Wolf Optimizer variants are more likely to attain global optimality on complex search landscapes embedded in data mining problems, owing to the elevated search diversity as well as the achievement of advanced trade-offs between exploration and exploitation

Northumbria University Research Portal