Search CORE

693 research outputs found

An Evolutionary Optimization Algorithm for Automated Classical Machine Learning

Author: Zahedi Leila
Publication venue: FIU Digital Commons
Publication date: 29/06/2022
Field of study

Machine learning is an evolving branch of computational algorithms that allow computers to learn from experiences, make predictions, and solve different problems without being explicitly programmed. However, building a useful machine learning model is a challenging process, requiring human expertise to perform various proper tasks and ensure that the machine learning\u27s primary objective --determining the best and most predictive model-- is achieved. These tasks include pre-processing, feature selection, and model selection. Many machine learning models developed by experts are designed manually and by trial and error. In other words, even experts need the time and resources to create good predictive machine learning models. The idea of automated machine learning (AutoML) is to automate a machine learning pipeline to release the burden of substantial development costs and manual processes. The algorithms leveraged in these systems have different hyper-parameters. On the other hand, different input datasets have various features. In both cases, the final performance of the model is closely related to the final selected configuration of features and hyper-parameters. That is why they are considered as crucial tasks in the AutoML. The challenges regarding the computationally expensive nature of tuning hyper-parameters and optimally selecting features create significant opportunities for filling the research gaps in the AutoML field. This dissertation explores how to select the features and tune the hyper-parameters of conventional machine learning algorithms efficiently and automatically. To address the challenges in the AutoML area, novel algorithms for hyper-parameter tuning and feature selection are proposed. The hyper-parameter tuning algorithm aims to provide the optimal set of hyper-parameters in three conventional machine learning models (Random Forest, XGBoost and Support Vector Machine) to obtain best scores regarding performance. On the other hand, the feature selection algorithm looks for the optimal subset of features to achieve the highest performance. Afterward, a hybrid framework is designed for both hyper-parameter tuning and feature selection. The proposed framework can discover close to the optimal configuration of features and hyper-parameters. The proposed framework includes the following components: (1) an automatic feature selection component based on artificial bee colony algorithms and machine learning training, and (2) an automatic hyper-parameter tuning component based on artificial bee colony algorithms and machine learning training for faster training and convergence of the learning models. The whole framework has been evaluated using four real-world datasets in different applications. This framework is an attempt to alleviate the challenges of hyper-parameter tuning and feature selection by using efficient algorithms. However, distributed processing, distributed learning, parallel computing, and other big data solutions are not taken into consideration in this framework

DigitalCommons@Florida International University

Metaheuristic design of feedforward neural networks: a review of two decades of research

Author: Abbass
Abraham
Ackley
Ajith Abraham
Akhand
Alba
Ali Ahmadi
Almeida
Alvarez
Amari
Andersen
Angeline
Arifovic
Augusteijn
Azimi-Sadjadi
Bakker
Baranyi
Battiti
Bertsekas
Bishop
Bland
Bousquet
Boussaid
Breiman
Brownlee
Carvalho
Chandra
Charalambous
Chen
Chen
Chen
Chen
Cho
Chrisley
Coello
Cortes
Costa
Cruz-Ramírez
Cybenko
Da
da Silva
Dai
Das
Das
Davis
de Albuquerque Teixeira
Deneubourg
Dhahri
Diebold
Ding
Ditzler
Dominey
Donate
Dorigo
Dumont
Engel
Fahlman
Feo
FernandezCaballero
Fister
Fletcher
Fogel
Fogel
Fontanari
Formato
Frean
Fukumizu
Fullér
Furtuna
Garcia-Pedrajas
García-Pedrajas
García-Pedrajas
Gaspar-Cunha
Geem
Geman
Gershenfeld
Ghalambaz
Girosi
Giustolisi
Glover
Goh
Goldberg
Gori
Gorin
Green
Grossberg
Hagan
Hansen
Haykin
Haykin
Hernández
Hestenes
Hinton
Hinton
Hinton
Hirose
Ho
Holland
Hopfield
Hornik
Hornik
Huang
Huang
Huang
Huang
Huang
Igel
Ilonen
Irani
Irani
Islam
Jacobs
Jain
Jain
Jin
Juang
Kaelbling
Karaboga
Karpat
Kennedy
Khan
Khan
Kim
Kim
Kim
Kim
Kiranyaz
Kirkpatrick
Kitano
Kitano
Kohonen
Kolmogorov
Kordík
Kouda
Koza
Kulluk
Kŭrková
Lam
Larrañaga
LeCun
Lera
Leshno
Leung
Leung
Lewenstein
Li
Lin
Lin
Ling
Lippmann
Liu
Liu
Lowe
Ludermir
Mahdavi
Maniezzo
March
Marquardt
Martínez-Muñoz
Mazurowski
McCulloch
Menczer
Merrill
Metropolis
Minku
Minsky
Mirjalili
Mirjalili
Mitra
Mjolsness
Mladenović
Moriarty
Murray
Nakama
Nandy
Narayanan
Natschläger
Nedjah
Niranjan
Niu
Nolfi
Oh
Ojha
Osman
Pan
Passino
Pearce
Pencina
Peng
Pettersson
Pipino
Polikar
Prechelt
Prisecaru
Puig
Rashedi
Reed
Ritchie
Rosenblatt
Rumelhart
Rumelhart
Saad
Salajegheh
Sarkar
Schaffer
Schapire
Schmidhuber
Schwefel
Sejnowski
Selmic
Sexton
Sexton
Sexton
Shang
Sharma
Sietsma
Simovici
Sivagaminathan
Slowik
Socha
Socha
Sokolova
Sporea
Stanley
Storn
Sum
Sörensen
Tang
Tayefeh Mahmoudi
Toh
Tong
Trelea
Trentin
Tsai
Tsai
Tsoulos
Twomey
Ulagammai
Van den Bergh
van der Voet
Varun Kumar Ojha
Venkadesh
Ventura
Vieira
Václav Snášel
Wand
Wang
Wessels
Weyland
Whitley
Widrow
Wiegand
Wilson
Wolpert
Wolpert
Xi-Zhao
Yaghini
Yang
Yang
Yao
Yao
Yao
Yao
Yao
Yao
Yao
Ye
Yin
Yusiong
Zhang
Zhang
Zhang
Zhang
Zhang
Zhao
Zhou
Zhou
Zikopoulos
Zăvoianu
Černỳ
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Over the past two decades, the feedforward neural network (FNN) optimization has been a key interest among the researchers and practitioners of multiple disciplines. The FNN optimization is often viewed from the various perspectives: the optimization of weights, network architecture, activation nodes, learning parameters, learning environment, etc. Researchers adopted such different viewpoints mainly to improve the FNN's generalization ability. The gradient-descent algorithm such as backpropagation has been widely applied to optimize the FNNs. Its success is evident from the FNN's application to numerous real-world problems. However, due to the limitations of the gradient-based optimization methods, the metaheuristic algorithms including the evolutionary algorithms, swarm intelligence, etc., are still being widely explored by the researchers aiming to obtain generalized FNN for a given problem. This article attempts to summarize a broad spectrum of FNN optimization methodologies including conventional and metaheuristic approaches. This article also tries to connect various research directions emerged out of the FNN optimization practices, such as evolving neural network (NN), cooperative coevolution NN, complex-valued NN, deep learning, extreme learning machine, quantum NN, etc. Additionally, it provides interesting research challenges for future research to cope-up with the present information processing era

arXiv.org e-Print Archive

Central Archive at the University of Reading

Repository for Publications and Research Data

Crossref

DSpace at VSB Technical University of Ostrava

Integrated bio-search approaches with multi-objective algorithms for optimization and classification problem

Author: Basir Mohammad Aizat
Hussin Mohamed Saifullah
Yusof Yuhanis
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/10/2020
Field of study

Optimal selection of features is very difficult and crucial to achieve, particularly for the task of classification. It is due to the traditional method of selecting features that function independently and generated the collection of irrelevant features, which therefore affects the quality of the accuracy of the classification. The goal of this paper is to leverage the potential of bio-inspired search algorithms, together with wrapper, in optimizing multi-objective algorithms, namely ENORA and NSGA-II to generate an optimal set of features. The main steps are to idealize the combination of ENORA and NSGA-II with suitable bio-search algorithms where multiple subset generation has been implemented. The next step is to validate the optimum feature set by conducting a subset evaluation. Eight (8) comparison datasets of various sizes have been deliberately selected to be checked. Results shown that the ideal combination of multi-objective algorithms, namely ENORA and NSGA-II, with the selected bio-inspired search algorithm is promising to achieve a better optimal solution (i.e. a best features with higher classification accuracy) for the selected datasets. This discovery implies that the ability of bio-inspired wrapper/filtered system algorithms will boost the efficiency of ENORA and NSGA-II for the task of selecting and classifying features

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

A framework for feature selection through boosting

Author: Alsahaf Ahmad
Azzopardi George
Petkov Nicolai
Shenoy Vikram
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

As dimensions of datasets in predictive modelling continue to grow, feature selection becomes increasingly practical. Datasets with complex feature interactions and high levels of redundancy still present a challenge to existing feature selection methods. We propose a novel framework for feature selection that relies on boosting, or sample re-weighting, to select sets of informative features in classification problems. The method uses as its basis the feature rankings derived from fast and scalable tree-boosting models, such as XGBoost. We compare the proposed method to standard feature selection algorithms on 9 benchmark datasets. We show that the proposed approach reaches higher accuracies with fewer features on most of the tested datasets, and that the selected features have lower redundancy

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Applications of Nature-Inspired Algorithms for Dimension Reduction: Enabling Efficient Data Analytics

Author: A Adeli
A Imteaj
A Zhang
AH Hamamoto
AI Hafez
C Yan
E Hancer
F Harfouchi
F Xie
FG Mohammadi
FG Mohammadi
FG Mohammadi
FG Mohammadi
H Peng
H Rao
H Shi
H Shi
H Wang
J Kaur
J Pierezan
K Ahmed
K Kira
L Ke
M Gong
M Kumari
M Tubishat
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MM Kabir
Mohamed Abd Elaziz
MR Mozafar
N Kozodoi
P Moradi
Q Al-Tashi
Q-T Bui
R Hang
R Vanaja
RR Chhikara
S Arora
S Gupta
S Khan
S Roy
S Tabakhi
V Rostami
X-L Li
X-Y Liu
Y Cao
Y Dong
Y Pathak
Y Xue
Y Zhang
Publication venue: FIU Digital Commons
Publication date: 01/08/2019
Field of study

In [1], we have explored the theoretical aspects of feature selection and evolutionary algorithms. In this chapter, we focus on optimization algorithms for enhancing data analytic process, i.e., we propose to explore applications of nature-inspired algorithms in data science. Feature selection optimization is a hybrid approach leveraging feature selection techniques and evolutionary algorithms process to optimize the selected features. Prior works solve this problem iteratively to converge to an optimal feature subset. Feature selection optimization is a non-specific domain approach. Data scientists mainly attempt to find an advanced way to analyze data n with high computational efficiency and low time complexity, leading to efficient data analytics. Thus, by increasing generated/measured/sensed data from various sources, analysis, manipulation and illustration of data grow exponentially. Due to the large scale data sets, Curse of dimensionality (CoD) is one of the NP-hard problems in data science. Hence, several efforts have been focused on leveraging evolutionary algorithms (EAs) to address the complex issues in large scale data analytics problems. Dimension reduction, together with EAs, lends itself to solve CoD and solve complex problems, in terms of time complexity, efficiently. In this chapter, we first provide a brief overview of previous studies that focused on solving CoD using feature extraction optimization process. We then discuss practical examples of research studies are successfully tackled some application domains, such as image processing, sentiment analysis, network traffics / anomalies analysis, credit score analysis and other benchmark functions/data sets analysis

arXiv.org e-Print Archive

Crossref

DigitalCommons@Florida International University

Oversampling technique in student performance classification from engineering course

Author: Punlumjeak Wattana
Rachburee Nachirat
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/08/2021
Field of study

The first year of an engineering student was important to take proper academic planning. All subjects in the first year were essential for an engineering basis. Student performance prediction helped academics improve their performance better. Students checked performance by themselves. If they were aware that their performance are low, then they could make some improvement for their better performance. This research focused on combining the oversampling minority class data with various kinds of classifier models. Oversampling techniques were SMOTE, Borderline-SMOTE, SVMSMOTE, and ADASYN and four classifiers were applied using MLP, gradient boosting, AdaBoost and random forest in this research. The results represented that Borderline-SMOTE gave the best result for minority class prediction with several classifiers

ZENODO

Institute of Advanced Engineering and Science

Predicting Arrhythmia Based on Machine Learning Using Improved Harris Hawk Algorithm

Author: Nitesh Sureja et al.
Publication venue: Auricle Global Society of Education and Research
Publication date: 02/11/2023
Field of study

Arrhythmia disease is widely recognized as a prominent and lethal ailment on a global scale, resulting in a significant number of fatalities annually. The timely identification of this ailment is crucial for preserving individuals' lives. Machine Learning (ML), a branch of artificial intelligence (AI), has emerged as a highly efficient and cost-effective method for illness detection. The objective of this work is to develop a machine learning (ML) model capable of accurately predicting heart illness by using the Arrhythmia disease dataset, with the purpose of achieving optimal performance. The performance of the model is greatly influenced by the selection of the machine learning method and the features in the dataset for training purposes. In order to mitigate the issue of overfitting caused by the high dimensionality of the features in the Arrhythmia dataset, a reduction of the dataset to a lower dimensional subspace was performed via the improved Harris hawk optimization algorithm (iHHO). The Harris hawk algorithm exhibits a rapid convergence rate and possesses a notable degree of adaptability in its ability to identify optimal characteristics. The performance of the models created with the feature-selected dataset using various machine learning techniques was evaluated and compared. In this work, total seven classifiers like SVM, GB, GNB, RF, LR, DT, and KNN are used to classify the data produced by the iHHO algorithm. The results clearly show the improvement of 3%, 4%, 4%, 9%, 8%, 3%, and 9% with the classifiers KNN, RF, GB, SVM, LR, DT, and GNB respectively

International Journal on Recent and Innovation Trends in Computing and Communication

Software Reliability Prediction using Correlation Constrained Multi-Objective Evolutionary Optimization Algorithm

Author: Yadav Neha
Yadav Vibhash
Publication venue: Faculty of Electrical Engineering, J.J. Strossmayer University of Osijek
Publication date: 01/01/2023
Field of study

Software reliability frameworks are extremely effective for estimating the probability of software failure over time. Numerous approaches for predicting software dependability were presented, but neither of those has shown to be effective. Predicting the number of software faults throughout the research and testing phases is a serious problem. As there are several software metrics such as object-oriented design metrics, public and private attributes, methods, previous bug metrics, and software change metrics. Many researchers have identified and performed predictions of software reliability on these metrics. But none of them contributed to identifying relations among these metrics and exploring the most optimal metrics. Therefore, this paper proposed a correlation- constrained multi-objective evolutionary optimization algorithm (CCMOEO) for software reliability prediction. CCMOEO is an effective optimization approach for estimating the variables of popular growth models which consists of reliability. To obtain the highest classification effectiveness, the suggested CCMOEO approach overcomes modeling uncertainties by integrating various metrics with multiple objective functions. The hypothesized models were formulated using evaluation results on five distinct datasets in this research. The prediction was evaluated on seven different machine learning algorithms i.e., linear support vector machine (LSVM), radial support vector machine (RSVM), decision tree, random forest, gradient boosting, k-nearest neighbor, and linear regression. The result analysis shows that random forest achieved better performance

HRČAK - Portal of Croatian Scientific and Professional Journals

A modified mayfly-SVM approach for early detection of type 2 diabetes mellitus

Author: Patil Kanishk
Patil Ratna
Rawandale Shitalkumar Adhar
Tamane Sharvari
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/02/2022
Field of study

Diabetes mellitus is a chronic disease that affects many people in the world badly. Early diagnosis of this disease is of paramount importance as physicians and patients can work towards prevention and mitigation of future complications. Hence, there is a necessity to develop a system that diagnoses type 2 diabetes mellitus (T2DM) at an early stage. Recently, large number of studies have emerged with prediction models to diagnose T2DM. Most importantly, published literature lacks the availability of multi-class studies. Therefore, the primary objective of the study is development of multi-class predictive model by taking advantage of routinely available clinical data in diagnosing T2DM using machine learning algorithms. In this work, modified mayfly-support vector machine is implemented to notice the prediabetic stage accurately. To assess the effectiveness of proposed model, a comparative study was undertaken and was contrasted with T2DM prediction models developed by other researchers from last five years. Proposed model was validated over data collected from local hospitals and the benchmark PIMA dataset available on UCI repository. The study reveals that modified Mayfly-SVM has a considerable edge over metaheuristic optimization algorithms in local as well as global searching capabilities and has attained maximum test accuracy of 94.5% over PIMA

ZENODO

Institute of Advanced Engineering and Science