Search CORE

92,828 research outputs found

Optimizing Weights And Biases in MLP Using Whale Optimization Algorithm

Author: HARIT ANOUSHKA
Publication venue
Publication date: 01/01/2022
Field of study

Artificial Neural Networks are intelligent and non-parametric mathematical models inspired by the human nervous system. They have been widely studied and applied for classification, pattern recognition and forecasting problems. The main challenge of training an Artificial Neural network is its learning process, the nonlinear nature and the unknown best set of main controlling parameters (weights and biases). When the Artificial Neural Networks are trained using the conventional training algorithm, they get caught in the local optima stagnation and slow convergence speed; this makes the stochastic optimization algorithm a definitive alternative to alleviate the drawbacks. This thesis proposes an algorithm based on the recently proposed Whale Optimization Algorithm(WOA). The algorithm has proven to solve a wide range of optimization problems and outperform existing algorithms. The successful implementation of this algorithm motivated our attempts to benchmark its performance in training feed-forward neural networks. We have taken a set of 20 datasets with different difficulty levels and tested the proposed WOA-MLP based trainer. Further, the results are verified by comparing WOA-MLP with the back propagation algorithms and six evolutionary techniques. The results have proved that the proposed trainer can outperform the current algorithms on the majority of datasets in terms of local optima avoidance and convergence speed

Durham e-Theses

Training Artificial Neural Networks by Coordinate Search Algorithm

Author: Bidgoli Azam Asilian
Miyandoab Sevil Zanjani
Rahnamayan Shahryar
Rokhsatyazdi Ehsan
Tizhoosh H. R.
Publication venue
Publication date: 19/02/2024
Field of study

Training Artificial Neural Networks poses a challenging and critical problem in machine learning. Despite the effectiveness of gradient-based learning methods, such as Stochastic Gradient Descent (SGD), in training neural networks, they do have several limitations. For instance, they require differentiable activation functions, and cannot optimize a model based on several independent non-differentiable loss functions simultaneously; for example, the F1-score, which is used during testing, can be used during training when a gradient-free optimization algorithm is utilized. Furthermore, the training in any DNN can be possible with a small size of the training dataset. To address these concerns, we propose an efficient version of the gradient-free Coordinate Search (CS) algorithm, an instance of General Pattern Search methods, for training neural networks. The proposed algorithm can be used with non-differentiable activation functions and tailored to multi-objective/multi-loss problems. Finding the optimal values for weights of ANNs is a large-scale optimization problem. Therefore instead of finding the optimal value for each variable, which is the common technique in classical CS, we accelerate optimization and convergence by bundling the weights. In fact, this strategy is a form of dimension reduction for optimization problems. Based on the experimental results, the proposed method, in some cases, outperforms the gradient-based approach, particularly, in situations with insufficient labeled training data. The performance plots demonstrate a high convergence rate, highlighting the capability of our suggested method to find a reasonable solution with fewer function calls. As of now, the only practical and efficient way of training ANNs with hundreds of thousands of weights is gradient-based algorithms such as SGD or Adam. In this paper we introduce an alternative method for training ANN.Comment: 7 pages, 9 figure

arXiv.org e-Print Archive

Study and Observation of the Variations of Accuracies for Handwritten Digits Recognition with Various Hidden Layers and Epochs using Neural Network Algorithm

Author: Arif Rezoana Bente
Ashrafi Zahidun
Khan Mohammad Mahmudur Rahman
Siddique Md. Abu Bakr
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/11/2018
Field of study

In recent days, Artificial Neural Network (ANN) can be applied to a vast majority of fields including business, medicine, engineering, etc. The most popular areas where ANN is employed nowadays are pattern and sequence recognition, novelty detection, character recognition, regression analysis, speech recognition, image compression, stock market prediction, Electronic nose, security, loan applications, data processing, robotics, and control. The benefits associated with its broad applications leads to increasing popularity of ANN in the era of 21st Century. ANN confers many benefits such as organic learning, nonlinear data processing, fault tolerance, and self-repairing compared to other conventional approaches. The primary objective of this paper is to analyze the influence of the hidden layers of a neural network over the overall performance of the network. To demonstrate this influence, we applied neural network with different layers on the MNIST dataset. Also, another goal is to observe the variations of accuracies of ANN for different numbers of hidden layers and epochs and to compare and contrast among them.Comment: To be published in the 4th IEEE International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT 2018

arXiv.org e-Print Archive

Crossref