92,828 research outputs found
Optimizing Weights And Biases in MLP Using Whale Optimization Algorithm
Artificial Neural Networks are intelligent and non-parametric mathematical models inspired by the human nervous system. They have been widely studied and applied for classification, pattern recognition and forecasting problems. The main challenge of training an Artificial Neural network is its learning process, the nonlinear nature and the unknown best set of main controlling parameters (weights and biases). When the Artificial Neural Networks are trained using the conventional training algorithm, they get caught in the local optima stagnation and slow convergence speed; this makes the stochastic optimization algorithm a definitive alternative to alleviate the drawbacks. This thesis proposes an algorithm based on the recently proposed Whale Optimization Algorithm(WOA). The algorithm has proven to solve a wide range of optimization problems and outperform existing algorithms. The successful implementation of this algorithm motivated our attempts to benchmark its performance in training feed-forward neural networks. We have taken a set of 20 datasets with different difficulty levels and tested the proposed WOA-MLP based trainer. Further, the results are verified by comparing WOA-MLP with the back propagation algorithms and six evolutionary techniques. The results have proved that the proposed trainer can outperform the current algorithms on the majority of datasets in terms of local optima avoidance and convergence speed
Training Artificial Neural Networks by Coordinate Search Algorithm
Training Artificial Neural Networks poses a challenging and critical problem
in machine learning. Despite the effectiveness of gradient-based learning
methods, such as Stochastic Gradient Descent (SGD), in training neural
networks, they do have several limitations. For instance, they require
differentiable activation functions, and cannot optimize a model based on
several independent non-differentiable loss functions simultaneously; for
example, the F1-score, which is used during testing, can be used during
training when a gradient-free optimization algorithm is utilized. Furthermore,
the training in any DNN can be possible with a small size of the training
dataset. To address these concerns, we propose an efficient version of the
gradient-free Coordinate Search (CS) algorithm, an instance of General Pattern
Search methods, for training neural networks. The proposed algorithm can be
used with non-differentiable activation functions and tailored to
multi-objective/multi-loss problems. Finding the optimal values for weights of
ANNs is a large-scale optimization problem. Therefore instead of finding the
optimal value for each variable, which is the common technique in classical CS,
we accelerate optimization and convergence by bundling the weights. In fact,
this strategy is a form of dimension reduction for optimization problems. Based
on the experimental results, the proposed method, in some cases, outperforms
the gradient-based approach, particularly, in situations with insufficient
labeled training data. The performance plots demonstrate a high convergence
rate, highlighting the capability of our suggested method to find a reasonable
solution with fewer function calls. As of now, the only practical and efficient
way of training ANNs with hundreds of thousands of weights is gradient-based
algorithms such as SGD or Adam. In this paper we introduce an alternative
method for training ANN.Comment: 7 pages, 9 figure
Study and Observation of the Variations of Accuracies for Handwritten Digits Recognition with Various Hidden Layers and Epochs using Neural Network Algorithm
In recent days, Artificial Neural Network (ANN) can be applied to a vast
majority of fields including business, medicine, engineering, etc. The most
popular areas where ANN is employed nowadays are pattern and sequence
recognition, novelty detection, character recognition, regression analysis,
speech recognition, image compression, stock market prediction, Electronic
nose, security, loan applications, data processing, robotics, and control. The
benefits associated with its broad applications leads to increasing popularity
of ANN in the era of 21st Century. ANN confers many benefits such as organic
learning, nonlinear data processing, fault tolerance, and self-repairing
compared to other conventional approaches. The primary objective of this paper
is to analyze the influence of the hidden layers of a neural network over the
overall performance of the network. To demonstrate this influence, we applied
neural network with different layers on the MNIST dataset. Also, another goal
is to observe the variations of accuracies of ANN for different numbers of
hidden layers and epochs and to compare and contrast among them.Comment: To be published in the 4th IEEE International Conference on
Electrical Engineering and Information & Communication Technology (iCEEiCT
2018
- …