7 research outputs found
Bayesian Neural Architecture Search using A Training-Free Performance Metric
Recurrent neural networks (RNNs) are a powerful approach for time series
prediction. However, their performance is strongly affected by their
architecture and hyperparameter settings. The architecture optimization of RNNs
is a time-consuming task, where the search space is typically a mixture of
real, integer and categorical values. To allow for shrinking and expanding the
size of the network, the representation of architectures often has a variable
length. In this paper, we propose to tackle the architecture optimization
problem with a variant of the Bayesian Optimization (BO) algorithm. To reduce
the evaluation time of candidate architectures the Mean Absolute Error Random
Sampling (MRS), a training-free method to estimate the network performance, is
adopted as the objective function for BO. Also, we propose three fixed-length
encoding schemes to cope with the variable-length architecture representation.
The result is a new perspective on accurate and efficient design of RNNs, that
we validate on three problems. Our findings show that 1) the BO algorithm can
explore different network architectures using the proposed encoding schemes and
successfully designs well-performing architectures, and 2) the optimization
time is significantly reduced by using MRS, without compromising the
performance as compared to the architectures obtained from the actual training
procedure
A sequential handwriting recognition model based on a dynamically configurable CRNN
Handwriting recognition refers to recognizing a handwritten input that includes character(s) or digit(s) based on an image. Because most applications of handwriting recognition in real life contain sequential text in various languages, there is a need to develop a dynamic handwriting recognition system. Inspired by the neuroevolutionary technique, this paper proposes a Dynamically Configurable Convolutional Recurrent Neural Network (DC-CRNN) for the handwriting recognition sequence modeling task. The proposed DC-CRNN is based on the Salp Swarm Optimization Algorithm (SSA), which generates the optimal structure and hyperparameters for Convolutional Recurrent Neural Networks (CRNNs). In addition, we investigate two types of encoding techniques used to translate the output of optimization to a CRNN recognizer. Finally, we proposed a novel hybridized SSA with Late Acceptance Hill-Climbing (LAHC) to improve the exploitation process. We conducted our experiments on two well-known datasets, IAM and IFN/ENIT, which include both the Arabic and English languages. The experimental results have shown that LAHC significantly improves the SSA search process. Therefore, the proposed DC-CRNN outperforms the handcrafted CRNN methods