293 research outputs found
Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization
An efficient algorithm for recurrent neural network training is presented.
The approach increases the training speed for tasks where a length of the input
sequence may vary significantly. The proposed approach is based on the optimal
batch bucketing by input sequence length and data parallelization on multiple
graphical processing units. The baseline training performance without sequence
bucketing is compared with the proposed solution for a different number of
buckets. An example is given for the online handwriting recognition task using
an LSTM recurrent neural network. The evaluation is performed in terms of the
wall clock time, number of epochs, and validation loss value.Comment: 4 pages, 5 figures, Comments, 2016 IEEE First International
Conference on Data Stream Mining & Processing (DSMP), Lviv, 201
Multiple classifiers fusion and CNN feature extraction for handwritten digits recognition
Handwritten digits recognition has been treated as a multi-class classification problem in the machine learning context, where each of the ten digits (0-9) is viewed as a class and the machine learning task is essentially to train a classifier that can effectively discriminate the ten classes. In practice, it is very usual that the performance of a single classifier trained by using a standard learning algorithm is varied on different data sets, which indicates that the same learning algorithm may train strong classifiers on some data sets but weak classifiers may be trained on other data sets. It is also possible that the same classifier shows different performance on different test sets, especially when considering the case that image instances can be highly diverse due to the different handwriting styles of different people on the same digits. In order to address the above issue, development of ensemble learning approaches have been very necessary to improve the overall performance and make the performance more stable on different data sets. In this paper, we propose a framework that involves CNN based feature extraction from the MINST data set and algebraic fusion of multiple classifiers trained on different feature sets, which are prepared through feature selection applied to the original feature set extracted using CNN. The experimental results show that the classifiers fusion can achieve the classification accuracy of ≥ 98%
- …