293 research outputs found

    Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization

    Full text link
    An efficient algorithm for recurrent neural network training is presented. The approach increases the training speed for tasks where a length of the input sequence may vary significantly. The proposed approach is based on the optimal batch bucketing by input sequence length and data parallelization on multiple graphical processing units. The baseline training performance without sequence bucketing is compared with the proposed solution for a different number of buckets. An example is given for the online handwriting recognition task using an LSTM recurrent neural network. The evaluation is performed in terms of the wall clock time, number of epochs, and validation loss value.Comment: 4 pages, 5 figures, Comments, 2016 IEEE First International Conference on Data Stream Mining & Processing (DSMP), Lviv, 201

    Multiple classifiers fusion and CNN feature extraction for handwritten digits recognition

    Get PDF
    Handwritten digits recognition has been treated as a multi-class classification problem in the machine learning context, where each of the ten digits (0-9) is viewed as a class and the machine learning task is essentially to train a classifier that can effectively discriminate the ten classes. In practice, it is very usual that the performance of a single classifier trained by using a standard learning algorithm is varied on different data sets, which indicates that the same learning algorithm may train strong classifiers on some data sets but weak classifiers may be trained on other data sets. It is also possible that the same classifier shows different performance on different test sets, especially when considering the case that image instances can be highly diverse due to the different handwriting styles of different people on the same digits. In order to address the above issue, development of ensemble learning approaches have been very necessary to improve the overall performance and make the performance more stable on different data sets. In this paper, we propose a framework that involves CNN based feature extraction from the MINST data set and algebraic fusion of multiple classifiers trained on different feature sets, which are prepared through feature selection applied to the original feature set extracted using CNN. The experimental results show that the classifiers fusion can achieve the classification accuracy of ≥ 98%
    corecore