2 research outputs found

    Spoken Persian digits recognition using deep learning

    Get PDF
    Classification of isolated digits is a fundamental challenge for many speech classification systems. Previous works on spoken digits have been limited to the numbers 0 to 9. In this paper, we propose two deep learning-based models for spoken digit recognition in the range of 0 to 599. The first model is a Convolutional Neural Network (CNN) model that uses the Mel spectrogram obtained from the audio data. The second model uses the recent advances in deep sequential models, especially the Transformer model followed by a Long Short-Term Memory (LSTM) Network and a classifier. Moreover, we also collected a dataset, including audio data by a contribution of 145 people, covering the numerical range from 0 to 599. The experimental results on the collected dataset indicate a validation accuracy of 98.03%

    Amharic spoken digits recognition using convolutional neural network

    Get PDF
    Authors would like to acknowledge and thanks to the participants in the collection of voice samples.Peer reviewe
    corecore