Search CORE

2 research outputs found

Spoken Persian digits recognition using deep learning

Author: Kourosh Kiani
Razieh Rastgoo
Sahar Zarbafi
Publication venue: Semnan University
Publication date: 01/10/2023
Field of study

Classification of isolated digits is a fundamental challenge for many speech classification systems. Previous works on spoken digits have been limited to the numbers 0 to 9. In this paper, we propose two deep learning-based models for spoken digit recognition in the range of 0 to 599. The first model is a Convolutional Neural Network (CNN) model that uses the Mel spectrogram obtained from the audio data. The second model uses the recent advances in deep sequential models, especially the Transformer model followed by a Long Short-Term Memory (LSTM) Network and a classifier. Moreover, we also collected a dataset, including audio data by a contribution of 145 people, covering the numerical range from 0 to 599. The experimental results on the collected dataset indicate a validation accuracy of 98.03%

Directory of Open Access Journals

Amharic spoken digits recognition using convolutional neural network

Author: Abate Solomon Teferra
Adjeisah Michael
Ayall Tewodros Alemu
Brhanemeskel Getnet Mezgebu
Liu Huawen
Zhou Changjun
Publication venue
Publication date: 04/05/2024
Field of study

Authors would like to acknowledge and thanks to the participants in the collection of voice samples.Peer reviewe

Aberdeen University Research