3 research outputs found
Additive Margin SincNet for Speaker Recognition
Speaker Recognition is a challenging task with essential applications such as
authentication, automation, and security. The SincNet is a new deep learning
based model which has produced promising results to tackle the mentioned task.
To train deep learning systems, the loss function is essential to the network
performance. The Softmax loss function is a widely used function in deep
learning methods, but it is not the best choice for all kind of problems. For
distance-based problems, one new Softmax based loss function called Additive
Margin Softmax (AM-Softmax) is proving to be a better choice than the
traditional Softmax. The AM-Softmax introduces a margin of separation between
the classes that forces the samples from the same class to be closer to each
other and also maximizes the distance between classes. In this paper, we
propose a new approach for speaker recognition systems called AM-SincNet, which
is based on the SincNet but uses an improved AM-Softmax layer. The proposed
method is evaluated in the TIMIT dataset and obtained an improvement of
approximately 40% in the Frame Error Rate compared to SincNet
Speaker Recognition Using Convolutional Neural Network and Neutrosophic
Speaker recognition is a process of recognizing persons based on their voice which is widely used in many applications. Although many researches have been performed in this domain, there are some challenges that have not been addressed yet. In this research, Neutrosophic (NS) theory and convolutional neural networks (CNN) are used to improve the accuracy of speaker recognition systems. To do this, at first, the spectrogram of the signal is created from the speech signal and then transferred to the NS domain. In the next step, the alpha correction operator is applied repeatedly until reaching constant entropy in subsequent iterations. Finally, a convolutional neural networks architecture is proposed to classify spectrograms in the NS domain. Two datasets TIMIT and Aurora2 are used to evaluate the effectiveness of the proposed method. The precision of the proposed method on two datasets TIMIT and Aurora2 are 93.79% and 95.24%, respectively, demonstrating that the proposed model outperforms competitive models