10 research outputs found

    Enhancement in Speaker Identification through Feature Fusion using Advanced Dilated Convolution Neural Network

    Get PDF
    There are various challenges in identifying the speakers accurately. The Extraction of discriminative features is a vital task for accurate identification in the speaker identification task. Nowadays, speaker identification is widely investigated using deep learning. The complex and noisy speech data affects the performance of Mel Frequency Cepstral Coefficients (MFCC); hence, MFCC fails to represent the speaker characteristics accurately. In this proposed work, a novel text-independent speaker identification system is developed to enhance the performance by fusion of Log-MelSpectrum and excitation features. The excitation information is obtained due to the vibration of vocal folds, and it is represented using Linear Prediction (LP) residual. The various types of features extracted from the excitation are residual phase, sharpness, Energy of Excitation (EoE), and Strength of Excitation (SoE). The extracted features were processed with the dilated convolution neural network (dilated CNN) to fulfill the identification task. The extensive evaluation showed that the fusion of excitation features gives better results than the existing methods. The accuracy reaches 94.12% for 11 complex classes and 91.34% for 80 speakers, and Equal Error Rate (EER) is reduced to 1.16% for the proposed model. The proposed model is tested with the Librispeech corpus using Matlab 2021b tool, outperforming the existing baseline models. The proposed model achieves an accuracy improvement of 1.34% compared to the baseline system

    Novel Data-Driven Approach Based on Capsule Network for Intelligent Multi-Fault Detection in Electric Motors

    Get PDF

    Speaker Recognition Using Convolutional Neural Network and Neutrosophic

    Get PDF
    Speaker recognition is a process of recognizing persons based on their voice which is widely used in many applications. Although many researches have been performed in this domain, there are some challenges that have not been addressed yet. In this research, Neutrosophic (NS) theory and convolutional neural networks (CNN) are used to improve the accuracy of speaker recognition systems. To do this, at first, the spectrogram of the signal is created from the speech signal and then transferred to the NS domain. In the next step, the alpha correction operator is applied repeatedly until reaching constant entropy in subsequent iterations. Finally, a convolutional neural networks architecture is proposed to classify spectrograms in the NS domain. Two datasets TIMIT and Aurora2 are used to evaluate the effectiveness of the proposed method. The precision of the proposed method on two datasets TIMIT and Aurora2 are 93.79% and 95.24%, respectively, demonstrating that the proposed model outperforms competitive models

    Noise-Boosted Convolutional Neural Network for Edge-based Motor Fault Diagnosis with Limited Samples

    Get PDF
    Convolutional neural networks (CNNs) have been widely applied in motor fault diagnosis. However, to obtain high recognition accuracy, massive training data are typically required and transmitted to the cloud/local server for training, which may suffer from security and privacy problems. In this study, a noise-boosted CNN (NBCNN) model is developed to achieve accelerated training and improved recognition accuracy with limited training samples. First, the NBCNN model with a noise-injection fully connected layer is established. Then, a strategy for noise selection and injection is proposed to obtain an optimal matching among the data, model, and noise. Finally, the optimal injected noise accelerates the convergence of model training and improves the accuracy of motor fault diagnosis. Compared with the conventional CNN without noise injection and the state-of-the-art models, the effectiveness and superiority of the proposed NBCNN model are validated by two benchmark datasets. In addition, the algorithm is deployed onto an edge device and the results show that the training speed of the developed NBCNN can reach nine times faster than the conventional CNN. The proposed method shows remarkable potential for distributed model training, federal learning, and real-time motor fault diagnosis

    Applications in security and evasions in machine learning : a survey

    Get PDF
    In recent years, machine learning (ML) has become an important part to yield security and privacy in various applications. ML is used to address serious issues such as real-time attack detection, data leakage vulnerability assessments and many more. ML extensively supports the demanding requirements of the current scenario of security and privacy across a range of areas such as real-time decision-making, big data processing, reduced cycle time for learning, cost-efficiency and error-free processing. Therefore, in this paper, we review the state of the art approaches where ML is applicable more effectively to fulfill current real-world requirements in security. We examine different security applications' perspectives where ML models play an essential role and compare, with different possible dimensions, their accuracy results. By analyzing ML algorithms in security application it provides a blueprint for an interdisciplinary research area. Even with the use of current sophisticated technology and tools, attackers can evade the ML models by committing adversarial attacks. Therefore, requirements rise to assess the vulnerability in the ML models to cope up with the adversarial attacks at the time of development. Accordingly, as a supplement to this point, we also analyze the different types of adversarial attacks on the ML models. To give proper visualization of security properties, we have represented the threat model and defense strategies against adversarial attack methods. Moreover, we illustrate the adversarial attacks based on the attackers' knowledge about the model and addressed the point of the model at which possible attacks may be committed. Finally, we also investigate different types of properties of the adversarial attacks

    GMM and CNN Hybrid Method for Short Utterance Speaker Recognition

    No full text
    corecore