15 research outputs found

    Noise reduction optimization of sound sensor based on a Conditional Generation Adversarial Network

    Get PDF
    To address the problems in the traditional speech signal noise elimination methods, such as the residual noise, poor real-time performance and narrow applications a new method is proposed to eliminate network voice noise based on deep learning of conditional generation adversarial network. In terms of the perceptual evaluation of speech quality (PESQ) and shorttime objective intelligibility measure (STOI) functions used as the loss function in the neural network, which were used as the loss function in the neural network, the flexibility of the whole network was optimized, and the training process of the model simplified. The experimental results indicate that, under the noisy environment, especially in a restaurant, the proposed noise reduction scheme improves the STOI score by 26.23% and PESQ score by 17.18%, respectively, compared with the traditional Wiener noise reduction algorithm. Therefore, the sound sensor\u27s noise reduction scheme through our approach has achieved a remarkable noise reduction effect, more useful information transmission, and stronger practicability

    Employing Emotion Cues to Verify Speakers in Emotional Talking Environments

    Full text link
    Usually, people talk neutrally in environments where there are no abnormal talking conditions such as stress and emotion. Other emotional conditions that might affect people talking tone like happiness, anger, and sadness. Such emotions are directly affected by the patient health status. In neutral talking environments, speakers can be easily verified, however, in emotional talking environments, speakers cannot be easily verified as in neutral talking ones. Consequently, speaker verification systems do not perform well in emotional talking environments as they do in neutral talking environments. In this work, a two-stage approach has been employed and evaluated to improve speaker verification performance in emotional talking environments. This approach employs speaker emotion cues (text-independent and emotion-dependent speaker verification problem) based on both Hidden Markov Models (HMMs) and Suprasegmental Hidden Markov Models (SPHMMs) as classifiers. The approach is comprised of two cascaded stages that combines and integrates emotion recognizer and speaker recognizer into one recognizer. The architecture has been tested on two different and separate emotional speech databases: our collected database and Emotional Prosody Speech and Transcripts database. The results of this work show that the proposed approach gives promising results with a significant improvement over previous studies and other approaches such as emotion-independent speaker verification approach and emotion-dependent speaker verification approach based completely on HMMs.Comment: Journal of Intelligent Systems, Special Issue on Intelligent Healthcare Systems, De Gruyter, 201
    corecore