Search CORE

15 research outputs found

Noise reduction optimization of sound sensor based on a Conditional Generation Adversarial Network

Author: Lin Xiongwei
Lu Shengguo
Mao Yadong
Yang Dongru
Zhao Xiaobo
Zhou Lei
Publication venue: 'IOP Publishing'
Publication date: 01/01/2021
Field of study

To address the problems in the traditional speech signal noise elimination methods, such as the residual noise, poor real-time performance and narrow applications a new method is proposed to eliminate network voice noise based on deep learning of conditional generation adversarial network. In terms of the perceptual evaluation of speech quality (PESQ) and shorttime objective intelligibility measure (STOI) functions used as the loss function in the neural network, which were used as the loss function in the neural network, the flexibility of the whole network was optimized, and the training process of the model simplified. The experimental results indicate that, under the noisy environment, especially in a restaurant, the proposed noise reduction scheme improves the STOI score by 26.23% and PESQ score by 17.18%, respectively, compared with the traditional Wiener noise reduction algorithm. Therefore, the sound sensor\u27s noise reduction scheme through our approach has achieved a remarkable noise reduction effect, more useful information transmission, and stronger practicability

Chalmers Research

Employing Emotion Cues to Verify Speakers in Emotional Talking Environments

Author: Shahin Ismail
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2016
Field of study

Usually, people talk neutrally in environments where there are no abnormal talking conditions such as stress and emotion. Other emotional conditions that might affect people talking tone like happiness, anger, and sadness. Such emotions are directly affected by the patient health status. In neutral talking environments, speakers can be easily verified, however, in emotional talking environments, speakers cannot be easily verified as in neutral talking ones. Consequently, speaker verification systems do not perform well in emotional talking environments as they do in neutral talking environments. In this work, a two-stage approach has been employed and evaluated to improve speaker verification performance in emotional talking environments. This approach employs speaker emotion cues (text-independent and emotion-dependent speaker verification problem) based on both Hidden Markov Models (HMMs) and Suprasegmental Hidden Markov Models (SPHMMs) as classifiers. The approach is comprised of two cascaded stages that combines and integrates emotion recognizer and speaker recognizer into one recognizer. The architecture has been tested on two different and separate emotional speech databases: our collected database and Emotional Prosody Speech and Transcripts database. The results of this work show that the proposed approach gives promising results with a significant improvement over previous studies and other approaches such as emotion-independent speaker verification approach and emotion-dependent speaker verification approach based completely on HMMs.Comment: Journal of Intelligent Systems, Special Issue on Intelligent Healthcare Systems, De Gruyter, 201

arXiv.org e-Print Archive

Directory of Open Access Journals