Search CORE

5 research outputs found

Nonintrusive Quality Assessment of Noise Suppressed Speech With Mel-Filtered Energies and Support Vector Regression

Author: Chia Liang-Tien
Emmanuel Sabu
Lin Weisi
McLoughlin Ian Vince
Narwaria Manish
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Objective speech quality assessment is a challenging task which aims to emulate human judgment in the complex and time consuming task of subjective assessment. It is difficult to perform in line with the human perception due the complex and nonlinear nature of the human auditory system. The challenge lies in representing speech signals using appropriate features and subsequently mapping these features into a quality score. This paper proposes a nonintrusive metric for the quality assessment of noise-suppressed speech. The originality of the proposed approach lies primarily in the use of Mel filter bank energies (FBEs) as features and the use of support vector regression (SVR) for feature mapping. We utilize the sensitivity of FBEs to noise in order to obtain an effective representation of speech towards quality assessment. In addition, the use of SVR exploits the advantages of kernels which allow the regression algorithm to learn complex data patterns via nonlinear transformation for an effective and generalized mapping of features into the quality score. Extensive experiments conducted using two third party databases with different noise-suppressed speech signals show the effectiveness of the proposed approach

Crossref

Kent Academic Repository

DR-NTU (Digital Repository of NTU)

Recommended from our members

Mulsemedia: State of the art, perspectives, and challenges

Author: Boyd-Davis S.
Cater J. P.
Christian Timmerer
Dinh H. Q.
Gheorghita Ghinea
Heilig M. L.
Ho C.
Klatzky R. L.
Lin W.
Nothdurft H.-C.
Pereira F.
Pyo S.
Rainer B.
Revonsuo A.
Stephen R. Gulliver
T.
Waltl M.
Weisi Lin
Yarbus A. L.
Yazdani A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2014
Field of study

Mulsemedia-multiple sensorial media-captures a wide variety of research efforts and applications. This article presents a historic perspective on mulsemedia work and reviews current developments in the area. These take place across the traditional multimedia spectrum-from virtual reality applications to computer games-as well as efforts in the arts, gastronomy, and therapy, to mention a few. We also describe standardization efforts, via the MPEG-V standard, and identify future developments and exciting challenges the community needs to overcome

Central Archive at the University of Reading

Crossref

Brunel University Research Archive

"Can you hear me now?":Automatic assessment of background noise intrusiveness and speech intelligibility in telecommunications

Author: Ullmann Raphaël Marc
Publication venue: Lausanne, EPFL
Publication date: 02/06/2016
Field of study

This thesis deals with signal-based methods that predict how listeners perceive speech quality in telecommunications. Such tools, called objective quality measures, are of great interest in the telecommunications industry to evaluate how new or deployed systems affect the end-user quality of experience. Two widely used measures, ITU-T Recommendations P.862 âPESQâ and P.863 âPOLQAâ, predict the overall listening quality of a speech signal as it would be rated by an average listener, but do not provide further insight into the composition of that score. This is in contrast to modern telecommunication systems, in which components such as noise reduction or speech coding process speech and non-speech signal parts differently. Therefore, there has been a growing interest for objective measures that assess different quality features of speech signals, allowing for a more nuanced analysis of how these components affect quality. In this context, the present thesis addresses the objective assessment of two quality features: background noise intrusiveness and speech intelligibility. The perception of background noise is investigated with newly collected datasets, including signals that go beyond the traditional telephone bandwidth, as well as Lombard (effortful) speech. We analyze listener scores for noise intrusiveness, and their relation to scores for perceived speech distortion and overall quality. We then propose a novel objective measure of noise intrusiveness that uses a sparse representation of noise as a model of high-level auditory coding. The proposed approach is shown to yield results that highly correlate with listener scores, without requiring training data. With respect to speech intelligibility, we focus on the case where the signal is degraded by strong background noises or very low bit-rate coding. Considering that listeners use prior linguistic knowledge in assessing intelligibility, we propose an objective measure that works at the phoneme level and performs a comparison of phoneme class-conditional probability estimations. The proposed approach is evaluated on a large corpus of recordings from public safety communication systems that use low bit-rate coding, and further extended to the assessment of synthetic speech, showing its applicability to a large range of distortion types. The effectiveness of both measures is evaluated with standardized performance metrics, using corpora that follow established recommendations for subjective listening tests

Infoscience - École polytechnique fédérale de Lausanne