16 research outputs found

    New single-ended objective measure for non-intrusive speech quality evaluation

    Get PDF
    peer-reviewedThis article proposes a new output-based method for non-intrusive assessment of speech quality of voice communication systems and evaluates its performance. The method requires access to the processed (degraded) speech only, and is based on measuring perception-motivated objective auditory distances between the voiced parts of the output speech to appropriately matching references extracted from a pre-formulated codebook. The codebook is formed by optimally clustering a large number of parametric speech vectors extracted from a database of clean speech records. The auditory distances are then mapped into objective Mean Opinion listening quality scores. An efficient data-mining tool known as the self-organizing map (SOM) achieves the required clustering and mapping/reference matching processes. In order to obtain a perception-based, speaker-independent parametric representation of the speech, three domain transformation techniques have been investigated. The first technique is based on a perceptual linear prediction (PLP) model, the second utilises a bark spectrum (BS) analysis and the third utilises mel-frequency cepstrum coefficients (MFCC). Reported evaluation results show that the proposed method provides high correlation with subjective listening quality scores, yielding accuracy similar to that of the ITU-T P.563 while maintaining a relatively low computational complexity. Results also demonstrate that the method outperforms the PESQ in a number of distortion conditions, such as those of speech degraded by channel impairments.acceptedpeer-reviewe

    Call Quality and Its Parameter Measurement inTelecommunication Networks

    Full text link

    Quality of media traffic over Lossy internet protocol networks: Measurement and improvement.

    Get PDF
    Voice over Internet Protocol (VoIP) is an active area of research in the world of communication. The high revenue made by the telecommunication companies is a motivation to develop solutions that transmit voice over other media rather than the traditional, circuit switching network. However, while IP networks can carry data traffic very well due to their besteffort nature, they are not designed to carry real-time applications such as voice. As such several degradations can happen to the speech signal before it reaches its destination. Therefore, it is important for legal, commercial, and technical reasons to measure the quality of VoIP applications accurately and non-intrusively. Several methods were proposed to measure the speech quality: some of these methods are subjective, others are intrusive-based while others are non-intrusive. One of the non-intrusive methods for measuring the speech quality is the E-model standardised by the International Telecommunication Union-Telecommunication Standardisation Sector (ITU-T). Although the E-model is a non-intrusive method for measuring the speech quality, but it depends on the time-consuming, expensive and hard to conduct subjective tests to calibrate its parameters, consequently it is applicable to a limited number of conditions and speech coders. Also, it is less accurate than the intrusive methods such as Perceptual Evaluation of Speech Quality (PESQ) because it does not consider the contents of the received signal. In this thesis an approach to extend the E-model based on PESQ is proposed. Using this method the E-model can be extended to new network conditions and applied to new speech coders without the need for the subjective tests. The modified E-model calibrated using PESQ is compared with the E-model calibrated using i ii subjective tests to prove its effectiveness. During the above extension the relation between quality estimation using the E-model and PESQ is investigated and a correction formula is proposed to correct the deviation in speech quality estimation. Another extension to the E-model to improve its accuracy in comparison with the PESQ looks into the content of the degraded signal and classifies packet loss into either Voiced or Unvoiced based on the received surrounding packets. The accuracy of the proposed method is evaluated by comparing the estimation of the new method that takes packet class into consideration with the measurement provided by PESQ as a more accurate, intrusive method for measuring the speech quality. The above two extensions for quality estimation of the E-model are combined to offer a method for estimating the quality of VoIP applications accurately, nonintrusively without the need for the time-consuming, expensive, and hard to conduct subjective tests. Finally, the applicability of the E-model or the modified E-model in measuring the quality of services in Service Oriented Computing (SOC) is illustrated

    Reverberation: models, estimation and application

    No full text
    The use of reverberation models is required in many applications such as acoustic measurements, speech dereverberation and robust automatic speech recognition. The aim of this thesis is to investigate different models and propose a perceptually-relevant reverberation model with suitable parameter estimation techniques for different applications. Reverberation can be modelled in both the time and frequency domain. The model parameters give direct information of both physical and perceptual characteristics. These characteristics create a multidimensional parameter space of reverberation, which can be to a large extent captured by a time-frequency domain model. In this thesis, the relationship between physical and perceptual model parameters will be discussed. In the first application, an intrusive technique is proposed to measure the reverberation or reverberance, perception of reverberation and the colouration. The room decay rate parameter is of particular interest. In practical applications, a blind estimate of the decay rate of acoustic energy in a room is required. A statistical model for the distribution of the decay rate of the reverberant signal named the eagleMax distribution is proposed. The eagleMax distribution describes the reverberant speech decay rates as a random variable that is the maximum of the room decay rates and anechoic speech decay rates. Three methods were developed to estimate the mean room decay rate from the eagleMax distributions alone. The estimated room decay rates form a reverberation model that will be discussed in the context of room acoustic measurements, speech dereverberation and robust automatic speech recognition individually

    Performance metrics, configuration strategies and traffic identification for group network application.

    Get PDF
    Fu, Zhengjia.Thesis (M.Phil.)--Chinese University of Hong Kong, 2008.Includes bibliographical references (p. 64-70 ).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 2 --- Design for group network communication --- p.6Chapter 2.1 --- Performance metrics of network Voice Conference: GMOS --- p.7Chapter 2.2 --- Conference Leader Selection strategies --- p.11Chapter 2.3 --- Experiment Description --- p.14Chapter 2.4 --- Data analysis and results --- p.16Chapter 2.5 --- Applications of Proposals to Voice Conference --- p.25Chapter 3 --- P2P Application Identification --- p.27Chapter 3.1 --- Periodic Group Communication Patterns --- p.28Chapter 3.1.1 --- Terminology for Behavioral Patterns --- p.29Chapter 3.1.2 --- Pattern 1: Gossip of Buffer Maps --- p.30Chapter 3.1.3 --- Pattern 2: Content flow control --- p.31Chapter 3.1.4 --- Pattern 3: Synchronized Link Activation and Deactivation --- p.32Chapter 3.2 --- Identification Based on behavioral signatures --- p.33Chapter 3.2.1 --- Algorithm Overview --- p.34Chapter 3.2.2 --- Sequence Generation (SG1): Time Series for the Gossip Pattern --- p.36Chapter 3.2.3 --- Transform Time-domain Sequence to Frequency-domain Sequence --- p.36Chapter 3.2.4 --- Sequence Generation (SG2): Time Series for Content Flow Control Pattern --- p.40Chapter 3.2.5 --- Sequence Generation (SG3): Time Series for Synchronized Start and Finish of Flows --- p.41Chapter 3.2.6 --- Analyzer step --- p.47Chapter 3.3 --- Behavioral signatures of popular P2P applications --- p.47Chapter 3.4 --- Experiment Results --- p.49Chapter 3.5 --- Discussion --- p.52Chapter 4 --- Related Work --- p.58Chapter 5 --- Conclusion --- p.62Bibliography --- p.6
    corecore