16 research outputs found
New single-ended objective measure for non-intrusive speech quality evaluation
peer-reviewedThis article proposes a new output-based method for non-intrusive assessment of speech quality of voice communication systems and evaluates its performance. The method requires access to the processed (degraded) speech only, and is based on measuring perception-motivated objective auditory distances between the voiced parts of the output speech to appropriately matching references extracted from a pre-formulated codebook. The codebook is formed by optimally clustering a large number of parametric speech vectors extracted from a database of clean speech records. The auditory distances are then mapped into objective Mean Opinion listening quality scores. An efficient data-mining tool known as the self-organizing map (SOM) achieves the required clustering and mapping/reference matching processes. In order to obtain a perception-based, speaker-independent parametric representation of the speech, three domain transformation techniques have been investigated. The first technique is based on a perceptual linear prediction (PLP) model, the second utilises a bark spectrum (BS) analysis and the third utilises mel-frequency cepstrum coefficients (MFCC). Reported evaluation results show that the proposed method provides high correlation with subjective listening quality scores, yielding accuracy similar to that of the ITU-T P.563 while maintaining a relatively low computational complexity. Results also demonstrate that the method outperforms the PESQ in a number of distortion conditions, such as those of speech degraded by channel impairments.acceptedpeer-reviewe
Recommended from our members
Operating System Based Perceptual Evaluation of Call Quality in Radio Telecommunications Networks. Development of call quality assessment at mobile terminals using the Symbian operating system, comparison with traditional approaches and proposals for a tariff regime relating call charging to perceived speech quality.
Call quality has been crucial from the inception of telecommunication networks.
Operators need to monitor call quality from the end-user¿s perspective, in order to retain
subscribers and reduce subscriber ¿churn¿. Operators worry not only about call quality and
interconnect revenue loss, but also about network connectivity issues in areas where mobile
network gateways are prevalent. Bandwidth quality as experienced by the end-user is equally
important in helping operators to reduce churn.
The parameters that network operators use to improve call quality are mainly from the
end-user¿s perspective. These parameters are usually ASR (answer seizure ratio), PDD (postdial
delay), NER (network efficiency ratio), the number of calls for which these parameters
have been analyzed and successful calls. Operators use these parameters to evaluate and
optimize the network to meet their quality requirements.
Analysis of speech quality is a major arena for research. Traditionally, users¿ perception
of speech quality has been measured offline using subjective listening tests. Such tests are,
however, slow, tedious and costly. An alternative method is therefore needed; one that can be
automatically computed on the subscriber¿s handset, be available to the operator as well as to
subscribers and, at the same time, provide results that are comparable with conventional
subjective scores. QMeter® ¿ a set of tools for signal and bandwidth measurement that have
been developed bearing in mind all the parameters that influence call and bandwidth quality
experienced by the end-user ¿ addresses these issues and, additionally, facilitates dynamic tariff
propositions which enhance the credibility of the operator.
This research focuses on call quality parameters from the end-user¿s perspective. The
call parameters used in the research are signal strength, successful call rate, normal drop call
rate, and hand-over drop rate. Signal strength is measured for every five milliseconds of an
active call and average signal strength is calculated for each successful call. The successful call
rate, normal drop rate and hand-over drop rate are used to achieve a measurement of the overall
call quality. Call quality with respect to bundles of 10 calls is proposed.
An attempt is made to visualize these parameters for better understanding of where the
quality is bad, good and excellent. This will help operators, as well as user groups, to measure
quality and coverage.
Operators boast about their bandwidth but in reality, to know the locations where speed
has to be improved, they need a tool that can effectively measure speed from the end-user¿s
perspective. BM (bandwidth meter), a tool developed as a part of this research, measures the
average speed of data sessions and stores the information for analysis at different locations.
To address issues of quality in the subscriber segment, this research proposes the
varying of tariffs based on call and bandwidth quality. Call charging based on call quality as
perceived by the end-user is proposed, both to satisfy subscribers and help operators to improve
customer satisfaction and increase average revenue per user. Tariff redemption procedures are
put forward for bundles of 10 calls and 10 data sessions. In addition to the varying of tariffs,
quality escalation processes are proposed. Deploying such tools on selected or random samples
of users will result in substantial improvement in user loyalty which, in turn, will bring
operational and economic advantages
Quality of media traffic over Lossy internet protocol networks: Measurement and improvement.
Voice over Internet Protocol (VoIP) is an active area of research in the world of
communication. The high revenue made by the telecommunication companies is a
motivation to develop solutions that transmit voice over other media rather than
the traditional, circuit switching network.
However, while IP networks can carry data traffic very well due to their besteffort
nature, they are not designed to carry real-time applications such as voice.
As such several degradations can happen to the speech signal before it reaches its
destination. Therefore, it is important for legal, commercial, and technical reasons
to measure the quality of VoIP applications accurately and non-intrusively.
Several methods were proposed to measure the speech quality: some of these
methods are subjective, others are intrusive-based while others are non-intrusive.
One of the non-intrusive methods for measuring the speech quality is the E-model
standardised by the International Telecommunication Union-Telecommunication Standardisation
Sector (ITU-T).
Although the E-model is a non-intrusive method for measuring the speech quality,
but it depends on the time-consuming, expensive and hard to conduct subjective
tests to calibrate its parameters, consequently it is applicable to a limited number
of conditions and speech coders. Also, it is less accurate than the intrusive methods
such as Perceptual Evaluation of Speech Quality (PESQ) because it does not consider
the contents of the received signal.
In this thesis an approach to extend the E-model based on PESQ is proposed.
Using this method the E-model can be extended to new network conditions and
applied to new speech coders without the need for the subjective tests. The modified
E-model calibrated using PESQ is compared with the E-model calibrated using
i
ii
subjective tests to prove its effectiveness.
During the above extension the relation between quality estimation using the
E-model and PESQ is investigated and a correction formula is proposed to correct
the deviation in speech quality estimation.
Another extension to the E-model to improve its accuracy in comparison with
the PESQ looks into the content of the degraded signal and classifies packet loss
into either Voiced or Unvoiced based on the received surrounding packets. The accuracy
of the proposed method is evaluated by comparing the estimation of the new
method that takes packet class into consideration with the measurement provided
by PESQ as a more accurate, intrusive method for measuring the speech quality.
The above two extensions for quality estimation of the E-model are combined
to offer a method for estimating the quality of VoIP applications accurately, nonintrusively
without the need for the time-consuming, expensive, and hard to conduct
subjective tests.
Finally, the applicability of the E-model or the modified E-model in measuring
the quality of services in Service Oriented Computing (SOC) is illustrated
Reverberation: models, estimation and application
The use of reverberation models is required in many applications such as acoustic measurements,
speech dereverberation and robust automatic speech recognition. The aim of this thesis is to
investigate different models and propose a perceptually-relevant reverberation model with suitable
parameter estimation techniques for different applications.
Reverberation can be modelled in both the time and frequency domain. The model parameters
give direct information of both physical and perceptual characteristics. These characteristics
create a multidimensional parameter space of reverberation, which can be to a large extent captured
by a time-frequency domain model. In this thesis, the relationship between physical and perceptual
model parameters will be discussed. In the first application, an intrusive technique is proposed to
measure the reverberation or reverberance, perception of reverberation and the colouration. The
room decay rate parameter is of particular interest.
In practical applications, a blind estimate of the decay rate of acoustic energy in a room
is required. A statistical model for the distribution of the decay rate of the reverberant signal
named the eagleMax distribution is proposed. The eagleMax distribution describes the reverberant
speech decay rates as a random variable that is the maximum of the room decay rates and anechoic
speech decay rates. Three methods were developed to estimate the mean room decay rate from
the eagleMax distributions alone. The estimated room decay rates form a reverberation model that
will be discussed in the context of room acoustic measurements, speech dereverberation and robust
automatic speech recognition individually
Performance metrics, configuration strategies and traffic identification for group network application.
Fu, Zhengjia.Thesis (M.Phil.)--Chinese University of Hong Kong, 2008.Includes bibliographical references (p. 64-70 ).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 2 --- Design for group network communication --- p.6Chapter 2.1 --- Performance metrics of network Voice Conference: GMOS --- p.7Chapter 2.2 --- Conference Leader Selection strategies --- p.11Chapter 2.3 --- Experiment Description --- p.14Chapter 2.4 --- Data analysis and results --- p.16Chapter 2.5 --- Applications of Proposals to Voice Conference --- p.25Chapter 3 --- P2P Application Identification --- p.27Chapter 3.1 --- Periodic Group Communication Patterns --- p.28Chapter 3.1.1 --- Terminology for Behavioral Patterns --- p.29Chapter 3.1.2 --- Pattern 1: Gossip of Buffer Maps --- p.30Chapter 3.1.3 --- Pattern 2: Content flow control --- p.31Chapter 3.1.4 --- Pattern 3: Synchronized Link Activation and Deactivation --- p.32Chapter 3.2 --- Identification Based on behavioral signatures --- p.33Chapter 3.2.1 --- Algorithm Overview --- p.34Chapter 3.2.2 --- Sequence Generation (SG1): Time Series for the Gossip Pattern --- p.36Chapter 3.2.3 --- Transform Time-domain Sequence to Frequency-domain Sequence --- p.36Chapter 3.2.4 --- Sequence Generation (SG2): Time Series for Content Flow Control Pattern --- p.40Chapter 3.2.5 --- Sequence Generation (SG3): Time Series for Synchronized Start and Finish of Flows --- p.41Chapter 3.2.6 --- Analyzer step --- p.47Chapter 3.3 --- Behavioral signatures of popular P2P applications --- p.47Chapter 3.4 --- Experiment Results --- p.49Chapter 3.5 --- Discussion --- p.52Chapter 4 --- Related Work --- p.58Chapter 5 --- Conclusion --- p.62Bibliography --- p.6