2,597 research outputs found
On the evaluation of the conversational speech quality in telecommunications
International audienceIn this paper we propose an objective method to assess speech quality in the conversational context by taking into account the talking and listening speech qualities and the impact of delay. This approach is applied to the results of four subjective tests on the effects of echo, delay, packet loss and noise. The dataset is divided into training and validation sets. For the training set, a multiple linear regression is applied to determine a relationship between conversational, talking and listening speech qualities and the delay value. The multiple linear regression leads to an accurate estimation of the conversational scores with high correlation and low error between subjective and estimated scores, both on the training and validation sets. In addition, a validation is performed on the data of a subjective test found in the literature which confirms the reliability of the regression. The relationship is then applied to an objective level by replacing talking and listening subjective scores with talking and listening objective scores provided by existing objective models, fed by speech signals recorded during the subjective tests. The conversational model achieves high perfor- mance as revealed by comparison with the test results and with the existing standard methodology “E-model”, presented in the ITU-T (International Telecommunication Union) Recommendation G.107
Predicting the Quality of Synthesized and Natural Speech Impaired by Packet Loss and Coding Using PESQ and P.563 Models
This paper investigates the impact of independent and dependent losses and coding on speech quality predictions
provided by PESQ (also known as ITU-T P.862) and P.563 models, when both naturally-produced and synthesized
speech are used. Two synthesized speech samples generated with two different Text-to-Speech systems
and one naturally-produced sample are investigated. In addition, we assess the variability of PESQ’s and P.563’s
predictions with respect to the type of speech used (naturally-produced or synthesized) and loss conditions as
well as their accuracy, by comparing the predictions with subjective assessments. The results show that there is
no difference between the impact of packet loss on naturally-produced speech and synthesized speech. On the
other hand, the impact of coding is different for the two types of stimuli. In addition, synthesized speech seems
to be insensitive to degradations provided by most of the codecs investigated here. The reasons for those findings
are particularly discussed. Finally, it is concluded that both models are capable of predicting the quality of transmitted
synthesized speech under the investigated conditions to a certain degree. As expected, PESQ achieves the
best performance over almost all of the investigated conditions
Impact of Different Active-Speech-Ratios on PESQ’s Predictions in Case of Independent and Dependent Losses (in Presence of Receiver-Side Comfort-Noise)
This paper deals with the investigation of PESQ’s behavior under independent and dependent loss conditions from an Active-Speech-Ratio perspective in presence of receiver-side comfort-noise. This reference signal characteristic is defined very broadly by ITU-T Recommendation P.862.3. That is the reason to investigate an impact of this characteristic on speech quality prediction more in-depth. We assess the variability of PESQ’s predictions with respect to Active-Speech-Ratios and loss conditions, as well as their accuracy, by comparing the predictions with subjective assessments. Our results show that an increase in amount of speech in the reference signal (expressed by the Active-Speech-Ratio characteristic) may result in an increase of the reference signal sensitivity to packet loss change. Interestingly, we have found two additional effects in this investigated case. The use of higher Active-Speech-Ratios may lead to negative shifting effect in MOS domain and also PESQ’s predictions accuracy declining. Predictions accuracy could be improved by higher packet losses
Quality aspects of Internet telephony
Internet telephony has had a tremendous impact on how people communicate.
Many now maintain contact using some form of Internet telephony.
Therefore the motivation for this work has been to address the quality aspects
of real-world Internet telephony for both fixed and wireless telecommunication.
The focus has been on the quality aspects of voice communication,
since poor quality leads often to user dissatisfaction. The scope of the work
has been broad in order to address the main factors within IP-based voice
communication.
The first four chapters of this dissertation constitute the background
material. The first chapter outlines where Internet telephony is deployed
today. It also motivates the topics and techniques used in this research.
The second chapter provides the background on Internet telephony including
signalling, speech coding and voice Internetworking. The third chapter
focuses solely on quality measures for packetised voice systems and finally
the fourth chapter is devoted to the history of voice research.
The appendix of this dissertation constitutes the research contributions.
It includes an examination of the access network, focusing on how calls are
multiplexed in wired and wireless systems. Subsequently in the wireless
case, we consider how to handover calls from 802.11 networks to the cellular
infrastructure. We then consider the Internet backbone where most of our
work is devoted to measurements specifically for Internet telephony. The
applications of these measurements have been estimating telephony arrival
processes, measuring call quality, and quantifying the trend in Internet telephony
quality over several years. We also consider the end systems, since
they are responsible for reconstructing a voice stream given loss and delay
constraints. Finally we estimate voice quality using the ITU proposal PESQ
and the packet loss process.
The main contribution of this work is a systematic examination of Internet
telephony. We describe several methods to enable adaptable solutions
for maintaining consistent voice quality. We have also found that relatively
small technical changes can lead to substantial user quality improvements.
A second contribution of this work is a suite of software tools designed to
ascertain voice quality in IP networks. Some of these tools are in use within
commercial systems today
Recommended from our members
Root-MUSIC-based methods for blind network-assisted diversity multiple access
Packet collisions in wireless networks degrade the throughput and impede the system performance. The collided packets are typically corrupted and get discarded. Channelization methods avoid collisions through fixed assignment of communication resources to the system users, but they do not take into account the randomness of packet arrivals. Statistical multiplexing optimally adapts the allocation of resources to the instantaneous traffic demands of the users. However, it is only possible in the downlink wherein the data streams are managed by one station. Random-access methods mimic statistical multiplexing by dynamically assigning resources to users. A slot is wasted if the channel incurs a collision, and the collided packets have to be retransmitted.
First, we present a cross-layer design for providing multiple access to a shared wireless link. While retransmissions are controlled by the medium access control (MAC) layer, this creates sufficient diversity to recover the collided packets in the physical (PHY) layer. Both the number and identities of the involved transmitters in a collision are unknown to the receiver. The signal separation is done blindly using root-MUSIC-like algorithms. We solve the collision resolution problem in four network-operation modes: synchronous blocking mode, synchronous non-blocking mode, asynchronous blocking mode and asynchronous non-blocking mode.
Second, we evaluate the decoding performance of the algorithms in block-fading channels with additive white Gaussian noise. We analytically demonstrate the effect of signal-to-noise ratio and the number of retransmissions on the signal separation capability of the proposed methods for a given number of collided packets.
Third, we evaluate the network throughput and mean packet queueing delay for the proposed collision resolution algorithms analytically and numerically. We derive conditions for stability of the queueing network as function of the mean packet arrival rates.Electrical and Computer Engineerin
A Generic Algorithm for Mid-call Audio Codec Switching
We present and evaluate an algorithm that performs
in-call selection of the most appropriate audio codec given
prevailing conditions on the network path between the endpoints
of a voice call. We have studied the behaviour of different
codecs under varying network conditions, in doing so deriving
the impairment factors for non-ITU-T codecs so that the E-model
can be used to assess voice call quality for them. Moreover, we
have studied the drawbacks of codec switching from the end
user perception point of view; our switching algorithm seeks to
minimise this impact. We have tested our algorithm on different
packages that contain a selection of the most commonly used
codecs: G.711, SILK, ILBC, GSM and SPEEX. Our results show
that in many typical network scenarios, our switching codecs
mid-call algorithm results in better Quality of Experience (QoE)
than would have been achieved had the initial codec been used
throughout the call
- …