7,953 research outputs found

    Enhancement of perceived quality of service for voice over internet protocol systems

    Get PDF
    Voice over Internet Protocol (WIP) applications are becoming more and more popular in the telecommunication market. Packet switched V61P systems have many technical advantages over conventional Public Switched Telephone Network (PSTN), including its efficient and flexible use of the bandwidth, lower cost and enhanced security. However, due to the IP network's "Best Effort" nature, voice quality are not naturally guaranteed in the VoIP services. In fact, most current Vol]P services can not provide as good a voice quality as PSTN. IP Network impairments such as packet loss, delay and jitter affect perceived speech quality as do application layer impairment factors, such as codec rate and audio features. Current perceived Quality of Service (QoS) methods are mainly designed to be used in a PSTN/TDM environment and their performance in V6IP environment is unknown. It is a challenge to measure perceived speech quality correctly in V61P system and to enhance user perceived speech quality for VoIP system. The main goal of this project is to evaluate the accuracy of the existing ITU-T speech quality measurement method (Perceptual Evaluation of Speech Quality - PESQ) in mobile wireless systems in the context of V61P, and to develop novel and efficient methods to enhance the user perceived speech quality for emerging V61P services especially in mobile V61P environment. The main contributions of the thesis are threefold: (1) A new discovery of PESQ errors in mobile VoIP environment. A detailed investigation of PESQ performance in mobile VoIP environment was undertaken and included setting up a PESQ performance evaluation platform and testing over 1800 mobile-to-mobile and mobileto- PSTN calls over a period of three months. The accuracy issues of PESQ algorithm was investigated and main problems causing inaccurate PESQ score (improper time-alignment in the PESQ algorithm) were discovered . Calibration issues for a safe and proper PESQ testing in mobile environment were also discussed in the thesis. (2) A new, simple-to-use, V611Pjit ter buffer algorithm. This was developed and implemented in a commercial mobile handset. The algorithm, called "Play Late Algorithm", adaptively alters the playout delay inside a speech talkspurt without introducing unnecessary extra end-to-end delay. It can be used as a front-end to conventional static or adaptive jitter buffer algorithms to provide improved performance. Results show that the proposed algorithm can increase user perceived quality without consuming too much processing power when tested in live wireless VbIP networks. (3) A new QoS enhancement scheme. The new scheme combines the strengths of adaptive codec bit rate (i. e. AMR 8-modes bit rate) and speech priority marking (i. e. giving high priority for the beginning of a voiced segment). The results gathered on a simulation and emulation test platform shows that the combined method provides a better user perceived speech quality than separate adaptive sender bit rate or packet priority marking methods

    Speech quality prediction for voice over Internet protocol networks

    Get PDF
    Merged with duplicate record 10026.1/878 on 03.01.2017 by CS (TIS). Merged with duplicate record 10026.1/1657 on 15.03.2017 by CS (TIS)This is a digitised version of a thesis that was deposited in the University Library. If you are the author please contact PEARL Admin ([email protected]) to discuss options.IP networks are on a steep slope of innovation that will make them the long-term carrier of all types of traffic, including voice. However, such networks are not designed to support real-time voice communication because their variable characteristics (e.g. due to delay, delay variation and packet loss) lead to a deterioration in voice quality. A major challenge in such networks is how to measure or predict voice quality accurately and efficiently for QoS monitoring and/or control purposes to ensure that technical and commercial requirements are met. Voice quality can be measured using either subjective or objective methods. Subjective measurement (e.g. MOS) is the benchmark for objective methods, but it is slow, time consuming and expensive. Objective measurement can be intrusive or non-intrusive. Intrusive methods (e.g. ITU PESQ) are more accurate, but normally are unsuitable for monitoring live traffic because of the need for a reference data and to utilise the network. This makes non-intrusive methods(e.g. ITU E-model) more attractive for monitoring voice quality from IP network impairments. However, current non-intrusive methods rely on subjective tests to derive model parameters and as a result are limited and do not meet new and emerging applications. The main goal of the project is to develop novel and efficient models for non-intrusive speech quality prediction to overcome the disadvantages of current subjective-based methods and to demonstrate their usefulness in new and emerging VoIP applications. The main contributions of the thesis are fourfold: (1) a detailed understanding of the relationships between voice quality, IP network impairments (e.g. packet loss, jitter and delay) and relevant parameters associated with speech (e.g. codec type, gender and language) is provided. An understanding of the perceptual effects of these key parameters on voice quality is important as it provides a basis for the development of non-intrusive voice quality prediction models. A fundamental investigation of the impact of the parameters on perceived voice quality was carried out using the latest ITU algorithm for perceptual evaluation of speech quality, PESQ, and by exploiting the ITU E-model to obtain an objective measure of voice quality. (2) a new methodology to predict voice quality non-intrusively was developed. The method exploits the intrusive algorithm, PESQ, and a combined PESQ/E-model structure to provide a perceptually accurate prediction of both listening and conversational voice quality non-intrusively. This avoids time-consuming subjective tests and so removes one of the major obstacles in the development of models for voice quality prediction. The method is generic and as such has wide applicability in multimedia applications. Efficient regression-based models and robust artificial neural network-based learning models were developed for predicting voice quality non-intrusively for VoIP applications. (3) three applications of the new models were investigated: voice quality monitoring/prediction for real Internet VoIP traces, perceived quality driven playout buffer optimization and perceived quality driven QoS control. The neural network and regression models were both used to predict voice quality for real Internet VoIP traces based on international links. A new adaptive playout buffer and a perceptual optimization playout buffer algorithms are presented. A QoS control scheme that combines the strengths of rate-adaptive and priority marking control schemes to provide a superior QoS control in terms of measured perceived voice quality is also provided. (4) a new methodology for Internet-based subjective speech quality measurement which allows rapid assessment of voice quality for VoIP applications is proposed and assessed using both objective and traditional MOS test methods

    VoIP Quality Assessment Technologies

    Get PDF

    Video streaming

    Get PDF

    A Utility-based QoS Model for Emerging Multimedia Applications

    Get PDF
    Existing network QoS models do not sufficiently reflect the challenges faced by high-throughput, always-on, inelastic multimedia applications. In this paper, a utility-based QoS model is proposed as a user layer extension to existing communication QoS models to better assess the requirements of multimedia applications and manage the QoS provisioning of multimedia flows. Network impairment utility functions are derived from user experiments and combined to application utility functions to evaluate the application quality. Simulation is used to demonstrate the validity of the proposed QoS model

    Assessing the quality of audio and video components in desktop multimedia conferencing

    Get PDF
    This thesis seeks to address the HCI (Human-Computer Interaction) research problem of how to establish the level of audio and video quality that end users require to successfully perform tasks via networked desktop videoconferencing. There are currently no established HCI methods of assessing the perceived quality of audio and video delivered in desktop videoconferencing. The transport of real-time speech and video information across new digital networks causes novel and different degradations, problems and issues to those common in the traditional telecommunications areas (telephone and television). Traditional assessment methods involve the use of very short test samples, are traditionally conducted outside a task-based environment, and focus on whether a degradation is noticed or not. But these methods cannot help establish what audio-visual quality is required by users to perform tasks successfully with the minimum of user cost, in interactive conferencing environments. This thesis addresses this research gap by investigating and developing a battery of assessment methods for networked videoconferencing, suitable for use in both field trials and laboratory-based studies. The development and use of these new methods helps identify the most critical variables (and levels of these variables) that affect perceived quality, and means by which network designers and HCI practitioners can address these problems are suggested. The output of the thesis therefore contributes both methodological (i.e. new rating scales and data-gathering methods) and substantive (i.e. explicit knowledge about quality requirements for certain tasks) knowledge to the HCI and networking research communities on the subjective quality requirements of real-time interaction in networked videoconferencing environments. Exploratory research is carried out through an interleaved series of field trials and controlled studies, advancing substantive and methodological knowledge in an incremental fashion. Initial studies use the ITU-recommended assessment methods, but these are found to be unsuitable for assessing networked speech and video quality for a number of reasons. Therefore later studies investigate and establish a novel polar rating scale, which can be used both as a static rating scale and as a dynamic continuous slider. These and further developments of the methods in future lab- based and real conferencing environments will enable subjective quality requirements and guidelines for different videoconferencing tasks to be established

    Synchronization of streamed audio between multiple playback devices over an unmanaged IP network

    Get PDF
    When designing and implementing a prototype supporting inter-destination media synchronization – synchronized playback between multiple devices receiving the same stream – there are a lot of aspects that need to be considered, especially when working with unmanaged networks. Not only is a proper streaming protocol essential, but also a way to obtain and maintain the synchronization of the clocks of the devices. The thesis had a few constraints, namely that the server producing the stream should be written for the .NET-platform and that the clients receiving it should be using the media framework GStreamer. This framework provides methods for both achieving synchronization as well as resynchronization. As the provided resynchro- nization methods introduced distortions in the audio, an alternative method was implemented. This method focused on minimizing the distortions, thus maintain- ing a smooth playback. After the prototype had been implemented, it was tested to see how well it performed under the influence of packet loss and delay. The accuracy of the synchronization was also tested under optimal conditions using two different time synchronization protocols. What could be concluded from this was that a good synchronization could be maintained on unloaded networks using the proposed method, but when introducing delay the prototype struggled more. This was mainly due to the usage of the Network Time Protocol (NTP), which is known to perform badly on networks with asymmetric paths.When working with synchronized playback it is not enough just obtain- ing it – it also needs to be maintained. Implementing a prototype thus involves many parts ranging from choosing a proper streaming protocol, to handling glitch free resynchronization of audio. Synchronization between multiple speakers has a wide area of application, ranging from home entertainment solutions to big malls where announcements should appear synchronized over the entire perimeter. In order to achieve this, two main parts are involved: the streaming of the audio, and the actual synchronization. The streaming itself poses problems mostly since the prototype should not only work on dedicated networks, but rather on all kinds, such as the Internet. As the information over these networks are transmitted in packets, and the path from source to destination crosses many sub networks, the packets may be delayed or even lost. This may create an audible distortion in the playback. The next part is the synchronization. This is most easily achieved by putting a time on each packet stating when in the future it should be played out. If then all receivers play it back at the specified time, synchronization is achieved. This however requires that all the receivers share the idea of when a specific time is – the clocks at all the receivers must be synchronized. By using existing software and hardware solutions, such as the Network Time Protocol (NTP) or the Precision Time Protocol (PTP), this can be accomplished. The accuracy of the synchronization is therefore partly dependent on how well these solutions work. Another valid aspect is how accurate the synchronization must be for the sound to be perceived as synchronized by humans. This is usually in the range of a few tens of milliseconds to five milliseconds depending on the sound. When a global time has been distributed to all receivers, matters get more complicated as there is more than one clock to consider at each receiver. Apart from the previously mentioned clock, now called the ’system clock’, there is also an audio clock, which is a hardware clock positioned on the sound card. This audio clock decides the rate at which media is played out. Altering the system clock to synchronize it to a common time is one thing, but altering the audio clock while media is being played will inevitably mean a jump in the playback, and thus a distortion. Although an initial synchronization can be achieved, the two clocks will over time tick in slightly different pace, thus drifting away from each other. This creates a need for the audio clock to continuously correct itself to follow the system clock. In the media framework GStreamer, used for handling the media at the re- ceivers, two alternatives to solve the correction problem were available. Quick evaluations of these two methods however showed that either audible glitches or ’oscillations’ occurred in the sound, when the clocks were corrected. A new method, which basically combines the two existing, was therefore implemented. With this method the audio clock is continuously corrected, but in a smaller and less aggressive way. Listening tests revealed much smaller, often not audible, distortions, while the synchronization performance was at par with the existing methods. More thorough testing showed that the synchronization over networks with light traffic was in the microsecond-range, thus far below the threshold of what will appear as synchronized. During worse conditions – simulated hostile environments – the synchronization quickly reached unacceptable levels though. This was due to the previously mentioned NTP, and not the implemented method on the other hand

    Quality aspects of Internet telephony

    Get PDF
    Internet telephony has had a tremendous impact on how people communicate. Many now maintain contact using some form of Internet telephony. Therefore the motivation for this work has been to address the quality aspects of real-world Internet telephony for both fixed and wireless telecommunication. The focus has been on the quality aspects of voice communication, since poor quality leads often to user dissatisfaction. The scope of the work has been broad in order to address the main factors within IP-based voice communication. The first four chapters of this dissertation constitute the background material. The first chapter outlines where Internet telephony is deployed today. It also motivates the topics and techniques used in this research. The second chapter provides the background on Internet telephony including signalling, speech coding and voice Internetworking. The third chapter focuses solely on quality measures for packetised voice systems and finally the fourth chapter is devoted to the history of voice research. The appendix of this dissertation constitutes the research contributions. It includes an examination of the access network, focusing on how calls are multiplexed in wired and wireless systems. Subsequently in the wireless case, we consider how to handover calls from 802.11 networks to the cellular infrastructure. We then consider the Internet backbone where most of our work is devoted to measurements specifically for Internet telephony. The applications of these measurements have been estimating telephony arrival processes, measuring call quality, and quantifying the trend in Internet telephony quality over several years. We also consider the end systems, since they are responsible for reconstructing a voice stream given loss and delay constraints. Finally we estimate voice quality using the ITU proposal PESQ and the packet loss process. The main contribution of this work is a systematic examination of Internet telephony. We describe several methods to enable adaptable solutions for maintaining consistent voice quality. We have also found that relatively small technical changes can lead to substantial user quality improvements. A second contribution of this work is a suite of software tools designed to ascertain voice quality in IP networks. Some of these tools are in use within commercial systems today
    • …
    corecore