2,430 research outputs found

    Speech quality prediction for voice over Internet protocol networks

    Get PDF
    Merged with duplicate record 10026.1/878 on 03.01.2017 by CS (TIS). Merged with duplicate record 10026.1/1657 on 15.03.2017 by CS (TIS)This is a digitised version of a thesis that was deposited in the University Library. If you are the author please contact PEARL Admin ([email protected]) to discuss options.IP networks are on a steep slope of innovation that will make them the long-term carrier of all types of traffic, including voice. However, such networks are not designed to support real-time voice communication because their variable characteristics (e.g. due to delay, delay variation and packet loss) lead to a deterioration in voice quality. A major challenge in such networks is how to measure or predict voice quality accurately and efficiently for QoS monitoring and/or control purposes to ensure that technical and commercial requirements are met. Voice quality can be measured using either subjective or objective methods. Subjective measurement (e.g. MOS) is the benchmark for objective methods, but it is slow, time consuming and expensive. Objective measurement can be intrusive or non-intrusive. Intrusive methods (e.g. ITU PESQ) are more accurate, but normally are unsuitable for monitoring live traffic because of the need for a reference data and to utilise the network. This makes non-intrusive methods(e.g. ITU E-model) more attractive for monitoring voice quality from IP network impairments. However, current non-intrusive methods rely on subjective tests to derive model parameters and as a result are limited and do not meet new and emerging applications. The main goal of the project is to develop novel and efficient models for non-intrusive speech quality prediction to overcome the disadvantages of current subjective-based methods and to demonstrate their usefulness in new and emerging VoIP applications. The main contributions of the thesis are fourfold: (1) a detailed understanding of the relationships between voice quality, IP network impairments (e.g. packet loss, jitter and delay) and relevant parameters associated with speech (e.g. codec type, gender and language) is provided. An understanding of the perceptual effects of these key parameters on voice quality is important as it provides a basis for the development of non-intrusive voice quality prediction models. A fundamental investigation of the impact of the parameters on perceived voice quality was carried out using the latest ITU algorithm for perceptual evaluation of speech quality, PESQ, and by exploiting the ITU E-model to obtain an objective measure of voice quality. (2) a new methodology to predict voice quality non-intrusively was developed. The method exploits the intrusive algorithm, PESQ, and a combined PESQ/E-model structure to provide a perceptually accurate prediction of both listening and conversational voice quality non-intrusively. This avoids time-consuming subjective tests and so removes one of the major obstacles in the development of models for voice quality prediction. The method is generic and as such has wide applicability in multimedia applications. Efficient regression-based models and robust artificial neural network-based learning models were developed for predicting voice quality non-intrusively for VoIP applications. (3) three applications of the new models were investigated: voice quality monitoring/prediction for real Internet VoIP traces, perceived quality driven playout buffer optimization and perceived quality driven QoS control. The neural network and regression models were both used to predict voice quality for real Internet VoIP traces based on international links. A new adaptive playout buffer and a perceptual optimization playout buffer algorithms are presented. A QoS control scheme that combines the strengths of rate-adaptive and priority marking control schemes to provide a superior QoS control in terms of measured perceived voice quality is also provided. (4) a new methodology for Internet-based subjective speech quality measurement which allows rapid assessment of voice quality for VoIP applications is proposed and assessed using both objective and traditional MOS test methods

    Quality-Oriented Perceptual HEVC Based on the Spatiotemporal Saliency Detection Model

    Get PDF
    Perceptual video coding (PVC) can provide a lower bitrate with the same visual quality compared with traditional H.265/high efficiency video coding (HEVC). In this work, a novel H.265/HEVC-compliant PVC framework is proposed based on the video saliency model. Firstly, both an effective and efficient spatiotemporal saliency model is used to generate a video saliency map. Secondly, a perceptual coding scheme is developed based on the saliency map. A saliency-based quantization control algorithm is proposed to reduce the bitrate. Finally, the simulation results demonstrate that the proposed perceptual coding scheme shows its superiority in objective and subjective tests, achieving up to a 9.46% bitrate reduction with negligible subjective and objective quality loss. The advantage of the proposed method is the high quality adapted for a high-definition video application

    Quality of experience in telemeetings and videoconferencing: a comprehensive survey

    Get PDF
    Telemeetings such as audiovisual conferences or virtual meetings play an increasingly important role in our professional and private lives. For that reason, system developers and service providers will strive for an optimal experience for the user, while at the same time optimizing technical and financial resources. This leads to the discipline of Quality of Experience (QoE), an active field originating from the telecommunication and multimedia engineering domains, that strives for understanding, measuring, and designing the quality experience with multimedia technology. This paper provides the reader with an entry point to the large and still growing field of QoE of telemeetings, by taking a holistic perspective, considering both technical and non-technical aspects, and by focusing on current and near-future services. Addressing both researchers and practitioners, the paper first provides a comprehensive survey of factors and processes that contribute to the QoE of telemeetings, followed by an overview of relevant state-of-the-art methods for QoE assessment. To embed this knowledge into recent technology developments, the paper continues with an overview of current trends, focusing on the field of eXtended Reality (XR) applications for communication purposes. Given the complexity of telemeeting QoE and the current trends, new challenges for a QoE assessment of telemeetings are identified. To overcome these challenges, the paper presents a novel Profile Template for characterizing telemeetings from the holistic perspective endorsed in this paper

    A MODEL FOR PREDICTING THE PERFORMANCE OF IP VIDEOCONFERENCING

    Get PDF
    With the incorporation of free desktop videoconferencing (DVC) software on the majority of the world's PCs, over the recent years, there has, inevitably, been considerable interest in using DVC over the Internet. The growing popularity of DVC increases the need for multimedia quality assessment. However, the task of predicting the perceived multimedia quality over the Internet Protocol (IP) networks is complicated by the fact that the audio and video streams are susceptible to unique impairments due to the unpredictable nature of IP networks, different types of task scenarios, different levels of complexity, and other related factors. To date, a standard consensus to define the IP media Quality of Service (QoS) has yet to be implemented. The thesis addresses this problem by investigating a new approach to assess the quality of audio, video, and audiovisual overall as perceived in low cost DVC systems. The main aim of the thesis is to investigate current methods used to assess the perceived IP media quality, and then propose a model which will predict the quality of audiovisual experience from prevailing network parameters. This thesis investigates the effects of various traffic conditions, such as, packet loss, jitter, and delay and other factors that may influence end user acceptance, when low cost DVC is used over the Internet. It also investigates the interaction effects between the audio and video media, and the issues involving the lip sychronisation error. The thesis provides the empirical evidence that the subjective mean opinion score (MOS) of the perceived multimedia quality is unaffected by lip synchronisation error in low cost DVC systems. The data-gathering approach that is advocated in this thesis involves both field and laboratory trials to enable the comparisons of results between classroom-based experiments and real-world environments to be made, and to provide actual real-world confirmation of the bench tests. The subjective test method was employed since it has been proven to be more robust and suitable for the research studies, as compared to objective testing techniques. The MOS results, and the number of observations obtained, have enabled a set of criteria to be established that can be used to determine the acceptable QoS for given network conditions and task scenarios. Based upon these comprehensive findings, the final contribution of the thesis is the proposal of a new adaptive architecture method that is intended to enable the performance of IP based DVC of a particular session to be predicted for a given network condition

    Contribution to quality of user experience provision over wireless networks

    Get PDF
    The widespread expansion of wireless networks has brought new attractive possibilities to end users. In addition to the mobility capabilities provided by unwired devices, it is worth remarking the easy configuration process that a user has to follow to gain connectivity through a wireless network. Furthermore, the increasing bandwidth provided by the IEEE 802.11 family has made possible accessing to high-demanding services such as multimedia communications. Multimedia traffic has unique characteristics that make it greatly vulnerable against network impairments, such as packet losses, delay, or jitter. Voice over IP (VoIP) communications, video-conference, video-streaming, etc., are examples of these high-demanding services that need to meet very strict requirements in order to be served with acceptable levels of quality. Accomplishing these tough requirements will become extremely important during the next years, taking into account that consumer video traffic will be the predominant traffic in the Internet during the next years. In wired systems, these requirements are achieved by using Quality of Service (QoS) techniques, such as Differentiated Services (DiffServ), traffic engineering, etc. However, employing these methodologies in wireless networks is not that simple as many other factors impact on the quality of the provided service, e.g., fading, interferences, etc. Focusing on the IEEE 802.11g standard, which is the most extended technology for Wireless Local Area Networks (WLANs), it defines two different architecture schemes. On one hand, the infrastructure mode consists of a central point, which manages the network, assuming network controlling tasks such as IP assignment, routing, accessing security, etc. The rest of the nodes composing the network act as hosts, i.e., they send and receive traffic through the central point. On the other hand, the IEEE 802.11 ad-hoc configuration mode is less extended than the infrastructure one. Under this scheme, there is not a central point in the network, but all the nodes composing the network assume both host and router roles, which permits the quick deployment of a network without a pre-existent infrastructure. This type of networks, so called Mobile Ad-hoc NETworks (MANETs), presents interesting characteristics for situations when the fast deployment of a communication system is needed, e.g., tactics networks, disaster events, or temporary networks. The benefits provided by MANETs are varied, including high mobility possibilities provided to the nodes, network coverage extension, or network reliability avoiding single points of failure. The dynamic nature of these networks makes the nodes to react to topology changes as fast as possible. Moreover, as aforementioned, the transmission of multimedia traffic entails real-time constraints, necessary to provide these services with acceptable levels of quality. For those reasons, efficient routing protocols are needed, capable of providing enough reliability to the network and with the minimum impact to the quality of the service flowing through the nodes. Regarding quality measurements, the current trend is estimating what the end user actually perceives when consuming the service. This paradigm is called Quality of user Experience (QoE) and differs from the traditional Quality of Service (QoS) approach in the human perspective given to quality estimations. In order to measure the subjective opinion that a user has about a given service, different approaches can be taken. The most accurate methodology is performing subjective tests in which a panel of human testers rates the quality of the service under evaluation. This approach returns a quality score, so-called Mean Opinion Score (MOS), for the considered service in a scale 1 - 5. This methodology presents several drawbacks such as its high expenses and the impossibility of performing tests at real time. For those reasons, several mathematical models have been presented in order to provide an estimation of the QoE (MOS) reached by different multimedia services In this thesis, the focus is on evaluating and understanding the multimedia-content transmission-process in wireless networks from a QoE perspective. To this end, firstly, the QoE paradigm is explored aiming at understanding how to evaluate the quality of a given multimedia service. Then, the influence of the impairments introduced by the wireless transmission channel on the multimedia communications is analyzed. Besides, the functioning of different WLAN schemes in order to test their suitability to support highly demanding traffic such as the multimedia transmission is evaluated. Finally, as the main contribution of this thesis, new mechanisms or strategies to improve the quality of multimedia services distributed over IEEE 802.11 networks are presented. Concretely, the distribution of multimedia services over ad-hoc networks is deeply studied. Thus, a novel opportunistic routing protocol, so-called JOKER (auto-adJustable Opportunistic acK/timEr-based Routing) is presented. This proposal permits better support to multimedia services while reducing the energy consumption in comparison with the standard ad-hoc routing protocols.Universidad Politécnica de CartagenaPrograma Oficial de Doctorado en Tecnologías de la Información y Comunicacione

    Finding perceptually optimal operating points of a real time interactive video-conferencing system

    Get PDF
    This research aims to address issues faced by real time video-conferencing systems in locating a perceptually optimal operating point under various network and conversational conditions. In order to determine the perceptually optimal operating point of a video-conferencing system, we must first be able to conduct a fair assessment of the quality of the current operating point in the system and compare it with another operating point to determine if one is better than the other in terms of perceptual quality. However at this point in time, there does not exist one objective quality metric that can accurately and fully describe the perceptual quality of a real time video conversation. Hence there is a need for a controlled environment to allow tests to be conducted in and in which we can study different metrics and identify the best trade-offs between them. We begin by studying the components of a typical setup of a real time video-conferencing system and the impacts that various network and conversation conditions can have on the overall perceptual quality. We also look into different metrics available to measure those impacts. We then created a platform to perform black box testing on current video conferencing systems and observe how they handle the changes in operating conditions. The platform is then used to conduct a brief evaluation of the performance of Skype, a popular commercial video-conferencing system. However, we are not able to modify the system parameters of Skype. The main contribution of this thesis is the design of a new testbed that provides a controlled environment to allow tests to be conducted to determine the perceptual optimum operating point of a video conversation under specified network and conversation conditions. This testbed will allow us to modify certain parameters, such as frame rate and frame size, which were not previously possible. The testbed takes as input, two recorded videos of the two speakers of a face-to-face conversation and desired output video parameters, such as frame rate, frame size and delay. A video generation algorithm is designed as part of the testbed to handle modifications to frame rate and frame size of the videos as well as delays inserted into the recorded video conversation to simulate the effects of network delays. The most important issue addressed is the generation of new frames to fill up the gaps created due to a change in frame rate or delay inserted, unlike as in the case of voice, where a period of silence can simply be used to handle these situations. The testbed uses a packetization strategy designed on the basis of an uneven packet transmission rate (UPTR) and that handles the packetization of interleaved video and audio data; it also uses piggybacking to provide redundancy if required. Losses can be injected either randomly or based on packet traces collected via PlanetLab. The processed videos will then be pieced together side-by-side to give the viewpoint of a third-party observing the video conversation from the site of the first speaker. Hence the first speaker will be observed to have a faster reaction time without network delays than that of the second speaker who is simulated to be located at the remote end. The video of the second speaker will also reflect the degradations in perceptual quality induced by the network conditions, whereas the first speaker will be of perfect quality. Hence with the testbed, we are able to generate output videos for different operating points under the same network and conversational conditions and thus able to make comparisons between two operating points. With the testbed in place, we demonstrate how it can be used to evaluate the effects of various parameters on the overall perceptual quality. Lastly, we demonstrate the results of applying an existing efficient search algorithm used for estimating the perceptually optimal mouth-to-ear delay (MED) of a Voice-over-IP(VoIP) conversation to a Video Conversation. This is achieved by using the network simulator designed to conduct a series of subjective and objective tests to identify the perceptual optimum MED under specific network and conversational conditions

    Virtual transcendence experiences: Exploring technical and design challenges in multi-sensory environments

    Get PDF
    In this paper 1, we introduce the concept of Virtual Transcendence Experience (VTE) as a response to the interactions of several users sharing several immersive experiences through different media channels. For that, we review the current body of knowledge that has led to the development of a VTE system. This is followed by a discussion of current technical and design challenges that could support the implementation of this concept. This discussion has informed the VTE framework (VTEf), which integrates different layers of experiences, including the role of each user and the technical challenges involved. We conclude this paper with suggestions for two scenarios and recommendations for the implementation of a system that could support VTEs
    • …
    corecore