20 research outputs found

    Quality of experience in telemeetings and videoconferencing: a comprehensive survey

    Telemeetings such as audiovisual conferences or virtual meetings play an increasingly important role in our professional and private lives. For that reason, system developers and service providers will strive for an optimal experience for the user, while at the same time optimizing technical and financial resources. This leads to the discipline of Quality of Experience (QoE), an active field originating from the telecommunication and multimedia engineering domains, that strives for understanding, measuring, and designing the quality experience with multimedia technology. This paper provides the reader with an entry point to the large and still growing field of QoE of telemeetings, by taking a holistic perspective, considering both technical and non-technical aspects, and by focusing on current and near-future services. Addressing both researchers and practitioners, the paper first provides a comprehensive survey of factors and processes that contribute to the QoE of telemeetings, followed by an overview of relevant state-of-the-art methods for QoE assessment. To embed this knowledge into recent technology developments, the paper continues with an overview of current trends, focusing on the field of eXtended Reality (XR) applications for communication purposes. Given the complexity of telemeeting QoE and the current trends, new challenges for a QoE assessment of telemeetings are identified. To overcome these challenges, the paper presents a novel Profile Template for characterizing telemeetings from the holistic perspective endorsed in this paper

    The contrast effect: QoE of mixed video-qualities at the same time

    In desktop multi-party video-conferencing videostreams of participants are delivered in different qualities, but we know little about how such composition of the screen affects the quality of experience. Do the different videostreams serve as indirect quality references and the perceived video quality is thus dependent on other streams in the same session? How is the relation between the perceived qualities of each stream and the perceived quality of the overall session? To answer these questions we conducted a crowdsourcing study, in which we gathered over 5000 perceived quality ratings of overall sessions and individual streams. Our results show a contrast effect: high quality streams are rated better when more low quality streams are co-present, and vice versa. In turn, the quality p

    A QoE study of different stream and layout configurations in video conferencing under limited network conditions

    One particular problem of QoE research in video conferencing is, that most research in the past concentrated on one-to-one video conferencing or simply video consumption. However, video conferencing with two people (one-to-one) and within a group (multi-party) is different. Particularly, limitations of one participant might have an effect on the QoE of the whole group. This possible effect however is not well studied. Therefore, this paper aims to better understand the impact of individual limitations towards the groups QoE. To do so, we show a study about different video stream configurations and layouts for multi-party conferencing in respect to individual network limitations. For this, we conduct a user study with 20 participants in 5 groups, in a semi-controlled setup. Such a setup, combines supervising participants locally while still using our software infrastructure deployed in the internet. Furthermore, we use an asymmetric experiment design, by putting every participant under a different condition, as this proposes a more realistic scenario. Within our study, we look at three different factors: layout, video quality and network limitations. To foster conversation between participants, the group engaged in a discussion about different survival questions. Our findings show that packet loss and the resulting distortions have a greater impact on the QoE as reducing the video quality by its resolution. Furthermore, our findings indicate that participants are more satisfied in a visually equal layout (showing participants in a similar size) and a more balanced stream configuration

    QoE Estimation of WebRTC-based Audio-visual Conversations from Facial and Speech Features

    The utilization of user’s facial- and speech-related features for the estimation of the Quality of Experience (QoE) of multimedia services is still underinvestigated despite its potential. Currently, only the use of either facial or speech features individually has been proposed, and relevant limited experiments have been performed. To advance in this respect, in this study, we focused on WebRTC-based videoconferencing, where it is often possible to capture both the facial expressions and vocal speech characteristics of the users. First, we performed thorough statistical analysis to identify the most significant facial- and speech-related features for QoE estimation, which we extracted from the participants’ audio-video data collected during a subjective assessment. Second, we trained individual QoE estimation machine learning-based models on the separated facial and speech datasets. Finally, we employed data fusion techniques to combine the facial and speech datasets into a single dataset to enhance the QoE estimation performance due to the integrated knowledge provided by the fusion of facial and speech features. The obtained results demonstrate that the data fusion technique based on the Improved Centered Kernel Alignment (ICKA) allows for reaching a mean QoE estimation accuracy of 0.93, whereas the values of 0.78 and 0.86 are reached when using only facial or speech features, respectively

    Flexible media transport framework based on service composition for future network

    This work introduces common guidelines defined in several standardization organisms towards future networks based on the actual mechanisms and protocols used to treat the multimedia data, most of them placed in the application layer of the OSI reference model.Peer ReviewedPreprin

    Multi-party holomeetings: toward a new era of low-cost volumetric holographic meetings in virtual reality

    © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Fueled by advances in multi-party communications, increasingly mature immersive technologies being adopted, and the COVID-19 pandemic, a new wave of social virtual reality (VR) platforms have emerged to support socialization, interaction, and collaboration among multiple remote users who are integrated into shared virtual environments. Social VR aims to increase levels of (co-)presence and interaction quality by overcoming the limitations of 2D windowed representations in traditional multi-party video conferencing tools, although most existing solutions rely on 3D avatars to represent users. This article presents a social VR platform that supports real-time volumetric holographic representations of users that are based on point clouds captured by off-the-shelf RGB-D sensors, and it analyzes the platform’s potential for conducting interactive holomeetings (i.e., holoconferencing scenarios). This work evaluates such a platform’s performance and readiness for conducting meetings with up to four users, and it provides insights into aspects of the user experience when using single-camera and low-cost capture systems in scenarios with both frontal and side viewpoints. Overall, the obtained results confirm the platform’s maturity and the potential of holographic communications for conducting interactive multi-party meetings, even when using low-cost systems and single-camera capture systems in scenarios where users are sitting or have a limited translational movement along the X, Y, and Z axes within the 3D virtual environment (commonly known as 3 Degrees of Freedom plus, 3DoF+).The authors would like to thank the members of the EU H2020 VR-Together consortium for their valuable contributions, especially Marc Martos and Mohamad Hjeij for their support in developing and evaluating tasks. This work has been partially funded by: the EU’s Horizon 2020 program, under agreement nÂș 762111 (VR-Together project); by ACCIÓ (Generalitat de Catalunya), under agreement COMRDI18-1-0008 (ViVIM project); and by Cisco Research and the Silicon Valley Community Foundation, under the grant Extended Reality Multipoint Control Unit (ID: 1779376). The work by Mario Montagud has been additionally funded by Spain’s Agencia Estatal de InvestigaciĂłn under grant RYC2020-030679-I (AEI / 10.13039/501100011033) and by Fondo Social Europeo. The work of David RincĂłn was supported by Spain’s Agencia Estatal de InvestigaciĂłn within the Ministerio de Ciencia e InnovaciĂłn under Project PID2019-108713RB-C51 MCIN/AEI/10.13039/501100011033.Peer ReviewedPostprint (published version

    Network utility maximization for delay-sensitive applications in unknown communication settings

    In the last decades the Internet traffic has greatly evolved. The advent of new Internet services and applications has, in fact, led to a significant growth of the amount of data transmitted, as well as to a transformation of the data type. As a matter of fact, nowadays, the largest amount of traffic share consists of multimedia data, which do not represent classical Internet data. Due to the increasing amount of traffic, the network resources might be scarce, and in such cases it becomes extremely important to optimize network transmission in order to provide a satisfying service to the users. Although methods for maximizing the network utility in scenarios with limited resources have been studied extensively, the evolution of the Internet services poses continuously new challenges that require novel solution methods to meet the transmission requirements. In this thesis we propose novel solutions methods to network utility maximization problems that arise in the context of nowadays network communications. In particular we analyze problems related to delay-sensitive Internet applications and rate allocation in unknown network settings. In the first problem we study how to effectively allocate the transmission rates in a multiparty videoconference system. The main contribution of this chapter is an approximate fast rate rate allocation method that is able to adapt quickly to changes in the videoconference conditions. This fast adaptation cannot be achieved with classical network utility maximization solving methods, as they are usually based on iterative approaches. In this case we leverage the particular structure of the problem to design a novel distributed solving method which proves to be very effective when compared to baseline solutions. The next problem that we address is the design of a congestion control algorithm for delay-sensitive applications. One of the main problems of existing delay-based congestion control algorithms is that they tend to achieve an extremely low throughput when competing against loss-based algorithms. In order to overcome this difficulty we propose a novel adaptive controller based on a bandit problem approach. The adaptive controller tries to infer how the network responds, in terms of rate-delay pair at equilibrium, when changing the delay sensitivity of an underlying delay-based congestion control. Once the network response is inferred, the controller selects the sensitivity that leads to the best trade-off between the transmitting rate and the experienced delay. In the final problem, we analyze the design of an overlay rate allocation systems to be used when: the amount of available network resources is not known, and the user congestion feedback cannot be used as valid signal to reach the optimal rate allocation. Such a scenario appears when an Internet application wants to maximize a certain utility metric, but, at the same time, it must operate using a specific congestion control algorithm that is completely unaware of the application utility. To solve this problem we design a distributed system that coordinates the users in order to perform active learning on the amount of network resource. Adopting such a method reveals to be the key to an effective maximization of the long term application utility for the entire system

    Faces in the Clouds: Long-Duration, Multi-User, Cloud-Assisted Video Conferencing

    Multi-user video conferencing is a ubiquitous technology. Increasingly end-hosts in a conference are assisted by cloud-based servers that improve the quality of experience for end users. This paper evaluates the impact of strategies for placement of such servers on user experience and deployment cost. We consider scenarios based upon the Amazon EC2 infrastructure as well as future scenarios in which cloud instances can be located at a larger number of possible sites across the planet. We compare a number of possible strategies for choosing which cloud locations should host services and how traffic should route through them. Our study is driven by real data to create demand scenarios with realistic geographical user distributions and diurnal behaviour. We conclude that on the EC2 infrastructure a well chosen static selection of servers performs well but as more cloud locations are available a dynamic choice of servers becomes important

    Inter-Destination Multimedia Synchronization; Schemes, Use Cases and Standardization

    Traditionally, the media consumption model has been a passive and isolated activity. However, the advent of media streaming technologies, interactive social applications, and synchronous communications, as well as the convergence between these three developments, point to an evolution towards dynamic shared media experiences. In this new model, geographically distributed groups of consumers, independently of their location and the nature of their end-devices, can be immersed in a common virtual networked environment in which they can share multimedia services, interact and collaborate in real-time within the context of simultaneous media content consumption. In most of these multimedia services and applications, apart from the well-known intra and inter-stream synchronization techniques that are important inside the consumers playout devices, also the synchronization of the playout processes between several distributed receivers, known as multipoint, group or Inter-destination multimedia synchronization (IDMS), becomes essential. Due to the increasing popularity of social networking, this type of multimedia synchronization has gained in popularity in recent years. Although Social TV is perhaps the most prominent use case in which IDMS is useful, in this paper we present up to 19 use cases for IDMS, each one having its own synchronization requirements. Different approaches used in the (recent) past by researchers to achieve IDMS are described and compared. As further proof of the significance of IDMS nowadays, relevant organizations (such as ETSI TISPAN and IETF AVTCORE Group) efforts on IDMS standardization (in which authors have been and are participating actively), defining architectures and protocols, are summarized.This work has been financed, partially, by Universitat Politecnica de Valencia (UPV), under its R&D Support Program in PAID-05-11-002-331 Project and in PAID-01-10, and by TNO, under its Future Internet Use Research & Innovation Program. The authors also want to thank Kevin Gross for providing some of the use cases included in Sect. 1.2.Montagud, M.; Boronat Segui, F.; Stokking, H.; Van Brandenburg, R. (2012). Inter-Destination Multimedia Synchronization; Schemes, Use Cases and Standardization. 