53 research outputs found
Machine Learning for Multimedia Communications
Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learningoriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise
Machine Learning for Multimedia Communications
Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learning-oriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise
BOLA360: Near-optimal View and Bitrate Adaptation for 360-degree Video Streaming
Recent advances in omnidirectional cameras and AR/VR headsets have spurred
the adoption of 360-degree videos that are widely believed to be the future of
online video streaming. 360-degree videos allow users to wear a head-mounted
display (HMD) and experience the video as if they are physically present in the
scene. Streaming high-quality 360-degree videos at scale is an unsolved problem
that is more challenging than traditional (2D) video delivery. The data rate
required to stream 360-degree videos is an order of magnitude more than
traditional videos. Further, the penalty for rebuffering events where the video
freezes or displays a blank screen is more severe as it may cause
cybersickness. We propose an online adaptive bitrate (ABR) algorithm for
360-degree videos called BOLA360 that runs inside the client's video player and
orchestrates the download of video segments from the server so as to maximize
the quality-of-experience (QoE) of the user. BOLA360 conserves bandwidth by
downloading only those video segments that are likely to fall within the
field-of-view (FOV) of the user. In addition, BOLA360 continually adapts the
bitrate of the downloaded video segments so as to enable a smooth playback
without rebuffering. We prove that BOLA360 is near-optimal with respect to an
optimal offline algorithm that maximizes QoE. Further, we evaluate BOLA360 on a
wide range of network and user head movement profiles and show that it provides
to more QoE than state-of-the-art algorithms. While ABR
algorithms for traditional (2D) videos have been well-studied over the last
decade, our work is the first ABR algorithm for 360-degree videos with both
theoretical and empirical guarantees on its performance.Comment: 25 page
Objective assessment of region of interest-aware adaptive multimedia streaming quality
Adaptive multimedia streaming relies on controlled
adjustment of content bitrate and consequent video quality variation in order to meet the bandwidth constraints of the communication
link used for content delivery to the end-user. The values of the easy to measure network-related Quality of Service metrics have no direct relationship with the way moving images are
perceived by the human viewer. Consequently variations in the video stream bitrate are not clearly linked to similar variation in the user perceived quality. This is especially true if some human visual system-based adaptation techniques are employed. As research has shown, there are certain image regions in each frame of a video sequence on which the users are more interested than in the others. This paper presents the Region of Interest-based Adaptive Scheme (ROIAS) which adjusts differently the regions within each frame of the streamed multimedia content based on the user interest in them. ROIAS is presented and discussed in terms of the adjustment algorithms employed and their impact on the human perceived video quality. Comparisons with existing approaches, including a constant quality adaptation scheme across the whole frame area, are performed employing two objective metrics which estimate user perceived video quality
Joint optimization of bitrate selection and beamforming for holographic video cooperative streaming in VLC systems
Holographic video streaming requires ultrahigh channel capacity, which might not be achieved by the existing radio frequency-based wireless networks. To address this challenge, we propose a holographic video cooperative streaming framework by integrating coordinated multipoint transmission and beamforming technologies in visible light communication (VLC) systems. This framework enables simultaneous video streaming with an ultrahigh data rate for multiple users in the VLC system, resulting in a more efficient and effective streaming process. By mathematically modeling the streaming framework, we formulate a joint bitrate selection and beamforming problem, aiming to maximize the average video quality experienced by all users. The problem is a non-convex mixed-integer problem and is NP-hard in general. We propose an algorithm with polynomial time complexity for the problem using an alternative optimization technique along with an appropriate rounding operation. Numerical results demonstrate the superiority of the proposed joint bitrate selection and beamforming solution over baselines
Do Users Behave Similarly in VR? Investigation of the User Influence on the System Design
With the overarching goal of developing user-centric Virtual Reality (VR) systems, a new wave of studies focused on understanding how users interact in VR environments has recently emerged. Despite the intense efforts, however, current literature still does not provide the right framework to fully interpret and predict usersâ trajectories while navigating in VR scenes. This work advances the state-of-the-art on both the study of usersâ behaviour in VR and the user-centric system design. In more detail, we complement current datasets by presenting a publicly available dataset that provides navigation trajectories acquired for heterogeneous omnidirectional videos and different viewing platformsânamely, head-mounted display, tablet, and laptop. We then present an exhaustive analysis on the collected data to better understand navigation in VR across users, content, and, for the first time, across viewing platforms. The novelty lies in the user-affinity metric, proposed in this work to investigate usersâ similarities when navigating within the content. The analysis reveals useful insights on the effect of device and content on the navigation, which could be precious considerations from the system design perspective. As a case study of the importance of studying usersâ behaviour when designing VR systems, we finally propose a user-centric server optimisation. We formulate an integer linear program that seeks the best stored set of omnidirectional content that minimises encoding and storage cost while maximising the userâs experience. This is posed while taking into account network dynamics, type of video content, and also user population interactivity. Experimental results prove that our solution outperforms common company recommendations in terms of experienced quality but also in terms of encoding and storage, achieving a savings up to 70%. More importantly, we highlight a strong correlation between the storage cost and the user-affinity metric, showing the impact of the latter in the system architecture design
- âŠ