2,310 research outputs found
Machine Learning for Multimedia Communications
Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learningoriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise
Towards QoE-Driven Optimization of Multi-Dimensional Content Streaming
Whereas adaptive video streaming for 2D video is well established and frequently used in streaming services, adaptation for emerging higher-dimensional content, such as point clouds, is still a research issue. Moreover, how to optimize resource usage in streaming services that support multiple content types of different dimensions and levels of interactivity has so far not been sufficiently studied. Learning-based approaches aim to optimize the streaming experience according to user needs. They predict quality metrics and try to find system parameters maximizing them given the current network conditions. With this paper, we show how to approach content and network adaption driven by Quality of Experience (QoE) for multi-dimensional content. We describe components required to create a system adapting multiple streams of different content types simultaneously, identify research gaps and propose potential next steps
User-Adaptive Editing for 360 degree Video Streaming with Deep Reinforcement Learning
International audienceThe development through streaming of 360°videos is persistently hindered by how much bandwidth they require. Adapting spatially the quality of the sphere to the user's Field of View (FoV) lowers the data rate but requires to keep the playback buffer small, to predict the user's motion or to make replacements to keep the buffered qualities up to date with the moving FoV, all three being uncertain and risky. We have previously shown that opportunistically regaining control on the FoV with active attention-driving techniques makes for additional levers to ease streaming and improve Quality of Experience (QoE). Deep neural networks have been recently shown to achieve best performance for video streaming adaptation and head motion prediction. This demo presents a step ahead in the important investigation of deep neural network approaches to obtain user-adaptive and network-adaptive 360°video streaming systems. In this demo, we show how snap-changes, an attention-driving technique, can be automatically modulated by the user's motion to improve the streaming QoE. The control of snap-changes is made with a deep neural network trained on head motion traces with the Deep Reinforcement Learning strategy A3C
BOLA360: Near-optimal View and Bitrate Adaptation for 360-degree Video Streaming
Recent advances in omnidirectional cameras and AR/VR headsets have spurred
the adoption of 360-degree videos that are widely believed to be the future of
online video streaming. 360-degree videos allow users to wear a head-mounted
display (HMD) and experience the video as if they are physically present in the
scene. Streaming high-quality 360-degree videos at scale is an unsolved problem
that is more challenging than traditional (2D) video delivery. The data rate
required to stream 360-degree videos is an order of magnitude more than
traditional videos. Further, the penalty for rebuffering events where the video
freezes or displays a blank screen is more severe as it may cause
cybersickness. We propose an online adaptive bitrate (ABR) algorithm for
360-degree videos called BOLA360 that runs inside the client's video player and
orchestrates the download of video segments from the server so as to maximize
the quality-of-experience (QoE) of the user. BOLA360 conserves bandwidth by
downloading only those video segments that are likely to fall within the
field-of-view (FOV) of the user. In addition, BOLA360 continually adapts the
bitrate of the downloaded video segments so as to enable a smooth playback
without rebuffering. We prove that BOLA360 is near-optimal with respect to an
optimal offline algorithm that maximizes QoE. Further, we evaluate BOLA360 on a
wide range of network and user head movement profiles and show that it provides
to more QoE than state-of-the-art algorithms. While ABR
algorithms for traditional (2D) videos have been well-studied over the last
decade, our work is the first ABR algorithm for 360-degree videos with both
theoretical and empirical guarantees on its performance.Comment: 25 page
Scalable Multiuser Immersive Communications with Multi-numerology and Mini-slot
This paper studies multiuser immersive communications networks in which
different user equipment may demand various extended reality (XR) services. In
such heterogeneous networks, time-frequency resource allocation needs to be
more adaptive since XR services are usually multi-modal and latency-sensitive.
To this end, we develop a scalable time-frequency resource allocation method
based on multi-numerology and mini-slot. To appropriately determining the
discrete parameters of multi-numerology and mini-slot for multiuser immersive
communications, the proposed method first presents a novel flexible
time-frequency resource block configuration, then it leverages the deep
reinforcement learning to maximize the total quality-of-experience (QoE) under
different users' QoE constraints. The results confirm the efficiency and
scalability of the proposed time-frequency resource allocation method
- …