7 research outputs found
Contextual bandit learning-based viewport prediction for 360 video
Accurately predicting where the user of a Virtual Reality (VR) application will be looking at in the near future improves the perceive quality of services, such as adaptive tile-based streaming or personalized online training. However, because of the unpredictability and dissimilarity of user behavior it is still a big challenge. In this work, we propose to use reinforcement learning, in particular contextual bandits, to solve this problem. The proposed solution tackles the prediction in two stages: (1) detection of movement; (2) prediction of direction. In order to prove its potential for VR services, the method was deployed on an adaptive tile-based VR streaming testbed, for benchmarking against a 3D trajectory extrapolation approach. Our results showed a significant improvement in terms of prediction error compared to the benchmark. This reduced prediction error also resulted in an enhancement on the perceived video quality
Application-level performance of cross-layer scheduling for social VR in 5G
Social VR aims at enabling people located at different places to communicate and interact with each other in a natural way. It poses extremely strong throughput and latency requirements on the underlying communication networks. This paper investigates the potential of using cross-layer design approaches for radio access scheduling in order to realize these challenging requirements in (beyond) 5G networks. In particular, we provide an in-depth simulation study of the performance/capacity gains that can be achieved by exploiting the end-to-end latency budget and/or video frame type as cross-layer information in the scheduling decisions, and show how the benefits depend on the actual social VR scenario. This study further reveals the importance of using application-level metrics such as PSNR or SSIM rather than traditional network-level metrics like the packet drop rate in the performance assessment.</p
Human-centric quality management of immersive multimedia applications
Augmented Reality (AR) and Virtual Reality (VR) multimodal systems are the latest trend within the field of multimedia. As they emulate the senses by means of omni-directional visuals, 360 degrees sound, motion tracking and touch simulation, they are able to create a strong feeling of presence and interaction with the virtual environment. These experiences can be applied for virtual training (Industry 4.0), tele-surgery (healthcare) or remote learning (education). However, given the strong time and task sensitiveness of these applications, it is of great importance to sustain the end-user quality, i.e. the Quality-of-Experience (QoE), at all times. Lack of synchronization and quality degradation need to be reduced to a minimum to avoid feelings of cybersickness or loss of immersiveness and concentration. This means that there is a need to shift the quality management from system-centered performance metrics towards a more human, QoE-centered approach. However, this requires for novel techniques in the three areas of the QoE-management loop (monitoring, modelling and control). This position paper identifies open areas of research to fully enable human-centric driven management of immersive multimedia. To this extent, four main dimensions are put forward: (1) Task and well-being driven subjective assessment; (2) Real-time QoE modelling; (3) Accurate viewport prediction; (4) Machine Learning (ML)-based quality optimization and content recreation. This paper discusses the state-of-the-art, and provides with possible solutions to tackle the open challenges
Federated Prompt-based Decision Transformer for Customized VR Services in Mobile Edge Computing System
This paper investigates resource allocation to provide heterogeneous users
with customized virtual reality (VR) services in a mobile edge computing (MEC)
system. We first introduce a quality of experience (QoE) metric to measure user
experience, which considers the MEC system's latency, user attention levels,
and preferred resolutions. Then, a QoE maximization problem is formulated for
resource allocation to ensure the highest possible user experience,which is
cast as a reinforcement learning problem, aiming to learn a generalized policy
applicable across diverse user environments for all MEC servers. To learn the
generalized policy, we propose a framework that employs federated learning (FL)
and prompt-based sequence modeling to pre-train a common decision model across
MEC servers, which is named FedPromptDT. Using FL solves the problem of
insufficient local MEC data while protecting user privacy during offline
training. The design of prompts integrating user-environment cues and
user-preferred allocation improves the model's adaptability to various user
environments during online execution