231 research outputs found
Spherical clustering of users navigating 360{\deg} content
In Virtual Reality (VR) applications, understanding how users explore the
omnidirectional content is important to optimize content creation, to develop
user-centric services, or even to detect disorders in medical applications.
Clustering users based on their common navigation patterns is a first direction
to understand users behaviour. However, classical clustering techniques fail in
identifying these common paths, since they are usually focused on minimizing a
simple distance metric. In this paper, we argue that minimizing the distance
metric does not necessarily guarantee to identify users that experience similar
navigation path in the VR domain. Therefore, we propose a graph-based method to
identify clusters of users who are attending the same portion of the spherical
content over time. The proposed solution takes into account the spherical
geometry of the content and aims at clustering users based on the actual
overlap of displayed content among users. Our method is tested on real VR user
navigation patterns. Results show that our solution leads to clusters in which
at least 85% of the content displayed by one user is shared among the other
users belonging to the same cluster.Comment: 5 pages, conference (Published in: ICASSP 2019 - 2019 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Bridge the Gap Between VQA and Human Behavior on Omnidirectional Video: A Large-Scale Dataset and a Deep Learning Model
Omnidirectional video enables spherical stimuli with the viewing range. Meanwhile, only the viewport region of omnidirectional
video can be seen by the observer through head movement (HM), and an even
smaller region within the viewport can be clearly perceived through eye
movement (EM). Thus, the subjective quality of omnidirectional video may be
correlated with HM and EM of human behavior. To fill in the gap between
subjective quality and human behavior, this paper proposes a large-scale visual
quality assessment (VQA) dataset of omnidirectional video, called VQA-OV, which
collects 60 reference sequences and 540 impaired sequences. Our VQA-OV dataset
provides not only the subjective quality scores of sequences but also the HM
and EM data of subjects. By mining our dataset, we find that the subjective
quality of omnidirectional video is indeed related to HM and EM. Hence, we
develop a deep learning model, which embeds HM and EM, for objective VQA on
omnidirectional video. Experimental results show that our model significantly
improves the state-of-the-art performance of VQA on omnidirectional video.Comment: Accepted by ACM MM 201
Streaming and User Behaviour in Omnidirectional Videos
Omnidirectional videos (ODVs) have gone beyond the passive paradigm of traditional video,
offering higher degrees of immersion and interaction. The revolutionary novelty of this technology is the possibility for users to interact with the surrounding environment, and to feel a
sense of engagement and presence in a virtual space. Users are clearly the main driving force of
immersive applications and consequentially the services need to be properly tailored to them.
In this context, this chapter highlights the importance of the new role of users in ODV streaming applications, and thus the need for understanding their behaviour while navigating within
ODVs. A comprehensive overview of the research efforts aimed at advancing ODV streaming
systems is also presented. In particular, the state-of-the-art solutions under examination in this
chapter are distinguished in terms of system-centric and user-centric streaming approaches: the
former approach comes from a quite straightforward extension of well-established solutions for
the 2D video pipeline while the latter one takes the benefit of understanding usersâ behaviour
and enable more personalised ODV streaming
Machine Learning for Multimedia Communications
Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learningoriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise
Do Users Behave Similarly in VR? Investigation of the User Influence on the System Design
With the overarching goal of developing user-centric Virtual Reality (VR) systems, a new wave of studies focused on understanding how users interact in VR environments has recently emerged. Despite the intense efforts, however, current literature still does not provide the right framework to fully interpret and predict usersâ trajectories while navigating in VR scenes. This work advances the state-of-the-art on both the study of usersâ behaviour in VR and the user-centric system design. In more detail, we complement current datasets by presenting a publicly available dataset that provides navigation trajectories acquired for heterogeneous omnidirectional videos and different viewing platformsânamely, head-mounted display, tablet, and laptop. We then present an exhaustive analysis on the collected data to better understand navigation in VR across users, content, and, for the first time, across viewing platforms. The novelty lies in the user-affinity metric, proposed in this work to investigate usersâ similarities when navigating within the content. The analysis reveals useful insights on the effect of device and content on the navigation, which could be precious considerations from the system design perspective. As a case study of the importance of studying usersâ behaviour when designing VR systems, we finally propose a user-centric server optimisation. We formulate an integer linear program that seeks the best stored set of omnidirectional content that minimises encoding and storage cost while maximising the userâs experience. This is posed while taking into account network dynamics, type of video content, and also user population interactivity. Experimental results prove that our solution outperforms common company recommendations in terms of experienced quality but also in terms of encoding and storage, achieving a savings up to 70%. More importantly, we highlight a strong correlation between the storage cost and the user-affinity metric, showing the impact of the latter in the system architecture design
- âŠ