Search CORE

2,233 research outputs found

Streaming and User Behaviour in Omnidirectional Videos

Author: Guedes Alan
Rossi Silvia
Toni Laura
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 29/09/2022
Field of study

Omnidirectional videos (ODVs) have gone beyond the passive paradigm of traditional video, offering higher degrees of immersion and interaction. The revolutionary novelty of this technology is the possibility for users to interact with the surrounding environment, and to feel a sense of engagement and presence in a virtual space. Users are clearly the main driving force of immersive applications and consequentially the services need to be properly tailored to them. In this context, this chapter highlights the importance of the new role of users in ODV streaming applications, and thus the need for understanding their behaviour while navigating within ODVs. A comprehensive overview of the research efforts aimed at advancing ODV streaming systems is also presented. In particular, the state-of-the-art solutions under examination in this chapter are distinguished in terms of system-centric and user-centric streaming approaches: the former approach comes from a quite straightforward extension of well-established solutions for the 2D video pipeline while the latter one takes the benefit of understanding users’ behaviour and enable more personalised ODV streaming

UCL Discovery

Detecting and removing visual distractors for video aesthetic enhancement

Author: Hu Shi-Min
Li Rui-Long
Wang Jue
Wu Xian
Zhang Fang-Lue
Zheng Zhao-Heng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2018
Field of study

Personal videos often contain visual distractors, which are objects that are accidentally captured that can distract viewers from focusing on the main subjects. We propose a method to automatically detect and localize these distractors through learning from a manually labeled dataset. To achieve spatially and temporally coherent detection, we propose extracting features at the Temporal-Superpixel (TSP) level using a traditional SVM-based learning framework. We also experiment with end-to-end learning using Convolutional Neural Networks (CNNs), which achieves slightly higher performance than other methods. The classification result is further refined in a post-processing step based on graph-cut optimization. Experimental results show that our method achieves an accuracy of 81% and a recall of 86%. We demonstrate several ways of removing the detected distractors to improve the video quality, including video hole filling; video frame replacement; and camera path re-planning. The user study results show that our method can significantly improve the aesthetic quality of videos

Crossref

Online Research @ Cardiff

Human-centric quality management of immersive multimedia applications

Author: De Turck Filip
Torres Vega Maria
Van Damme Sam
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Augmented Reality (AR) and Virtual Reality (VR) multimodal systems are the latest trend within the field of multimedia. As they emulate the senses by means of omni-directional visuals, 360 degrees sound, motion tracking and touch simulation, they are able to create a strong feeling of presence and interaction with the virtual environment. These experiences can be applied for virtual training (Industry 4.0), tele-surgery (healthcare) or remote learning (education). However, given the strong time and task sensitiveness of these applications, it is of great importance to sustain the end-user quality, i.e. the Quality-of-Experience (QoE), at all times. Lack of synchronization and quality degradation need to be reduced to a minimum to avoid feelings of cybersickness or loss of immersiveness and concentration. This means that there is a need to shift the quality management from system-centered performance metrics towards a more human, QoE-centered approach. However, this requires for novel techniques in the three areas of the QoE-management loop (monitoring, modelling and control). This position paper identifies open areas of research to fully enable human-centric driven management of immersive multimedia. To this extent, four main dimensions are put forward: (1) Task and well-being driven subjective assessment; (2) Real-time QoE modelling; (3) Accurate viewport prediction; (4) Machine Learning (ML)-based quality optimization and content recreation. This paper discusses the state-of-the-art, and provides with possible solutions to tackle the open challenges

Crossref

Ghent University Academic Bibliography

Archivsystem Ask23

Machine Learning for Multimedia Communications

Author: Maugey T
Thomos N
Toni L
Publication venue: 'MDPI AG'
Publication date: 21/01/2022
Field of study

Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learningoriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise

UCL Discovery

Efficient Video Quality Assessment Based on Spacetime Texture Representation

Author: Kevin Cannons
Peng Peng
Ze-nian Li
Publication venue
Publication date
Field of study

Mostexistingvideoqualitymetricsmeasuretemporaldistortions based on optical-flow estimation, which typically has limited descriptive power of visual dynamics and low efficiency. This paperpresents aunifiedandefficient framework to measure temporal distortions based on a spacetime texture representation of motion. We first propose an effective motion-tuning scheme to capture temporal distortions along motion trajectories by exploiting the distributive characteristic of the spacetime texture. Then we reuse the motion descriptors to build a self-information based spatiotemporal saliency model to guide the spatial pooling. At last, a comprehensive quality metric is developed by combining the temporaldistortionmeasurewithspatialdistortionmeasure. Our method demonstrates high efficiency and excellent correlation with the human perception of video quality

CiteSeerX