226 research outputs found

    Saliency Driven Perceptual Quality Metric for Omnidirectional Visual Content

    Get PDF
    The problem of objectively measuring perceptual quality of omnidirectional visual content arises in many immersive imaging applications and particularly in compression. The interactive nature of this type of content limits the performance of earlier methods designed for static images or for video with a predefined dynamic. The non-deterministic impact must be addressed using statistical approach. One of the ways to describe, analyze and predict viewer interactions in omnidirectional imaging is through estimation of visual attention. We propose an objective metric to measure perceptual quality of omnidirectional visual content considering visual attention information

    Do Users Behave Similarly in VR? Investigation of the User Influence on the System Design

    Get PDF
    With the overarching goal of developing user-centric Virtual Reality (VR) systems, a new wave of studies focused on understanding how users interact in VR environments has recently emerged. Despite the intense efforts, however, current literature still does not provide the right framework to fully interpret and predict users’ trajectories while navigating in VR scenes. This work advances the state-of-the-art on both the study of users’ behaviour in VR and the user-centric system design. In more detail, we complement current datasets by presenting a publicly available dataset that provides navigation trajectories acquired for heterogeneous omnidirectional videos and different viewing platforms—namely, head-mounted display, tablet, and laptop. We then present an exhaustive analysis on the collected data to better understand navigation in VR across users, content, and, for the first time, across viewing platforms. The novelty lies in the user-affinity metric, proposed in this work to investigate users’ similarities when navigating within the content. The analysis reveals useful insights on the effect of device and content on the navigation, which could be precious considerations from the system design perspective. As a case study of the importance of studying users’ behaviour when designing VR systems, we finally propose a user-centric server optimisation. We formulate an integer linear program that seeks the best stored set of omnidirectional content that minimises encoding and storage cost while maximising the user’s experience. This is posed while taking into account network dynamics, type of video content, and also user population interactivity. Experimental results prove that our solution outperforms common company recommendations in terms of experienced quality but also in terms of encoding and storage, achieving a savings up to 70%. More importantly, we highlight a strong correlation between the storage cost and the user-affinity metric, showing the impact of the latter in the system architecture design

    AVQBits-adaptive video quality model based on bitstream information for various video applications

    Get PDF
    The paper presents AVQBits, a versatile, bitstream-based video quality model. It can be applied in several contexts such as video service monitoring, evaluation of video encoding quality, of gaming video QoE, and even of omnidirectional video quality. In the paper, it is shown that AVQBits predictions closely match video quality ratings obained in various subjective tests with human viewers, for videos up to 4K-UHD resolution (Ultra-High Definition, 3840 x 2180 pixels) and framerates up 120 fps. With the different variants of AVQBits presented in the paper, video quality can be monitored either at the client side, in the network or directly after encoding. The no-reference AVQBits model was developed for different video services and types of input data, reflecting the increasing popularity of Video-on-Demand services and widespread use of HTTP-based adaptive streaming. At its core, AVQBits encompasses the standardized ITU-T P.1204.3 model, with further model instances that can either have restricted or extended input information, depending on the application context. Four different instances of AVQBits are presented, that is, a Mode 3 model with full access to the bitstream, a Mode 0 variant using only metadata such as codec type, framerate, resoution and bitrate as input, a Mode 1 model using Mode 0 information and frame-type and -size information, and a Hybrid Mode 0 model that is based on Mode 0 metadata and the decoded video pixel information. The models are trained on the authors’ own AVT-PNATS-UHD-1 dataset described in the paper. All models show a highly competitive performance by using AVT-VQDB-UHD-1 as validation dataset, e.g., with the Mode 0 variant yielding a value of 0.890 Pearson Correlation, the Mode 1 model of 0.901, the hybrid no-reference mode 0 model of 0.928 and the model with full bitstream access of 0.942. In addition, all four AVQBits variants are evaluated when applying them out-of-the-box to different media formats such as 360° video, high framerate (HFR) content, or gaming videos. The analysis shows that the ITU-T P.1204.3 and Hybrid Mode 0 instances of AVQBits for the considered use-cases either perform on par with or better than even state-of-the-art full reference, pixel-based models. Furthermore, it is shown that the proposed Mode 0 and Mode 1 variants outperform commonly used no-reference models for the different application scopes. Also, a long-term integration model based on the standardized ITU-T P.1203.3 is presented to estimate ratings of overall audiovisual streaming Quality of Experience (QoE) for sessions of 30 s up to 5 min duration. In the paper, the AVQBits instances with their per-1-sec score output are evaluated as the video quality component of the proposed long-term integration model. All AVQBits variants as well as the long-term integration module are made publicly available for the community for further research

    Perceptual video quality assessment: the journey continues!

    Get PDF
    Perceptual Video Quality Assessment (VQA) is one of the most fundamental and challenging problems in the field of Video Engineering. Along with video compression, it has become one of two dominant theoretical and algorithmic technologies in television streaming and social media. Over the last 2 decades, the volume of video traffic over the internet has grown exponentially, powered by rapid advancements in cloud services, faster video compression technologies, and increased access to high-speed, low-latency wireless internet connectivity. This has given rise to issues related to delivering extraordinary volumes of picture and video data to an increasingly sophisticated and demanding global audience. Consequently, developing algorithms to measure the quality of pictures and videos as perceived by humans has become increasingly critical since these algorithms can be used to perceptually optimize trade-offs between quality and bandwidth consumption. VQA models have evolved from algorithms developed for generic 2D videos to specialized algorithms explicitly designed for on-demand video streaming, user-generated content (UGC), virtual and augmented reality (VR and AR), cloud gaming, high dynamic range (HDR), and high frame rate (HFR) scenarios. Along the way, we also describe the advancement in algorithm design, beginning with traditional hand-crafted feature-based methods and finishing with current deep-learning models powering accurate VQA algorithms. We also discuss the evolution of Subjective Video Quality databases containing videos and human-annotated quality scores, which are the necessary tools to create, test, compare, and benchmark VQA algorithms. To finish, we discuss emerging trends in VQA algorithm design and general perspectives on the evolution of Video Quality Assessment in the foreseeable future

    No-Reference Quality Assessment for Colored Point Cloud and Mesh Based on Natural Scene Statistics

    Full text link
    To improve the viewer's quality of experience and optimize processing systems in computer graphics applications, the 3D quality assessment (3D-QA) has become an important task in the multimedia area. Point cloud and mesh are the two most widely used electronic representation formats of 3D models, the quality of which is quite sensitive to operations like simplification and compression. Therefore, many studies concerning point cloud quality assessment (PCQA) and mesh quality assessment (MQA) have been carried out to measure the visual quality degradations caused by lossy operations. However, a large part of previous studies utilizes full-reference (FR) metrics, which means they may fail to predict the accurate quality level of 3D models when the reference 3D model is not available. Furthermore, limited numbers of 3D-QA metrics are carried out to take color features into consideration, which significantly restricts the effectiveness and scope of application. In many quality assessment studies, natural scene statistics (NSS) have shown a good ability to quantify the distortion of natural scenes to statistical parameters. Therefore, we propose an NSS-based no-reference quality assessment metric for colored 3D models. In this paper, quality-aware features are extracted from the aspects of color and geometry directly from the 3D models. Then the statistic parameters are estimated using different distribution models to describe the characteristic of the 3D models. Our method is mainly validated on the colored point cloud quality assessment database (SJTU-PCQA) and the colored mesh quality assessment database (CMDM). The experimental results show that the proposed method outperforms all the state-of-art NR 3D-QA metrics and obtains an acceptable gap with the state-of-art FR 3D-QA metrics

    Deployment, Coverage And Network Optimization In Wireless Video Sensor Networks For 3D Indoor Monitoring

    Get PDF
    As a result of extensive research over the past decade or so, wireless sensor networks (wsns) have evolved into a well established technology for industry, environmental and medical applications. However, traditional wsns employ such sensors as thermal or photo light resistors that are often modeled with simple omni-directional sensing ranges, which focus only on scalar data within the sensing environment. In contrast, the sensing range of a wireless video sensor is directional and capable of providing more detailed video information about the sensing field. Additionally, with the introduction of modern features in non-fixed focus cameras such as the pan, tilt and zoom (ptz), the sensing range of a video sensor can be further regarded as a fan-shape in 2d and pyramid-shape in 3d. Such uniqueness attributed to wireless video sensors and the challenges associated with deployment restrictions of indoor monitoring make the traditional sensor coverage, deployment and networked solutions in 2d sensing model environments for wsns ineffective and inapplicable in solving the wireless video sensor network (wvsn) issues for 3d indoor space, thus calling for novel solutions. In this dissertation, we propose optimization techniques and develop solutions that will address the coverage, deployment and network issues associated within wireless video sensor networks for a 3d indoor environment. We first model the general problem in a continuous 3d space to minimize the total number of required video sensors to monitor a given 3d indoor region. We then convert it into a discrete version problem by incorporating 3d grids, which can achieve arbitrary approximation precision by adjusting the grid granularity. Due in part to the uniqueness of the visual sensor directional sensing range, we propose to exploit the directional feature to determine the optimal angular-coverage of each deployed visual sensor. Thus, we propose to deploy the visual sensors from divergent directional angles and further extend k-coverage to ``k-angular-coverage\u27\u27, while ensuring connectivity within the network. We then propose a series of mechanisms to handle obstacles in the 3d environment. We develop efficient greedy heuristic solutions that integrate all these aforementioned considerations one by one and can yield high quality results. Based on this, we also propose enhanced depth first search (dfs) algorithms that can not only further improve the solution quality, but also return optimal results if given enough time. Our extensive simulations demonstrate the superiority of both our greedy heuristic and enhanced dfs solutions. Finally, this dissertation discusses some future research directions such as in-network traffic routing and scheduling issues
    • …
    corecore