Search CORE

59,540 research outputs found

Blind Stereo Image Quality Assessment Inspired by Brain Sensory-Motor Fusion

Author: Karimi Maryam
Karimi Nader
Najarian Kayvan
Samavi Shadrokh
Soltanian Najmeh
Soroushmehr S. M. Reza
Publication venue
Publication date: 03/09/2017
Field of study

The use of 3D and stereo imaging is rapidly increasing. Compression, transmission, and processing could degrade the quality of stereo images. Quality assessment of such images is different than their 2D counterparts. Metrics that represent 3D perception by human visual system (HVS) are expected to assess stereoscopic quality more accurately. In this paper, inspired by brain sensory/motor fusion process, two stereo images are fused together. Then from every fused image two synthesized images are extracted. Effects of different distortions on statistical distributions of the synthesized images are shown. Based on the observed statistical changes, features are extracted from these synthesized images. These features can reveal type and severity of distortions. Then, a stacked neural network model is proposed, which learns the extracted features and accurately evaluates the quality of stereo images. This model is tested on 3D images of popular databases. Experimental results show the superiority of this method over state of the art stereo image quality assessment approachesComment: 11 pages, 13 figures, 3 table

arXiv.org e-Print Archive

Prediction of the Influence of Navigation Scan-path on Perceived Quality of Free-Viewpoint Videos

Author: Callet Patrick Le
Gutiérrez Jesús
Ke Gu
Ling Suiyi
Publication venue
Publication date: 10/10/2018
Field of study

Free-Viewpoint Video (FVV) systems allow the viewers to freely change the viewpoints of the scene. In such systems, view synthesis and compression are the two main sources of artifacts influencing the perceived quality. To assess this influence, quality evaluation studies are often carried out using conventional displays and generating predefined navigation trajectories mimicking the possible movement of the viewers when exploring the content. Nevertheless, as different trajectories may lead to different conclusions in terms of visual quality when benchmarking the performance of the systems, methods to identify critical trajectories are needed. This paper aims at exploring the impact of exploration trajectories (defined as Hypothetical Rendering Trajectories: HRT) on perceived quality of FVV subjectively and objectively, providing two main contributions. Firstly, a subjective assessment test including different HRTs was carried out and analyzed. The results demonstrate and quantify the influence of HRT in the perceived quality. Secondly, we propose a new objective video quality assessment measure to objectively predict the impact of HRT. This measure, based on Sketch-Token representation, models how the categories of the contours change spatially and temporally from a higher semantic level. Performance in comparison with existing quality metrics for FVV, highlight promising results for automatic detection of most critical HRTs for the benchmark of immersive systems.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

An Efficient Human Visual System Based Quality Metric for 3D Video

Author: Banitalebi-Dehkordi Amin
Nasiopoulos Panos
Pourazad Mahsa T.
Publication venue
Publication date: 13/03/2018
Field of study

Stereoscopic video technologies have been introduced to the consumer market in the past few years. A key factor in designing a 3D system is to understand how different visual cues and distortions affect the perceptual quality of stereoscopic video. The ultimate way to assess 3D video quality is through subjective tests. However, subjective evaluation is time consuming, expensive, and in some cases not possible. The other solution is developing objective quality metrics, which attempt to model the Human Visual System (HVS) in order to assess perceptual quality. Although several 2D quality metrics have been proposed for still images and videos, in the case of 3D efforts are only at the initial stages. In this paper, we propose a new full-reference quality metric for 3D content. Our method mimics HVS by fusing information of both the left and right views to construct the cyclopean view, as well as taking to account the sensitivity of HVS to contrast and the disparity of the views. In addition, a temporal pooling strategy is utilized to address the effect of temporal variations of the quality in the video. Performance evaluations showed that our 3D quality metric quantifies quality degradation caused by several representative types of distortions very accurately, with Pearson correlation coefficient of 90.8 %, a competitive performance compared to the state-of-the-art 3D quality metrics

arXiv.org e-Print Archive

Survey on Error Concealment Strategies and Subjective Testing of 3D Videos

Author: Arnold John
Frater Michael
Hasan Md Mehedi
Publication venue
Publication date: 29/08/2018
Field of study

Over the last decade, different technologies to visualize 3D scenes have been introduced and improved. These technologies include stereoscopic, multi-view, integral imaging and holographic types. Despite increasing consumer interest; poor image quality, crosstalk or side effects of 3D displays and also the lack of defined broadcast standards has hampered the advancement of 3D displays to the mass consumer market. Also, in real time transmission of 3DTV sequences over packet-based networks may results in visual quality degradations due to packet loss and others. In the conventional 2D videos different extrapolation and directional interpolation strategies have been used for concealing the missing blocks but in 3D, it is still an emerging field of research. Few studies have been carried out to define the assessment methods of stereoscopic images and videos. But through industrial and commercial perspective, subjective quality evaluation is the most direct way to evaluate human perception on 3DTV systems. This paper reviews the state-of-the-art error concealment strategies and the subjective evaluation of 3D videos and proposes a low complexity frame loss concealment method for the video decoder. Subjective testing on prominent datasets videos and comparison with existing concealment methods show that the proposed method is very much efficient to conceal errors of stereoscopic videos in terms of computation time, comfort and distortion

arXiv.org e-Print Archive

Causes of discomfort in stereoscopic content: a review

Author: Hansard Miles
Terzic Kasim
Publication venue
Publication date: 14/03/2017
Field of study

This paper reviews the causes of discomfort in viewing stereoscopic content. These include objective factors, such as misaligned images, as well as subjective factors, such as excessive disparity. Different approaches to the measurement of visual discomfort are also reviewed, in relation to the underlying physiological and psychophysical processes. The importance of understanding these issues, in the context of new display technologies, is emphasized

arXiv.org e-Print Archive

Binocular Rivalry - Psychovisual Challenge in Stereoscopic Video Error Concealment

Author: Arnold John F.
Frater Michael R.
Hasan Md Mehedi
Publication venue
Publication date: 28/08/2018
Field of study

During Stereoscopic 3D (S3D) video transmission, one or both views can be affected by bit errors and packet losses caused by adverse channel conditions, delay or jitter. Typically, the Human Visual System (HVS) is incapable of aligning and fusing stereoscopic content if one view is affected by artefacts caused by compression, transmission and rendering with distorted patterns being perceived as alterations of the original which presents a shimmering effect known as binocular rivalry and is detrimental to a user's Quality of Experience (QoE). This study attempts to quantify the effects of binocular rivalry for stereoscopic videos. Existing approaches, in which one or more frames are lost in one or both views undergo error concealment, are implemented. Then, subjective testing is carried out on the error concealed 3D video sequences. The evaluations provided by these subjects were then combined and analysed using a standard Student t-test thus quantifying the impact of binocular rivalry and allowing the impact to be compared with that of monocular viewing. The main focus is implementing error-resilient video communication, avoiding the detrimental effects of binocular rivalry and improving the overall QoE of viewers.Comment: 11 pages, 9 Figure

arXiv.org e-Print Archive

Investigating Simulation-Based Metrics for Characterizing Linear Iterative Reconstruction in Digital Breast Tomosynthesis

Author: Pan Xiaochuan
Rose Sean D.
Sanchez Adrian A.
Sidky Emil Y.
Publication venue: 'Wiley'
Publication date: 01/06/2017
Field of study

Simulation-based image quality metrics are adapted and investigated for characterizing the parameter dependences of linear iterative image reconstruction for DBT. Three metrics based on 2D DBT simulation are investigated: (1) a root-mean-square-error (RMSE) between the test phantom and reconstructed image, (2) a gradient RMSE where the comparison is made after taking a spatial gradient of both image and phantom, and (3) a region-of-interest (ROI) Hotelling observer (HO) for signal-known-exactly/background-known-exactly (SKE/BKE) and signal-known-exactly/background-known-statistically (SKE/BKS) detection tasks. Two simulation studies are performed using the aforementioned metrics, varying voxel aspect ratio and regularization strength for two types of Tikhonov regularized least-squares optimization. The RMSE metrics are applied to a 2D test phantom and the ROI-HO metric is applied to two tasks relevant to DBT: large, low contrast lesion detection and small, high contrast microcalcification detection. The RMSE metric trends are compared with visual assessment of the reconstructed test phantom. The ROI-HO metric trends are compared with 3D reconstructed images from ACR phantom data acquired with a Hologic Selenia Dimensions DBT system. Sensitivity of image RMSE to mean pixel value is found to limit its applicability to the assessment of DBT image reconstruction. Image gradient RMSE is insensitive to mean pixel value and appears to track better with subjective visualization of the reconstructed bar-pattern phantom. The ROI-HO metric shows an increasing trend with regularization strength for both forms of Tikhonov-regularized least-squares; however, this metric saturates at intermediate regularization strength indicating a point of diminishing returns for signal detection. Visualization with reconstructed ACR phantom images appears to show a similar dependence with regularization strength.Comment: The manuscript has been submitted to Medical Physic

arXiv.org e-Print Archive

Benchmark 3D eye-tracking dataset for visual saliency prediction on stereoscopic 3D video

Author: Banitalebi-Dehkordi Amin
Nasiopoulos Eleni
Nasiopoulos Panos
Pourazad Mahsa T.
Publication venue
Publication date: 13/03/2018
Field of study

Visual Attention Models (VAMs) predict the location of an image or video regions that are most likely to attract human attention. Although saliency detection is well explored for 2D image and video content, there are only few attempts made to design 3D saliency prediction models. Newly proposed 3D visual attention models have to be validated over large-scale video saliency prediction datasets, which also contain results of eye-tracking information. There are several publicly available eye-tracking datasets for 2D image and video content. In the case of 3D, however, there is still a need for large-scale video saliency datasets for the research community for validating different 3D-VAMs. In this paper, we introduce a large-scale dataset containing eye-tracking data collected from 61 stereoscopic 3D videos (and also 2D versions of those) and 24 subjects participated in a free-viewing test. We evaluate the performance of the existing saliency detection methods over the proposed dataset. In addition, we created an online benchmark for validating the performance of the existing 2D and 3D visual attention models and facilitate addition of new VAMs to the benchmark. Our benchmark currently contains 50 different VAMs

arXiv.org e-Print Archive

Human Pose Forecasting via Deep Markov Models

Author: Cherian Anoop
Gould Stephen
Han Tengda
Toyer Sam
Publication venue
Publication date: 05/09/2017
Field of study

Human pose forecasting is an important problem in computer vision with applications to human-robot interaction, visual surveillance, and autonomous driving. Usually, forecasting algorithms use 3D skeleton sequences and are trained to forecast for a few milliseconds into the future. Long-range forecasting is challenging due to the difficulty of estimating how long a person continues an activity. To this end, our contributions are threefold: (i) we propose a generative framework for poses using variational autoencoders based on Deep Markov Models (DMMs); (ii) we evaluate our pose forecasts using a pose-based action classifier, which we argue better reflects the subjective quality of pose forecasts than distance in coordinate space; (iii) last, for evaluation of the new model, we introduce a 480,000-frame video dataset called Ikea Furniture Assembly (Ikea FA), which depicts humans repeatedly assembling and disassembling furniture. We demonstrate promising results for our approach on both Ikea FA and the existing NTU RGB+D dataset.Comment: Accepted to DICTA'1

arXiv.org e-Print Archive

Perceptual Quality Assessment of Omnidirectional Images as Moving Camera Videos

Author: Fang Yuming
Ma Kede
Sui Xiangjie
Yao Yiru
Publication venue
Publication date: 04/01/2021
Field of study

Omnidirectional images (also referred to as static 360{\deg} panoramas) impose viewing conditions much different from those of regular 2D images. How do humans perceive image distortions in immersive virtual reality (VR) environments is an important problem which receives less attention. We argue that, apart from the distorted panorama itself, two types of VR viewing conditions are crucial in determining the viewing behaviors of users and the perceived quality of the panorama: the starting point and the exploration time. We first carry out a psychophysical experiment to investigate the interplay among the VR viewing conditions, the user viewing behaviors, and the perceived quality of 360{\deg} images. Then, we provide a thorough analysis of the collected human data, leading to several interesting findings. Moreover, we propose a computational framework for objective quality assessment of 360{\deg} images, embodying viewing conditions and behaviors in a delightful way. Specifically, we first transform an omnidirectional image to several video representations using different user viewing behaviors under different viewing conditions. We then leverage advanced 2D full-reference video quality models to compute the perceived quality. We construct a set of specific quality measures within the proposed framework, and demonstrate their promises on three VR quality databases.Comment: 11 pages, 11 figure, 9 tables. This paper has been accepted by IEEE Transactions on Visualization and Computer Graphic

arXiv.org e-Print Archive