Search CORE

816 research outputs found

Towards Top-Down Stereoscopic Image Quality Assessment via Stereo Attention

Author: Chang Yongli
Li Sumei
Zhang Huilin
Publication venue
Publication date: 08/08/2023
Field of study

Stereoscopic image quality assessment (SIQA) plays a crucial role in evaluating and improving the visual experience of 3D content. Existing binocular properties and attention-based methods for SIQA have achieved promising performance. However, these bottom-up approaches are inadequate in exploiting the inherent characteristics of the human visual system (HVS). This paper presents a novel network for SIQA via stereo attention, employing a top-down perspective to guide the quality assessment process. Our proposed method realizes the guidance from high-level binocular signals down to low-level monocular signals, while the binocular and monocular information can be calibrated progressively throughout the processing pipeline. We design a generalized Stereo AttenTion (SAT) block to implement the top-down philosophy in stereo perception. This block utilizes the fusion-generated attention map as a high-level binocular modulator, influencing the representation of two low-level monocular features. Additionally, we introduce an Energy Coefficient (EC) to account for recent findings indicating that binocular responses in the primate primary visual cortex are less than the sum of monocular responses. The adaptive EC can tune the magnitude of binocular response flexibly, thus enhancing the formation of robust binocular features within our framework. To extract the most discriminative quality information from the summation and subtraction of the two branches of monocular features, we utilize a dual-pooling strategy that applies min-pooling and max-pooling operations to the respective branches. Experimental results highlight the superiority of our top-down method in simulating the property of visual perception and advancing the state-of-the-art in the SIQA field. The code of this work is available at https://github.com/Fanning-Zhang/SATNet.Comment: 13 pages, 4 figure

arXiv.org e-Print Archive

A blind stereoscopic image quality evaluator with segmented stacked autoencoders considering the whole visual perception route

Author: Baihua Li (1253553)
Jiachen Yang (840978)
Kyohoon Sim (7168592)
Qinggang Meng (1257072)
Wen Lu (153883)
Xinbo Gao (709340)
Publication venue
Publication date: 01/01/2018
Field of study

Most of the current blind stereoscopic image quality assessment (SIQA) algorithms cannot show reliable accuracy. One reason is that they do not have the deep architectures and the other reason is that they are designed on the relatively weak biological basis, compared with findings on human visual system (HVS). In this paper, we propose a Deep Edge and COlor Signal INtegrity Evaluator (DECOSINE) based on the whole visual perception route from eyes to the frontal lobe, and especially focus on edge and color signal processing in retinal ganglion cells (RGC) and lateral geniculate nucleus (LGN). Furthermore, to model the complex and deep structure of the visual cortex, Segmented Stacked Auto-encoder (S-SAE) is used, which has not utilized for SIQA before. The utilization of the S-SAE complements weakness of deep learning-based SIQA metrics that require a very long training time. Experiments are conducted on popular SIQA databases, and the superiority of DECOSINE in terms of prediction accuracy and monotonicity is proved. The experimental results show that our model about the whole visual perception route and utilization of S-SAE are effective for SIQA

End-to-end deep multi-score model for No-reference stereoscopic image quality assessment

Author: Chetouani Aladine
Messai Oussama
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/10/2022
Field of study

Deep learning-based quality metrics have recently given significant improvement in Image Quality Assessment (IQA). In the field of stereoscopic vision, information is evenly distributed with slight disparity to the left and right eyes. However, due to asymmetric distortion, the objective quality ratings for the left and right images would differ, necessitating the learning of unique quality indicators for each view. Unlike existing stereoscopic IQA measures which focus mainly on estimating a global human score, we suggest incorporating left, right, and stereoscopic objective scores to extract the corresponding properties of each view, and so forth estimating stereoscopic image quality without reference. Therefore, we use a deep multi-score Convolutional Neural Network (CNN). Our model has been trained to perform four tasks: First, predict the left view's quality. Second, predict the quality of the left view. Third and fourth, predict the quality of the stereo view and global quality, respectively, with the global score serving as the ultimate quality. Experiments are conducted on Waterloo IVC 3D Phase 1 and Phase 2 databases. The results obtained show the superiority of our method when comparing with those of the state-of-the-art. The implementation code can be found at: https://github.com/o-messai/multi-score-SIQ

arXiv.org e-Print Archive

No reference quality assessment of stereo video based on saliency and sparsity

Author: Bin Jiang (196018)
Chunqi Ji (7168757)
Jiachen Yang (840978)
Qinggang Meng (1257072)
Wen Lu (153883)
Publication venue
Publication date: 28/02/2018
Field of study

With the popularity of video technology, stereoscopic video quality assessment (SVQA) has become increasingly important. Existing SVQA methods cannot achieve good performance because the videos' information is not fully utilized. In this paper, we consider various information in the videos together, construct a simple model to combine and analyze the diverse features, which is based on saliency and sparsity. First, we utilize the 3-D saliency map of sum map, which remains the basic information of stereoscopic video, as a valid tool to evaluate the videos' quality. Second, we use the sparse representation to decompose the sum map of 3-D saliency into coefficients, then calculate the features based on sparse coefficients to obtain the effective expression of videos' message. Next, in order to reduce the relevance between the features, we put them into stacked auto-encoder, mapping vectors to higher dimensional space, and adding the sparse restraint, then input them into support vector machine subsequently, and finally, get the quality assessment scores. Within that process, we take the advantage of saliency and sparsity to extract and simplify features. Through the later experiment, we can see the proposed method is fitting well with the subjective scores

Stereoscopic video quality assessment based on 3D convolutional neural networks

Author: Chaofan Ma (7168922)
Jiachen Yang (840978)
Qinggang Meng (1257072)
Wen Lu (153883)
Yinghao Zhu (7167920)
Publication venue
Publication date: 01/01/2018
Field of study

The research of stereoscopic video quality assessment (SVQA) plays an important role for promoting the development of stereoscopic video system. Existing SVQA metrics rely on hand-crafted features, which is inaccurate and time-consuming because of the diversity and complexity of stereoscopic video distortion. This paper introduces a 3D convolutional neural networks (CNN) based SVQA framework that can model not only local spatio-temporal information but also global temporal information with cubic difference video patches as input. First, instead of using hand-crafted features, we design a 3D CNN architecture to automatically and effectively capture local spatio-temporal features. Then we employ a quality score fusion strategy considering global temporal clues to obtain final video-level predicted score. Extensive experiments conducted on two public stereoscopic video quality datasets show that the proposed method correlates highly with human perception and outperforms state-of-the-art methods by a large margin. We also show that our 3D CNN features have more desirable property for SVQA than hand-crafted features in previous methods, and our 3D CNN features together with support vector regression (SVR) can further boost the performance. In addition, with no complex preprocessing and GPU acceleration, our proposed method is demonstrated computationally efficient and easy to use

Change blindness: eradication of gestalt strategies

Author: Goddard Paul
Wilson Steve
Publication venue: 'Pion Ltd'
Publication date: 01/08/2011
Field of study

Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

Sparsity based stereoscopic image quality assessment

Author: Channappayya Sumohana
Khan Md S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In this work, we present a full-reference stereo image quality assessment algorithm that is based on the sparse representations of luminance images and depth maps. The primary challenge lies in dealing with the sparsity of disparity maps in conjunction with the sparsity of luminance images. Although analysing the sparsity of images is sufficient to bring out the quality of luminance images, the effectiveness of sparsity in quantifying depth quality is yet to be fully understood. We present a full reference Sparsity-based Quality Assessment of Stereo Images (SQASI) that is aimed at this understanding

Haptic-Enhanced Learning in Preclinical Operative Dentistry

Author: Al Saud Lulwah Bint Mohammed Bin Saad A.
Publication venue: University of Leeds
Publication date: 20/01/2017
Field of study

Background: Virtual reality haptic simulators represent a new paradigm in dental education that may potentially impact the rate and efficiency of basic skill acquisition, as well as pedagogically influence the various aspects of students’ preclinical experience. However, the evidence to support their efficiency and inform their implementation is still limited. Objectives: This thesis set out to empirically examine how haptic VR simulator (Simodont®) can enhance the preclinical dental education experience particularly in the context of operative dentistry. We specify 4 distinct research themes to explore, namely: simulator validity (face, content and predictive), human factors in 3D stereoscopic display, motor skill acquisition, and curriculum integration. Methods: Chapter 3 explores the face and content validity of Simodont® haptic dental simulator among a group of postgraduate dental students. Chapter 4 examines the predictive utility of Simodont® in predicting subsequent preclinical and clinical performance. The results indicate the potential utility of the simulator in predicting future clinical dental performance among undergraduate students. Chapter 5 investigates the role of stereopsis in dentistry from two different perspectives via two studies. Chapter 6 explores the effect of qualitatively different types of pedagogical feedback on the training, transfer and retention of basic manual dexterity dental skills. The results indicate that the acquisition and retention of basic dental motor skills in novice trainees is best optimised through a combination of instructor and visualdisplay VR-driven feedback. A pedagogical model for integration of haptic dental simulator into the dental curriculum has been proposed in Chapter 7. Conclusion: The findings from this thesis provide new insights into the utility of the haptic virtual reality simulator in undergraduate preclinical dental education. Haptic simulators have promising potential as a pedagogical tool in undergraduate dentistry that complements the existing simulation methods. Integration of haptic VR simulators into the dental curriculum has to be informed by sound pedagogical principles and mapped into specific learning objectives

Recommended from our members

Statistical and perceptual properties of images and videos with applications

Author: Sinno Zeina
Publication venue
Publication date: 21/06/2021
Field of study

The visual brain is optimally designed to process images from the natural environment that we perceive. Describing the natural environment statistically helps in understanding how the brain encodes those images efficiently. The Natural Scene Statistics (NSS) of the luminance component of images is the basis of several univariate statistical models. Such models were the fundamental building blocks of multiple visual applications, ranging from the design of faithful image and video quality models to the development of perceptually optimized image enhancing techniques. Towards advancing this area, I studied the bivariate statistical properties of images and developed the first of its kind closed-form model that describes the correlation of spatially separated bandpass image samples. I found that the model was useful in tackling different problems such as blindly assessing the quality of images and assessing 3D visual discomfort of stereo images. Provided the success of NSS in tackling image processing problems, I decided to use them as a tool to tackle the blind video quality assessment (VQA) problem. First, I constructed a video quality database, the LIVE Video Quality Challenge Database (LIVE-VQC). This database is the largest across several key dimensions: number of unique contents, distortions, devices, resolutions, and videographers. For collecting the subjective scores, I constructed a new framework in Amazon Mechanical Turk. A massive number of subjects from across the globe participated in my study. Those efforts resulted in a VQA database that serves as a great benchmark for real-world videos. Next, I studied the spatio-temporal statistics of a wide variety of natural videos and created a space-time completely blind VQA model that deploys a directional temporal NSS model to predict quality. My newly created model outperforms all previous completely blind VQA models on the LIVE-VQCElectrical and Computer Engineerin

Texas ScholarWorks