476 research outputs found

    Spatiotemporal Video Quality Assessment Method via Multiple Feature Mappings

    Get PDF
    Progressed video quality assessment (VQA) methods aim to evaluate the perceptual quality of videos in many applications but often prompt to increase computational complexity. Problems derive from the complexity of the distorted videos that are of significant concern in the communication industry, as well as the spatial-temporal content of the two-fold (spatial and temporal) distortion. Therefore, the findings of the study indicate that the information in the spatiotemporal slice (STS) images are useful in measuring video distortion. This paper mainly focuses on developing on a full reference video quality assessment algorithm estimator that integrates several features of spatiotemporal slices (STSS) of frames to form a high-performance video quality. This research work aims to evaluate video quality by utilizing several VQA databases by the following steps: (1) we first arrange the reference and test video sequences into a spatiotemporal slice representation. A collection of spatiotemporal feature maps were computed on each reference-test video. These response features are then processed by using a Structural Similarity (SSIM) to form a local frame quality.  (2) To further enhance the quality assessment, we combine the spatial feature maps with the spatiotemporal feature maps and propose the VQA model, named multiple map similarity feature deviation (MMSFD-STS). (3) We apply a sequential pooling strategy to assemble the quality indices of frames in the video quality scoring. (4) Extensive evaluations on video quality databases show that the proposed VQA algorithm achieves better/competitive performance as compared with other state- of- the- art methods

    Detecting changes in auditory events

    Get PDF
    Change deafness is defined as the failure to detect the source of an above-threshold change in an auditory scene. A new paradigm recently demonstrated the phenomenon under analogous conditions to its visual counterpart, change blindness (Hall, Peck, Gaston, & Dickerson, 2015). This investigation examined the use of the paradigm through two experiments which involved the same four simultaneously presented events. Experiment 1 distributed events across a virtual 120º on the azimuth while the target event oscillated across a 60º space throughout each trial. Listeners were instructed to identify the target as soon as possible. Target rate of change was manipulated across four different velocities (80º/s, 40º/s, 24º/s, 8º/s). Results confirmed that all conditions differed in error rates from an isolated control task. The 8º/s condition displayed the highest error rates, providing strong evidence of change deafness, whereas error rates in the 80º/s, 40º/s, and 24º/s conditions did not significantly differ, providing inconclusive evidence. Response times did not vary across conditions. Experiment 2 compared findings to a frequency-based filter manipulation and evaluated change deafness by comparing flickered (one-second and three-second initial presentation) and continuously changing target events, which oscillated between wide- and narrow-band filters. All conditions resulted in error rates that did not vary from the control task. The continuous condition produced increased response times, providing explicit evidence of change deafness. Rapid response times in flicker conditions indicated the elimination of change deafness. The three-second presentation time in one flicker condition further reduced response times, demonstrating the impact of encoding. Experiments support the assessed paradigm as an appropriate method of analyzing the occurrence of change deafness

    Full-reference stereoscopic video quality assessment using a motion sensitive HVS model

    Get PDF
    Stereoscopic video quality assessment has become a major research topic in recent years. Existing stereoscopic video quality metrics are predominantly based on stereoscopic image quality metrics extended to the time domain via for example temporal pooling. These approaches do not explicitly consider the motion sensitivity of the Human Visual System (HVS). To address this limitation, this paper introduces a novel HVS model inspired by physiological findings characterising the motion sensitive response of complex cells in the primary visual cortex (V1 area). The proposed HVS model generalises previous HVS models, which characterised the behaviour of simple and complex cells but ignored motion sensitivity, by estimating optical flow to measure scene velocity at different scales and orientations. The local motion characteristics (direction and amplitude) are used to modulate the output of complex cells. The model is applied to develop a new type of full-reference stereoscopic video quality metrics which uniquely combine non-motion sensitive and motion sensitive energy terms to mimic the response of the HVS. A tailored two-stage multi-variate stepwise regression algorithm is introduced to determine the optimal contribution of each energy term. The two proposed stereoscopic video quality metrics are evaluated on three stereoscopic video datasets. Results indicate that they achieve average correlations with subjective scores of 0.9257 (PLCC), 0.9338 and 0.9120 (SRCC), 0.8622 and 0.8306 (KRCC), and outperform previous stereoscopic video quality metrics including other recent HVS-based metrics

    Quality Assessment of In-the-Wild Videos

    Full text link
    Quality assessment of in-the-wild videos is a challenging problem because of the absence of reference videos and shooting distortions. Knowledge of the human visual system can help establish methods for objective quality assessment of in-the-wild videos. In this work, we show two eminent effects of the human visual system, namely, content-dependency and temporal-memory effects, could be used for this purpose. We propose an objective no-reference video quality assessment method by integrating both effects into a deep neural network. For content-dependency, we extract features from a pre-trained image classification neural network for its inherent content-aware property. For temporal-memory effects, long-term dependencies, especially the temporal hysteresis, are integrated into the network with a gated recurrent unit and a subjectively-inspired temporal pooling layer. To validate the performance of our method, experiments are conducted on three publicly available in-the-wild video quality assessment databases: KoNViD-1k, CVD2014, and LIVE-Qualcomm, respectively. Experimental results demonstrate that our proposed method outperforms five state-of-the-art methods by a large margin, specifically, 12.39%, 15.71%, 15.45%, and 18.09% overall performance improvements over the second-best method VBLIINDS, in terms of SROCC, KROCC, PLCC and RMSE, respectively. Moreover, the ablation study verifies the crucial role of both the content-aware features and the modeling of temporal-memory effects. The PyTorch implementation of our method is released at https://github.com/lidq92/VSFA.Comment: 9 pages, 7 figures, 4 tables. ACM Multimedia 2019 camera ready. -> Update alignment formatting of Table

    The role of temporal frequency in continuous flash suppression: A case for a unified framework

    Get PDF
    In continuous flash suppression (CFS), a rapidly changing Mondrian sequence is presented to one eye in order to suppress a static target presented to the other eye. Targets generally remain suppressed for several seconds at a time, contributing to the widespread use of CFS in studies of unconscious visual processes. Nevertheless, the mechanisms underlying CFS suppression remain unclear, complicating its use and the comprehension of results obtained with the technique. As a starting point, this thesis examined the role of temporal frequency in CFS suppression using carefully controlled stimuli generated by Fourier Transform techniques. A low-level stimulus attribute, the choice of temporal frequency allowed us to evaluate the contributions of early visual processes and test the general assumption that fast update rates drive CFS effectiveness. Three psychophysical studies are described in this thesis, starting with the temporal frequency tuning of CFS (Chapter 2), the relationship between the Mondrian pattern and temporal frequency content (Chapter 3), and finally the role of temporal frequency selectivity in CFS (Chapter 4). Contrary to conventional wisdom, the results showed that the suppression of static targets is largely driven by high spatial frequencies and low temporal frequencies. Faster masker rates, on the other hand, worked best with transient targets. Indicative of early, feature selective processes, these findings are reminiscent of binocular rivalry suppression, demonstrating the possible use of a unified framework
    corecore