4 research outputs found
Deep Local and Global Spatiotemporal Feature Aggregation for Blind Video Quality Assessment
In recent years, deep learning has achieved promising success for multimedia
quality assessment, especially for image quality assessment (IQA). However,
since there exist more complex temporal characteristics in videos, very little
work has been done on video quality assessment (VQA) by exploiting powerful
deep convolutional neural networks (DCNNs). In this paper, we propose an
efficient VQA method named Deep SpatioTemporal video Quality assessor (DeepSTQ)
to predict the perceptual quality of various distorted videos in a no-reference
manner. In the proposed DeepSTQ, we first extract local and global
spatiotemporal features by pre-trained deep learning models without fine-tuning
or training from scratch. The composited features consider distorted video
frames as well as frame difference maps from both global and local views. Then,
the feature aggregation is conducted by the regression model to predict the
perceptual video quality. Finally, experimental results demonstrate that our
proposed DeepSTQ outperforms state-of-the-art quality assessment algorithms
Blind Quality Assessment for Image Superresolution Using Deep Two-Stream Convolutional Networks
Numerous image superresolution (SR) algorithms have been proposed for
reconstructing high-resolution (HR) images from input images with lower spatial
resolutions. However, effectively evaluating the perceptual quality of SR
images remains a challenging research problem. In this paper, we propose a
no-reference/blind deep neural network-based SR image quality assessor
(DeepSRQ). To learn more discriminative feature representations of various
distorted SR images, the proposed DeepSRQ is a two-stream convolutional network
including two subcomponents for distorted structure and texture SR images.
Different from traditional image distortions, the artifacts of SR images cause
both image structure and texture quality degradation. Therefore, we choose the
two-stream scheme that captures different properties of SR inputs instead of
directly learning features from one image stream. Considering the human visual
system (HVS) characteristics, the structure stream focuses on extracting
features in structural degradations, while the texture stream focuses on the
change in textural distributions. In addition, to augment the training data and
ensure the category balance, we propose a stride-based adaptive cropping
approach for further improvement. Experimental results on three publicly
available SR image quality databases demonstrate the effectiveness and
generalization ability of our proposed DeepSRQ method compared with
state-of-the-art image quality assessment algorithms
No-Reference Quality Assessment for 360-degree Images by Analysis of Multi-frequency Information and Local-global Naturalness
360-degree/omnidirectional images (OIs) have achieved remarkable attentions
due to the increasing applications of virtual reality (VR). Compared to
conventional 2D images, OIs can provide more immersive experience to consumers,
benefitting from the higher resolution and plentiful field of views (FoVs).
Moreover, observing OIs is usually in the head mounted display (HMD) without
references. Therefore, an efficient blind quality assessment method, which is
specifically designed for 360-degree images, is urgently desired. In this
paper, motivated by the characteristics of the human visual system (HVS) and
the viewing process of VR visual contents, we propose a novel and effective
no-reference omnidirectional image quality assessment (NR OIQA) algorithm by
Multi-Frequency Information and Local-Global Naturalness (MFILGN).
Specifically, inspired by the frequency-dependent property of visual cortex, we
first decompose the projected equirectangular projection (ERP) maps into
wavelet subbands. Then, the entropy intensities of low and high frequency
subbands are exploited to measure the multi-frequency information of OIs.
Besides, except for considering the global naturalness of ERP maps, owing to
the browsed FoVs, we extract the natural scene statistics features from each
viewport image as the measure of local naturalness. With the proposed
multi-frequency information measurement and local-global naturalness
measurement, we utilize support vector regression as the final image quality
regressor to train the quality evaluation model from visual quality-related
features to human ratings. To our knowledge, the proposed model is the first
no-reference quality assessment method for 360-degreee images that combines
multi-frequency information and image naturalness. Experimental results on two
publicly available OIQA databases demonstrate that our proposed MFILGN
outperforms state-of-the-art approaches
Stereoscopic video quality prediction based on end-to-end dual stream deep neural networks
In this paper, we propose a no-reference stereoscopic video quality assessment (NR-SVQA) method based on an end-to-end dual stream deep neural network (DNN), which incorporates left and right view sub-networks. The end-to-end dual stream network takes image patch pairs from left and right view pivotal frames as inputs and evaluates the perceptual quality of each image patch pair. By combining multiple convolution, max-pooling and fully-connected layers with regression in the framework, distortion related features are learned end-to-end and purely data driven. Then, a spatiotemporal pooling strategy is employed on these image patch pairs to estimate the entire stereoscopic video quality. The proposed network architecture, which we name End-to-end Dual stream deep Neural network (EDN), is trained and tested on the well-known stereoscopic video dataset divided by reference videos. Experimental results demonstrate that our proposed method outperforms state-of-the-art algorithms