271 research outputs found
Visual Comfort Assessment for Stereoscopic Image Retargeting
In recent years, visual comfort assessment (VCA) for 3D/stereoscopic content
has aroused extensive attention. However, much less work has been done on the
perceptual evaluation of stereoscopic image retargeting. In this paper, we
first build a Stereoscopic Image Retargeting Database (SIRD), which contains
source images and retargeted images produced by four typical stereoscopic
retargeting methods. Then, the subjective experiment is conducted to assess
four aspects of visual distortion, i.e. visual comfort, image quality, depth
quality and the overall quality. Furthermore, we propose a Visual Comfort
Assessment metric for Stereoscopic Image Retargeting (VCA-SIR). Based on the
characteristics of stereoscopic retargeted images, the proposed model
introduces novel features like disparity range, boundary disparity as well as
disparity intensity distribution into the assessment model. Experimental
results demonstrate that VCA-SIR can achieve high consistency with subjective
perception
Deep Local and Global Spatiotemporal Feature Aggregation for Blind Video Quality Assessment
In recent years, deep learning has achieved promising success for multimedia
quality assessment, especially for image quality assessment (IQA). However,
since there exist more complex temporal characteristics in videos, very little
work has been done on video quality assessment (VQA) by exploiting powerful
deep convolutional neural networks (DCNNs). In this paper, we propose an
efficient VQA method named Deep SpatioTemporal video Quality assessor (DeepSTQ)
to predict the perceptual quality of various distorted videos in a no-reference
manner. In the proposed DeepSTQ, we first extract local and global
spatiotemporal features by pre-trained deep learning models without fine-tuning
or training from scratch. The composited features consider distorted video
frames as well as frame difference maps from both global and local views. Then,
the feature aggregation is conducted by the regression model to predict the
perceptual video quality. Finally, experimental results demonstrate that our
proposed DeepSTQ outperforms state-of-the-art quality assessment algorithms
Deep Multi-Scale Features Learning for Distorted Image Quality Assessment
Image quality assessment (IQA) aims to estimate human perception based image
visual quality. Although existing deep neural networks (DNNs) have shown
significant effectiveness for tackling the IQA problem, it still needs to
improve the DNN-based quality assessment models by exploiting efficient
multi-scale features. In this paper, motivated by the human visual system (HVS)
combining multi-scale features for perception, we propose to use pyramid
features learning to build a DNN with hierarchical multi-scale features for
distorted image quality prediction. Our model is based on both residual maps
and distorted images in luminance domain, where the proposed network contains
spatial pyramid pooling and feature pyramid from the network structure. Our
proposed network is optimized in a deep end-to-end supervision manner. To
validate the effectiveness of the proposed method, extensive experiments are
conducted on four widely-used image quality assessment databases, demonstrating
the superiority of our algorithm
Semantics-Aligned Representation Learning for Person Re-identification
Person re-identification (reID) aims to match person images to retrieve the
ones with the same identity. This is a challenging task, as the images to be
matched are generally semantically misaligned due to the diversity of human
poses and capture viewpoints, incompleteness of the visible bodies (due to
occlusion), etc. In this paper, we propose a framework that drives the reID
network to learn semantics-aligned feature representation through delicate
supervision designs. Specifically, we build a Semantics Aligning Network (SAN)
which consists of a base network as encoder (SA-Enc) for re-ID, and a decoder
(SA-Dec) for reconstructing/regressing the densely semantics aligned full
texture image. We jointly train the SAN under the supervisions of person
re-identification and aligned texture generation. Moreover, at the decoder,
besides the reconstruction loss, we add Triplet ReID constraints over the
feature maps as the perceptual losses. The decoder is discarded in the
inference and thus our scheme is computationally efficient. Ablation studies
demonstrate the effectiveness of our design. We achieve the state-of-the-art
performances on the benchmark datasets CUHK03, Market1501, MSMT17, and the
partial person reID dataset Partial REID. Code for our proposed method is
available at:
https://github.com/microsoft/Semantics-Aligned-Representation-Learning-for-Person-Re-identification.Comment: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20),
code has been release
Blind Omnidirectional Image Quality Assessment with Viewport Oriented Graph Convolutional Networks
Quality assessment of omnidirectional images has become increasingly urgent
due to the rapid growth of virtual reality applications. Different from
traditional 2D images and videos, omnidirectional contents can provide
consumers with freely changeable viewports and a larger field of view covering
the spherical surface, which makes the objective
quality assessment of omnidirectional images more challenging. In this paper,
motivated by the characteristics of the human vision system (HVS) and the
viewing process of omnidirectional contents, we propose a novel Viewport
oriented Graph Convolution Network (VGCN) for blind omnidirectional image
quality assessment (IQA). Generally, observers tend to give the subjective
rating of a 360-degree image after passing and aggregating different viewports
information when browsing the spherical scenery. Therefore, in order to model
the mutual dependency of viewports in the omnidirectional image, we build a
spatial viewport graph. Specifically, the graph nodes are first defined with
selected viewports with higher probabilities to be seen, which is inspired by
the HVS that human beings are more sensitive to structural information. Then,
these nodes are connected by spatial relations to capture interactions among
them. Finally, reasoning on the proposed graph is performed via graph
convolutional networks. Moreover, we simultaneously obtain global quality using
the entire omnidirectional image without viewport sampling to boost the
performance according to the viewing experience. Experimental results
demonstrate that our proposed model outperforms state-of-the-art full-reference
and no-reference IQA metrics on two public omnidirectional IQA databases
Holographic description of elastic photon-proton and photon-photon scattering
We investigate the elastic photon-proton and photon-photon scattering in a
holographic QCD model, focusing on the Regge regime. Considering contributions
of the Pomeron and Reggeon exchange, the total and differential cross sections
are calculated. While our model involves several parameters, by virtue of the
universality of the Pomeron and Reggeon, for most of them the values determined
in the preceding study on the proton-proton and proton-antiproton scattering
can be employed. Once the two adjustable parameters, the Pomeron-photon and
Reggeon-photon coupling constant, are determined with the experimental data of
the total cross sections, predicting the both cross sections in a wide
kinematic region, from the GeV to TeV scale, becomes possible. We show that the
total cross section data can be well described within the model, and our
predictions for the photon-proton differential cross section are consistent
with the data.Comment: 13 pages, 3 figure
- …