Search CORE

50,311 research outputs found

Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning

Author: Lin Kwan-Yee
Wang Guanxiang
Publication venue
Publication date: 05/04/2018
Field of study

No-reference image quality assessment (NR-IQA) is a fundamental yet challenging task in low-level computer vision community. The difficulty is particularly pronounced for the limited information, for which the corresponding reference for comparison is typically absent. Although various feature extraction mechanisms have been leveraged from natural scene statistics to deep neural networks in previous methods, the performance bottleneck still exists. In this work, we propose a hallucination-guided quality regression network to address the issue. We firstly generate a hallucinated reference constrained on the distorted image, to compensate the absence of the true reference. Then, we pair the information of hallucinated reference with the distorted image, and forward them to the regressor to learn the perceptual discrepancy with the guidance of an implicit ranking relationship within the generator, and therefore produce the precise quality prediction. To demonstrate the effectiveness of our approach, comprehensive experiments are evaluated on four popular image quality assessment benchmarks. Our method significantly outperforms all the previous state-of-the-art methods by large margins. The code and model will be publicly available on the project page https://kwanyeelin.github.io/projects/HIQA/HIQA.html.Comment: Accepted to CVPR201

arXiv.org e-Print Archive

Adaptive Quantile Sparse Image (AQuaSI) Prior for Inverse Imaging Problems

Author: Köhler Thomas
Riess Christian
Schirrmacher Franziska
Publication venue
Publication date: 21/02/2020
Field of study

Inverse problems play a central role for many classical computer vision and image processing tasks. Many inverse problems are ill-posed, and hence require a prior to regularize the solution space. However, many of the existing priors, like total variation, are based on ad-hoc assumptions that have difficulties to represent the actual distribution of natural images. Thus, a key challenge in research on image processing is to find better suited priors to represent natural images. In this work, we propose the Adaptive Quantile Sparse Image (AQuaSI) prior. It is based on a quantile filter, can be used as a joint filter on guidance data, and be readily plugged into a wide range of numerical optimization algorithms. We demonstrate the efficacy of the proposed prior in joint RGB/depth upsampling, on RGB/NIR image restoration, and in a comparison with related regularization by denoising approaches

arXiv.org e-Print Archive

Towards Fine-grained Human Pose Transfer with Detail Replenishing Network

Author: Gao Wen
Gao Zhanning
Hua Xiansheng
Liu Chang
Ma Siwei
Ren Peiran
Wang Pan
Wang Shanshe
Yang Lingbo
Zhang Xinfeng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/05/2020
Field of study

Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applications, the visual realism of fine-grained appearance details is crucial for production quality and user engagement. However, existing HPT methods often suffer from three fundamental issues: detail deficiency, content ambiguity and style inconsistency, which severely degrade the visual quality and realism of generated images. Aiming towards real-world applications, we develop a more challenging yet practical HPT setting, termed as Fine-grained Human Pose Transfer (FHPT), with a higher focus on semantic fidelity and detail replenishment. Concretely, we analyze the potential design flaws of existing methods via an illustrative example, and establish the core FHPT methodology by combing the idea of content synthesis and feature transfer together in a mutually-guided fashion. Thereafter, we substantiate the proposed methodology with a Detail Replenishing Network (DRN) and a corresponding coarse-to-fine model training scheme. Moreover, we build up a complete suite of fine-grained evaluation protocols to address the challenges of FHPT in a comprehensive manner, including semantic analysis, structural detection and perceptual quality assessment. Extensive experiments on the DeepFashion benchmark dataset have verified the power of proposed benchmark against start-of-the-art works, with 12\%-14\% gain on top-10 retrieval recall, 5\% higher joint localization accuracy, and near 40\% gain on face identity preservation. Moreover, the evaluation results offer further insights to the subject matter, which could inspire many promising future works along this direction.Comment: IEEE TIP submissio

arXiv.org e-Print Archive

Learning to Calibrate Straight Lines for Fisheye Image Rectification

Author: Shen Weiming
Xia Gui-Song
Xue Nan
Xue Zhucun
Publication venue
Publication date: 01/05/2019
Field of study

This paper presents a new deep-learning based method to simultaneously calibrate the intrinsic parameters of fisheye lens and rectify the distorted images. Assuming that the distorted lines generated by fisheye projection should be straight after rectification, we propose a novel deep neural network to impose explicit geometry constraints onto processes of the fisheye lens calibration and the distorted image rectification. In addition, considering the nonlinearity of distortion distribution in fisheye images, the proposed network fully exploits multi-scale perception to equalize the rectification effects on the whole image. To train and evaluate the proposed model, we also create a new largescale dataset labeled with corresponding distortion parameters and well-annotated distorted lines. Compared with the state-of-the-art methods, our model achieves the best published rectification quality and the most accurate estimation of distortion parameters on a large set of synthetic and real fisheye images

arXiv.org e-Print Archive

Perception, Attention, and Resources: A Decision-Theoretic Approach to Graphics Rendering

Author: Horvitz Eric J.
Lengyel Jed
Publication venue
Publication date: 06/02/2013
Field of study

We describe work to control graphics rendering under limited computational resources by taking a decision-theoretic perspective on perceptual costs and computational savings of approximations. The work extends earlier work on the control of rendering by introducing methods and models for computing the expected cost associated with degradations of scene components. The expected cost is computed by considering the perceptual cost of degradations and a probability distribution over the attentional focus of viewers. We review the critical literature describing findings on visual search and attention, discuss the implications of the findings, and introduce models of expected perceptual cost. Finally, we discuss policies that harness information about the expected cost of scene components.Comment: Appears in Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI1997

arXiv.org e-Print Archive

Cube Padding for Weakly-Supervised Saliency Prediction in 360{\deg} Videos

Author: Chao Chun-Hung
Cheng Hsien-Tzu
Dong Jin-Dong
Liu Tyng-Luh
Sun Min
Wen Hao-Kai
Publication venue
Publication date: 04/06/2018
Field of study

Automatic saliency prediction in 360{\deg} videos is critical for viewpoint guidance applications (e.g., Facebook 360 Guide). We propose a spatial-temporal network which is (1) weakly-supervised trained and (2) tailor-made for 360{\deg} viewing sphere. Note that most existing methods are less scalable since they rely on annotated saliency map for training. Most importantly, they convert 360{\deg} sphere to 2D images (e.g., a single equirectangular image or multiple separate Normal Field-of-View (NFoV) images) which introduces distortion and image boundaries. In contrast, we propose a simple and effective Cube Padding (CP) technique as follows. Firstly, we render the 360{\deg} view on six faces of a cube using perspective projection. Thus, it introduces very little distortion. Then, we concatenate all six faces while utilizing the connectivity between faces on the cube for image padding (i.e., Cube Padding) in convolution, pooling, convolutional LSTM layers. In this way, CP introduces no image boundary while being applicable to almost all Convolutional Neural Network (CNN) structures. To evaluate our method, we propose Wild-360, a new 360{\deg} video saliency dataset, containing challenging videos with saliency heatmap annotations. In experiments, our method outperforms baseline methods in both speed and quality.Comment: CVPR 201

arXiv.org e-Print Archive

Blind Predicting Similar Quality Map for Image Quality Assessment

Author: Fu Sizhe
Hou Ming
Pan Da
Shi Ping
Ying Zefeng
Zhang Yuan
Publication venue
Publication date: 10/03/2019
Field of study

A key problem in blind image quality assessment (BIQA) is how to effectively model the properties of human visual system in a data-driven manner. In this paper, we propose a simple and efficient BIQA model based on a novel framework which consists of a fully convolutional neural network (FCNN) and a pooling network to solve this problem. In principle, FCNN is capable of predicting a pixel-by-pixel similar quality map only from a distorted image by using the intermediate similarity maps derived from conventional full-reference image quality assessment methods. The predicted pixel-by-pixel quality maps have good consistency with the distortion correlations between the reference and distorted images. Finally, a deep pooling network regresses the quality map into a score. Experiments have demonstrated that our predictions outperform many state-of-the-art BIQA methods

arXiv.org e-Print Archive

Cycle-IR: Deep Cyclic Image Retargeting

Author: Lin Chumin
Niu Xuejing
Tan Weimin
Yan Bo
Publication venue
Publication date: 09/05/2019
Field of study

Supervised deep learning techniques have achieved great success in various fields due to getting rid of the limitation of handcrafted representations. However, most previous image retargeting algorithms still employ fixed design principles such as using gradient map or handcrafted features to compute saliency map, which inevitably restricts its generality. Deep learning techniques may help to address this issue, but the challenging problem is that we need to build a large-scale image retargeting dataset for the training of deep retargeting models. However, building such a dataset requires huge human efforts. In this paper, we propose a novel deep cyclic image retargeting approach, called Cycle-IR, to firstly implement image retargeting with a single deep model, without relying on any explicit user annotations. Our idea is built on the reverse mapping from the retargeted images to the given input images. If the retargeted image has serious distortion or excessive loss of important visual information, the reverse mapping is unlikely to restore the input image well. We constrain this forward-reverse consistency by introducing a cyclic perception coherence loss. In addition, we propose a simple yet effective image retargeting network (IRNet) to implement the image retargeting process. Our IRNet contains a spatial and channel attention layer, which is able to discriminate visually important regions of input images effectively, especially in cluttered images. Given arbitrary sizes of input images and desired aspect ratios, our Cycle-IR can produce visually pleasing target images directly. Extensive experiments on the standard RetargetMe dataset show the superiority of our Cycle-IR. In addition, our Cycle-IR outperforms the Multiop method and obtains the best result in the user study. Code is available at https://github.com/mintanwei/Cycle-IR.Comment: 12 page

arXiv.org e-Print Archive

Attention-aware Multi-stroke Style Transfer

Author: Liu Weidong
Liu Yong-Jin
Ren Jianqiang
Wang Jun
Xie Xuansong
Yao Yuan
Publication venue
Publication date: 15/01/2019
Field of study

Neural style transfer has drawn considerable attention from both academic and industrial field. Although visual effect and efficiency have been significantly improved, existing methods are unable to coordinate spatial distribution of visual attention between the content image and stylized image, or render diverse level of detail via different brush strokes. In this paper, we tackle these limitations by developing an attention-aware multi-stroke style transfer model. We first propose to assemble self-attention mechanism into a style-agnostic reconstruction autoencoder framework, from which the attention map of a content image can be derived. By performing multi-scale style swap on content features and style features, we produce multiple feature maps reflecting different stroke patterns. A flexible fusion strategy is further presented to incorporate the salient characteristics from the attention map, which allows integrating multiple stroke patterns into different spatial regions of the output image harmoniously. We demonstrate the effectiveness of our method, as well as generate comparable stylized images with multiple stroke patterns against the state-of-the-art methods

arXiv.org e-Print Archive

Face Image Reflection Removal

Author: Duan Ling-Yu
Kot Alex C.
Li Haoliang
Shi Boxin
Wan Renjie
Publication venue
Publication date: 03/03/2019
Field of study

Face images captured through the glass are usually contaminated by reflections. The non-transmitted reflections make the reflection removal more challenging than for general scenes, because important facial features are completely occluded. In this paper, we propose and solve the face image reflection removal problem. We remove non-transmitted reflections by incorporating inpainting ideas into a guided reflection removal framework and recover facial features by considering various face-specific priors. We use a newly collected face reflection image dataset to train our model and compare with state-of-the-art methods. The proposed method shows advantages in estimating reflection-free face images for improving face recognition

arXiv.org e-Print Archive