Search CORE

7,624 research outputs found

Difficulty-aware Image Super Resolution via Deep Adaptive Dual-Network

Author: Qin Jinghui
Shi Yukai
Wen Wushao
Xie Ziwei
Publication venue
Publication date: 01/05/2019
Field of study

Recently, deep learning based single image super-resolution(SR) approaches have achieved great development. The state-of-the-art SR methods usually adopt a feed-forward pipeline to establish a non-linear mapping between low-res(LR) and high-res(HR) images. However, due to treating all image regions equally without considering the difficulty diversity, these approaches meet an upper bound for optimization. To address this issue, we propose a novel SR approach that discriminately processes each image region within an image by its difficulty. Specifically, we propose a dual-way SR network that one way is trained to focus on easy image regions and another is trained to handle hard image regions. To identify whether a region is easy or hard, we propose a novel image difficulty recognition network based on PSNR prior. Our SR approach that uses the region mask to adaptively enforce the dual-way SR network yields superior results. Extensive experiments on several standard benchmarks (e.g., Set5, Set14, BSD100, and Urban100) show that our approach achieves state-of-the-art performance.Comment: ICME2019(Oral), code and results are available at: https://github.com/xzwlx/Difficulty-S

arXiv.org e-Print Archive

NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image

Author: Arad Boaz
Ben-Shahar Ohad
Finlayson Graham
Givati Shai
Lin Yi-Tun
others
Timofte Radu
Publication venue
Publication date: 07/05/2020
Field of study

This paper reviews the second challenge on spectral reconstruction from RGB images, i.e., the recovery of whole-scene hyperspectral (HS) information from a 3-channel RGB image. As in the previous challenge, two tracks were provided: (i) a "Clean" track where HS images are estimated from noise-free RGBs, the RGB images are themselves calculated numerically using the ground-truth HS images and supplied spectral sensitivity functions (ii) a "Real World" track, simulating capture by an uncalibrated and unknown camera, where the HS images are recovered from noisy JPEG-compressed RGB images. A new, larger-than-ever, natural hyperspectral image data set is presented, containing a total of 510 HS images. The Clean and Real World tracks had 103 and 78 registered participants respectively, with 14 teams competing in the final testing phase. A description of the proposed methods, alongside their challenge scores and an extensive evaluation of top performing methods is also provided. They gauge the state-of-the-art in spectral reconstruction from an RGB image

arXiv.org e-Print Archive

DAVANet: Stereo Deblurring with View Aggregation

Author: Pan Jinshan
Ren Jimmy
Xie Haozhe
Zhang Jiawei
Zhou Shangchen
Zuo Wangmeng
Publication venue
Publication date: 10/04/2019
Field of study

Nowadays stereo cameras are more commonly adopted in emerging devices such as dual-lens smartphones and unmanned aerial vehicles. However, they also suffer from blurry images in dynamic scenes which leads to visual discomfort and hampers further image processing. Previous works have succeeded in monocular deblurring, yet there are few studies on deblurring for stereoscopic images. By exploiting the two-view nature of stereo images, we propose a novel stereo image deblurring network with Depth Awareness and View Aggregation, named DAVANet. In our proposed network, 3D scene cues from the depth and varying information from two views are incorporated, which help to remove complex spatially-varying blur in dynamic scenes. Specifically, with our proposed fusion network, we integrate the bidirectional disparities estimation and deblurring into a unified framework. Moreover, we present a large-scale multi-scene dataset for stereo deblurring, containing 20,637 blurry-sharp stereo image pairs from 135 diverse sequences and their corresponding bidirectional disparities. The experimental results on our dataset demonstrate that DAVANet outperforms state-of-the-art methods in terms of accuracy, speed, and model size.Comment: CVPR 2019 (Oral

arXiv.org e-Print Archive

AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results

Author: Ahn Keon-Hee
Ahn Keon-Hee
Almasri Feras
Arora Aditya
Baek JaeHyun
Cani Marie-Paule
Cao Liang
Cheng Kaihua
Choi Jun-Ho
Choi Jun-Ho
Cong Xiaofeng
Debeir Olivier
Ding Errui
Fan Yanwen
Guan Wei
Gupta Rajat
Han Junyu
Han Yoseob
Hayat Munawar
Jia Yu
Jin Xin
Khan Fahad Shahbaz
Khan Salman
Kim Byung-Hoon
Kim Jun-Hyuk
Kim Jun-Hyuk
Lee Jong-Seok
Lee Jong-Seok
Li Baopu
Li Chu-Tak
Li Hao
Li Xiaobo
Li Xin
Lin Liang
Lin Yue
Liu Cen
Liu Jingtuo
Liu Sen
Liu Zhi-Song
Lu Hannan
Luo Zhipeng
Micheloni Christian
Pan Zhihong
Pang Yingxue
Peng Yunbo
Shen Yi
Shi Yukai
Siu Wan-Chi
Timofte Radu
Umer Rao Muhammad
Vandamme Thomas
Wang Li-Wen
Wei Pengxu
Wu Chenhuan
Wu Haoning
Wu Yaojun
Xi Teng
Xie Tangxin
Xu Dejia
Xu Zhenyu
Yang Xiaojun
Yang Zhijing
Yao Yuehan
Ye Chen
Zamir Syed Waqas
Zhang Gang
Zhang Jialiang
Zhao Shanshan
Zhao Tongtong
Zhong Haoyu
Zhong Haoyu
Zhou Bo
Zhou Yuanbo
Zou Xueyi
Zou Yan
Zuo Wangmeng
Publication venue
Publication date: 25/09/2020
Field of study

This paper introduces the real image Super-Resolution (SR) challenge that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2020. This challenge involves three tracks to super-resolve an input image for

\times

\times

3 and

\times

4 scaling factors, respectively. The goal is to attract more attention to realistic image degradation for the SR task, which is much more complicated and challenging, and contributes to real-world image super-resolution applications. 452 participants were registered for three tracks in total, and 24 teams submitted their results. They gauge the state-of-the-art approaches for real image SR in terms of PSNR and SSIM

arXiv.org e-Print Archive

UG $^{2+}$ Track 2: A Collective Benchmark Effort for Evaluating and Advancing Image Understanding in Poor Visibility Environments

Author: Liu Jiaying
Ren Wenqi
Scheirer Walter J.
Wang Zhangyang
Yang Wenhan
Yuan Ye
Publication venue
Publication date: 31/03/2020
Field of study

The UG

^{2+}

challenge in IEEE CVPR 2019 aims to evoke a comprehensive discussion and exploration about how low-level vision techniques can benefit the high-level automatic visual recognition in various scenarios. In its second track, we focus on object or face detection in poor visibility enhancements caused by bad weathers (haze, rain) and low light conditions. While existing enhancement methods are empirically expected to help the high-level end task, that is observed to not always be the case in practice. To provide a more thorough examination and fair comparison, we introduce three benchmark sets collected in real-world hazy, rainy, and low-light conditions, respectively, with annotate objects/faces annotated. To our best knowledge, this is the first and currently largest effort of its kind. Baseline results by cascading existing enhancement and detection models are reported, indicating the highly challenging nature of our new data as well as the large room for further technical innovations. We expect a large participation from the broad research community to address these challenges together.Comment: A summary paper on datasets, fact sheets, baseline results, challenge results, and winning methods in UG

^{2+}

Challenge (Track 2). More materials are provided in http://www.ug2challenge.org/index.htm

arXiv.org e-Print Archive

Deep Learning-Based Video Coding: A Review and A Case Study

Author: Li Houqiang
Li Yue
Lin Jianping
Liu Dong
Wu Feng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/04/2019
Field of study

The past decade has witnessed great success of deep learning technology in many disciplines, especially in computer vision and image processing. However, deep learning-based video coding remains in its infancy. This paper reviews the representative works about using deep learning for image/video coding, which has been an actively developing research area since the year of 2015. We divide the related works into two categories: new coding schemes that are built primarily upon deep networks (deep schemes), and deep network-based coding tools (deep tools) that shall be used within traditional coding schemes or together with traditional coding tools. For deep schemes, pixel probability modeling and auto-encoder are the two approaches, that can be viewed as predictive coding scheme and transform coding scheme, respectively. For deep tools, there have been several proposed techniques using deep learning to perform intra-picture prediction, inter-picture prediction, cross-channel prediction, probability distribution prediction, transform, post- or in-loop filtering, down- and up-sampling, as well as encoding optimizations. In the hope of advocating the research of deep learning-based video coding, we present a case study of our developed prototype video codec, namely Deep Learning Video Coding (DLVC). DLVC features two deep tools that are both based on convolutional neural network (CNN), namely CNN-based in-loop filter (CNN-ILF) and CNN-based block adaptive resolution coding (CNN-BARC). Both tools help improve the compression efficiency by a significant margin. With the two deep tools as well as other non-deep coding tools, DLVC is able to achieve on average 39.6\% and 33.0\% bits saving than HEVC, under random-access and low-delay configurations, respectively. The source code of DLVC has been released for future researches

arXiv.org e-Print Archive

When Autonomous Systems Meet Accuracy and Transferability through AI: A Survey

Author: Kurths Jürgen
Qian Feng
Sun Qiyu
Tang Yang
Wang Jianrui
Yen Gary G.
Zhang Chongzhen
Zhao Chaoqiang
Publication venue
Publication date: 24/05/2020
Field of study

With widespread applications of artificial intelligence (AI), the capabilities of the perception, understanding, decision-making and control for autonomous systems have improved significantly in the past years. When autonomous systems consider the performance of accuracy and transferability, several AI methods, like adversarial learning, reinforcement learning (RL) and meta-learning, show their powerful performance. Here, we review the learning-based approaches in autonomous systems from the perspectives of accuracy and transferability. Accuracy means that a well-trained model shows good results during the testing phase, in which the testing set shares a same task or a data distribution with the training set. Transferability means that when a well-trained model is transferred to other testing domains, the accuracy is still good. Firstly, we introduce some basic concepts of transfer learning and then present some preliminaries of adversarial learning, RL and meta-learning. Secondly, we focus on reviewing the accuracy or transferability or both of them to show the advantages of adversarial learning, like generative adversarial networks (GANs), in typical computer vision tasks in autonomous systems, including image style transfer, image superresolution, image deblurring/dehazing/rain removal, semantic segmentation, depth estimation, pedestrian detection and person re-identification (re-ID). Then, we further review the performance of RL and meta-learning from the aspects of accuracy or transferability or both of them in autonomous systems, involving pedestrian tracking, robot navigation and robotic manipulation. Finally, we discuss several challenges and future topics for using adversarial learning, RL and meta-learning in autonomous systems

arXiv.org e-Print Archive

Towards Real Scene Super-Resolution with Raw Images

Author: Ma Yongrui
Sun Wenxiu
Xu Xiangyu
Publication venue
Publication date: 28/05/2019
Field of study

Most existing super-resolution methods do not perform well in real scenarios due to lack of realistic training data and information loss of the model input. To solve the first problem, we propose a new pipeline to generate realistic training data by simulating the imaging process of digital cameras. And to remedy the information loss of the input, we develop a dual convolutional neural network to exploit the originally captured radiance information in raw images. In addition, we propose to learn a spatially-variant color transformation which helps more effective color corrections. Extensive experiments demonstrate that super-resolution with raw data helps recover fine details and clear structures, and more importantly, the proposed network and data generation pipeline achieve superior results for single image super-resolution in real scenarios.Comment: Accepted in CVPR 2019, project page: https://sites.google.com/view/xiangyuxu/rawsr_cvpr1

arXiv.org e-Print Archive

cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey

Author: Abe Kaori
Fuchida Masataka
He Yun
Kanehara Yoshihiro
Kanezaki Asako
Kataoka Hirokatsu
Maruyama Shinya
Matsuzaki Yuta
Miyashita Yudai
Morita Shin'ichiro
Okayasu Kazushige
Shirakabe Soma
Suzuki Teppei
Takasawa Ryosuke
Ueta Shunya
Yabe Toshiyuki
Yatsuyanagi Hiroya
Publication venue
Publication date: 20/07/2017
Field of study

The paper gives futuristic challenges disscussed in the cvpaper.challenge. In 2015 and 2016, we thoroughly study 1,600+ papers in several conferences/journals such as CVPR/ICCV/ECCV/NIPS/PAMI/IJCV

arXiv.org e-Print Archive

Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning

Author: Dong Chao
Lin Liang
Loy Chen Change
Yu Ke
Publication venue
Publication date: 09/04/2018
Field of study

We investigate a novel approach for image restoration by reinforcement learning. Unlike existing studies that mostly train a single large network for a specialized task, we prepare a toolbox consisting of small-scale convolutional networks of different complexities and specialized in different tasks. Our method, RL-Restore, then learns a policy to select appropriate tools from the toolbox to progressively restore the quality of a corrupted image. We formulate a step-wise reward function proportional to how well the image is restored at each step to learn the action policy. We also devise a joint learning scheme to train the agent and tools for better performance in handling uncertainty. In comparison to conventional human-designed networks, RL-Restore is capable of restoring images corrupted with complex and unknown distortions in a more parameter-efficient manner using the dynamically formed toolchain.Comment: To appear at CVPR 2018 (Spotlight). Project page: http://mmlab.ie.cuhk.edu.hk/projects/RL-Restore

arXiv.org e-Print Archive