14,160 research outputs found
Difficulty-aware Image Super Resolution via Deep Adaptive Dual-Network
Recently, deep learning based single image super-resolution(SR) approaches
have achieved great development. The state-of-the-art SR methods usually adopt
a feed-forward pipeline to establish a non-linear mapping between low-res(LR)
and high-res(HR) images. However, due to treating all image regions equally
without considering the difficulty diversity, these approaches meet an upper
bound for optimization. To address this issue, we propose a novel SR approach
that discriminately processes each image region within an image by its
difficulty. Specifically, we propose a dual-way SR network that one way is
trained to focus on easy image regions and another is trained to handle hard
image regions. To identify whether a region is easy or hard, we propose a novel
image difficulty recognition network based on PSNR prior. Our SR approach that
uses the region mask to adaptively enforce the dual-way SR network yields
superior results. Extensive experiments on several standard benchmarks (e.g.,
Set5, Set14, BSD100, and Urban100) show that our approach achieves
state-of-the-art performance.Comment: ICME2019(Oral), code and results are available at:
https://github.com/xzwlx/Difficulty-S
NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image
This paper reviews the second challenge on spectral reconstruction from RGB
images, i.e., the recovery of whole-scene hyperspectral (HS) information from a
3-channel RGB image. As in the previous challenge, two tracks were provided:
(i) a "Clean" track where HS images are estimated from noise-free RGBs, the RGB
images are themselves calculated numerically using the ground-truth HS images
and supplied spectral sensitivity functions (ii) a "Real World" track,
simulating capture by an uncalibrated and unknown camera, where the HS images
are recovered from noisy JPEG-compressed RGB images. A new, larger-than-ever,
natural hyperspectral image data set is presented, containing a total of 510 HS
images. The Clean and Real World tracks had 103 and 78 registered participants
respectively, with 14 teams competing in the final testing phase. A description
of the proposed methods, alongside their challenge scores and an extensive
evaluation of top performing methods is also provided. They gauge the
state-of-the-art in spectral reconstruction from an RGB image
DAVANet: Stereo Deblurring with View Aggregation
Nowadays stereo cameras are more commonly adopted in emerging devices such as
dual-lens smartphones and unmanned aerial vehicles. However, they also suffer
from blurry images in dynamic scenes which leads to visual discomfort and
hampers further image processing. Previous works have succeeded in monocular
deblurring, yet there are few studies on deblurring for stereoscopic images. By
exploiting the two-view nature of stereo images, we propose a novel stereo
image deblurring network with Depth Awareness and View Aggregation, named
DAVANet. In our proposed network, 3D scene cues from the depth and varying
information from two views are incorporated, which help to remove complex
spatially-varying blur in dynamic scenes. Specifically, with our proposed
fusion network, we integrate the bidirectional disparities estimation and
deblurring into a unified framework. Moreover, we present a large-scale
multi-scene dataset for stereo deblurring, containing 20,637 blurry-sharp
stereo image pairs from 135 diverse sequences and their corresponding
bidirectional disparities. The experimental results on our dataset demonstrate
that DAVANet outperforms state-of-the-art methods in terms of accuracy, speed,
and model size.Comment: CVPR 2019 (Oral
AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results
This paper introduces the real image Super-Resolution (SR) challenge that was
part of the Advances in Image Manipulation (AIM) workshop, held in conjunction
with ECCV 2020. This challenge involves three tracks to super-resolve an input
image for 2, 3 and 4 scaling factors, respectively. The
goal is to attract more attention to realistic image degradation for the SR
task, which is much more complicated and challenging, and contributes to
real-world image super-resolution applications. 452 participants were
registered for three tracks in total, and 24 teams submitted their results.
They gauge the state-of-the-art approaches for real image SR in terms of PSNR
and SSIM
UG Track 2: A Collective Benchmark Effort for Evaluating and Advancing Image Understanding in Poor Visibility Environments
The UG challenge in IEEE CVPR 2019 aims to evoke a comprehensive
discussion and exploration about how low-level vision techniques can benefit
the high-level automatic visual recognition in various scenarios. In its second
track, we focus on object or face detection in poor visibility enhancements
caused by bad weathers (haze, rain) and low light conditions. While existing
enhancement methods are empirically expected to help the high-level end task,
that is observed to not always be the case in practice. To provide a more
thorough examination and fair comparison, we introduce three benchmark sets
collected in real-world hazy, rainy, and low-light conditions, respectively,
with annotate objects/faces annotated. To our best knowledge, this is the first
and currently largest effort of its kind. Baseline results by cascading
existing enhancement and detection models are reported, indicating the highly
challenging nature of our new data as well as the large room for further
technical innovations. We expect a large participation from the broad research
community to address these challenges together.Comment: A summary paper on datasets, fact sheets, baseline results, challenge
results, and winning methods in UG Challenge (Track 2). More materials
are provided in http://www.ug2challenge.org/index.htm
Deep Learning-Based Video Coding: A Review and A Case Study
The past decade has witnessed great success of deep learning technology in
many disciplines, especially in computer vision and image processing. However,
deep learning-based video coding remains in its infancy. This paper reviews the
representative works about using deep learning for image/video coding, which
has been an actively developing research area since the year of 2015. We divide
the related works into two categories: new coding schemes that are built
primarily upon deep networks (deep schemes), and deep network-based coding
tools (deep tools) that shall be used within traditional coding schemes or
together with traditional coding tools. For deep schemes, pixel probability
modeling and auto-encoder are the two approaches, that can be viewed as
predictive coding scheme and transform coding scheme, respectively. For deep
tools, there have been several proposed techniques using deep learning to
perform intra-picture prediction, inter-picture prediction, cross-channel
prediction, probability distribution prediction, transform, post- or in-loop
filtering, down- and up-sampling, as well as encoding optimizations. In the
hope of advocating the research of deep learning-based video coding, we present
a case study of our developed prototype video codec, namely Deep Learning Video
Coding (DLVC). DLVC features two deep tools that are both based on
convolutional neural network (CNN), namely CNN-based in-loop filter (CNN-ILF)
and CNN-based block adaptive resolution coding (CNN-BARC). Both tools help
improve the compression efficiency by a significant margin. With the two deep
tools as well as other non-deep coding tools, DLVC is able to achieve on
average 39.6\% and 33.0\% bits saving than HEVC, under random-access and
low-delay configurations, respectively. The source code of DLVC has been
released for future researches
When Autonomous Systems Meet Accuracy and Transferability through AI: A Survey
With widespread applications of artificial intelligence (AI), the
capabilities of the perception, understanding, decision-making and control for
autonomous systems have improved significantly in the past years. When
autonomous systems consider the performance of accuracy and transferability,
several AI methods, like adversarial learning, reinforcement learning (RL) and
meta-learning, show their powerful performance. Here, we review the
learning-based approaches in autonomous systems from the perspectives of
accuracy and transferability. Accuracy means that a well-trained model shows
good results during the testing phase, in which the testing set shares a same
task or a data distribution with the training set. Transferability means that
when a well-trained model is transferred to other testing domains, the accuracy
is still good. Firstly, we introduce some basic concepts of transfer learning
and then present some preliminaries of adversarial learning, RL and
meta-learning. Secondly, we focus on reviewing the accuracy or transferability
or both of them to show the advantages of adversarial learning, like generative
adversarial networks (GANs), in typical computer vision tasks in autonomous
systems, including image style transfer, image superresolution, image
deblurring/dehazing/rain removal, semantic segmentation, depth estimation,
pedestrian detection and person re-identification (re-ID). Then, we further
review the performance of RL and meta-learning from the aspects of accuracy or
transferability or both of them in autonomous systems, involving pedestrian
tracking, robot navigation and robotic manipulation. Finally, we discuss
several challenges and future topics for using adversarial learning, RL and
meta-learning in autonomous systems
Towards Real Scene Super-Resolution with Raw Images
Most existing super-resolution methods do not perform well in real scenarios
due to lack of realistic training data and information loss of the model input.
To solve the first problem, we propose a new pipeline to generate realistic
training data by simulating the imaging process of digital cameras. And to
remedy the information loss of the input, we develop a dual convolutional
neural network to exploit the originally captured radiance information in raw
images. In addition, we propose to learn a spatially-variant color
transformation which helps more effective color corrections. Extensive
experiments demonstrate that super-resolution with raw data helps recover fine
details and clear structures, and more importantly, the proposed network and
data generation pipeline achieve superior results for single image
super-resolution in real scenarios.Comment: Accepted in CVPR 2019, project page:
https://sites.google.com/view/xiangyuxu/rawsr_cvpr1
Wavelet-Based Dual-Branch Network for Image Demoireing
When smartphone cameras are used to take photos of digital screens, usually
moire patterns result, severely degrading photo quality. In this paper, we
design a wavelet-based dual-branch network (WDNet) with a spatial attention
mechanism for image demoireing. Existing image restoration methods working in
the RGB domain have difficulty in distinguishing moire patterns from true scene
texture. Unlike these methods, our network removes moire patterns in the
wavelet domain to separate the frequencies of moire patterns from the image
content. The network combines dense convolution modules and dilated convolution
modules supporting large receptive fields. Extensive experiments demonstrate
the effectiveness of our method, and we further show that WDNet generalizes to
removing moire artifacts on non-screen images. Although designed for image
demoireing, WDNet has been applied to two other low-levelvision tasks,
outperforming state-of-the-art image deraining and derain-drop methods on the
Rain100h and Raindrop800 data sets, respectively.Comment: Accepted to ECCV 202
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey
The paper gives futuristic challenges disscussed in the cvpaper.challenge. In
2015 and 2016, we thoroughly study 1,600+ papers in several
conferences/journals such as CVPR/ICCV/ECCV/NIPS/PAMI/IJCV
- …