13,279 research outputs found
NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results
This paper reviews the NTIRE 2020 challenge on real image denoising with
focus on the newly introduced dataset, the proposed methods and their results.
The challenge is a new version of the previous NTIRE 2019 challenge on real
image denoising that was based on the SIDD benchmark. This challenge is based
on a newly collected validation and testing image datasets, and hence, named
SIDD+. This challenge has two tracks for quantitatively evaluating image
denoising performance in (1) the Bayer-pattern rawRGB and (2) the standard RGB
(sRGB) color spaces. Each track ~250 registered participants. A total of 22
teams, proposing 24 methods, competed in the final phase of the challenge. The
proposed methods by the participating teams represent the current
state-of-the-art performance in image denoising targeting real noisy images.
The newly collected SIDD+ datasets are publicly available at:
https://bit.ly/siddplus_data
NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image
This paper reviews the second challenge on spectral reconstruction from RGB
images, i.e., the recovery of whole-scene hyperspectral (HS) information from a
3-channel RGB image. As in the previous challenge, two tracks were provided:
(i) a "Clean" track where HS images are estimated from noise-free RGBs, the RGB
images are themselves calculated numerically using the ground-truth HS images
and supplied spectral sensitivity functions (ii) a "Real World" track,
simulating capture by an uncalibrated and unknown camera, where the HS images
are recovered from noisy JPEG-compressed RGB images. A new, larger-than-ever,
natural hyperspectral image data set is presented, containing a total of 510 HS
images. The Clean and Real World tracks had 103 and 78 registered participants
respectively, with 14 teams competing in the final testing phase. A description
of the proposed methods, alongside their challenge scores and an extensive
evaluation of top performing methods is also provided. They gauge the
state-of-the-art in spectral reconstruction from an RGB image
Triple Attention Mixed Link Network for Single Image Super Resolution
Single image super resolution is of great importance as a low-level computer
vision task. Recent approaches with deep convolutional neural networks have
achieved im-pressive performance. However, existing architectures have
limitations due to the less sophisticated structure along with less strong
representational power. In this work, to significantly enhance the feature
representation, we proposed Triple Attention mixed link Network (TAN) which
consists of 1) three different aspects (i.e., kernel, spatial and channel) of
attention mechanisms and 2) fu-sion of both powerful residual and dense
connections (i.e., mixed link). Specifically, the network with multi kernel
learns multi hierarchical representations under different receptive fields. The
output features are recalibrated by the effective kernel and channel attentions
and feed into next layer partly residual and partly dense, which filters the
information and enable the network to learn more powerful representations. The
features finally pass through the spatial attention in the reconstruction
network which generates a fusion of local and global information, let the
network restore more details and improves the quality of reconstructed images.
Thanks to the diverse feature recalibrations and the advanced information flow
topology, our proposed model is strong enough to per-form against the
state-of-the-art methods on the bench-mark evaluations
Efficient Deep Neural Network for Photo-realistic Image Super-Resolution
Recent progress in the deep learning-based models has improved
photo-realistic (or perceptual) single-image super-resolution significantly.
However, despite their powerful performance, many methods are difficult to
apply to real-world applications because of the heavy computational
requirements. To facilitate the use of a deep model under such demands, we
focus on keeping the network efficient while maintaining its performance. In
detail, we design an architecture that implements a cascading mechanism on a
residual network to boost the performance with limited resources via
multi-level feature fusion. In addition, our proposed model adopts group
convolution and recursive scheme in order to achieve extreme efficiency. We
further improve the perceptual quality of the output by employing the
adversarial learning paradigm and a multi-scale discriminator approach. The
performance of our method is investigated through extensive internal
experiments and benchmark using various datasets. Our results show that our
models outperform the recent methods with similar complexity, for both
traditional pixel-based and perception-based tasks
MDCN: Multi-scale Dense Cross Network for Image Super-Resolution
Convolutional neural networks have been proven to be of great benefit for
single-image super-resolution (SISR). However, previous works do not make full
use of multi-scale features and ignore the inter-scale correlation between
different upsampling factors, resulting in sub-optimal performance. Instead of
blindly increasing the depth of the network, we are committed to mining image
features and learning the inter-scale correlation between different upsampling
factors. To achieve this, we propose a Multi-scale Dense Cross Network (MDCN),
which achieves great performance with fewer parameters and less execution time.
MDCN consists of multi-scale dense cross blocks (MDCBs), hierarchical feature
distillation block (HFDB), and dynamic reconstruction block (DRB). Among them,
MDCB aims to detect multi-scale features and maximize the use of image features
flow at different scales, HFDB focuses on adaptively recalibrate channel-wise
feature responses to achieve feature distillation, and DRB attempts to
reconstruct SR images with different upsampling factors in a single model. It
is worth noting that all these modules can run independently. It means that
these modules can be selectively plugged into any CNN model to improve model
performance. Extensive experiments show that MDCN achieves competitive results
in SISR, especially in the reconstruction task with multiple upsampling
factors. The code will be provided at https://github.com/MIVRC/MDCN-PyTorch.Comment: 15 pages, 15 figure
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki
Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers
on computer vision, pattern recognition, and related fields. For this
particular review, we focused on reading the ALL 602 conference papers
presented at the CVPR2015, the premier annual computer vision event held in
June 2015, in order to grasp the trends in the field. Further, we are proposing
"DeepSurvey" as a mechanism embodying the entire process from the reading
through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
Unsupervised Learning of Monocular Depth Estimation with Bundle Adjustment, Super-Resolution and Clip Loss
We present a novel unsupervised learning framework for single view depth
estimation using monocular videos. It is well known in 3D vision that enlarging
the baseline can increase the depth estimation accuracy, and jointly optimizing
a set of camera poses and landmarks is essential. In previous monocular
unsupervised learning frameworks, only part of the photometric and geometric
constraints within a sequence are used as supervisory signals. This may result
in a short baseline and overfitting. Besides, previous works generally estimate
a low resolution depth from a low resolution impute image. The low resolution
depth is then interpolated to recover the original resolution. This strategy
may generate large errors on object boundaries, as the depth of background and
foreground are mixed to yield the high resolution depth. In this paper, we
introduce a bundle adjustment framework and a super-resolution network to solve
the above two problems. In bundle adjustment, depths and poses of an image
sequence are jointly optimized, which increases the baseline by establishing
the relationship between farther frames. The super resolution network learns to
estimate a high resolution depth from a low resolution image. Additionally, we
introduce the clip loss to deal with moving objects and occlusion. Experimental
results on the KITTI dataset show that the proposed algorithm outperforms the
state-of-the-art unsupervised methods using monocular sequences, and achieves
comparable or even better result compared to unsupervised methods using stereo
sequences
AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results
This paper introduces the real image Super-Resolution (SR) challenge that was
part of the Advances in Image Manipulation (AIM) workshop, held in conjunction
with ECCV 2020. This challenge involves three tracks to super-resolve an input
image for 2, 3 and 4 scaling factors, respectively. The
goal is to attract more attention to realistic image degradation for the SR
task, which is much more complicated and challenging, and contributes to
real-world image super-resolution applications. 452 participants were
registered for three tracks in total, and 24 teams submitted their results.
They gauge the state-of-the-art approaches for real image SR in terms of PSNR
and SSIM
DeepFaceLab: A simple, flexible and extensible face swapping framework
DeepFaceLab is an open-source deepfake system created by \textbf{iperov} for
face swapping with more than 3,000 forks and 13,000 stars in Github: it
provides an imperative and easy-to-use pipeline for people to use with no
comprehensive understanding of deep learning framework or with model
implementation required, while remains a flexible and loose coupling structure
for people who need to strengthen their own pipeline with other features
without writing complicated boilerplate code. In this paper, we detail the
principles that drive the implementation of DeepFaceLab and introduce the
pipeline of it, through which every aspect of the pipeline can be modified
painlessly by users to achieve their customization purpose, and it's noteworthy
that DeepFaceLab could achieve results with high fidelity and indeed
indiscernible by mainstream forgery detection approaches. We demonstrate the
advantage of our system through comparing our approach with current prevailing
systems. For more information, please visit:
https://github.com/iperov/DeepFaceLab/
Deep Learning Convolutional Networks for Multiphoton Microscopy Vasculature Segmentation
Recently there has been an increasing trend to use deep learning frameworks
for both 2D consumer images and for 3D medical images. However, there has been
little effort to use deep frameworks for volumetric vascular segmentation. We
wanted to address this by providing a freely available dataset of 12 annotated
two-photon vasculature microscopy stacks. We demonstrated the use of deep
learning framework consisting both 2D and 3D convolutional filters (ConvNet).
Our hybrid 2D-3D architecture produced promising segmentation result. We
derived the architectures from Lee et al. who used the ZNN framework initially
designed for electron microscope image segmentation. We hope that by sharing
our volumetric vasculature datasets, we will inspire other researchers to
experiment with vasculature dataset and improve the used network architectures.Comment: 23 pages, 10 figure
- …