4,834 research outputs found
On the Relation between Color Image Denoising and Classification
Large amount of image denoising literature focuses on single channel images
and often experimentally validates the proposed methods on tens of images at
most. In this paper, we investigate the interaction between denoising and
classification on large scale dataset. Inspired by classification models, we
propose a novel deep learning architecture for color (multichannel) image
denoising and report on thousands of images from ImageNet dataset as well as
commonly used imagery. We study the importance of (sufficient) training data,
how semantic class information can be traded for improved denoising results. As
a result, our method greatly improves PSNR performance by 0.34 - 0.51 dB on
average over state-of-the art methods on large scale dataset. We conclude that
it is beneficial to incorporate in classification models. On the other hand, we
also study how noise affect classification performance. In the end, we come to
a number of interesting conclusions, some being counter-intuitive
Bridging the Gap Between Computational Photography and Visual Recognition
What is the current state-of-the-art for image restoration and enhancement
applied to degraded images acquired under less than ideal circumstances? Can
the application of such algorithms as a pre-processing step to improve image
interpretability for manual analysis or automatic visual recognition to
classify scene content? While there have been important advances in the area of
computational photography to restore or enhance the visual quality of an image,
the capabilities of such techniques have not always translated in a useful way
to visual recognition tasks. Consequently, there is a pressing need for the
development of algorithms that are designed for the joint problem of improving
visual appearance and recognition, which will be an enabling factor for the
deployment of visual recognition tools in many real-world scenarios. To address
this, we introduce the UG^2 dataset as a large-scale benchmark composed of
video imagery captured under challenging conditions, and two enhancement tasks
designed to test algorithmic impact on visual quality and automatic object
recognition. Furthermore, we propose a set of metrics to evaluate the joint
improvement of such tasks as well as individual algorithmic advances, including
a novel psychophysics-based evaluation regime for human assessment and a
realistic set of quantitative measures for object recognition performance. We
introduce six new algorithms for image restoration or enhancement, which were
created as part of the IARPA sponsored UG^2 Challenge workshop held at CVPR
2018. Under the proposed evaluation regime, we present an in-depth analysis of
these algorithms and a host of deep learning-based and classic baseline
approaches. From the observed results, it is evident that we are in the early
days of building a bridge between computational photography and visual
recognition, leaving many opportunities for innovation in this area.Comment: CVPR Prize Challenge: http://www.ug2challenge.or
200x Low-dose PET Reconstruction using Deep Learning
Positron emission tomography (PET) is widely used in various clinical
applications, including cancer diagnosis, heart disease and neuro disorders.
The use of radioactive tracer in PET imaging raises concerns due to the risk of
radiation exposure. To minimize this potential risk in PET imaging, efforts
have been made to reduce the amount of radio-tracer usage. However, lowing dose
results in low Signal-to-Noise-Ratio (SNR) and loss of information, both of
which will heavily affect clinical diagnosis. Besides, the ill-conditioning of
low-dose PET image reconstruction makes it a difficult problem for iterative
reconstruction algorithms. Previous methods proposed are typically complicated
and slow, yet still cannot yield satisfactory results at significantly low
dose. Here, we propose a deep learning method to resolve this issue with an
encoder-decoder residual deep network with concatenate skip connections.
Experiments shows the proposed method can reconstruct low-dose PET image to a
standard-dose quality with only two-hundredth dose. Different cost functions
for training model are explored. Multi-slice input strategy is introduced to
provide the network with more structural information and make it more robust to
noise. Evaluation on ultra-low-dose clinical data shows that the proposed
method can achieve better result than the state-of-the-art methods and
reconstruct images with comparable quality using only 0.5% of the original
regular dose
CFSNet: Toward a Controllable Feature Space for Image Restoration
Deep learning methods have witnessed the great progress in image restoration
with specific metrics (e.g., PSNR, SSIM). However, the perceptual quality of
the restored image is relatively subjective, and it is necessary for users to
control the reconstruction result according to personal preferences or image
characteristics, which cannot be done using existing deterministic networks.
This motivates us to exquisitely design a unified interactive framework for
general image restoration tasks. Under this framework, users can control
continuous transition of different objectives, e.g., the perception-distortion
trade-off of image super-resolution, the trade-off between noise reduction and
detail preservation. We achieve this goal by controlling the latent features of
the designed network. To be specific, our proposed framework, named
Controllable Feature Space Network (CFSNet), is entangled by two branches based
on different objectives. Our framework can adaptively learn the coupling
coefficients of different layers and channels, which provides finer control of
the restored image quality. Experiments on several typical image restoration
tasks fully validate the effective benefits of the proposed method. Code is
available at https://github.com/qibao77/CFSNet.Comment: Accepted by ICCV 201
Image Super-Resolution Using Deep Convolutional Networks
We propose a deep learning method for single image super-resolution (SR). Our
method directly learns an end-to-end mapping between the low/high-resolution
images. The mapping is represented as a deep convolutional neural network (CNN)
that takes the low-resolution image as the input and outputs the
high-resolution one. We further show that traditional sparse-coding-based SR
methods can also be viewed as a deep convolutional network. But unlike
traditional methods that handle each component separately, our method jointly
optimizes all layers. Our deep CNN has a lightweight structure, yet
demonstrates state-of-the-art restoration quality, and achieves fast speed for
practical on-line usage. We explore different network structures and parameter
settings to achieve trade-offs between performance and speed. Moreover, we
extend our network to cope with three color channels simultaneously, and show
better overall reconstruction quality.Comment: 14 pages, 14 figures, journa
Loss Functions for Neural Networks for Image Processing
Neural networks are becoming central in several areas of computer vision and
image processing and different architectures have been proposed to solve
specific problems. The impact of the loss layer of neural networks, however,
has not received much attention in the context of image processing: the default
and virtually only choice is L2. In this paper, we bring attention to
alternative choices for image restoration. In particular, we show the
importance of perceptually-motivated losses when the resulting image is to be
evaluated by a human observer. We compare the performance of several losses,
and propose a novel, differentiable error function. We show that the quality of
the results improves significantly with better loss functions, even when the
network architecture is left unchanged.Comment: This paper was published in IEEE Transactions on Computational
Imaging on December 23, 201
An Integrated Autoencoder-Based Filter for Sparse Big Data
We propose a novel filter for sparse big data, called an integrated
autoencoder (IAE), which utilizes auxiliary information to mitigate data
sparsity. The proposed model achieves an appropriate balance between prediction
accuracy, convergence speed, and complexity. We implement experiments on a GPS
trajectory dataset, and the results demonstrate that the IAE is more accurate
and robust than some state-of-the-art methods
X-GANs: Image Reconstruction Made Easy for Extreme Cases
Image reconstruction including image restoration and denoising is a
challenging problem in the field of image computing. We present a new method,
called X-GANs, for reconstruction of arbitrary corrupted resource based on a
variant of conditional generative adversarial networks (conditional GANs). In
our method, a novel generator and multi-scale discriminators are proposed, as
well as the combined adversarial losses, which integrate a VGG perceptual loss,
an adversarial perceptual loss, and an elaborate corresponding point loss
together based on the analysis of image feature. Our conditional GANs have
enabled a variety of applications in image reconstruction, including image
denoising, image restoration from quite a sparse sampling, image inpainting,
image recovery from the severely polluted block or even color-noise dominated
images, which are extreme cases and haven't been addressed in the status quo.
We have significantly improved the accuracy and quality of image
reconstruction. Extensive perceptual experiments on datasets ranging from human
faces to natural scenes demonstrate that images reconstructed by the presented
approach are considerably more realistic than alternative work. Our method can
also be extended to handle high-ratio image compression.Comment: 9 pages, 12 figure
Segmentation-Aware Image Denoising without Knowing True Segmentation
Several recent works discussed application-driven image restoration neural
networks, which are capable of not only removing noise in images but also
preserving their semantic-aware details, making them suitable for various
high-level computer vision tasks as the pre-processing step. However, such
approaches require extra annotations for their high-level vision tasks, in
order to train the joint pipeline using hybrid losses. The availability of
those annotations is yet often limited to a few image sets, potentially
restricting the general applicability of these methods to denoising more unseen
and unannotated images. Motivated by that, we propose a segmentation-aware
image denoising model dubbed U-SAID, based on a novel unsupervised approach
with a pixel-wise uncertainty loss. U-SAID does not need any ground-truth
segmentation map, and thus can be applied to any image dataset. It generates
denoised images with comparable or even better quality, and the denoised
results show stronger robustness for subsequent semantic segmentation tasks,
when compared to either its supervised counterpart or classical
"application-agnostic" denoisers. Moreover, we demonstrate the superior
generalizability of U-SAID in three-folds, by plugging its "universal" denoiser
without fine-tuning: (1) denoising unseen types of images; (2) denoising as
pre-processing for segmenting unseen noisy images; and (3) denoising for unseen
high-level tasks. Extensive experiments demonstrate the effectiveness,
robustness and generalizability of the proposed U-SAID over various popular
image sets
DPW-SDNet: Dual Pixel-Wavelet Domain Deep CNNs for Soft Decoding of JPEG-Compressed Images
JPEG is one of the widely used lossy compression methods. JPEG-compressed
images usually suffer from compression artifacts including blocking and
blurring, especially at low bit-rates. Soft decoding is an effective solution
to improve the quality of compressed images without changing codec or
introducing extra coding bits. Inspired by the excellent performance of the
deep convolutional neural networks (CNNs) on both low-level and high-level
computer vision problems, we develop a dual pixel-wavelet domain deep
CNNs-based soft decoding network for JPEG-compressed images, namely DPW-SDNet.
The pixel domain deep network takes the four downsampled versions of the
compressed image to form a 4-channel input and outputs a pixel domain
prediction, while the wavelet domain deep network uses the 1-level discrete
wavelet transformation (DWT) coefficients to form a 4-channel input to produce
a DWT domain prediction. The pixel domain and wavelet domain estimates are
combined to generate the final soft decoded result. Experimental results
demonstrate the superiority of the proposed DPW-SDNet over several
state-of-the-art compression artifacts reduction algorithms.Comment: CVPRW 201
- …