23,585 research outputs found
A Fully Progressive Approach to Single-Image Super-Resolution
Recent deep learning approaches to single image super-resolution have
achieved impressive results in terms of traditional error measures and
perceptual quality. However, in each case it remains challenging to achieve
high quality results for large upsampling factors. To this end, we propose a
method (ProSR) that is progressive both in architecture and training: the
network upsamples an image in intermediate steps, while the learning process is
organized from easy to hard, as is done in curriculum learning. To obtain more
photorealistic results, we design a generative adversarial network (GAN), named
ProGanSR, that follows the same progressive multi-scale design principle. This
not only allows to scale well to high upsampling factors (e.g., 8x) but
constitutes a principled multi-scale approach that increases the reconstruction
quality for all upsampling factors simultaneously. In particular ProSR ranks
2nd in terms of SSIM and 4th in terms of PSNR in the NTIRE2018 SISR challenge
[34]. Compared to the top-ranking team, our model is marginally lower, but runs
5 times faster
Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks
Deep neural networks (DNNs) have become a widely deployed model for numerous
machine learning applications. However, their fixed architecture, substantial
training cost, and significant model redundancy make it difficult to
efficiently update them to accommodate previously unseen data. To solve these
problems, we propose an incremental learning framework based on a
grow-and-prune neural network synthesis paradigm. When new data arrive, the
neural network first grows new connections based on the gradients to increase
the network capacity to accommodate new data. Then, the framework iteratively
prunes away connections based on the magnitude of weights to enhance network
compactness, and hence recover efficiency. Finally, the model rests at a
lightweight DNN that is both ready for inference and suitable for future
grow-and-prune updates. The proposed framework improves accuracy, shrinks
network size, and significantly reduces the additional training cost for
incoming data compared to conventional approaches, such as training from
scratch and network fine-tuning. For the LeNet-300-100 and LeNet-5 neural
network architectures derived for the MNIST dataset, the framework reduces
training cost by up to 64% (63%) and 67% (63%) compared to training from
scratch (network fine-tuning), respectively. For the ResNet-18 architecture
derived for the ImageNet dataset and DeepSpeech2 for the AN4 dataset, the
corresponding training cost reductions against training from scratch (network
fine-tunning) are 64% (60%) and 67% (62%), respectively. Our derived models
contain fewer network parameters but achieve higher accuracy relative to
conventional baselines
A bio-inspired image coder with temporal scalability
We present a novel bio-inspired and dynamic coding scheme for static images.
Our coder aims at reproducing the main steps of the visual stimulus processing
in the mammalian retina taking into account its time behavior. The main novelty
of this work is to show how to exploit the time behavior of the retina cells to
ensure, in a simple way, scalability and bit allocation. To do so, our main
source of inspiration will be the biologically plausible retina model called
Virtual Retina. Following a similar structure, our model has two stages. The
first stage is an image transform which is performed by the outer layers in the
retina. Here it is modelled by filtering the image with a bank of difference of
Gaussians with time-delays. The second stage is a time-dependent
analog-to-digital conversion which is performed by the inner layers in the
retina. Thanks to its conception, our coder enables scalability and bit
allocation across time. Also, our decoded images do not show annoying artefacts
such as ringing and block effects. As a whole, this article shows how to
capture the main properties of a biological system, here the retina, in order
to design a new efficient coder.Comment: 12 pages; Advanced Concepts for Intelligent Vision Systems (ACIVS
2011
- …