3 research outputs found
xUnit: Learning a Spatial Activation Function for Efficient Image Restoration
In recent years, deep neural networks (DNNs) achieved unprecedented
performance in many low-level vision tasks. However, state-of-the-art results
are typically achieved by very deep networks, which can reach tens of layers
with tens of millions of parameters. To make DNNs implementable on platforms
with limited resources, it is necessary to weaken the tradeoff between
performance and efficiency. In this paper, we propose a new activation unit,
which is particularly suitable for image restoration problems. In contrast to
the widespread per-pixel activation units, like ReLUs and sigmoids, our unit
implements a learnable nonlinear function with spatial connections. This
enables the net to capture much more complex features, thus requiring a
significantly smaller number of layers in order to reach the same performance.
We illustrate the effectiveness of our units through experiments with
state-of-the-art nets for denoising, de-raining, and super resolution, which
are already considered to be very small. With our approach, we are able to
further reduce these models by nearly 50% without incurring any degradation in
performance.Comment: Conference on Computer Vision and Pattern Recognition (CVPR), 201
Sparsity Aware Normalization for GANs
Generative adversarial networks (GANs) are known to benefit from
regularization or normalization of their critic (discriminator) network during
training. In this paper, we analyze the popular spectral normalization scheme,
find a significant drawback and introduce sparsity aware normalization (SAN), a
new alternative approach for stabilizing GAN training. As opposed to other
normalization methods, our approach explicitly accounts for the sparse nature
of the feature maps in convolutional networks with ReLU activations. We
illustrate the effectiveness of our method through extensive experiments with a
variety of network architectures. As we show, sparsity is particularly dominant
in critics used for image-to-image translation settings. In these cases our
approach improves upon existing methods, in less training epochs and with
smaller capacity networks, while requiring practically no computational
overhead.Comment: AAAI Conference on Artificial Intelligence (AAAI-21