21,299 research outputs found
Non-linear Convolution Filters for CNN-based Learning
During the last years, Convolutional Neural Networks (CNNs) have achieved
state-of-the-art performance in image classification. Their architectures have
largely drawn inspiration by models of the primate visual system. However,
while recent research results of neuroscience prove the existence of non-linear
operations in the response of complex visual cells, little effort has been
devoted to extend the convolution technique to non-linear forms. Typical
convolutional layers are linear systems, hence their expressiveness is limited.
To overcome this, various non-linearities have been used as activation
functions inside CNNs, while also many pooling strategies have been applied. We
address the issue of developing a convolution method in the context of a
computational model of the visual cortex, exploring quadratic forms through the
Volterra kernels. Such forms, constituting a more rich function space, are used
as approximations of the response profile of visual cells. Our proposed
second-order convolution is tested on CIFAR-10 and CIFAR-100. We show that a
network which combines linear and non-linear filters in its convolutional
layers, can outperform networks that use standard linear filters with the same
architecture, yielding results competitive with the state-of-the-art on these
datasets.Comment: 9 pages, 5 figures, code link, ICCV 201
The Power of Linear Combinations: Learning with Random Convolutions
Following the traditional paradigm of convolutional neural networks (CNNs),
modern CNNs manage to keep pace with more recent, for example
transformer-based, models by not only increasing model depth and width but also
the kernel size. This results in large amounts of learnable model parameters
that need to be handled during training. While following the convolutional
paradigm with the according spatial inductive bias, we question the
significance of \emph{learned} convolution filters. In fact, our findings
demonstrate that many contemporary CNN architectures can achieve high test
accuracies without ever updating randomly initialized (spatial) convolution
filters. Instead, simple linear combinations (implemented through efficient
convolutions) suffice to effectively recombine even random filters
into expressive network operators. Furthermore, these combinations of random
filters can implicitly regularize the resulting operations, mitigating
overfitting and enhancing overall performance and robustness. Conversely,
retaining the ability to learn filter updates can impair network performance.
Lastly, although we only observe relatively small gains from learning convolutions, the learning gains increase proportionally with kernel size,
owing to the non-idealities of the independent and identically distributed
(\textit{i.i.d.}) nature of default initialization techniques
Kervolutional Neural Networks
Convolutional neural networks (CNNs) have enabled the state-of-the-art
performance in many computer vision tasks. However, little effort has been
devoted to establishing convolution in non-linear space. Existing works mainly
leverage on the activation layers, which can only provide point-wise
non-linearity. To solve this problem, a new operation, kervolution (kernel
convolution), is introduced to approximate complex behaviors of human
perception systems leveraging on the kernel trick. It generalizes convolution,
enhances the model capacity, and captures higher order interactions of
features, via patch-wise kernel functions, but without introducing additional
parameters. Extensive experiments show that kervolutional neural networks (KNN)
achieve higher accuracy and faster convergence than baseline CNN.Comment: oral paper in CVPR 201
Learning a Dilated Residual Network for SAR Image Despeckling
In this paper, to break the limit of the traditional linear models for
synthetic aperture radar (SAR) image despeckling, we propose a novel deep
learning approach by learning a non-linear end-to-end mapping between the noisy
and clean SAR images with a dilated residual network (SAR-DRN). SAR-DRN is
based on dilated convolutions, which can both enlarge the receptive field and
maintain the filter size and layer depth with a lightweight structure. In
addition, skip connections and residual learning strategy are added to the
despeckling model to maintain the image details and reduce the vanishing
gradient problem. Compared with the traditional despeckling methods, the
proposed method shows superior performance over the state-of-the-art methods on
both quantitative and visual assessments, especially for strong speckle noise.Comment: 18 pages, 13 figures, 7 table
- …