13 research outputs found
Image Super-Resolution with Deep Dictionary
Since the first success of Dong et al., the deep-learning-based approach has
become dominant in the field of single-image super-resolution. This replaces
all the handcrafted image processing steps of traditional sparse-coding-based
methods with a deep neural network. In contrast to sparse-coding-based methods,
which explicitly create high/low-resolution dictionaries, the dictionaries in
deep-learning-based methods are implicitly acquired as a nonlinear combination
of multiple convolutions. One disadvantage of deep-learning-based methods is
that their performance is degraded for images created differently from the
training dataset (out-of-domain images). We propose an end-to-end
super-resolution network with a deep dictionary (SRDD), where a high-resolution
dictionary is explicitly learned without sacrificing the advantages of deep
learning. Extensive experiments show that explicit learning of high-resolution
dictionary makes the network more robust for out-of-domain test images while
maintaining the performance of the in-domain test images.Comment: ECCV 202
Learning Linear Groups in Neural Networks
Employing equivariance in neural networks leads to greater parameter
efficiency and improved generalization performance through the encoding of
domain knowledge in the architecture; however, the majority of existing
approaches require an a priori specification of the desired symmetries. We
present a neural network architecture, Linear Group Networks (LGNs), for
learning linear groups acting on the weight space of neural networks. Linear
groups are desirable due to their inherent interpretability, as they can be
represented as finite matrices. LGNs learn groups without any supervision or
knowledge of the hidden symmetries in the data and the groups can be mapped to
well known operations in machine learning. We use LGNs to learn groups on
multiple datasets while considering different downstream tasks; we demonstrate
that the linear group structure depends on both the data distribution and the
considered task
SC-VAE: Sparse Coding-based Variational Autoencoder
Learning rich data representations from unlabeled data is a key challenge
towards applying deep learning algorithms in downstream supervised tasks.
Several variants of variational autoencoders have been proposed to learn
compact data representaitons by encoding high-dimensional data in a lower
dimensional space. Two main classes of VAEs methods may be distinguished
depending on the characteristics of the meta-priors that are enforced in the
representation learning step. The first class of methods derives a continuous
encoding by assuming a static prior distribution in the latent space. The
second class of methods learns instead a discrete latent representation using
vector quantization (VQ) along with a codebook. However, both classes of
methods suffer from certain challenges, which may lead to suboptimal image
reconstruction results. The first class of methods suffers from posterior
collapse, whereas the second class of methods suffers from codebook collapse.
To address these challenges, we introduce a new VAE variant, termed SC-VAE
(sparse coding-based VAE), which integrates sparse coding within variational
autoencoder framework. Instead of learning a continuous or discrete latent
representation, the proposed method learns a sparse data representation that
consists of a linear combination of a small number of learned atoms. The sparse
coding problem is solved using a learnable version of the iterative shrinkage
thresholding algorithm (ISTA). Experiments on two image datasets demonstrate
that our model can achieve improved image reconstruction results compared to
state-of-the-art methods. Moreover, the use of learned sparse code vectors
allows us to perform downstream task like coarse image segmentation through
clustering image patches.Comment: 15 pages, 11 figures, and 3 table
Fast and Interpretable Nonlocal Neural Networks for Image Denoising via Group-Sparse Convolutional Dictionary Learning
Nonlocal self-similarity within natural images has become an increasingly
popular prior in deep-learning models. Despite their successful image
restoration performance, such models remain largely uninterpretable due to
their black-box construction. Our previous studies have shown that
interpretable construction of a fully convolutional denoiser (CDLNet), with
performance on par with state-of-the-art black-box counterparts, is achievable
by unrolling a dictionary learning algorithm. In this manuscript, we seek an
interpretable construction of a convolutional network with a nonlocal
self-similarity prior that performs on par with black-box nonlocal models. We
show that such an architecture can be effectively achieved by upgrading the
sparsity prior of CDLNet to a weighted group-sparsity prior. From this
formulation, we propose a novel sliding-window nonlocal operation, enabled by
sparse array arithmetic. In addition to competitive performance with black-box
nonlocal DNNs, we demonstrate the proposed sliding-window sparse attention
enables inference speeds greater than an order of magnitude faster than its
competitors.Comment: 11 pages, 8 figures, 6 table
Fully Trainable and Interpretable Non-Local Sparse Models for Image Restoration
Non-local self-similarity and sparsity principles have proven to be powerful
priors for natural image modeling. We propose a novel differentiable relaxation
of joint sparsity that exploits both principles and leads to a general
framework for image restoration which is (1) trainable end to end, (2) fully
interpretable, and (3) much more compact than competing deep learning
architectures. We apply this approach to denoising, jpeg deblocking, and
demosaicking, and show that, with as few as 100K parameters, its performance on
several standard benchmarks is on par or better than state-of-the-art methods
that may have an order of magnitude or more parameters.Comment: ECCV 202
A Flexible Framework for Designing Trainable Priors with Adaptive Smoothing and Game Encoding
We introduce a general framework for designing and training neural network
layers whose forward passes can be interpreted as solving non-smooth convex
optimization problems, and whose architectures are derived from an optimization
algorithm. We focus on convex games, solved by local agents represented by the
nodes of a graph and interacting through regularization functions. This
approach is appealing for solving imaging problems, as it allows the use of
classical image priors within deep models that are trainable end to end. The
priors used in this presentation include variants of total variation, Laplacian
regularization, bilateral filtering, sparse coding on learned dictionaries, and
non-local self similarities. Our models are fully interpretable as well as
parameter and data efficient. Our experiments demonstrate their effectiveness
on a large diversity of tasks ranging from image denoising and compressed
sensing for fMRI to dense stereo matching.Comment: NeurIPS 202