1,807 research outputs found
Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks
Bilateral filters have wide spread use due to their edge-preserving
properties. The common use case is to manually choose a parametric filter type,
usually a Gaussian filter. In this paper, we will generalize the
parametrization and in particular derive a gradient descent algorithm so the
filter parameters can be learned from data. This derivation allows to learn
high dimensional linear filters that operate in sparsely populated feature
spaces. We build on the permutohedral lattice construction for efficient
filtering. The ability to learn more general forms of high-dimensional filters
can be used in several diverse applications. First, we demonstrate the use in
applications where single filter applications are desired for runtime reasons.
Further, we show how this algorithm can be used to learn the pairwise
potentials in densely connected conditional random fields and apply these to
different image segmentation tasks. Finally, we introduce layers of bilateral
filters in CNNs and propose bilateral neural networks for the use of
high-dimensional sparse data. This view provides new ways to encode model
structure into network architectures. A diverse set of experiments empirically
validates the usage of general forms of filters
Segmentation-Aware Convolutional Networks Using Local Attention Masks
We introduce an approach to integrate segmentation information within a
convolutional neural network (CNN). This counter-acts the tendency of CNNs to
smooth information across regions and increases their spatial precision. To
obtain segmentation information, we set up a CNN to provide an embedding space
where region co-membership can be estimated based on Euclidean distance. We use
these embeddings to compute a local attention mask relative to every neuron
position. We incorporate such masks in CNNs and replace the convolution
operation with a "segmentation-aware" variant that allows a neuron to
selectively attend to inputs coming from its own region. We call the resulting
network a segmentation-aware CNN because it adapts its filters at each image
point according to local segmentation cues. We demonstrate the merit of our
method on two widely different dense prediction tasks, that involve
classification (semantic segmentation) and regression (optical flow). Our
results show that in semantic segmentation we can match the performance of
DenseCRFs while being faster and simpler, and in optical flow we obtain clearly
sharper responses than networks that do not use local attention masks. In both
cases, segmentation-aware convolution yields systematic improvements over
strong baselines. Source code for this work is available online at
http://cs.cmu.edu/~aharley/segaware
Learning Enriched Features for Real Image Restoration and Enhancement
With the goal of recovering high-quality image content from its degraded
version, image restoration enjoys numerous applications, such as in
surveillance, computational photography, medical imaging, and remote sensing.
Recently, convolutional neural networks (CNNs) have achieved dramatic
improvements over conventional approaches for image restoration task. Existing
CNN-based methods typically operate either on full-resolution or on
progressively low-resolution representations. In the former case, spatially
precise but contextually less robust results are achieved, while in the latter
case, semantically reliable but spatially less accurate outputs are generated.
In this paper, we present a novel architecture with the collective goals of
maintaining spatially-precise high-resolution representations through the
entire network and receiving strong contextual information from the
low-resolution representations. The core of our approach is a multi-scale
residual block containing several key elements: (a) parallel multi-resolution
convolution streams for extracting multi-scale features, (b) information
exchange across the multi-resolution streams, (c) spatial and channel attention
mechanisms for capturing contextual information, and (d) attention based
multi-scale feature aggregation. In a nutshell, our approach learns an enriched
set of features that combines contextual information from multiple scales,
while simultaneously preserving the high-resolution spatial details. Extensive
experiments on five real image benchmark datasets demonstrate that our method,
named as MIRNet, achieves state-of-the-art results for a variety of image
processing tasks, including image denoising, super-resolution, and image
enhancement. The source code and pre-trained models are available at
https://github.com/swz30/MIRNet.Comment: Accepted for publication at ECCV 202
Knowledge-driven deep learning for fast MR imaging: undersampled MR image reconstruction from supervised to un-supervised learning
Deep learning (DL) has emerged as a leading approach in accelerating MR
imaging. It employs deep neural networks to extract knowledge from available
datasets and then applies the trained networks to reconstruct accurate images
from limited measurements. Unlike natural image restoration problems, MR
imaging involves physics-based imaging processes, unique data properties, and
diverse imaging tasks. This domain knowledge needs to be integrated with
data-driven approaches. Our review will introduce the significant challenges
faced by such knowledge-driven DL approaches in the context of fast MR imaging
along with several notable solutions, which include learning neural networks
and addressing different imaging application scenarios. The traits and trends
of these techniques have also been given which have shifted from supervised
learning to semi-supervised learning, and finally, to unsupervised learning
methods. In addition, MR vendors' choices of DL reconstruction have been
provided along with some discussions on open questions and future directions,
which are critical for the reliable imaging systems.Comment: 46 pages, 5figures, 1 tabl
- …