1,562 research outputs found
BLADE: Filter Learning for General Purpose Computational Photography
The Rapid and Accurate Image Super Resolution (RAISR) method of Romano,
Isidoro, and Milanfar is a computationally efficient image upscaling method
using a trained set of filters. We describe a generalization of RAISR, which we
name Best Linear Adaptive Enhancement (BLADE). This approach is a trainable
edge-adaptive filtering framework that is general, simple, computationally
efficient, and useful for a wide range of problems in computational
photography. We show applications to operations which may appear in a camera
pipeline including denoising, demosaicing, and stylization
PYRO-NN: Python Reconstruction Operators in Neural Networks
Purpose: Recently, several attempts were conducted to transfer deep learning
to medical image reconstruction. An increasingly number of publications follow
the concept of embedding the CT reconstruction as a known operator into a
neural network. However, most of the approaches presented lack an efficient CT
reconstruction framework fully integrated into deep learning environments. As a
result, many approaches are forced to use workarounds for mathematically
unambiguously solvable problems. Methods: PYRO-NN is a generalized framework to
embed known operators into the prevalent deep learning framework Tensorflow.
The current status includes state-of-the-art parallel-, fan- and cone-beam
projectors and back-projectors accelerated with CUDA provided as Tensorflow
layers. On top, the framework provides a high level Python API to conduct FBP
and iterative reconstruction experiments with data from real CT systems.
Results: The framework provides all necessary algorithms and tools to design
end-to-end neural network pipelines with integrated CT reconstruction
algorithms. The high level Python API allows a simple use of the layers as
known from Tensorflow. To demonstrate the capabilities of the layers, the
framework comes with three baseline experiments showing a cone-beam short scan
FDK reconstruction, a CT reconstruction filter learning setup, and a TV
regularized iterative reconstruction. All algorithms and tools are referenced
to a scientific publication and are compared to existing non deep learning
reconstruction frameworks. The framework is available as open-source software
at \url{https://github.com/csyben/PYRO-NN}. Conclusions: PYRO-NN comes with the
prevalent deep learning framework Tensorflow and allows to setup end-to-end
trainable neural networks in the medical image reconstruction context. We
believe that the framework will be a step towards reproducible researchComment: V1: Submitted to Medical Physics, 11 pages, 7 figure
Learning Spectral-Spatial-Temporal Features via a Recurrent Convolutional Neural Network for Change Detection in Multispectral Imagery
Change detection is one of the central problems in earth observation and was
extensively investigated over recent decades. In this paper, we propose a novel
recurrent convolutional neural network (ReCNN) architecture, which is trained
to learn a joint spectral-spatial-temporal feature representation in a unified
framework for change detection in multispectral images. To this end, we bring
together a convolutional neural network (CNN) and a recurrent neural network
(RNN) into one end-to-end network. The former is able to generate rich
spectral-spatial feature representations, while the latter effectively analyzes
temporal dependency in bi-temporal images. In comparison with previous
approaches to change detection, the proposed network architecture possesses
three distinctive properties: 1) It is end-to-end trainable, in contrast to
most existing methods whose components are separately trained or computed; 2)
it naturally harnesses spatial information that has been proven to be
beneficial to change detection task; 3) it is capable of adaptively learning
the temporal dependency between multitemporal images, unlike most of algorithms
that use fairly simple operation like image differencing or stacking. As far as
we know, this is the first time that a recurrent convolutional network
architecture has been proposed for multitemporal remote sensing image analysis.
The proposed network is validated on real multispectral data sets. Both visual
and quantitative analysis of experimental results demonstrates competitive
performance in the proposed mode
Surface Networks
We study data-driven representations for three-dimensional triangle meshes,
which are one of the prevalent objects used to represent 3D geometry. Recent
works have developed models that exploit the intrinsic geometry of manifolds
and graphs, namely the Graph Neural Networks (GNNs) and its spectral variants,
which learn from the local metric tensor via the Laplacian operator. Despite
offering excellent sample complexity and built-in invariances, intrinsic
geometry alone is invariant to isometric deformations, making it unsuitable for
many applications. To overcome this limitation, we propose several upgrades to
GNNs to leverage extrinsic differential geometry properties of
three-dimensional surfaces, increasing its modeling power.
In particular, we propose to exploit the Dirac operator, whose spectrum
detects principal curvature directions --- this is in stark contrast with the
classical Laplace operator, which directly measures mean curvature. We coin the
resulting models \emph{Surface Networks (SN)}. We prove that these models
define shape representations that are stable to deformation and to
discretization, and we demonstrate the efficiency and versatility of SNs on two
challenging tasks: temporal prediction of mesh deformations under non-linear
dynamics and generative models using a variational autoencoder framework with
encoders/decoders given by SNs
When Kernel Methods meet Feature Learning: Log-Covariance Network for Action Recognition from Skeletal Data
Human action recognition from skeletal data is a hot research topic and
important in many open domain applications of computer vision, thanks to
recently introduced 3D sensors. In the literature, naive methods simply
transfer off-the-shelf techniques from video to the skeletal representation.
However, the current state-of-the-art is contended between to different
paradigms: kernel-based methods and feature learning with (recurrent) neural
networks. Both approaches show strong performances, yet they exhibit heavy, but
complementary, drawbacks. Motivated by this fact, our work aims at combining
together the best of the two paradigms, by proposing an approach where a
shallow network is fed with a covariance representation. Our intuition is that,
as long as the dynamics is effectively modeled, there is no need for the
classification network to be deep nor recurrent in order to score favorably. We
validate this hypothesis in a broad experimental analysis over 6 publicly
available datasets.Comment: 2017 IEEE Computer Vision and Pattern Recognition (CVPR) Workshop
Dynamic Variational Autoencoders for Visual Process Modeling
This work studies the problem of modeling visual processes by leveraging deep
generative architectures for learning linear, Gaussian representations from
observed sequences. We propose a joint learning framework, combining a vector
autoregressive model and Variational Autoencoders. This results in an
architecture that allows Variational Autoencoders to simultaneously learn a
non-linear observation as well as a linear state model from sequences of
frames. We validate our approach on artificial sequences and dynamic textures
- …