30,604 research outputs found
Fast space-variant elliptical filtering using box splines
The efficient realization of linear space-variant (non-convolution) filters
is a challenging computational problem in image processing. In this paper, we
demonstrate that it is possible to filter an image with a Gaussian-like
elliptic window of varying size, elongation and orientation using a fixed
number of computations per pixel. The associated algorithm, which is based on a
family of smooth compactly supported piecewise polynomials, the
radially-uniform box splines, is realized using pre-integration and local
finite-differences. The radially-uniform box splines are constructed through
the repeated convolution of a fixed number of box distributions, which have
been suitably scaled and distributed radially in an uniform fashion. The
attractive features of these box splines are their asymptotic behavior, their
simple covariance structure, and their quasi-separability. They converge to
Gaussians with the increase of their order, and are used to approximate
anisotropic Gaussians of varying covariance simply by controlling the scales of
the constituent box distributions. Based on the second feature, we develop a
technique for continuously controlling the size, elongation and orientation of
these Gaussian-like functions. Finally, the quasi-separable structure, along
with a certain scaling property of box distributions, is used to efficiently
realize the associated space-variant elliptical filtering, which requires O(1)
computations per pixel irrespective of the shape and size of the filter.Comment: 12 figures; IEEE Transactions on Image Processing, vol. 19, 201
Distance Measures for Reduced Ordering Based Vector Filters
Reduced ordering based vector filters have proved successful in removing
long-tailed noise from color images while preserving edges and fine image
details. These filters commonly utilize variants of the Minkowski distance to
order the color vectors with the aim of distinguishing between noisy and
noise-free vectors. In this paper, we review various alternative distance
measures and evaluate their performance on a large and diverse set of images
using several effectiveness and efficiency criteria. The results demonstrate
that there are in fact strong alternatives to the popular Minkowski metrics
Multi-directional Geodesic Neural Networks via Equivariant Convolution
We propose a novel approach for performing convolution of signals on curved
surfaces and show its utility in a variety of geometric deep learning
applications. Key to our construction is the notion of directional functions
defined on the surface, which extend the classic real-valued signals and which
can be naturally convolved with with real-valued template functions. As a
result, rather than trying to fix a canonical orientation or only keeping the
maximal response across all alignments of a 2D template at every point of the
surface, as done in previous works, we show how information across all
rotations can be kept across different layers of the neural network. Our
construction, which we call multi-directional geodesic convolution, or
directional convolution for short, allows, in particular, to propagate and
relate directional information across layers and thus different regions on the
shape. We first define directional convolution in the continuous setting, prove
its key properties and then show how it can be implemented in practice, for
shapes represented as triangle meshes. We evaluate directional convolution in a
wide variety of learning scenarios ranging from classification of signals on
surfaces, to shape segmentation and shape matching, where we show a significant
improvement over several baselines
Perceptually-Driven Video Coding with the Daala Video Codec
The Daala project is a royalty-free video codec that attempts to compete with
the best patent-encumbered codecs. Part of our strategy is to replace core
tools of traditional video codecs with alternative approaches, many of them
designed to take perceptual aspects into account, rather than optimizing for
simple metrics like PSNR. This paper documents some of our experiences with
these tools, which ones worked and which did not. We evaluate which tools are
easy to integrate into a more traditional codec design, and show results in the
context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital
Image Processing (ADIP), 201
Fast adaptive elliptical filtering using box splines
We demonstrate that it is possible to filter an image with an elliptic window
of varying size, elongation and orientation with a fixed computational cost per
pixel. Our method involves the application of a suitable global pre-integrator
followed by a pointwise-adaptive localization mesh. We present the basic theory
for the 1D case using a B-spline formalism and then appropriately extend it to
2D using radially-uniform box splines. The size and ellipticity of these
radially-uniform box splines is adaptively controlled. Moreover, they converge
to Gaussians as the order increases. Finally, we present a fast and practical
directional filtering algorithm that has the capability of adapting to the
local image features.Comment: 9 pages, 1 figur
Complex data processing: fast wavelet analysis on the sphere
In the general context of complex data processing, this paper reviews a
recent practical approach to the continuous wavelet formalism on the sphere.
This formalism notably yields a correspondence principle which relates wavelets
on the plane and on the sphere. Two fast algorithms are also presented for the
analysis of signals on the sphere with steerable wavelets.Comment: 20 pages, 5 figures, JFAA style, paper invited to J. Fourier Anal.
and Appli
ShearLab 3D: Faithful Digital Shearlet Transforms based on Compactly Supported Shearlets
Wavelets and their associated transforms are highly efficient when
approximating and analyzing one-dimensional signals. However, multivariate
signals such as images or videos typically exhibit curvilinear singularities,
which wavelets are provably deficient of sparsely approximating and also of
analyzing in the sense of, for instance, detecting their direction. Shearlets
are a directional representation system extending the wavelet framework, which
overcomes those deficiencies. Similar to wavelets, shearlets allow a faithful
implementation and fast associated transforms. In this paper, we will introduce
a comprehensive carefully documented software package coined ShearLab 3D
(www.ShearLab.org) and discuss its algorithmic details. This package provides
MATLAB code for a novel faithful algorithmic realization of the 2D and 3D
shearlet transform (and their inverses) associated with compactly supported
universal shearlet systems incorporating the option of using CUDA. We will
present extensive numerical experiments in 2D and 3D concerning denoising,
inpainting, and feature extraction, comparing the performance of ShearLab 3D
with similar transform-based algorithms such as curvelets, contourlets, or
surfacelets. In the spirit of reproducible reseaerch, all scripts are
accessible on www.ShearLab.org.Comment: There is another shearlet software package
(http://www.mathematik.uni-kl.de/imagepro/members/haeuser/ffst/) by S.
H\"auser and G. Steidl. We will include this in a revisio
Two-Stream Convolutional Networks for Action Recognition in Videos
We investigate architectures of discriminatively trained deep Convolutional
Networks (ConvNets) for action recognition in video. The challenge is to
capture the complementary information on appearance from still frames and
motion between frames. We also aim to generalise the best performing
hand-crafted features within a data-driven learning framework.
Our contribution is three-fold. First, we propose a two-stream ConvNet
architecture which incorporates spatial and temporal networks. Second, we
demonstrate that a ConvNet trained on multi-frame dense optical flow is able to
achieve very good performance in spite of limited training data. Finally, we
show that multi-task learning, applied to two different action classification
datasets, can be used to increase the amount of training data and improve the
performance on both.
Our architecture is trained and evaluated on the standard video actions
benchmarks of UCF-101 and HMDB-51, where it is competitive with the state of
the art. It also exceeds by a large margin previous attempts to use deep nets
for video classification
- …