12,187 research outputs found
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Principled Design and Implementation of Steerable Detectors
We provide a complete pipeline for the detection of patterns of interest in
an image. In our approach, the patterns are assumed to be adequately modeled by
a known template, and are located at unknown position and orientation. We
propose a continuous-domain additive image model, where the analyzed image is
the sum of the template and an isotropic background signal with self-similar
isotropic power-spectrum. The method is able to learn an optimal steerable
filter fulfilling the SNR criterion based on one single template and background
pair, that therefore strongly responds to the template, while optimally
decoupling from the background model. The proposed filter then allows for a
fast detection process, with the unknown orientation estimation through the use
of steerability properties. In practice, the implementation requires to
discretize the continuous-domain formulation on polar grids, which is performed
using radial B-splines. We demonstrate the practical usefulness of our method
on a variety of template approximation and pattern detection experiments
Time-causal and time-recursive spatio-temporal receptive fields
We present an improved model and theory for time-causal and time-recursive
spatio-temporal receptive fields, based on a combination of Gaussian receptive
fields over the spatial domain and first-order integrators or equivalently
truncated exponential filters coupled in cascade over the temporal domain.
Compared to previous spatio-temporal scale-space formulations in terms of
non-enhancement of local extrema or scale invariance, these receptive fields
are based on different scale-space axiomatics over time by ensuring
non-creation of new local extrema or zero-crossings with increasing temporal
scale. Specifically, extensions are presented about (i) parameterizing the
intermediate temporal scale levels, (ii) analysing the resulting temporal
dynamics, (iii) transferring the theory to a discrete implementation, (iv)
computing scale-normalized spatio-temporal derivative expressions for
spatio-temporal feature detection and (v) computational modelling of receptive
fields in the lateral geniculate nucleus (LGN) and the primary visual cortex
(V1) in biological vision.
We show that by distributing the intermediate temporal scale levels according
to a logarithmic distribution, we obtain much faster temporal response
properties (shorter temporal delays) compared to a uniform distribution.
Specifically, these kernels converge very rapidly to a limit kernel possessing
true self-similar scale-invariant properties over temporal scales, thereby
allowing for true scale invariance over variations in the temporal scale,
although the underlying temporal scale-space representation is based on a
discretized temporal scale parameter.
We show how scale-normalized temporal derivatives can be defined for these
time-causal scale-space kernels and how the composed theory can be used for
computing basic types of scale-normalized spatio-temporal derivative
expressions in a computationally efficient manner.Comment: 39 pages, 12 figures, 5 tables in Journal of Mathematical Imaging and
Vision, published online Dec 201
Left-invariant evolutions of wavelet transforms on the Similitude Group
Enhancement of multiple-scale elongated structures in noisy image data is
relevant for many biomedical applications but commonly used PDE-based
enhancement techniques often fail at crossings in an image. To get an overview
of how an image is composed of local multiple-scale elongated structures we
construct a multiple scale orientation score, which is a continuous wavelet
transform on the similitude group, SIM(2). Our unitary transform maps the space
of images onto a reproducing kernel space defined on SIM(2), allowing us to
robustly relate Euclidean (and scaling) invariant operators on images to
left-invariant operators on the corresponding continuous wavelet transform.
Rather than often used wavelet (soft-)thresholding techniques, we employ the
group structure in the wavelet domain to arrive at left-invariant evolutions
and flows (diffusion), for contextual crossing preserving enhancement of
multiple scale elongated structures in noisy images. We present experiments
that display benefits of our work compared to recent PDE techniques acting
directly on the images and to our previous work on left-invariant diffusions on
orientation scores defined on Euclidean motion group.Comment: 40 page
Sparse Coding on Symmetric Positive Definite Manifolds using Bregman Divergences
This paper introduces sparse coding and dictionary learning for Symmetric
Positive Definite (SPD) matrices, which are often used in machine learning,
computer vision and related areas. Unlike traditional sparse coding schemes
that work in vector spaces, in this paper we discuss how SPD matrices can be
described by sparse combination of dictionary atoms, where the atoms are also
SPD matrices. We propose to seek sparse coding by embedding the space of SPD
matrices into Hilbert spaces through two types of Bregman matrix divergences.
This not only leads to an efficient way of performing sparse coding, but also
an online and iterative scheme for dictionary learning. We apply the proposed
methods to several computer vision tasks where images are represented by region
covariance matrices. Our proposed algorithms outperform state-of-the-art
methods on a wide range of classification tasks, including face recognition,
action recognition, material classification and texture categorization
- …