5,771 research outputs found
Time-causal and time-recursive spatio-temporal receptive fields
We present an improved model and theory for time-causal and time-recursive
spatio-temporal receptive fields, based on a combination of Gaussian receptive
fields over the spatial domain and first-order integrators or equivalently
truncated exponential filters coupled in cascade over the temporal domain.
Compared to previous spatio-temporal scale-space formulations in terms of
non-enhancement of local extrema or scale invariance, these receptive fields
are based on different scale-space axiomatics over time by ensuring
non-creation of new local extrema or zero-crossings with increasing temporal
scale. Specifically, extensions are presented about (i) parameterizing the
intermediate temporal scale levels, (ii) analysing the resulting temporal
dynamics, (iii) transferring the theory to a discrete implementation, (iv)
computing scale-normalized spatio-temporal derivative expressions for
spatio-temporal feature detection and (v) computational modelling of receptive
fields in the lateral geniculate nucleus (LGN) and the primary visual cortex
(V1) in biological vision.
We show that by distributing the intermediate temporal scale levels according
to a logarithmic distribution, we obtain much faster temporal response
properties (shorter temporal delays) compared to a uniform distribution.
Specifically, these kernels converge very rapidly to a limit kernel possessing
true self-similar scale-invariant properties over temporal scales, thereby
allowing for true scale invariance over variations in the temporal scale,
although the underlying temporal scale-space representation is based on a
discretized temporal scale parameter.
We show how scale-normalized temporal derivatives can be defined for these
time-causal scale-space kernels and how the composed theory can be used for
computing basic types of scale-normalized spatio-temporal derivative
expressions in a computationally efficient manner.Comment: 39 pages, 12 figures, 5 tables in Journal of Mathematical Imaging and
Vision, published online Dec 201
Interpretable Transformations with Encoder-Decoder Networks
Deep feature spaces have the capacity to encode complex transformations of
their input data. However, understanding the relative feature-space
relationship between two transformed encoded images is difficult. For instance,
what is the relative feature space relationship between two rotated images?
What is decoded when we interpolate in feature space? Ideally, we want to
disentangle confounding factors, such as pose, appearance, and illumination,
from object identity. Disentangling these is difficult because they interact in
very nonlinear ways. We propose a simple method to construct a deep feature
space, with explicitly disentangled representations of several known
transformations. A person or algorithm can then manipulate the disentangled
representation, for example, to re-render an image with explicit control over
parameterized degrees of freedom. The feature space is constructed using a
transforming encoder-decoder network with a custom feature transform layer,
acting on the hidden representations. We demonstrate the advantages of explicit
disentangling on a variety of datasets and transformations, and as an aid for
traditional tasks, such as classification.Comment: Accepted at ICCV 201
Idealized computational models for auditory receptive fields
This paper presents a theory by which idealized models of auditory receptive
fields can be derived in a principled axiomatic manner, from a set of
structural properties to enable invariance of receptive field responses under
natural sound transformations and ensure internal consistency between
spectro-temporal receptive fields at different temporal and spectral scales.
For defining a time-frequency transformation of a purely temporal sound
signal, it is shown that the framework allows for a new way of deriving the
Gabor and Gammatone filters as well as a novel family of generalized Gammatone
filters, with additional degrees of freedom to obtain different trade-offs
between the spectral selectivity and the temporal delay of time-causal temporal
window functions.
When applied to the definition of a second-layer of receptive fields from a
spectrogram, it is shown that the framework leads to two canonical families of
spectro-temporal receptive fields, in terms of spectro-temporal derivatives of
either spectro-temporal Gaussian kernels for non-causal time or the combination
of a time-causal generalized Gammatone filter over the temporal domain and a
Gaussian filter over the logspectral domain. For each filter family, the
spectro-temporal receptive fields can be either separable over the
time-frequency domain or be adapted to local glissando transformations that
represent variations in logarithmic frequencies over time. Within each domain
of either non-causal or time-causal time, these receptive field families are
derived by uniqueness from the assumptions.
It is demonstrated how the presented framework allows for computation of
basic auditory features for audio processing and that it leads to predictions
about auditory receptive fields with good qualitative similarity to biological
receptive fields measured in the inferior colliculus (ICC) and primary auditory
cortex (A1) of mammals.Comment: 55 pages, 22 figures, 3 table
- …