261 research outputs found
Time-causal and time-recursive spatio-temporal receptive fields
We present an improved model and theory for time-causal and time-recursive
spatio-temporal receptive fields, based on a combination of Gaussian receptive
fields over the spatial domain and first-order integrators or equivalently
truncated exponential filters coupled in cascade over the temporal domain.
Compared to previous spatio-temporal scale-space formulations in terms of
non-enhancement of local extrema or scale invariance, these receptive fields
are based on different scale-space axiomatics over time by ensuring
non-creation of new local extrema or zero-crossings with increasing temporal
scale. Specifically, extensions are presented about (i) parameterizing the
intermediate temporal scale levels, (ii) analysing the resulting temporal
dynamics, (iii) transferring the theory to a discrete implementation, (iv)
computing scale-normalized spatio-temporal derivative expressions for
spatio-temporal feature detection and (v) computational modelling of receptive
fields in the lateral geniculate nucleus (LGN) and the primary visual cortex
(V1) in biological vision.
We show that by distributing the intermediate temporal scale levels according
to a logarithmic distribution, we obtain much faster temporal response
properties (shorter temporal delays) compared to a uniform distribution.
Specifically, these kernels converge very rapidly to a limit kernel possessing
true self-similar scale-invariant properties over temporal scales, thereby
allowing for true scale invariance over variations in the temporal scale,
although the underlying temporal scale-space representation is based on a
discretized temporal scale parameter.
We show how scale-normalized temporal derivatives can be defined for these
time-causal scale-space kernels and how the composed theory can be used for
computing basic types of scale-normalized spatio-temporal derivative
expressions in a computationally efficient manner.Comment: 39 pages, 12 figures, 5 tables in Journal of Mathematical Imaging and
Vision, published online Dec 201
Invariance of visual operations at the level of receptive fields
Receptive field profiles registered by cell recordings have shown that
mammalian vision has developed receptive fields tuned to different sizes and
orientations in the image domain as well as to different image velocities in
space-time. This article presents a theoretical model by which families of
idealized receptive field profiles can be derived mathematically from a small
set of basic assumptions that correspond to structural properties of the
environment. The article also presents a theory for how basic invariance
properties to variations in scale, viewing direction and relative motion can be
obtained from the output of such receptive fields, using complementary
selection mechanisms that operate over the output of families of receptive
fields tuned to different parameters. Thereby, the theory shows how basic
invariance properties of a visual system can be obtained already at the level
of receptive fields, and we can explain the different shapes of receptive field
profiles found in biological vision from a requirement that the visual system
should be invariant to the natural types of image transformations that occur in
its environment.Comment: 40 pages, 17 figure
Idealized computational models for auditory receptive fields
This paper presents a theory by which idealized models of auditory receptive
fields can be derived in a principled axiomatic manner, from a set of
structural properties to enable invariance of receptive field responses under
natural sound transformations and ensure internal consistency between
spectro-temporal receptive fields at different temporal and spectral scales.
For defining a time-frequency transformation of a purely temporal sound
signal, it is shown that the framework allows for a new way of deriving the
Gabor and Gammatone filters as well as a novel family of generalized Gammatone
filters, with additional degrees of freedom to obtain different trade-offs
between the spectral selectivity and the temporal delay of time-causal temporal
window functions.
When applied to the definition of a second-layer of receptive fields from a
spectrogram, it is shown that the framework leads to two canonical families of
spectro-temporal receptive fields, in terms of spectro-temporal derivatives of
either spectro-temporal Gaussian kernels for non-causal time or the combination
of a time-causal generalized Gammatone filter over the temporal domain and a
Gaussian filter over the logspectral domain. For each filter family, the
spectro-temporal receptive fields can be either separable over the
time-frequency domain or be adapted to local glissando transformations that
represent variations in logarithmic frequencies over time. Within each domain
of either non-causal or time-causal time, these receptive field families are
derived by uniqueness from the assumptions.
It is demonstrated how the presented framework allows for computation of
basic auditory features for audio processing and that it leads to predictions
about auditory receptive fields with good qualitative similarity to biological
receptive fields measured in the inferior colliculus (ICC) and primary auditory
cortex (A1) of mammals.Comment: 55 pages, 22 figures, 3 table
Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields
This work presents a first evaluation of using spatio-temporal receptive
fields from a recently proposed time-causal spatio-temporal scale-space
framework as primitives for video analysis. We propose a new family of video
descriptors based on regional statistics of spatio-temporal receptive field
responses and evaluate this approach on the problem of dynamic texture
recognition. Our approach generalises a previously used method, based on joint
histograms of receptive field responses, from the spatial to the
spatio-temporal domain and from object recognition to dynamic texture
recognition. The time-recursive formulation enables computationally efficient
time-causal recognition. The experimental evaluation demonstrates competitive
performance compared to state-of-the-art. Especially, it is shown that binary
versions of our dynamic texture descriptors achieve improved performance
compared to a large range of similar methods using different primitives either
handcrafted or learned from data. Further, our qualitative and quantitative
investigation into parameter choices and the use of different sets of receptive
fields highlights the robustness and flexibility of our approach. Together,
these results support the descriptive power of this family of time-causal
spatio-temporal receptive fields, validate our approach for dynamic texture
recognition and point towards the possibility of designing a range of video
analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure
Edge detection and ridge detection with automatic scale selection
When extracting features from image data, the type of information that can be extracted may be strongly dependent on the scales at which the feature detectors are applied. This article presents a systematic methodology for addressing this problem. A mechanism is presented for automatic selection of scale levels when detecting one-dimensional features, such as edges and ridges. Anovel concept of a scale-space edge is introduced, defined as a connected set of points in scale-space at which: (i) the gradient magnitude assumes a local maximum in the gradient direction, and (ii) a normalized measure of the strength of the edge response is locally maximal over scales. An important property of this definition is that it allows the scale levels to vary along the edge. Two specific measures of edge strength are analysed in detail. It is shown that by expressing these in terms of γ-normalized derivatives, an immediate consequence of this definition is that fine scales are selected for sharp edges (so as to reduce the shape distortions due to scale-space smoothing), whereas coarse scales are selected for diffuse edges, such that an edge model constitutes a valid abstraction of the intensity profile across the edge. With slight modifications, this idea can be used for formulating a ridge detector with automatic scale selection, having the characteristic property that the selected scales on a scale-space ridge instead reflect the width of the ridge
Covariance properties under natural image transformations for the generalized Gaussian derivative model for visual receptive fields
This paper presents a theory for how geometric image transformations can be
handled by a first layer of linear receptive fields, in terms of true
covariance properties, which, in turn, enable geometric invariance properties
at higher levels in the visual hierarchy. Specifically, we develop this theory
for a generalized Gaussian derivative model for visual receptive fields, which
is derived in an axiomatic manner from first principles, that reflect symmetry
properties of the environment, complemented by structural assumptions to
guarantee internally consistent treatment of image structures over multiple
spatio-temporal scales.
It is shown how the studied generalized Gaussian derivative model for visual
receptive fields obeys true covariance properties under spatial scaling
transformations, spatial affine transformations, Galilean transformations and
temporal scaling transformations, implying that a vision system, based on image
and video measurements in terms of the receptive fields according to this
model, can to first order of approximation handle the image and video
deformations between multiple views of objects delimited by smooth surfaces, as
well as between multiple views of spatio-temporal events, under varying
relative motions between the objects and events in the world and the observer.
We conclude by describing implications of the presented theory for biological
vision, regarding connections between the variabilities of the shapes of
biological visual receptive fields and the variabilities of spatial and
spatio-temporal image structures under natural image transformations.Comment: 38 pages, 14 figure
Inability of spatial transformations of CNN feature maps to support invariant recognition
A large number of deep learning architectures use spatial transformations of
CNN feature maps or filters to better deal with variability in object
appearance caused by natural image transformations. In this paper, we prove
that spatial transformations of CNN feature maps cannot align the feature maps
of a transformed image to match those of its original, for general affine
transformations, unless the extracted features are themselves invariant. Our
proof is based on elementary analysis for both the single- and multi-layer
network case. The results imply that methods based on spatial transformations
of CNN feature maps or filters cannot replace image alignment of the input and
cannot enable invariant recognition for general affine transformations,
specifically not for scaling transformations or shear transformations. For
rotations and reflections, spatially transforming feature maps or filters can
enable invariance but only for networks with learnt or hardcoded rotation- or
reflection-invariant featuresComment: 22 pages, 3 figure
The curriculum and society
Through curricula, society expresses and determines its identity. The curriculum is always a more or less successful picture of what society was in the past, what it is now and what it wants to be in the future. A number of stakeholders and individuals cannot be indifferent to questions like: why, what and how to do it. The basic purpose of determinations (why), content (what) and methods (how) are a kind of a pedagogical vision of the future of a society and determine-that members of a society can be and what human potential can be developed. It largely depends on a successfully prepared curriculum. The question of which we will indulge in this paper is the culture of excellence in university teaching. Whether the quality of university teaching came to the level of metaphysics and became the essence of herself? Bologna process as a modern European trend is designed to cover effective models for teaching the student clearly knows what he teaches and learns why. Do we use models of motivation for academic achievement of students and teachers at universities in Macedonia? Is there a culture of quality? The results of the survey which included students in their final years and graduates as well as business community, showed that more than 60% of graduates have deficiencies in key professional working skills. More than 51% of students said they have not gained any practical skills during their studies. Young people in the region (1) does not possess the appropriate skills for employment, mainly skills that are listed by employers, and (2) there is no relationship between universities, students and society
Covariant spatio-temporal receptive fields for neuromorphic computing
Biological nervous systems constitute important sources of inspiration
towards computers that are faster, cheaper, and more energy efficient.
Neuromorphic disciplines view the brain as a coevolved system, simultaneously
optimizing the hardware and the algorithms running on it. There are clear
efficiency gains when bringing the computations into a physical substrate, but
we presently lack theories to guide efficient implementations. Here, we present
a principled computational model for neuromorphic systems in terms of
spatio-temporal receptive fields, based on affine Gaussian kernels over space
and leaky-integrator and leaky integrate-and-fire models over time. Our theory
is provably covariant to spatial affine and temporal scaling transformations,
and with close similarities to the visual processing in mammalian brains. We
use these spatio-temporal receptive fields as a prior in an event-based vision
task, and show that this improves the training of spiking networks, which
otherwise is known as problematic for event-based vision. This work combines
efforts within scale-space theory and computational neuroscience to identify
theoretically well-founded ways to process spatio-temporal signals in
neuromorphic systems. Our contributions are immediately relevant for signal
processing and event-based vision, and can be extended to other processing
tasks over space and time, such as memory and control.Comment: Code available at https://github.com/jegp/nr
- …