2,943 research outputs found
Representation of Samba dance gestures, using a multi-modal analysis approach
In this paper we propose an approach for the
representation of dance gestures in Samba dance. This
representation is based on a video analysis of body
movements, carried out from the viewpoint of the
musical meter. Our method provides the periods, a
measure of energy and a visual representation of
periodic movement in dance. The method is applied to
a limited universe of Samba dances and music, which
is used to illustrate the usefulness of the approach
Sea Ice Extraction via Remote Sensed Imagery: Algorithms, Datasets, Applications and Challenges
The deep learning, which is a dominating technique in artificial
intelligence, has completely changed the image understanding over the past
decade. As a consequence, the sea ice extraction (SIE) problem has reached a
new era. We present a comprehensive review of four important aspects of SIE,
including algorithms, datasets, applications, and the future trends. Our review
focuses on researches published from 2016 to the present, with a specific focus
on deep learning-based approaches in the last five years. We divided all
relegated algorithms into 3 categories, including classical image segmentation
approach, machine learning-based approach and deep learning-based methods. We
reviewed the accessible ice datasets including SAR-based datasets, the
optical-based datasets and others. The applications are presented in 4 aspects
including climate research, navigation, geographic information systems (GIS)
production and others. It also provides insightful observations and inspiring
future research directions.Comment: 24 pages, 6 figure
Spatiotemporal oriented energies for spacetime stereo
This paper presents a novel approach to recovering tem-porally coherent estimates of 3D structure of a dynamic scene from a sequence of binocular stereo images. The approach is based on matching spatiotemporal orientation distributions between left and right temporal image streams, which encapsulates both local spatial and temporal struc-ture for disparity estimation. By capturing spatial and tem-poral structure in this unified fashion, both sources of in-formation combine to yield disparity estimates that are nat-urally temporal coherent, while helping to resolve matches that might be ambiguous when either source is considered alone. Further, by allowing subsets of the orientation mea-surements to support different disparity estimates, an ap-proach to recovering multilayer disparity from spacetime stereo is realized. The approach has been implemented with real-time performance on commodity GPUs. Empir-ical evaluation shows that the approach yields qualitatively and quantitatively superior disparity estimates in compari-son to various alternative approaches, including the ability to provide accurate multilayer estimates in the presence of (semi)transparent and specular surfaces. 1
The Role of Early Recurrence in Improving Visual Representations
This dissertation proposes a computational model of early vision with recurrence, termed as early recurrence. The idea is motivated from the research of the primate vision. Specifically, the proposed model relies on the following four observations. 1) The primate visual system includes two main visual pathways: the dorsal pathway and the ventral pathway; 2) The two pathways respond to different visual features; 3) The neurons of the dorsal pathway conduct visual information faster than that of the neurons of the ventral pathway; 4) There are lower-level feedback connections from the dorsal pathway to the ventral pathway. As such, the primate visual system may implement a recurrent mechanism to improve visual representations of the ventral pathway.
Our work starts from a comprehensive review of the literature, based on which a conceptualization of early recurrence is proposed. Early recurrence manifests itself as a form of surround suppression. We propose that early recurrence is capable of refining the ventral processing using results of the dorsal processing.
Our work further defines a set of computational components to formalize early recurrence. Although we do not intend to model the true nature of biology, to verify that the proposed computation is biologically consistent, we have applied the model to simulate a neurophysiological experiment of a bar-and-checkerboard and a psychological experiment involving a moving contour illusion. Simulation results indicated that the proposed computation behaviourally reproduces the original observations.
The ultimate goal of this work is to investigate whether the proposal is capable of improving computer vision applications. To do this, we have applied the model to a variety of applications, including visual saliency and contour detection. Based on comparisons against the state-of-the-art, we conclude that the proposed model of early recurrence sheds light on a generally applicable yet lightweight approach to boost real-life application performance
Neural networks application to divergence-based passive ranging
The purpose of this report is to summarize the state of knowledge and outline the planned work in divergence-based/neural networks approach to the problem of passive ranging derived from optical flow. Work in this and closely related areas is reviewed in order to provide the necessary background for further developments. New ideas about devising a monocular passive-ranging system are then introduced. It is shown that image-plan divergence is independent of image-plan location with respect to the focus of expansion and of camera maneuvers because it directly measures the object's expansion which, in turn, is related to the time-to-collision. Thus, a divergence-based method has the potential of providing a reliable range complementing other monocular passive-ranging methods which encounter difficulties in image areas close to the focus of expansion. Image-plan divergence can be thought of as some spatial/temporal pattern. A neural network realization was chosen for this task because neural networks have generally performed well in various other pattern recognition applications. The main goal of this work is to teach a neural network to derive the divergence from the imagery
Sensory coding in supragranular cells of the vibrissal cortex in anesthetized and awake mice
Sensory perception entails reliable representation of the
external stimuli as impulse activity of individual neurons (i.e.
spikes) and neuronal populations in the sensory area. An ongoing
challenge in neuroscience is to identify and characterize the
features of the stimuli which are relevant to a specific sensory
modality and neuronal strategies to effectively and efficiently
encode those features. It is widely hypothesized that the
neuronal populations employ “sparse coding” strategies to
optimize the stimulus representations with a low energetic cost
(i.e. low impulse activity). In the past two decades, a wealth of
experimental evidence has supported this hypothesis by showing
spatiotemporally sparse activity in sensory area. Despite
numerous studies, the extent of sparse coding and its underlying
mechanisms are not fully understood, especially in primary
vibrissal somatosensory cortex (vS1), which is a key model system
in sensory neuroscience. Importantly, it is not clear yet whether
sparse activation of supragranular vS1 is due to insufficient
synaptic input to the majority of the cells or the absence of
effective stimulus features.
In this thesis, first we asked how the choice of stimulus could
affect the degree of sparseness and/or the overall fraction of
the responsive vS1 neurons. We presented whisker deflections
spanning a broad range of intensities, including “standard
stimuli” and a high-velocity, “sharp” stimulus, which
simulated the fast slip events that occur during whisker mediated
object palpation. We used whole-cell and cell-attached recording
and calcium imaging to characterize the neuronal responses to
these stimuli. Consistent with previous literature, whole-cell
recording revealed a sparse response to the standard range of
velocities: although all recorded cells showed tuning to velocity
in their postsynaptic potentials, only a small fraction produced
stimulus-evoked spikes. In contrast, the sharp stimulus evoked
reliable spiking in a large fraction of regular spiking neurons
in the supragranular vS1. Spiking responses to the sharp stimulus
were binary and precisely timed, with minimum trial-to-trial
variability. Interestingly, we also observed that the sharp
stimulus produced a consistent and significant reduction in
action potential threshold.
In the second step we asked whether the stimulus dependent sparse
and dense activations we found in anesthetized condition would
generalize to the awake condition. We employed cell-attached
recordings in head-fixed awake mice to explore the degree of
sparseness in awake cortex. Although, stimuli delivered by a
piezo-electric actuator evoked significant response in a small
fraction of regular spiking supragranular neurons (16%-29%), we
observed that a majority of neurons (84%) were driven by manual
probing of whiskers. Our results demonstrate that despite sparse
activity, the majority of neurons in the superficial layers of
vS1 contribute to coding by representing a specific feature of
the tactile stimulus.
Thesis outline: Chapter 1 provides a review of the current
knowledge on sparse coding and an overview of the whisker-sensory
pathway. Chapter 2 represents our published results regarding
sparse and dense coding in vS1 of anesthetized mice
(Ranjbar-Slamloo and Arabzadeh 2017). Chapter 3 represents our
pending manuscript with results obtained with piezo and manual
stimulation in awake mice. Finally, in Chapter 4 we discuss and
conclude our findings in the context of the literature. The
appendix provides unpublished results related to Chapter 2. This
section is referenced in the final chapter for further
discussion
Recommended from our members
Efficient spiking neural network model of pattern motion selectivity in visual cortex
Simulating large-scale models of biological motion perception is challenging, due to the required memory to store the network structure and the computational power needed to quickly solve the neuronal dynamics. A low-cost yet high-performance approach to simulating large-scale neural network models in real-time is to leverage the parallel processing capability of graphics processing units (GPUs). Based on this approach, we present a two-stage model of visual area MT that we believe to be the first large-scale spiking network to demonstrate pattern direction selectivity. In this model, component-direction- selective (CDS) cells in MT linearly combine inputs from V1 cells that have spatiotemporal receptive fields according to the motion energy model of Simoncelli and Heeger. Pattern-direction-selective (PDS) cells in MT are constructed by pooling over MT CDS cells with a wide range of preferred directions. Responses of our model neurons are comparable to electrophysiological results for grating and plaid stimuli as well as speed tuning. The behavioral response of the network in a motion discrimination task is in agreement with psychophysical data. Moreover, our implementation outperforms a previous implementation of the motion energy model by orders of magnitude in terms of computational speed and memory usage. The full network, which comprises 153,216 neurons and approximately 40 million synapses, processes 20 frames per second of a 40∈×∈40 input video in real-time using a single off-the-shelf GPU. To promote the use of this algorithm among neuroscientists and computer vision researchers, the source code for the simulator, the network, and analysis scripts are publicly available. © 2014 Springer Science+Business Media New York
- …