693 research outputs found
An Event-Driven Multi-Kernel Convolution Processor Module for Event-Driven Vision Sensors
Event-Driven vision sensing is a new way of sensing
visual reality in a frame-free manner. This is, the vision sensor
(camera) is not capturing a sequence of still frames, as in conventional
video and computer vision systems. In Event-Driven sensors
each pixel autonomously and asynchronously decides when to
send its address out. This way, the sensor output is a continuous
stream of address events representing reality dynamically continuously
and without constraining to frames. In this paper we present
an Event-Driven Convolution Module for computing 2D convolutions
on such event streams. The Convolution Module has been
designed to assemble many of them for building modular and hierarchical
Convolutional Neural Networks for robust shape and
pose invariant object recognition. The Convolution Module has
multi-kernel capability. This is, it will select the convolution kernel
depending on the origin of the event. A proof-of-concept test prototype
has been fabricated in a 0.35 m CMOS process and extensive
experimental results are provided. The Convolution Processor has
also been combined with an Event-Driven Dynamic Vision Sensor
(DVS) for high-speed recognition examples. The chip can discriminate
propellers rotating at 2 k revolutions per second, detect symbols
on a 52 card deck when browsing all cards in 410 ms, or detect
and follow the center of a phosphor oscilloscope trace rotating at
5 KHz.Unión Europea 216777 (NABAB)Ministerio de Ciencia e Innovación TEC2009-10639-C04-0
Automatic Neuron Detection in Calcium Imaging Data Using Convolutional Networks
Calcium imaging is an important technique for monitoring the activity of
thousands of neurons simultaneously. As calcium imaging datasets grow in size,
automated detection of individual neurons is becoming important. Here we apply
a supervised learning approach to this problem and show that convolutional
networks can achieve near-human accuracy and superhuman speed. Accuracy is
superior to the popular PCA/ICA method based on precision and recall relative
to ground truth annotation by a human expert. These results suggest that
convolutional networks are an efficient and flexible tool for the analysis of
large-scale calcium imaging data.Comment: 9 pages, 5 figures, 2 ancillary files; minor changes for camera-ready
version. appears in Advances in Neural Information Processing Systems 29
(NIPS 2016
ModDrop: adaptive multi-modal gesture recognition
We present a method for gesture detection and localisation based on
multi-scale and multi-modal deep learning. Each visual modality captures
spatial information at a particular spatial scale (such as motion of the upper
body or a hand), and the whole system operates at three temporal scales. Key to
our technique is a training strategy which exploits: i) careful initialization
of individual modalities; and ii) gradual fusion involving random dropping of
separate channels (dubbed ModDrop) for learning cross-modality correlations
while preserving uniqueness of each modality-specific representation. We
present experiments on the ChaLearn 2014 Looking at People Challenge gesture
recognition track, in which we placed first out of 17 teams. Fusing multiple
modalities at several spatial and temporal scales leads to a significant
increase in recognition rates, allowing the model to compensate for errors of
the individual classifiers as well as noise in the separate channels.
Futhermore, the proposed ModDrop training technique ensures robustness of the
classifier to missing signals in one or several channels to produce meaningful
predictions from any number of available modalities. In addition, we
demonstrate the applicability of the proposed fusion scheme to modalities of
arbitrary nature by experiments on the same dataset augmented with audio.Comment: 14 pages, 7 figure
- …