3,262 research outputs found
Event-based Face Detection and Tracking in the Blink of an Eye
We present the first purely event-based method for face detection using the
high temporal resolution of an event-based camera. We will rely on a new
feature that has never been used for such a task that relies on detecting eye
blinks. Eye blinks are a unique natural dynamic signature of human faces that
is captured well by event-based sensors that rely on relative changes of
luminance. Although an eye blink can be captured with conventional cameras, we
will show that the dynamics of eye blinks combined with the fact that two eyes
act simultaneously allows to derive a robust methodology for face detection at
a low computational cost and high temporal resolution. We show that eye blinks
have a unique temporal signature over time that can be easily detected by
correlating the acquired local activity with a generic temporal model of eye
blinks that has been generated from a wide population of users. We furthermore
show that once the face is reliably detected it is possible to apply a
probabilistic framework to track the spatial position of a face for each
incoming event while updating the position of trackers. Results are shown for
several indoor and outdoor experiments. We will also release an annotated data
set that can be used for future work on the topic
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
Traditional architectures for solving computer vision problems and the degree
of success they enjoyed have been heavily reliant on hand-crafted features.
However, of late, deep learning techniques have offered a compelling
alternative -- that of automatically learning problem-specific features. With
this new paradigm, every problem in computer vision is now being re-examined
from a deep learning perspective. Therefore, it has become important to
understand what kind of deep networks are suitable for a given problem.
Although general surveys of this fast-moving paradigm (i.e. deep-networks)
exist, a survey specific to computer vision is missing. We specifically
consider one form of deep networks widely used in computer vision -
convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN
and then examine the broad variations proposed over time to suit different
applications. We hope that our recipe-style survey will serve as a guide,
particularly for novice practitioners intending to use deep-learning techniques
for computer vision.Comment: Published in Frontiers in Robotics and AI (http://goo.gl/6691Bm
Dynamical system analysis and forecasting of deformation produced by an earthquake fault
We present a method of constructing low-dimensional nonlinear models
describing the main dynamical features of a discrete 2D cellular fault zone,
with many degrees of freedom, embedded in a 3D elastic solid. A given fault
system is characterized by a set of parameters that describe the dynamics,
rheology, property disorder, and fault geometry. Depending on the location in
the system parameter space we show that the coarse dynamics of the fault can be
confined to an attractor whose dimension is significantly smaller than the
space in which the dynamics takes place. Our strategy of system reduction is to
search for a few coherent structures that dominate the dynamics and to capture
the interaction between these coherent structures. The identification of the
basic interacting structures is obtained by applying the Proper Orthogonal
Decomposition (POD) to the surface deformations fields that accompany
strike-slip faulting accumulated over equal time intervals. We use a
feed-forward artificial neural network (ANN) architecture for the
identification of the system dynamics projected onto the subspace (model space)
spanned by the most energetic coherent structures. The ANN is trained using a
standard back-propagation algorithm to predict (map) the values of the observed
model state at a future time given the observed model state at the present
time. This ANN provides an approximate, large scale, dynamical model for the
fault.Comment: 30 pages, 12 figure
Data-driven modeling of the olfactory neural codes and their dynamics in the insect antennal lobe
Recordings from neurons in the insects' olfactory primary processing center,
the antennal lobe (AL), reveal that the AL is able to process the input from
chemical receptors into distinct neural activity patterns, called olfactory
neural codes. These exciting results show the importance of neural codes and
their relation to perception. The next challenge is to \emph{model the
dynamics} of neural codes. In our study, we perform multichannel recordings
from the projection neurons in the AL driven by different odorants. We then
derive a neural network from the electrophysiological data. The network
consists of lateral-inhibitory neurons and excitatory neurons, and is capable
of producing unique olfactory neural codes for the tested odorants.
Specifically, we (i) design a projection, an odor space, for the neural
recording from the AL, which discriminates between distinct odorants
trajectories (ii) characterize scent recognition, i.e., decision-making based
on olfactory signals and (iii) infer the wiring of the neural circuit, the
connectome of the AL. We show that the constructed model is consistent with
biological observations, such as contrast enhancement and robustness to noise.
The study answers a key biological question in identifying how lateral
inhibitory neurons can be wired to excitatory neurons to permit robust activity
patterns
Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks
This work addresses the problem of vehicle identification through
non-overlapping cameras. As our main contribution, we introduce a novel dataset
for vehicle identification, called Vehicle-Rear, that contains more than three
hours of high-resolution videos, with accurate information about the make,
model, color and year of nearly 3,000 vehicles, in addition to the position and
identification of their license plates. To explore our dataset we design a
two-stream CNN that simultaneously uses two of the most distinctive and
persistent features available: the vehicle's appearance and its license plate.
This is an attempt to tackle a major problem: false alarms caused by vehicles
with similar designs or by very close license plate identifiers. In the first
network stream, shape similarities are identified by a Siamese CNN that uses a
pair of low-resolution vehicle patches recorded by two different cameras. In
the second stream, we use a CNN for OCR to extract textual information,
confidence scores, and string similarities from a pair of high-resolution
license plate patches. Then, features from both streams are merged by a
sequence of fully connected layers for decision. In our experiments, we
compared the two-stream network against several well-known CNN architectures
using single or multiple vehicle features. The architectures, trained models,
and dataset are publicly available at https://github.com/icarofua/vehicle-rear
SVS-JOIN : efficient spatial visual similarity join for geo-multimedia
In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently
- …