3,262 research outputs found

    Event-based Face Detection and Tracking in the Blink of an Eye

    Full text link
    We present the first purely event-based method for face detection using the high temporal resolution of an event-based camera. We will rely on a new feature that has never been used for such a task that relies on detecting eye blinks. Eye blinks are a unique natural dynamic signature of human faces that is captured well by event-based sensors that rely on relative changes of luminance. Although an eye blink can be captured with conventional cameras, we will show that the dynamics of eye blinks combined with the fact that two eyes act simultaneously allows to derive a robust methodology for face detection at a low computational cost and high temporal resolution. We show that eye blinks have a unique temporal signature over time that can be easily detected by correlating the acquired local activity with a generic temporal model of eye blinks that has been generated from a wide population of users. We furthermore show that once the face is reliably detected it is possible to apply a probabilistic framework to track the spatial position of a face for each incoming event while updating the position of trackers. Results are shown for several indoor and outdoor experiments. We will also release an annotated data set that can be used for future work on the topic

    A Taxonomy of Deep Convolutional Neural Nets for Computer Vision

    Get PDF
    Traditional architectures for solving computer vision problems and the degree of success they enjoyed have been heavily reliant on hand-crafted features. However, of late, deep learning techniques have offered a compelling alternative -- that of automatically learning problem-specific features. With this new paradigm, every problem in computer vision is now being re-examined from a deep learning perspective. Therefore, it has become important to understand what kind of deep networks are suitable for a given problem. Although general surveys of this fast-moving paradigm (i.e. deep-networks) exist, a survey specific to computer vision is missing. We specifically consider one form of deep networks widely used in computer vision - convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN and then examine the broad variations proposed over time to suit different applications. We hope that our recipe-style survey will serve as a guide, particularly for novice practitioners intending to use deep-learning techniques for computer vision.Comment: Published in Frontiers in Robotics and AI (http://goo.gl/6691Bm

    Dynamical system analysis and forecasting of deformation produced by an earthquake fault

    Full text link
    We present a method of constructing low-dimensional nonlinear models describing the main dynamical features of a discrete 2D cellular fault zone, with many degrees of freedom, embedded in a 3D elastic solid. A given fault system is characterized by a set of parameters that describe the dynamics, rheology, property disorder, and fault geometry. Depending on the location in the system parameter space we show that the coarse dynamics of the fault can be confined to an attractor whose dimension is significantly smaller than the space in which the dynamics takes place. Our strategy of system reduction is to search for a few coherent structures that dominate the dynamics and to capture the interaction between these coherent structures. The identification of the basic interacting structures is obtained by applying the Proper Orthogonal Decomposition (POD) to the surface deformations fields that accompany strike-slip faulting accumulated over equal time intervals. We use a feed-forward artificial neural network (ANN) architecture for the identification of the system dynamics projected onto the subspace (model space) spanned by the most energetic coherent structures. The ANN is trained using a standard back-propagation algorithm to predict (map) the values of the observed model state at a future time given the observed model state at the present time. This ANN provides an approximate, large scale, dynamical model for the fault.Comment: 30 pages, 12 figure

    Data-driven modeling of the olfactory neural codes and their dynamics in the insect antennal lobe

    Get PDF
    Recordings from neurons in the insects' olfactory primary processing center, the antennal lobe (AL), reveal that the AL is able to process the input from chemical receptors into distinct neural activity patterns, called olfactory neural codes. These exciting results show the importance of neural codes and their relation to perception. The next challenge is to \emph{model the dynamics} of neural codes. In our study, we perform multichannel recordings from the projection neurons in the AL driven by different odorants. We then derive a neural network from the electrophysiological data. The network consists of lateral-inhibitory neurons and excitatory neurons, and is capable of producing unique olfactory neural codes for the tested odorants. Specifically, we (i) design a projection, an odor space, for the neural recording from the AL, which discriminates between distinct odorants trajectories (ii) characterize scent recognition, i.e., decision-making based on olfactory signals and (iii) infer the wiring of the neural circuit, the connectome of the AL. We show that the constructed model is consistent with biological observations, such as contrast enhancement and robustness to noise. The study answers a key biological question in identifying how lateral inhibitory neurons can be wired to excitatory neurons to permit robust activity patterns

    Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks

    Full text link
    This work addresses the problem of vehicle identification through non-overlapping cameras. As our main contribution, we introduce a novel dataset for vehicle identification, called Vehicle-Rear, that contains more than three hours of high-resolution videos, with accurate information about the make, model, color and year of nearly 3,000 vehicles, in addition to the position and identification of their license plates. To explore our dataset we design a two-stream CNN that simultaneously uses two of the most distinctive and persistent features available: the vehicle's appearance and its license plate. This is an attempt to tackle a major problem: false alarms caused by vehicles with similar designs or by very close license plate identifiers. In the first network stream, shape similarities are identified by a Siamese CNN that uses a pair of low-resolution vehicle patches recorded by two different cameras. In the second stream, we use a CNN for OCR to extract textual information, confidence scores, and string similarities from a pair of high-resolution license plate patches. Then, features from both streams are merged by a sequence of fully connected layers for decision. In our experiments, we compared the two-stream network against several well-known CNN architectures using single or multiple vehicle features. The architectures, trained models, and dataset are publicly available at https://github.com/icarofua/vehicle-rear

    SVS-JOIN : efficient spatial visual similarity join for geo-multimedia

    Get PDF
    In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently
    • …
    corecore