49 research outputs found
Event-based Face Detection and Tracking in the Blink of an Eye
We present the first purely event-based method for face detection using the
high temporal resolution of an event-based camera. We will rely on a new
feature that has never been used for such a task that relies on detecting eye
blinks. Eye blinks are a unique natural dynamic signature of human faces that
is captured well by event-based sensors that rely on relative changes of
luminance. Although an eye blink can be captured with conventional cameras, we
will show that the dynamics of eye blinks combined with the fact that two eyes
act simultaneously allows to derive a robust methodology for face detection at
a low computational cost and high temporal resolution. We show that eye blinks
have a unique temporal signature over time that can be easily detected by
correlating the acquired local activity with a generic temporal model of eye
blinks that has been generated from a wide population of users. We furthermore
show that once the face is reliably detected it is possible to apply a
probabilistic framework to track the spatial position of a face for each
incoming event while updating the position of trackers. Results are shown for
several indoor and outdoor experiments. We will also release an annotated data
set that can be used for future work on the topic
Neuromorphic Event-Based Generalized Time-Based Stereovision
3D reconstruction from multiple viewpoints is an important problem in machine vision that allows recovering tridimensional structures from multiple two-dimensional views of a given scene. Reconstructions from multiple views are conventionally achieved through a process of pixel luminance-based matching between different views. Unlike conventional machine vision methods that solve matching ambiguities by operating only on spatial constraints and luminance, this paper introduces a fully time-based solution to stereovision using the high temporal resolution of neuromorphic asynchronous event-based cameras. These cameras output dynamic visual information in the form of what is known as “change events” that encode the time, the location and the sign of the luminance changes. A more advanced event-based camera, the Asynchronous Time-based Image Sensor (ATIS), in addition of change events, encodes absolute luminance as time differences. The stereovision problem can then be formulated solely in the time domain as a problem of events coincidences detection problem. This work is improving existing event-based stereovision techniques by adding luminance information that increases the matching reliability. It also introduces a formulation that does not require to build local frames (though it is still possible) from the luminances which can be costly to implement. Finally, this work also introduces a methodology for time based stereovision in the context of binocular and trinocular configurations using time based event matching criterion combining for the first time all together: space, time, luminance, and motion
Event-Driven Stereo Visual Tracking Algorithm to Solve Object Occlusion
Object tracking is a major problem for many computer
vision applications, but it continues to be computationally
expensive. The use of bio-inspired neuromorphic event-driven
dynamic vision sensors (DVSs) has heralded new methods for
vision processing, exploiting reduced amount of data and very
precise timing resolutions. Previous studies have shown these
neural spiking sensors to be well suited to implementing singlesensor
object tracking systems, although they experience difficulties
when solving ambiguities caused by object occlusion.
DVSs have also performed well in 3-D reconstruction in which
event matching techniques are applied in stereo setups. In this
paper, we propose a new event-driven stereo object tracking
algorithm that simultaneously integrates 3-D reconstruction
and cluster tracking, introducing feedback information in both
tasks to improve their respective performances. This algorithm,
inspired by human vision, identifies objects and learns their
position and size in order to solve ambiguities. This strategy
has been validated in four different experiments where the
3-D positions of two objects were tracked in a stereo setup even
when occlusion occurred. The objects studied in the experiments
were: 1) two swinging pens, the distance between which during
movement was measured with an error of less than 0.5%;
2) a pen and a box, to confirm the correctness of the results
obtained with a more complex object; 3) two straws attached to
a fan and rotating at 6 revolutions per second, to demonstrate
the high-speed capabilities of this approach; and 4) two people
walking in a real-world environment.Ministerio de Economía y Competitividad TEC2012-37868-C04-01Ministerio de Economía y Competitividad TEC2015-63884-C2-1-PJunta de Andalucía TIC-609
Plenoptic cameras in real-time robotics
Abstract Real-time vision-based navigation is a difficult task largely due to the limited optical properties of single cameras tha
Event-driven stereo vision with orientation filters
The recently developed Dynamic Vision Sensors
(DVS) sense dynamic visual information asynchronously and
code it into trains of events with sub-micro second temporal
resolution. This high temporal precision makes the output of
these sensors especially suited for dynamic 3D visual
reconstruction, by matching corresponding events generated by
two different sensors in a stereo setup. This paper explores the
use of Gabor filters to extract information about the orientation
of the object edges that produce the events, applying the
matching algorithm to the events generated by the Gabor filters
and not to those produced by the DVS. This strategy provides
more reliably matched pairs of events, improving the final 3D
reconstruction.European Union PRI-PIMCHI-2011-0768Ministerio de Economía y Competitividad TEC2009-10639-C04-01Ministerio de Economía y Competitividad TEC2012-37868-C04-01Junta de Andalucía TIC-609
Designing Non Constant Resolution Vision Sensors Via Photosites Rearrangement
Abstract-Non conventional imaging sensors have been intensively investigate in recent works. Research is conducted to design devices capable to provide panoramic views with no need of mozaicing process. These devices combine optical lenses and/or non planar reflective surface with standard pinhole camera. We present in this paper an analysis of pixels rearranged sensor adapted to distortions induced by mirrors. Particularly, we aim to identify photosites distributions that compensate the non constant resolution for a given reflective surface. Our analysis is applied to a wide set of commonly used reflective surfaces in panoramic vision. Synthetic datas are produced in order to substantiate the geometric properties since such sensors do not exist yet. I. Introduction Standard vision sensors provide almost constant spatial resolution image such that a wide range of linear operators can be applied for signal processing. Scene objects and their projections on image plane are linked only by the Thales relation, hence metrics of the scene can be computed. Straight lines are projected as straight lines on image plane and the projections are equiareal if distances to the sensor are constant. If one is familiar with the perspective camera image formation, it is less trivial to apprehend sensor models that are intensively studied for their panoramic property. Non linear devices are used to enlarge the field of view : non planar reflective surfaces and/or wide angle lenses that do not comply with Gauss condition. The trade-off of the broadened field of view is the non linearity of the imaged signals. Resulting projections are aphylactic. The measures performed on images are highly complexified and most of the traditional image processing operators are no more appliable. Intensive works are carried out on the design of omnidirectionnal vision imaging sytems and on their signal interpretation. Distortions induced by lenses or reflection from mirrors are described as "deviation" from the perspective model and a metric to quantify them is introduce
A Framework for Event-based Computer Vision on a Mobile Device
We present the first publicly available Android framework to stream data from
an event camera directly to a mobile phone. Today's mobile devices handle a
wider range of workloads than ever before and they incorporate a growing gamut
of sensors that make devices smarter, more user friendly and secure.
Conventional cameras in particular play a central role in such tasks, but they
cannot record continuously, as the amount of redundant information recorded is
costly to process. Bio-inspired event cameras on the other hand only record
changes in a visual scene and have shown promising low-power applications that
specifically suit mobile tasks such as face detection, gesture recognition or
gaze tracking. Our prototype device is the first step towards embedding such an
event camera into a battery-powered handheld device. The mobile framework
allows us to stream events in real-time and opens up the possibilities for
always-on and on-demand sensing on mobile phones. To liaise the asynchronous
event camera output with synchronous von Neumann hardware, we look at how
buffering events and processing them in batches can benefit mobile
applications. We evaluate our framework in terms of latency and throughput and
show examples of computer vision tasks that involve both event-by-event and
pre-trained neural network methods for gesture recognition, aperture robust
optical flow and grey-level image reconstruction from events. The code is
available at https://github.com/neuromorphic-paris/fro
Cone of view camera model using conformal geometric algebra for classic and panoramic image sensors
International audienceCamera calibration is a necessary step in 3D computer vision in order to extract metric information from 2D images. Calibration has been defined as the non parametric association of a projection ray in 3D to every pixel in an image. It is normally neglected that pixels have a finite surface that can be approximated by a cone of view that has the usual ray of view of the pixel as a directrix axis. If this pixels' physical topology can be easily neglected in the case of perspective cameras, it is an absolute necessity to consider it in the case of variant scale cameras such as foveolar or catadioptric omnidirectional sensors which are nowadays widely used in robotics. This paper presents a general model to geometrically describe cameras whether they have a constant or variant scale resolution by introducing the new idea of using pixel-cones to model the field of view of cameras rather than the usual line-rays. The paper presents the general formulation using twists of conformal geometric algebra to express cones and their intersections in an easy and elegant manner, and without which the use of cones would be too binding. The paper will also introduce an experimental method to determine pixels-cones for any geometric type of camera. Experimental results will be shown in the case of perspective and omnidirectional catadioptric cameras