81,276 research outputs found
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Pushbroom Stereo for High-Speed Navigation in Cluttered Environments
We present a novel stereo vision algorithm that is capable of obstacle
detection on a mobile-CPU processor at 120 frames per second. Our system
performs a subset of standard block-matching stereo processing, searching only
for obstacles at a single depth. By using an onboard IMU and state-estimator,
we can recover the position of obstacles at all other depths, building and
updating a full depth-map at framerate.
Here, we describe both the algorithm and our implementation on a high-speed,
small UAV, flying at over 20 MPH (9 m/s) close to obstacles. The system
requires no external sensing or computation and is, to the best of our
knowledge, the first high-framerate stereo detection system running onboard a
small UAV
Multi-camera Realtime 3D Tracking of Multiple Flying Animals
Automated tracking of animal movement allows analyses that would not
otherwise be possible by providing great quantities of data. The additional
capability of tracking in realtime - with minimal latency - opens up the
experimental possibility of manipulating sensory feedback, thus allowing
detailed explorations of the neural basis for control of behavior. Here we
describe a new system capable of tracking the position and body orientation of
animals such as flies and birds. The system operates with less than 40 msec
latency and can track multiple animals simultaneously. To achieve these
results, a multi target tracking algorithm was developed based on the Extended
Kalman Filter and the Nearest Neighbor Standard Filter data association
algorithm. In one implementation, an eleven camera system is capable of
tracking three flies simultaneously at 60 frames per second using a gigabit
network of nine standard Intel Pentium 4 and Core 2 Duo computers. This
manuscript presents the rationale and details of the algorithms employed and
shows three implementations of the system. An experiment was performed using
the tracking system to measure the effect of visual contrast on the flight
speed of Drosophila melanogaster. At low contrasts, speed is more variable and
faster on average than at high contrasts. Thus, the system is already a useful
tool to study the neurobiology and behavior of freely flying animals. If
combined with other techniques, such as `virtual reality'-type computer
graphics or genetic manipulation, the tracking system would offer a powerful
new way to investigate the biology of flying animals.Comment: pdfTeX using libpoppler 3.141592-1.40.3-2.2 (Web2C 7.5.6), 18 pages
with 9 figure
Neural Models of Motion Integration, Segmentation, and Probablistic Decision-Making
When brain mechanism carry out motion integration and segmentation processes that compute unambiguous global motion percepts from ambiguous local motion signals? Consider, for example, a deer running at variable speeds behind forest cover. The forest cover is an occluder that creates apertures through which fragments of the deer's motion signals are intermittently experienced. The brain coherently groups these fragments into a trackable percept of the deer in its trajectory. Form and motion processes are needed to accomplish this using feedforward and feedback interactions both within and across cortical processing streams. All the cortical areas V1, V2, MT, and MST are involved in these interactions. Figure-ground processes in the form stream through V2, such as the seperation of occluding boundaries of the forest cover from the boundaries of the deer, select the motion signals which determine global object motion percepts in the motion stream through MT. Sparse, but unambiguous, feauture tracking signals are amplified before they propogate across position and are intergrated with far more numerous ambiguous motion signals. Figure-ground and integration processes together determine the global percept. A neural model predicts the processing stages that embody these form and motion interactions. Model concepts and data are summarized about motion grouping across apertures in response to a wide variety of displays, and probabilistic decision making in parietal cortex in response to random dot displays.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
Traffic Danger Recognition With Surveillance Cameras Without Training Data
We propose a traffic danger recognition model that works with arbitrary
traffic surveillance cameras to identify and predict car crashes. There are too
many cameras to monitor manually. Therefore, we developed a model to predict
and identify car crashes from surveillance cameras based on a 3D reconstruction
of the road plane and prediction of trajectories. For normal traffic, it
supports real-time proactive safety checks of speeds and distances between
vehicles to provide insights about possible high-risk areas. We achieve good
prediction and recognition of car crashes without using any labeled training
data of crashes. Experiments on the BrnoCompSpeed dataset show that our model
can accurately monitor the road, with mean errors of 1.80% for distance
measurement, 2.77 km/h for speed measurement, 0.24 m for car position
prediction, and 2.53 km/h for speed prediction.Comment: To be published in proceedings of Advanced Video and Signal-based
Surveillance (AVSS), 2018 15th IEEE International Conference on, pp. 378-383,
IEE
External localization system for mobile robotics
We present a fast and precise vision-based software intended for multiple robot localization. The core component of
the proposed localization system is an efficient method for black and white circular pattern detection. The method is robust to variable lighting conditions, achieves sub-pixel precision, and its computational complexity is independent of the processed image size. With off-the-shelf computational equipment and low-cost camera, its core algorithm is able to process hundreds of images per second while tracking hundreds of objects with millimeter precision. We propose a mathematical model of the method that allows to calculate its precision, area of coverage, and processing speed from the camera’s intrinsic parameters and hardware’s processing capacity. The correctness of the presented model and
performance of the algorithm in real-world conditions are verified in several experiments. Apart from the method description, we also publish its source code; so, it can be used as an enabling technology for various mobile robotics problems
Direct Monocular Odometry Using Points and Lines
Most visual odometry algorithm for a monocular camera focuses on points,
either by feature matching, or direct alignment of pixel intensity, while
ignoring a common but important geometry entity: edges. In this paper, we
propose an odometry algorithm that combines points and edges to benefit from
the advantages of both direct and feature based methods. It works better in
texture-less environments and is also more robust to lighting changes and fast
motion by increasing the convergence basin. We maintain a depth map for the
keyframe then in the tracking part, the camera pose is recovered by minimizing
both the photometric error and geometric error to the matched edge in a
probabilistic framework. In the mapping part, edge is used to speed up and
increase stereo matching accuracy. On various public datasets, our algorithm
achieves better or comparable performance than state-of-the-art monocular
odometry methods. In some challenging texture-less environments, our algorithm
reduces the state estimation error over 50%.Comment: ICRA 201
The Event-Camera Dataset and Simulator: Event-based Data for Pose Estimation, Visual Odometry, and SLAM
New vision sensors, such as the Dynamic and Active-pixel Vision sensor
(DAVIS), incorporate a conventional global-shutter camera and an event-based
sensor in the same pixel array. These sensors have great potential for
high-speed robotics and computer vision because they allow us to combine the
benefits of conventional cameras with those of event-based sensors: low
latency, high temporal resolution, and very high dynamic range. However, new
algorithms are required to exploit the sensor characteristics and cope with its
unconventional output, which consists of a stream of asynchronous brightness
changes (called "events") and synchronous grayscale frames. For this purpose,
we present and release a collection of datasets captured with a DAVIS in a
variety of synthetic and real environments, which we hope will motivate
research on new algorithms for high-speed and high-dynamic-range robotics and
computer-vision applications. In addition to global-shutter intensity images
and asynchronous events, we provide inertial measurements and ground-truth
camera poses from a motion-capture system. The latter allows comparing the pose
accuracy of ego-motion estimation algorithms quantitatively. All the data are
released both as standard text files and binary files (i.e., rosbag). This
paper provides an overview of the available data and describes a simulator that
we release open-source to create synthetic event-camera data.Comment: 7 pages, 4 figures, 3 table
- …