4,455 research outputs found
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Event-based tracking of human hands
This paper proposes a novel method for human hands tracking using data from
an event camera. The event camera detects changes in brightness, measuring
motion, with low latency, no motion blur, low power consumption and high
dynamic range. Captured frames are analysed using lightweight algorithms
reporting 3D hand position data. The chosen pick-and-place scenario serves as
an example input for collaborative human-robot interactions and in obstacle
avoidance for human-robot safety applications. Events data are pre-processed
into intensity frames. The regions of interest (ROI) are defined through object
edge event activity, reducing noise. ROI features are extracted for use
in-depth perception. Event-based tracking of human hand demonstrated feasible,
in real time and at a low computational cost. The proposed ROI-finding method
reduces noise from intensity images, achieving up to 89% of data reduction in
relation to the original, while preserving the features. The depth estimation
error in relation to ground truth (measured with wearables), measured using
dynamic time warping and using a single event camera, is from 15 to 30
millimetres, depending on the plane it is measured. Tracking of human hands in
3D space using a single event camera data and lightweight algorithms to define
ROI features (hands tracking in space)
High speed event-based visual processing in the presence of noise
Standard machine vision approaches are challenged in applications where large amounts of noisy temporal data must be processed in real-time. This work aims to develop neuromorphic event-based processing systems for such challenging, high-noise environments. The novel event-based application-focused algorithms developed are primarily designed for implementation in digital neuromorphic hardware with a focus on noise robustness, ease of implementation, operationally useful ancillary signals and processing speed in embedded systems
Event-based object detection and tracking for space situational awareness
In this work, we present an optical space imaging dataset using a range of event-based neuromorphic vision sensors. The unique method of operation of event-based sensors makes them ideal for space situational awareness (SSA) applications due to the sparseness inherent in space imaging data. These sensors offer significantly lower bandwidth and power requirements making them particularly well suited for use in remote locations and space-based platforms. We present the first publicly-accessible event-based space imaging dataset including recordings using sensors from multiple providers, greatly lowering the barrier to entry for other researchers given the scarcity of such sensors and the expertise required to operate them for SSA applications. The dataset contains both day time and night time recordings, including simultaneous co-collections from different event-based sensors. Recorded at a remote site, and containing 572 labeled targets with a wide range of sizes, trajectories, and signal-to-noise ratios, this real-world event-based dataset represents a challenging detection and tracking task that is not readily solved using previously proposed methods. We propose a highly optimized and robust feature-based detection and tracking method, designed specifically for SSA applications, and implemented via a cascade of increasingly selective event filters. These filters rapidly isolate events associated with space objects, maintaining the high temporal resolution of the sensors. The results from this simple yet highly optimized algorithm on the space imaging dataset demonstrate robust high-speed event-based detection and tracking which can readily be implemented on sensor platforms in space as well as terrestrial environments
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
Exploring Motion Signatures for Vision-Based Tracking, Recognition and Navigation
As cameras become more and more popular in intelligent systems, algorithms and systems for understanding video data become more and more important. There is a broad range of applications, including object detection, tracking, scene understanding, and robot navigation. Besides the stationary information, video data contains rich motion information of the environment. Biological visual systems, like human and animal eyes, are very sensitive to the motion information. This inspires active research on vision-based motion analysis in recent years. The main focus of motion analysis has been on low level motion representations of pixels and image regions. However, the motion signatures can benefit a broader range of applications if further in-depth analysis techniques are developed.
In this dissertation, we mainly discuss how to exploit motion signatures to solve problems in two applications: object recognition and robot navigation.
First, we use bird species recognition as the application to explore motion signatures for object recognition. We begin with study of the periodic wingbeat motion of flying birds. To analyze the wing motion of a flying bird, we establish kinematics models for bird wings, and obtain wingbeat periodicity in image frames after the perspective projection. Time series of salient extremities on bird images are extracted, and the wingbeat frequency is acquired for species classification. Physical experiments show that the frequency based recognition method is robust to segmentation errors and measurement lost up to 30%. In addition to the wing motion, the body motion of the bird is also analyzed to extract the flying velocity in 3D space. An interacting multi-model approach is then designed to capture the combined object motion patterns and different environment conditions. The
proposed systems and algorithms are tested in physical experiments, and the results show a false positive rate of around 20% with a low false negative rate close to zero.
Second, we explore motion signatures for vision-based vehicle navigation. We discover that motion vectors (MVs) encoded in Moving Picture Experts Group (MPEG) videos provide rich information of the motion in the environment, which can be used to reconstruct the vehicle ego-motion and the structure of the scene. However, MVs suffer from high noise level. To handle the challenge, an error propagation model for MVs is first proposed. Several steps, including MV merging, plane-at-infinity elimination, and planar region extraction, are designed to further reduce noises. The extracted planes are used as landmarks in an extended Kalman filter (EKF) for simultaneous localization and mapping. Results show that the algorithm performs localization and plane mapping with a relative
trajectory error below 5:1%.
Exploiting the fact that MVs encodes both environment information and moving obstacles, we further propose to track moving objects at the same time of localization and mapping. This enables the two critical navigation functionalities, localization and obstacle avoidance, to be performed in a single framework. MVs are labeled as stationary or moving according to their consistency to geometric constraints. Therefore, the extracted planes are separated into moving objects and the stationary scene. Multiple EKFs are used to track the static scene and the moving objects simultaneously. In physical experiments, we show a detection rate of moving objects at 96:6% and a mean absolute localization error below 3:5 meters
- …