10 research outputs found
Event-based tracking of human hands
This paper proposes a novel method for human hands tracking using data from
an event camera. The event camera detects changes in brightness, measuring
motion, with low latency, no motion blur, low power consumption and high
dynamic range. Captured frames are analysed using lightweight algorithms
reporting 3D hand position data. The chosen pick-and-place scenario serves as
an example input for collaborative human-robot interactions and in obstacle
avoidance for human-robot safety applications. Events data are pre-processed
into intensity frames. The regions of interest (ROI) are defined through object
edge event activity, reducing noise. ROI features are extracted for use
in-depth perception. Event-based tracking of human hand demonstrated feasible,
in real time and at a low computational cost. The proposed ROI-finding method
reduces noise from intensity images, achieving up to 89% of data reduction in
relation to the original, while preserving the features. The depth estimation
error in relation to ground truth (measured with wearables), measured using
dynamic time warping and using a single event camera, is from 15 to 30
millimetres, depending on the plane it is measured. Tracking of human hands in
3D space using a single event camera data and lightweight algorithms to define
ROI features (hands tracking in space)
An extended modular processing pipeline for event-based vision in automatic visual inspection
Dynamic Vision Sensors differ from conventional cameras in that only intensity changes of individual pixels are perceived and transmitted as an asynchronous stream instead of an entire frame. The technology promises, among other things, high temporal resolution and low latencies and data rates. While such sensors currently enjoy much scientific attention, there are only little publications on practical applications. One field of application that has hardly been considered so far, yet potentially fits well with the sensor principle due to its special properties, is automatic visual inspection. In this paper, we evaluate current state-of-the-art processing algorithms in this new application domain. We further propose an algorithmic approach for the identification of ideal time windows within an event stream for object classification. For the evaluation of our method, we acquire two novel datasets that contain typical visual inspection scenarios, i.e., the inspection of objects on a conveyor belt and during free fall. The success of our algorithmic extension for data processing is demonstrated on the basis of these new datasets by showing that classification accuracy of current algorithms is highly increased. By making our new datasets publicly available, we intend to stimulate further research on application of Dynamic Vision Sensors in machine vision applications
Event Blob Tracking: An Asynchronous Real-Time Algorithm
Event-based cameras have become increasingly popular for tracking fast-moving
objects due to their high temporal resolution, low latency, and high dynamic
range. In this paper, we propose a novel algorithm for tracking event blobs
using raw events asynchronously in real time. We introduce the concept of an
event blob as a spatio-temporal likelihood of event occurrence where the
conditional spatial likelihood is blob-like. Many real-world objects generate
event blob data, for example, flickering LEDs such as car headlights or any
small foreground object moving against a static or slowly varying background.
The proposed algorithm uses a nearest neighbour classifier with a dynamic
threshold criteria for data association coupled with a Kalman filter to track
the event blob state. Our algorithm achieves highly accurate tracking and event
blob shape estimation even under challenging lighting conditions and high-speed
motions. The microsecond time resolution achieved means that the filter output
can be used to derive secondary information such as time-to-contact or range
estimation, that will enable applications to real-world problems such as
collision avoidance in autonomous driving.Comment: 17 pages, 8 figures, preprint versio
Distractor-aware Event-based Tracking
Event cameras, or dynamic vision sensors, have recently achieved success from
fundamental vision tasks to high-level vision researches. Due to its ability to
asynchronously capture light intensity changes, event camera has an inherent
advantage to capture moving objects in challenging scenarios including objects
under low light, high dynamic range, or fast moving objects. Thus event camera
are natural for visual object tracking. However, the current event-based
trackers derived from RGB trackers simply modify the input images to event
frames and still follow conventional tracking pipeline that mainly focus on
object texture for target distinction. As a result, the trackers may not be
robust dealing with challenging scenarios such as moving cameras and cluttered
foreground. In this paper, we propose a distractor-aware event-based tracker
that introduces transformer modules into Siamese network architecture (named
DANet). Specifically, our model is mainly composed of a motion-aware network
and a target-aware network, which simultaneously exploits both motion cues and
object contours from event data, so as to discover motion objects and identify
the target object by removing dynamic distractors. Our DANet can be trained in
an end-to-end manner without any post-processing and can run at over 80 FPS on
a single V100. We conduct comprehensive experiments on two large event tracking
datasets to validate the proposed model. We demonstrate that our tracker has
superior performance against the state-of-the-art trackers in terms of both
accuracy and efficiency
Event-based clustering and looming detection
Based on the sequential K-means algorithm, we present a real-time, accurate and automatic clustering method for asynchronous events generated by the optical flow algorithm of Ridwan and Cheng. The complexity of our algorithm does not increase with increasing number of events. We also designed an implementation of the elbow method capable of detecting the number of clusters without any a priori assumptions on objects. In addition, we designed a merge algorithm capable of merging multiple touching clusters into one for enhancing the results of our clustering algorithm. The output of our clustering algorithm is then used with a single object looming detection algorithm to detect looming for multiple objects. We tested our algorithm on both simulated and captured data sets against two other well-known algorithms. Our algorithm is fast and accurate both in cluster detection quality and looming detection quality
Low Latency Event-Based Filtering and Feature Extraction for Dynamic Vision Sensors in Real-Time FPGA Applications
Dynamic Vision Sensor (DVS) pixels produce an asynchronous variable-rate address-event
output that represents brightness changes at the pixel. Since these sensors produce frame-free output, they
are ideal for real-time dynamic vision applications with real-time latency and power system constraints.
Event-based ltering algorithms have been proposed to post-process the asynchronous event output to
reduce sensor noise, extract low level features, and track objects, among others. These postprocessing
algorithms help to increase the performance and accuracy of further processing for tasks such as classi cation
using spike-based learning (ie. ConvNets), stereo vision, and visually-servoed robots, etc. This paper
presents an FPGA-based library of these postprocessing event-based algorithms with implementation details;
speci cally background activity (noise) ltering, pixel masking, object motion detection and object tracking.
The latencies of these lters on the Field Programmable Gate Array (FPGA) platform are below 300ns with
an average latency reduction of 188% (maximum of 570%) over the software versions running on a desktop
PC CPU. This open-source event-based lter IP library for FPGA has been tested on two different platforms
and scenarios using different synthesis and implementation tools for Lattice and Xilinx vendors
Utilization and experimental evaluation of occlusion aware kernel correlation filter tracker using RGB-D
Unlike deep-learning which requires large training datasets, correlation filter-based trackers like Kernelized Correlation Filter (KCF) uses implicit properties of tracked images (circulant matrices) for training in real-time. Despite their practical application in tracking, a need for a better understanding of the fundamentals associated with KCF in terms of theoretically, mathematically, and experimentally exists. This thesis first details the workings prototype of the tracker and investigates its effectiveness in real-time applications and supporting visualizations. We further address some of the drawbacks of the tracker in cases of occlusions, scale changes, object rotation, out-of-view and model drift with our novel RGB-D Kernel Correlation tracker. We also study the use of particle filter to improve trackers\u27 accuracy. Our results are experimentally evaluated using a) standard dataset and b) real-time using Microsoft Kinect V2 sensor. We believe this work will set the basis for better understanding the effectiveness of kernel-based correlation filter trackers and to further define some of its possible advantages in tracking
BIO-INSPIRED MOTION PERCEPTION: FROM GANGLION CELLS TO AUTONOMOUS VEHICLES
Animals are remarkable at navigation, even in extreme situations. Through motion perception, animals compute their own movements (egomotion) and find other objects (prey, predator, obstacles) and their motions in the environment. Analogous to animals, artificial systems such as robots also need to know where they are relative to structure and segment obstacles to avoid collisions. Even though substantial progress has been made in the development of artificial visual systems, they still struggle to achieve robust and generalizable solutions. To this end, I propose a bio-inspired framework that narrows the gap between natural and artificial systems.
The standard approaches in robot motion perception seek to reconstruct a three-dimensional model of the scene and then use this model to estimate egomotion and object segmentation. However, the scene reconstruction process is data-heavy and computationally expensive and fails to deal with high-speed and dynamic scenarios. On the contrary, biological visual systems excel in the aforementioned difficult situation by extracting only minimal information sufficient for motion perception tasks. I derive minimalist/purposive ideas from biological processes throughout this thesis and develop mathematical solutions for robot motion perception problems.
In this thesis, I develop a full range of solutions that utilize bio-inspired motion representation and learning approaches for motion perception tasks. Particularly, I focus on egomotion estimation and motion segmentation tasks. I have four main contributions: 1. First, I introduce NFlowNet, a neural network to estimate normal flow (bio-inspired motion filters). Normal flow estimation presents a new avenue for solving egomotion in a robust and qualitative framework. 2. Utilizing normal flow, I propose the DiffPoseNet framework to estimate egomotion by formulating the qualitative constraint in a differentiable optimization layer, which allows for end-to-end learning. 3. Further, utilizing a neuromorphic event camera, a retina-inspired vision sensor, I develop 0-MMS, a model-based optimization approach that employs event spikes to segment the scene into multiple moving parts in high-speed dynamic lighting scenarios. 4. To improve the precision of event-based motion perception across time, I develop SpikeMS, a novel bio-inspired learning approach that fully capitalizes on the rich temporal information in event spikes