1,112 research outputs found

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Tracking of the Articulated Upper Body on Multi-View Stereo Image Sequences

    Get PDF

    Wing and body motion during flight initiation in Drosophila revealed by automated visual tracking

    Get PDF
    The fruit fly Drosophila melanogaster is a widely used model organism in studies of genetics, developmental biology and biomechanics. One limitation for exploiting Drosophila as a model system for behavioral neurobiology is that measuring body kinematics during behavior is labor intensive and subjective. In order to quantify flight kinematics during different types of maneuvers, we have developed a visual tracking system that estimates the posture of the fly from multiple calibrated cameras. An accurate geometric fly model is designed using unit quaternions to capture complex body and wing rotations, which are automatically fitted to the images in each time frame. Our approach works across a range of flight behaviors, while also being robust to common environmental clutter. The tracking system is used in this paper to compare wing and body motion during both voluntary and escape take-offs. Using our automated algorithms, we are able to measure stroke amplitude, geometric angle of attack and other parameters important to a mechanistic understanding of flapping flight. When compared with manual tracking methods, the algorithm estimates body position within 4.4±1.3% of the body length, while body orientation is measured within 6.5±1.9 deg. (roll), 3.2±1.3 deg. (pitch) and 3.4±1.6 deg. (yaw) on average across six videos. Similarly, stroke amplitude and deviation are estimated within 3.3 deg. and 2.1 deg., while angle of attack is typically measured within 8.8 deg. comparing against a human digitizer. Using our automated tracker, we analyzed a total of eight voluntary and two escape take-offs. These sequences show that Drosophila melanogaster do not utilize clap and fling during take-off and are able to modify their wing kinematics from one wingstroke to the next. Our approach should enable biomechanists and ethologists to process much larger datasets than possible at present and, therefore, accelerate insight into the mechanisms of free-flight maneuvers of flying insects

    Configurable Input Devices for 3D Interaction using Optical Tracking

    Get PDF
    Three-dimensional interaction with virtual objects is one of the aspects that needs to be addressed in order to increase the usability and usefulness of virtual reality. Human beings have difficulties understanding 3D spatial relationships and manipulating 3D user interfaces, which require the control of multiple degrees of freedom simultaneously. Conventional interaction paradigms known from the desktop computer, such as the use of interaction devices as the mouse and keyboard, may be insufficient or even inappropriate for 3D spatial interaction tasks. The aim of the research in this thesis is to develop the technology required to improve 3D user interaction. This can be accomplished by allowing interaction devices to be constructed such that their use is apparent from their structure, and by enabling efficient development of new input devices for 3D interaction. The driving vision in this thesis is that for effective and natural direct 3D interaction the structure of an interaction device should be specifically tuned to the interaction task. Two aspects play an important role in this vision. First, interaction devices should be structured such that interaction techniques are as direct and transparent as possible. Interaction techniques define the mapping between interaction task parameters and the degrees of freedom of interaction devices. Second, the underlying technology should enable developers to rapidly construct and evaluate new interaction devices. The thesis is organized as follows. In Chapter 2, a review of the optical tracking field is given. The tracking pipeline is discussed, existing methods are reviewed, and improvement opportunities are identified. In Chapters 3 and 4 the focus is on the development of optical tracking techniques of rigid objects. The goal of the tracking method presented in Chapter 3 is to reduce the occlusion problem. The method exploits projection invariant properties of line pencil markers, and the fact that line features only need to be partially visible. In Chapter 4, the aim is to develop a tracking system that supports devices of arbitrary shapes, and allows for rapid development of new interaction devices. The method is based on subgraph isomorphism to identify point clouds. To support the development of new devices in the virtual environment an automatic model estimation method is used. Chapter 5 provides an analysis of three optical tracking systems based on different principles. The first system is based on an optimization procedure that matches the 3D device model points to the 2D data points that are detected in the camera images. The other systems are the tracking methods as discussed in Chapters 3 and 4. In Chapter 6 an analysis of various filtering and prediction methods is given. These techniques can be used to make the tracking system more robust against noise, and to reduce the latency problem. Chapter 7 focusses on optical tracking of composite input devices, i.e., input devices 197 198 Summary that consist of multiple rigid parts that can have combinations of rotational and translational degrees of freedom with respect to each other. Techniques are developed to automatically generate a 3D model of a segmented input device from motion data, and to use this model to track the device. In Chapter 8, the presented techniques are combined to create a configurable input device, which supports direct and natural co-located interaction. In this chapter, the goal of the thesis is realized. The device can be configured such that its structure reflects the parameters of the interaction task. In Chapter 9, the configurable interaction device is used to study the influence of spatial device structure with respect to the interaction task at hand. The driving vision of this thesis, that the spatial structure of an interaction device should match that of the task, is analyzed and evaluated by performing a user study. The concepts and techniques developed in this thesis allow researchers to rapidly construct and apply new interaction devices for 3D interaction in virtual environments. Devices can be constructed such that their spatial structure reflects the 3D parameters of the interaction task at hand. The interaction technique then becomes a transparent one-to-one mapping that directly mediates the functions of the device to the task. The developed configurable interaction devices can be used to construct intuitive spatial interfaces, and allow researchers to rapidly evaluate new device configurations and to efficiently perform studies on the relation between the spatial structure of devices and the interaction task

    Iterative Estimation of Rigid-Body Transformations: Application to Robust Object Tracking and Iterative Closest Point

    Get PDF
    Closed-form solutions are traditionally used in computer vision for estimating rigid body transformations. Here we suggest an iterative solution for estimating rigid body transformations and prove its global convergence. We show that for a number of applications involving repeated estimations of rigid body transformations, an iterative scheme is preferable to a closed-form solution. We illustrate this experimentally on two applications, 3D object tracking and image registration with Iterative Closest Point. Our results show that for those problems using an iterative and continuous estimation process is more robust than using many independent closed-form estimation

    Exploitation of time-of-flight (ToF) cameras

    Get PDF
    This technical report reviews the state-of-the art in the field of ToF cameras, their advantages, their limitations, and their present-day applications sometimes in combination with other sensors. Even though ToF cameras provide neither higher resolution nor larger ambiguity-free range compared to other range map estimation systems, advantages such as registered depth and intensity data at a high frame rate, compact design, low weight and reduced power consumption have motivated their use in numerous areas of research. In robotics, these areas range from mobile robot navigation and map building to vision-based human motion capture and gesture recognition, showing particularly a great potential in object modeling and recognition.Preprin

    Video foreground extraction for mobile camera platforms

    Get PDF
    Foreground object detection is a fundamental task in computer vision with many applications in areas such as object tracking, event identification, and behavior analysis. Most conventional foreground object detection methods work only in a stable illumination environments using fixed cameras. In real-world applications, however, it is often the case that the algorithm needs to operate under the following challenging conditions: drastic lighting changes, object shape complexity, moving cameras, low frame capture rates, and low resolution images. This thesis presents four novel approaches for foreground object detection on real-world datasets using cameras deployed on moving vehicles.The first problem addresses passenger detection and tracking tasks for public transport buses investigating the problem of changing illumination conditions and low frame capture rates. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modelling method with a human shape model into a weighted Bayesian framework to detect passengers. To deal with the problem of tracking multiple targets, we employ the Reversible Jump Monte Carlo Markov Chain tracking algorithm. Using the SVM classifier, the appearance transformation models capture changes in the appearance of the foreground objects across two consecutives frames under low frame rate conditions. In the second problem, we present a system for pedestrian detection involving scenes captured by a mobile bus surveillance system. It integrates scene localization, foreground-background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data.In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarity, and the second stage further clusters these aligned frames according to consistency in illumination. This produces clusters of images that are differential in viewpoint and lighting. A kernel density estimation (KDE) technique for colour and gradient is then used to construct background models for each image cluster, which is further used to detect candidate foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be detected.In addition to the second problem, we present three direct pedestrian detection methods that extend the HOG (Histogram of Oriented Gradient) techniques (Dalal and Triggs, 2005) and provide a comparative evaluation of these approaches. The three approaches include: a) a new histogram feature, that is formed by the weighted sum of both the gradient magnitude and the filter responses from a set of elongated Gaussian filters (Leung and Malik, 2001) corresponding to the quantised orientation, which we refer to as the Histogram of Oriented Gradient Banks (HOGB) approach; b) the codebook based HOG feature with branch-and-bound (efficient subwindow search) algorithm (Lampert et al., 2008) and; c) the codebook based HOGB approach.In the third problem, a unified framework that combines 3D and 2D background modelling is proposed to detect scene changes using a camera mounted on a moving vehicle. The 3D scene is first reconstructed from a set of videos taken at different times. The 3D background modelling identifies inconsistent scene structures as foreground objects. For the 2D approach, foreground objects are detected using the spatio-temporal MRF algorithm. Finally, the 3D and 2D results are combined using morphological operations.The significance of these research is that it provides basic frameworks for automatic large-scale mobile surveillance applications and facilitates many higher-level applications such as object tracking and behaviour analysis
    • …
    corecore