691 research outputs found
Recommended from our members
An evaluation framework for stereo-based driver assistance
This is the post-print version of the Article - Copyright @ 2012 Springer VerlagThe accuracy of stereo algorithms or optical flow methods is commonly assessed by comparing the results against the Middlebury
database. However, equivalent data for automotive or robotics applications
rarely exist as they are difficult to obtain. As our main contribution, we introduce an evaluation framework tailored for stereo-based driver assistance able to deliver excellent performance measures while
circumventing manual label effort. Within this framework one can combine several ways of ground-truthing, different comparison metrics, and use large image databases.
Using our framework we show examples on several types of ground truthing techniques: implicit ground truthing (e.g. sequence recorded without a crash occurred), robotic vehicles with high precision sensors, and to a small extent, manual labeling. To show the effectiveness of our evaluation framework we compare three different stereo algorithms on
pixel and object level. In more detail we evaluate an intermediate representation
called the Stixel World. Besides evaluating the accuracy of the Stixels, we investigate the completeness (equivalent to the detection rate) of the StixelWorld vs. the number of phantom Stixels. Among many findings, using this framework enables us to reduce the number of phantom Stixels by a factor of three compared to the base parametrization. This base parametrization has already been optimized by test driving vehicles for distances exceeding 10000 km
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
3D Visual Odometry for Road Vehicles
This paper describes a method for estimating the vehicle global position in a network of roads by means of visual odometry. To do so, the ego-motion of the vehicle relative to the road is computed using a stereo-vision system mounted next to the rear view mirror of the car. Feature points are matched between pairs of frames and linked into 3D trajectories. Vehicle motion is estimated using the non-linear, photogrametric approach based on RANSAC. This iterative technique enables the formulation of a robust method that can ignore large numbers of outliers as encountered in real traffic scenes. The resulting method is defined as visual odometry and can be used in conjunction with other sensors, such as GPS, to produce accurate estimates of the vehicle global position. The obvious application of the method is to provide on-board driver assistance in navigation tasks, or to provide a means for autonomously navigating a vehicle. The method has been tested in real traffic conditions without using prior knowledge about the scene nor the vehicle motion. We provide examples of estimated vehicle trajectories using the proposed method and discuss the key issues for further improvement
Long-Term On-Board Prediction of People in Traffic Scenes under Uncertainty
Progress towards advanced systems for assisted and autonomous driving is
leveraging recent advances in recognition and segmentation methods. Yet, we are
still facing challenges in bringing reliable driving to inner cities, as those
are composed of highly dynamic scenes observed from a moving platform at
considerable speeds. Anticipation becomes a key element in order to react
timely and prevent accidents. In this paper we argue that it is necessary to
predict at least 1 second and we thus propose a new model that jointly predicts
ego motion and people trajectories over such large time horizons. We pay
particular attention to modeling the uncertainty of our estimates arising from
the non-deterministic nature of natural traffic scenes. Our experimental
results show that it is indeed possible to predict people trajectories at the
desired time horizons and that our uncertainty estimates are informative of the
prediction error. We also show that both sequence modeling of trajectories as
well as our novel method of long term odometry prediction are essential for
best performance.Comment: CVPR 201
Featureless visual processing for SLAM in changing outdoor environments
Vision-based SLAM is mostly a solved problem providing clear, sharp images can be obtained. However, in outdoor environments a number of factors such as rough terrain, high speeds and hardware limitations can result in these conditions not being met. High speed transit on rough terrain can lead to image blur and under/over exposure, problems that cannot easily be dealt with using low cost hardware. Furthermore, recently there has been a growth in interest in lifelong autonomy for robots, which brings with it the challenge in outdoor environments of dealing with a moving sun and lack of constant artificial lighting. In this paper, we present a lightweight approach to visual localization and visual odometry that addresses the challenges posed by perceptual change and low cost cameras. The approach combines low resolution imagery with the SLAM algorithm, RatSLAM. We test the system using a cheap consumer camera mounted on a small vehicle in a mixed urban and vegetated environment, at times ranging from dawn to dusk and in conditions ranging from sunny weather to rain. We first show that the system is able to provide reliable mapping and recall over the course of the day and incrementally incorporate new visual scenes from different times into an existing map. We then restrict the system to only learning visual scenes at one time of day, and show that the system is still able to localize and map at other times of day. The results demonstrate the viability of the approach in situations where image quality is poor and environmental or hardware factors preclude the use of visual features
Track, then Decide: Category-Agnostic Vision-based Multi-Object Tracking
The most common paradigm for vision-based multi-object tracking is
tracking-by-detection, due to the availability of reliable detectors for
several important object categories such as cars and pedestrians. However,
future mobile systems will need a capability to cope with rich human-made
environments, in which obtaining detectors for every possible object category
would be infeasible. In this paper, we propose a model-free multi-object
tracking approach that uses a category-agnostic image segmentation method to
track objects. We present an efficient segmentation mask-based tracker which
associates pixel-precise masks reported by the segmentation. Our approach can
utilize semantic information whenever it is available for classifying objects
at the track level, while retaining the capability to track generic unknown
objects in the absence of such information. We demonstrate experimentally that
our approach achieves performance comparable to state-of-the-art
tracking-by-detection methods for popular object categories such as cars and
pedestrians. Additionally, we show that the proposed method can discover and
robustly track a large variety of other objects.Comment: ICRA'18 submissio
Data-Efficient Decentralized Visual SLAM
Decentralized visual simultaneous localization and mapping (SLAM) is a
powerful tool for multi-robot applications in environments where absolute
positioning systems are not available. Being visual, it relies on cameras,
cheap, lightweight and versatile sensors, and being decentralized, it does not
rely on communication to a central ground station. In this work, we integrate
state-of-the-art decentralized SLAM components into a new, complete
decentralized visual SLAM system. To allow for data association and
co-optimization, existing decentralized visual SLAM systems regularly exchange
the full map data between all robots, incurring large data transfers at a
complexity that scales quadratically with the robot count. In contrast, our
method performs efficient data association in two stages: in the first stage a
compact full-image descriptor is deterministically sent to only one robot. In
the second stage, which is only executed if the first stage succeeded, the data
required for relative pose estimation is sent, again to only one robot. Thus,
data association scales linearly with the robot count and uses highly compact
place representations. For optimization, a state-of-the-art decentralized
pose-graph optimization method is used. It exchanges a minimum amount of data
which is linear with trajectory overlap. We characterize the resulting system
and identify bottlenecks in its components. The system is evaluated on publicly
available data and we provide open access to the code.Comment: 8 pages, submitted to ICRA 201
- …