5,047 research outputs found
The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping
Many tasks performed by autonomous vehicles such as road marking detection,
object tracking, and path planning are simpler in bird's-eye view. Hence,
Inverse Perspective Mapping (IPM) is often applied to remove the perspective
effect from a vehicle's front-facing camera and to remap its images into a 2D
domain, resulting in a top-down view. Unfortunately, however, this leads to
unnatural blurring and stretching of objects at further distance, due to the
resolution of the camera, limiting applicability. In this paper, we present an
adversarial learning approach for generating a significantly improved IPM from
a single camera image in real time. The generated bird's-eye-view images
contain sharper features (e.g. road markings) and a more homogeneous
illumination, while (dynamic) objects are automatically removed from the scene,
thus revealing the underlying road layout in an improved fashion. We
demonstrate our framework using real-world data from the Oxford RobotCar
Dataset and show that scene understanding tasks directly benefit from our
boosted IPM approach.Comment: equal contribution of first two authors, 8 full pages, 6 figures,
accepted at IV 201
Semantic Instance Annotation of Street Scenes by 3D to 2D Label Transfer
Semantic annotations are vital for training models for object recognition,
semantic segmentation or scene understanding. Unfortunately, pixelwise
annotation of images at very large scale is labor-intensive and only little
labeled data is available, particularly at instance level and for street
scenes. In this paper, we propose to tackle this problem by lifting the
semantic instance labeling task from 2D into 3D. Given reconstructions from
stereo or laser data, we annotate static 3D scene elements with rough bounding
primitives and develop a model which transfers this information into the image
domain. We leverage our method to obtain 2D labels for a novel suburban video
dataset which we have collected, resulting in 400k semantic and instance image
annotations. A comparison of our method to state-of-the-art label transfer
baselines reveals that 3D information enables more efficient annotation while
at the same time resulting in improved accuracy and time-coherent labels.Comment: 10 pages in Conference on Computer Vision and Pattern Recognition
(CVPR), 201
Urban accessibility diagnosis from mobile laser scanning data
International audienceIn this paper we present an approach for automatic analysis of urban acessibility using 3D point clouds. Our approach is based on range images and it consists in two main steps: urban objects segmentation and curbs detection. Both of them are required for accessibility diagnosis and itinerary planning. Our method automatically segments facades and urban objects using two hypotheses: facades are the highest vertical structures in the scene and objects are bumps on the ground on the range image. The segmentation result is used to build an urban obstacle map. After that, the gradient is computed on the ground range image. Curb candidates are selected using height and geodesic features. Then, nearby curbs are reconnected using BĂ©zier curves. Finally, accessibility is defined based on geometrical features and accessibility standards. Our methodology is tested on two MLS databases from Paris (France) and Enschede (The Netherlands). Our experiments show that our method has good detection rates, is fast and presents few false alarms. Our method outperforms other works reported in the literature on the same databases
Online Inference and Detection of Curbs in Partially Occluded Scenes with Sparse LIDAR
Road boundaries, or curbs, provide autonomous vehicles with essential
information when interpreting road scenes and generating behaviour plans.
Although curbs convey important information, they are difficult to detect in
complex urban environments (in particular in comparison to other elements of
the road such as traffic signs and road markings). These difficulties arise
from occlusions by other traffic participants as well as changing lighting
and/or weather conditions. Moreover, road boundaries have various shapes,
colours and structures while motion planning algorithms require accurate and
precise metric information in real-time to generate their plans.
In this paper, we present a real-time LIDAR-based approach for accurate curb
detection around the vehicle (360 degree). Our approach deals with both
occlusions from traffic and changing environmental conditions. To this end, we
project 3D LIDAR pointcloud data into 2D bird's-eye view images (akin to
Inverse Perspective Mapping). These images are then processed by trained deep
networks to infer both visible and occluded road boundaries. Finally, a
post-processing step filters detected curb segments and tracks them over time.
Experimental results demonstrate the effectiveness of the proposed approach on
real-world driving data. Hence, we believe that our LIDAR-based approach
provides an efficient and effective way to detect visible and occluded curbs
around the vehicles in challenging driving scenarios.Comment: Accepted at the 22nd IEEE Intelligent Transportation Systems
Conference (ITSC19), October, 2019, Auckland, New Zealan
The Cityscapes Dataset for Semantic Urban Scene Understanding
Visual understanding of complex urban street scenes is an enabling factor for
a wide range of applications. Object detection has benefited enormously from
large-scale datasets, especially in the context of deep learning. For semantic
urban scene understanding, however, no current dataset adequately captures the
complexity of real-world urban scenes.
To address this, we introduce Cityscapes, a benchmark suite and large-scale
dataset to train and test approaches for pixel-level and instance-level
semantic labeling. Cityscapes is comprised of a large, diverse set of stereo
video sequences recorded in streets from 50 different cities. 5000 of these
images have high quality pixel-level annotations; 20000 additional images have
coarse annotations to enable methods that leverage large volumes of
weakly-labeled data. Crucially, our effort exceeds previous attempts in terms
of dataset size, annotation richness, scene variability, and complexity. Our
accompanying empirical study provides an in-depth analysis of the dataset
characteristics, as well as a performance evaluation of several
state-of-the-art approaches based on our benchmark.Comment: Includes supplemental materia
- …