Search CORE

26 research outputs found

Object Detection and Classification in Occupancy Grid Maps using Deep Convolutional Networks

Author: Fischer Tom
Frias Jesus Balado
Stiller Christoph
Wirges Sascha
Publication venue
Publication date: 05/12/2018
Field of study

A detailed environment perception is a crucial component of automated vehicles. However, to deal with the amount of perceived information, we also require segmentation strategies. Based on a grid map environment representation, well-suited for sensor fusion, free-space estimation and machine learning, we detect and classify objects using deep convolutional neural networks. As input for our networks we use a multi-layer grid map efficiently encoding 3D range sensor information. The inference output consists of a list of rotated bounding boxes with associated semantic classes. We conduct extensive ablation studies, highlight important design considerations when using grid maps and evaluate our models on the KITTI Bird's Eye View benchmark. Qualitative and quantitative benchmark results show that we achieve robust detection and state of the art accuracy solely using top-view grid maps from range sensor data.Comment: 6 pages, 4 tables, 4 figure

arXiv.org e-Print Archive

Investigo

Crossref

Myocardial scar extent evaluated by cardiac magnetic resonance imaging in ICD patients: differences between polymorphic and monomorphic spontaneous events during follow-up

Author: Bernhardt Peter
Rottbauer Wolfgang
Stiller Sascha
Walcher Daniel
Publication venue: BioMed Central
Publication date: 01/02/2011
Field of study

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Learned Enrichment of Top-View Grid Maps Improves Object Detection

Author: Hu Haohao
Richter Sven
Stiller Christoph
Wirges Sascha
Yang Ye
Publication venue
Publication date: 09/03/2020
Field of study

We propose an object detector for top-view grid maps which is additionally trained to generate an enriched version of its input. Our goal in the joint model is to improve generalization by regularizing towards structural knowledge in form of a map fused from multiple adjacent range sensor measurements. This training data can be generated in an automatic fashion, thus does not require manual annotations. We present an evidential framework to generate training data, investigate different model architectures and show that predicting enriched inputs as an additional task can improve object detection performance.Comment: 6 pages, 6 figures, 4 table

arXiv.org e-Print Archive

Crossref

SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using LiDAR Point Cloud and Semantic Segmentation

Author: Chen Wenbo
Fei Juncong
Heidenreich Philipp
Stiller Christoph
Wirges Sascha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/09/2020
Field of study

3D pedestrian detection is a challenging task in automated driving because pedestrians are relatively small, frequently occluded and easily confused with narrow vertical objects. LiDAR and camera are two commonly used sensor modalities for this task, which should provide complementary information. Unexpectedly, LiDAR-only detection methods tend to outperform multisensor fusion methods in public benchmarks. Recently, PointPainting has been presented to eliminate this performance drop by effectively fusing the output of a semantic segmentation network instead of the raw image information. In this paper, we propose a generalization of PointPainting to be able to apply fusion at different levels. After the semantic augmentation of the point cloud, we encode raw point data in pillars to get geometric features and semantic point data in voxels to get semantic features and fuse them in an effective way. Experimental results on the KITTI test set show that SemanticVoxels achieves state-of-the-art performance in both 3D and bird's eye view pedestrian detection benchmarks. In particular, our approach demonstrates its strength in detecting challenging pedestrian cases and outperforms current state-of-the-art approaches.Comment: Accepted to present in the 2020 IEEE International Conference on Multisensor Fusion and Integration (MFI 2020

arXiv.org e-Print Archive

Crossref

Semantic evidential grid mapping using monocular and stereo cameras

Author: Beck Johannes
Richter Sven
Stiller Christoph
Wang Yiqun
Wirges Sascha
Publication venue: MDPI
Publication date: 01/06/2021
Field of study

Accurately estimating the current state of local traffic scenes is one of the key problems in the development of software components for automated vehicles. In addition to details on free space and drivability, static and dynamic traffic participants and information on the semantics may also be included in the desired representation. Multi-layer grid maps allow the inclusion of all of this information in a common representation. However, most existing grid mapping approaches only process range sensor measurements such as Lidar and Radar and solely model occupancy without semantic states. In order to add sensor redundancy and diversity, it is desired to add vision-based sensor setups in a common grid map representation. In this work, we present a semantic evidential grid mapping pipeline, including estimates for eight semantic classes, that is designed for straightforward fusion with range sensor data. Unlike other publications, our representation explicitly models uncertainties in the evidential model. We present results of our grid mapping pipeline based on a monocular vision setup and a stereo vision setup. Our mapping results are accurate and dense mapping due to the incorporation of a disparity- or depth-based ground surface estimation in the inverse perspective mapping. We conclude this paper by providing a detailed quantitative evaluation based on real traffic scenarios in the KITTI odometry benchmark dataset and demonstrating the advantages compared to other semantic grid mapping approaches

KITopen