Search CORE

4,480 research outputs found

Multi-Modal Obstacle Detection in Unstructured Environments with Conditional Random Fields

Author: Kragh Mikkel
Underwood James
Publication venue: 'Wiley'
Publication date: 13/03/2019
Field of study

Reliable obstacle detection and classification in rough and unstructured terrain such as agricultural fields or orchards remains a challenging problem. These environments involve large variations in both geometry and appearance, challenging perception systems that rely on only a single sensor modality. Geometrically, tall grass, fallen leaves, or terrain roughness can mistakenly be perceived as nontraversable or might even obscure actual obstacles. Likewise, traversable grass or dirt roads and obstacles such as trees and bushes might be visually ambiguous. In this paper, we combine appearance- and geometry-based detection methods by probabilistically fusing lidar and camera sensing with semantic segmentation using a conditional random field. We apply a state-of-the-art multimodal fusion algorithm from the scene analysis domain and adjust it for obstacle detection in agriculture with moving ground vehicles. This involves explicitly handling sparse point cloud data and exploiting both spatial, temporal, and multimodal links between corresponding 2D and 3D regions. The proposed method was evaluated on a diverse data set, comprising a dairy paddock and different orchards gathered with a perception research robot in Australia. Results showed that for a two-class classification problem (ground and nonground), only the camera leveraged from information provided by the other modality with an increase in the mean classification score of 0.5%. However, as more classes were introduced (ground, sky, vegetation, and object), both modalities complemented each other with improvements of 1.4% in 2D and 7.9% in 3D. Finally, introducing temporal links between successive frames resulted in improvements of 0.2% in 2D and 1.5% in 3D.Comment: This is the accepted version of the following article: Kragh M, Underwood J. Multimodal obstacle detection in unstructured environments with conditional random fields. J Field Robotics. 2019, 1-20., which has been published in final form at https://doi.org/10.1002/rob.2186

arXiv.org e-Print Archive

Gaussian Processes Semantic Map Representation

Author: Eustice Ryan M.
Gan Lu
Jadidi Maani Ghaffari
Li Jie
Parkison Steven A.
Publication venue
Publication date: 05/07/2017
Field of study

In this paper, we develop a high-dimensional map building technique that incorporates raw pixelated semantic measurements into the map representation. The proposed technique uses Gaussian Processes (GPs) multi-class classification for map inference and is the natural extension of GP occupancy maps from binary to multi-class form. The technique exploits the continuous property of GPs and, as a result, the map can be inferred with any resolution. In addition, the proposed GP Semantic Map (GPSM) learns the structural and semantic correlation from measurements rather than resorting to assumptions, and can flexibly learn the spatial correlation as well as any additional non-spatial correlation between map points. We extend the OctoMap to Semantic OctoMap representation and compare with the GPSM mapping performance using NYU Depth V2 dataset. Evaluations of the proposed technique on multiple partially labeled RGBD scans and labels from noisy image segmentation show that the GP semantic map can handle sparse measurements, missing labels in the point cloud, as well as noise corrupted labels.Comment: Accepted for RSS 2017 Workshop on Spatial-Semantic Representations in Robotic

arXiv.org e-Print Archive

Multisource and Multitemporal Data Fusion in Remote Sensing

Author: Anders Katharina
Atkinson Peter M.
Benediktsson Jon Atli
Bovolo Francesca
Bruzzone Lorenzo
Chi Mingmin
Ghamisi Pedram
Gloaguen Richard
Hofle Bernhard
Rasti Behnood
Wang Qunming
Yokoya Naoto
Publication venue
Publication date: 19/12/2018
Field of study

The sharp and recent increase in the availability of data captured by different sensors combined with their considerably heterogeneous natures poses a serious challenge for the effective and efficient processing of remotely sensed data. Such an increase in remote sensing and ancillary datasets, however, opens up the possibility of utilizing multimodal datasets in a joint manner to further improve the performance of the processing approaches with respect to the application at hand. Multisource data fusion has, therefore, received enormous attention from researchers worldwide for a wide variety of applications. Moreover, thanks to the revisit capability of several spaceborne sensors, the integration of the temporal information with the spatial and/or spectral/backscattering information of the remotely sensed data is possible and helps to move from a representation of 2D/3D data to 4D data structures, where the time variable adds new information as well as challenges for the information extraction algorithms. There are a huge number of research works dedicated to multisource and multitemporal data fusion, but the methods for the fusion of different modalities have expanded in different paths according to each research community. This paper brings together the advances of multisource and multitemporal data fusion approaches with respect to different research communities and provides a thorough and discipline-specific starting point for researchers at different levels (i.e., students, researchers, and senior researchers) willing to conduct novel investigations on this challenging topic by supplying sufficient detail and references

arXiv.org e-Print Archive

Incorporating Human Domain Knowledge in 3D LiDAR-based Semantic Segmentation

Author: Mei Jilin
Zhao Huijing
Publication venue
Publication date: 23/05/2019
Field of study

This work studies semantic segmentation using 3D LiDAR data. Popular deep learning methods applied for this task require a large number of manual annotations to train the parameters. We propose a new method that makes full use of the advantages of traditional methods and deep learning methods via incorporating human domain knowledge into the neural network model to reduce the demand for large numbers of manual annotations and improve the training efficiency. We first pretrain a model with autogenerated samples from a rule-based classifier so that human knowledge can be propagated into the network. Based on the pretrained model, only a small set of annotations is required for further fine-tuning. Quantitative experiments show that the pretrained model achieves better performance than random initialization in almost all cases; furthermore, our method can achieve similar performance with fewer manual annotations.Comment: 8 Page

arXiv.org e-Print Archive

Real-time Dynamic Object Detection for Autonomous Driving using Prior 3D-Maps

Author: Irastorza Benat
Kiran B Ravi
Lepoutre Alexandre
Roldão Luis
Suss Sebastian
Talpaert Victor
Trehard Guillaume
Verastegui Renzo
Yogamani Senthil
Publication venue
Publication date: 05/07/2019
Field of study

Lidar has become an essential sensor for autonomous driving as it provides reliable depth estimation. Lidar is also the primary sensor used in building 3D maps which can be used even in the case of low-cost systems which do not use Lidar. Computation on Lidar point clouds is intensive as it requires processing of millions of points per second. Additionally there are many subsequent tasks such as clustering, detection, tracking and classification which makes real-time execution challenging. In this paper, we discuss real-time dynamic object detection algorithms which leverages previously mapped Lidar point clouds to reduce processing. The prior 3D maps provide a static background model and we formulate dynamic object detection as a background subtraction problem. Computation and modeling challenges in the mapping and online execution pipeline are described. We propose a rejection cascade architecture to subtract road regions and other 3D regions separately. We implemented an initial version of our proposed algorithm and evaluated the accuracy on CARLA simulator.Comment: Preprint Submission to ECCVW AutoNUE 2018 - v2 author name accent correctio

arXiv.org e-Print Archive

SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences

Author: Behley Jens
Behnke Sven
Gall Juergen
Garbade Martin
Milioto Andres
Quenzel Jan
Stachniss Cyrill
Publication venue
Publication date: 16/08/2019
Field of study

Semantic scene understanding is important for various applications. In particular, self-driving cars need a fine-grained understanding of the surfaces and objects in their vicinity. Light detection and ranging (LiDAR) provides precise geometric information about the environment and is thus a part of the sensor suites of almost all self-driving cars. Despite the relevance of semantic scene understanding for this application, there is a lack of a large dataset for this task which is based on an automotive LiDAR. In this paper, we introduce a large dataset to propel research on laser-based semantic segmentation. We annotated all sequences of the KITTI Vision Odometry Benchmark and provide dense point-wise annotations for the complete

360^{o}

field-of-view of the employed automotive LiDAR. We propose three benchmark tasks based on this dataset: (i) semantic segmentation of point clouds using a single scan, (ii) semantic segmentation using multiple past scans, and (iii) semantic scene completion, which requires to anticipate the semantic scene in the future. We provide baseline experiments and show that there is a need for more sophisticated models to efficiently tackle these tasks. Our dataset opens the door for the development of more advanced methods, but also provides plentiful data to investigate new research directions.Comment: ICCV2019. See teaser video at http://bit.ly/SemanticKITTI-tease

arXiv.org e-Print Archive

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Author: Guo Yulan
Hu Qingyong
Markham Andrew
Rosa Stefano
Trigoni Niki
Wang Zhihua
Xie Linhai
Yang Bo
Publication venue
Publication date: 01/05/2020
Field of study

We study the problem of efficient semantic segmentation for large-scale 3D point clouds. By relying on expensive sampling techniques or computationally heavy pre/post-processing steps, most existing approaches are only able to be trained and operate over small-scale point clouds. In this paper, we introduce RandLA-Net, an efficient and lightweight neural architecture to directly infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Although remarkably computation and memory efficient, random sampling can discard key features by chance. To overcome this, we introduce a novel local feature aggregation module to progressively increase the receptive field for each 3D point, thereby effectively preserving geometric details. Extensive experiments show that our RandLA-Net can process 1 million points in a single pass with up to 200X faster than existing approaches. Moreover, our RandLA-Net clearly surpasses state-of-the-art approaches for semantic segmentation on two large-scale benchmarks Semantic3D and SemanticKITTI.Comment: CVPR 2020 Oral. Code and data are available at: https://github.com/QingyongHu/RandLA-Ne

arXiv.org e-Print Archive

Sparse Bayesian Inference for Dense Semantic Mapping

Author: Eustice Ryan M.
Gan Lu
Jadidi Maani Ghaffari
Parkison Steven A.
Publication venue
Publication date: 22/09/2017
Field of study

Despite impressive advances in simultaneous localization and mapping, dense robotic mapping remains challenging due to its inherent nature of being a high-dimensional inference problem. In this paper, we propose a dense semantic robotic mapping technique that exploits sparse Bayesian models, in particular, the relevance vector machine, for high-dimensional sequential inference. The technique is based on the principle of automatic relevance determination and produces sparse models that use a small subset of the original dense training set as the dominant basis. The resulting map posterior is continuous, and queries can be made efficiently at any resolution. Moreover, the technique has probabilistic outputs per semantic class through Bayesian inference. We evaluate the proposed relevance vector semantic map using publicly available benchmark datasets, NYU Depth V2 and KITTI; and the results show promising improvements over the state-of-the-art techniques.Comment: Submitted to ICRA 2018, 8 page

arXiv.org e-Print Archive

Machine Learning Techniques and Applications For Ground-based Image Analysis

Author: Dev Soumyabrata
Lee Yee Hui
Wen Bihan
Winkler Stefan
Publication venue
Publication date: 08/06/2016
Field of study

Ground-based whole sky cameras have opened up new opportunities for monitoring the earth's atmosphere. These cameras are an important complement to satellite images by providing geoscientists with cheaper, faster, and more localized data. The images captured by whole sky imagers can have high spatial and temporal resolution, which is an important pre-requisite for applications such as solar energy modeling, cloud attenuation analysis, local weather prediction, etc. Extracting valuable information from the huge amount of image data by detecting and analyzing the various entities in these images is challenging. However, powerful machine learning techniques have become available to aid with the image analysis. This article provides a detailed walk-through of recent developments in these techniques and their applications in ground-based imaging. We aim to bridge the gap between computer vision and remote sensing with the help of illustrative examples. We demonstrate the advantages of using machine learning techniques in ground-based image analysis via three primary applications -- segmentation, classification, and denoising

arXiv.org e-Print Archive

Deep Semantic Segmentation for Automated Driving: Taxonomy, Roadmap and Challenges

Author: Elkerdawy Sara
Jagersand Martin
Siam Mennatullah
Yogamani Senthil
Publication venue
Publication date: 03/08/2017
Field of study

Semantic segmentation was seen as a challenging computer vision problem few years ago. Due to recent advancements in deep learning, relatively accurate solutions are now possible for its use in automated driving. In this paper, the semantic segmentation problem is explored from the perspective of automated driving. Most of the current semantic segmentation algorithms are designed for generic images and do not incorporate prior structure and end goal for automated driving. First, the paper begins with a generic taxonomic survey of semantic segmentation algorithms and then discusses how it fits in the context of automated driving. Second, the particular challenges of deploying it into a safety system which needs high level of accuracy and robustness are listed. Third, different alternatives instead of using an independent semantic segmentation module are explored. Finally, an empirical evaluation of various semantic segmentation architectures was performed on CamVid dataset in terms of accuracy and speed. This paper is a preliminary shorter version of a more detailed survey which is work in progress.Comment: To appear in IEEE ITSC 201

arXiv.org e-Print Archive