4,480 research outputs found
Multi-Modal Obstacle Detection in Unstructured Environments with Conditional Random Fields
Reliable obstacle detection and classification in rough and unstructured
terrain such as agricultural fields or orchards remains a challenging problem.
These environments involve large variations in both geometry and appearance,
challenging perception systems that rely on only a single sensor modality.
Geometrically, tall grass, fallen leaves, or terrain roughness can mistakenly
be perceived as nontraversable or might even obscure actual obstacles.
Likewise, traversable grass or dirt roads and obstacles such as trees and
bushes might be visually ambiguous. In this paper, we combine appearance- and
geometry-based detection methods by probabilistically fusing lidar and camera
sensing with semantic segmentation using a conditional random field. We apply a
state-of-the-art multimodal fusion algorithm from the scene analysis domain and
adjust it for obstacle detection in agriculture with moving ground vehicles.
This involves explicitly handling sparse point cloud data and exploiting both
spatial, temporal, and multimodal links between corresponding 2D and 3D
regions. The proposed method was evaluated on a diverse data set, comprising a
dairy paddock and different orchards gathered with a perception research robot
in Australia. Results showed that for a two-class classification problem
(ground and nonground), only the camera leveraged from information provided by
the other modality with an increase in the mean classification score of 0.5%.
However, as more classes were introduced (ground, sky, vegetation, and object),
both modalities complemented each other with improvements of 1.4% in 2D and
7.9% in 3D. Finally, introducing temporal links between successive frames
resulted in improvements of 0.2% in 2D and 1.5% in 3D.Comment: This is the accepted version of the following article: Kragh M,
Underwood J. Multimodal obstacle detection in unstructured environments with
conditional random fields. J Field Robotics. 2019, 1-20., which has been
published in final form at https://doi.org/10.1002/rob.2186
Gaussian Processes Semantic Map Representation
In this paper, we develop a high-dimensional map building technique that
incorporates raw pixelated semantic measurements into the map representation.
The proposed technique uses Gaussian Processes (GPs) multi-class classification
for map inference and is the natural extension of GP occupancy maps from binary
to multi-class form. The technique exploits the continuous property of GPs and,
as a result, the map can be inferred with any resolution. In addition, the
proposed GP Semantic Map (GPSM) learns the structural and semantic correlation
from measurements rather than resorting to assumptions, and can flexibly learn
the spatial correlation as well as any additional non-spatial correlation
between map points. We extend the OctoMap to Semantic OctoMap representation
and compare with the GPSM mapping performance using NYU Depth V2 dataset.
Evaluations of the proposed technique on multiple partially labeled RGBD scans
and labels from noisy image segmentation show that the GP semantic map can
handle sparse measurements, missing labels in the point cloud, as well as noise
corrupted labels.Comment: Accepted for RSS 2017 Workshop on Spatial-Semantic Representations in
Robotic
Multisource and Multitemporal Data Fusion in Remote Sensing
The sharp and recent increase in the availability of data captured by
different sensors combined with their considerably heterogeneous natures poses
a serious challenge for the effective and efficient processing of remotely
sensed data. Such an increase in remote sensing and ancillary datasets,
however, opens up the possibility of utilizing multimodal datasets in a joint
manner to further improve the performance of the processing approaches with
respect to the application at hand. Multisource data fusion has, therefore,
received enormous attention from researchers worldwide for a wide variety of
applications. Moreover, thanks to the revisit capability of several spaceborne
sensors, the integration of the temporal information with the spatial and/or
spectral/backscattering information of the remotely sensed data is possible and
helps to move from a representation of 2D/3D data to 4D data structures, where
the time variable adds new information as well as challenges for the
information extraction algorithms. There are a huge number of research works
dedicated to multisource and multitemporal data fusion, but the methods for the
fusion of different modalities have expanded in different paths according to
each research community. This paper brings together the advances of multisource
and multitemporal data fusion approaches with respect to different research
communities and provides a thorough and discipline-specific starting point for
researchers at different levels (i.e., students, researchers, and senior
researchers) willing to conduct novel investigations on this challenging topic
by supplying sufficient detail and references
Incorporating Human Domain Knowledge in 3D LiDAR-based Semantic Segmentation
This work studies semantic segmentation using 3D LiDAR data. Popular deep
learning methods applied for this task require a large number of manual
annotations to train the parameters. We propose a new method that makes full
use of the advantages of traditional methods and deep learning methods via
incorporating human domain knowledge into the neural network model to reduce
the demand for large numbers of manual annotations and improve the training
efficiency. We first pretrain a model with autogenerated samples from a
rule-based classifier so that human knowledge can be propagated into the
network. Based on the pretrained model, only a small set of annotations is
required for further fine-tuning. Quantitative experiments show that the
pretrained model achieves better performance than random initialization in
almost all cases; furthermore, our method can achieve similar performance with
fewer manual annotations.Comment: 8 Page
Real-time Dynamic Object Detection for Autonomous Driving using Prior 3D-Maps
Lidar has become an essential sensor for autonomous driving as it provides
reliable depth estimation. Lidar is also the primary sensor used in building 3D
maps which can be used even in the case of low-cost systems which do not use
Lidar. Computation on Lidar point clouds is intensive as it requires processing
of millions of points per second. Additionally there are many subsequent tasks
such as clustering, detection, tracking and classification which makes
real-time execution challenging. In this paper, we discuss real-time dynamic
object detection algorithms which leverages previously mapped Lidar point
clouds to reduce processing. The prior 3D maps provide a static background
model and we formulate dynamic object detection as a background subtraction
problem. Computation and modeling challenges in the mapping and online
execution pipeline are described. We propose a rejection cascade architecture
to subtract road regions and other 3D regions separately. We implemented an
initial version of our proposed algorithm and evaluated the accuracy on CARLA
simulator.Comment: Preprint Submission to ECCVW AutoNUE 2018 - v2 author name accent
correctio
SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences
Semantic scene understanding is important for various applications. In
particular, self-driving cars need a fine-grained understanding of the surfaces
and objects in their vicinity. Light detection and ranging (LiDAR) provides
precise geometric information about the environment and is thus a part of the
sensor suites of almost all self-driving cars. Despite the relevance of
semantic scene understanding for this application, there is a lack of a large
dataset for this task which is based on an automotive LiDAR.
In this paper, we introduce a large dataset to propel research on laser-based
semantic segmentation. We annotated all sequences of the KITTI Vision Odometry
Benchmark and provide dense point-wise annotations for the complete
field-of-view of the employed automotive LiDAR. We propose three benchmark
tasks based on this dataset: (i) semantic segmentation of point clouds using a
single scan, (ii) semantic segmentation using multiple past scans, and (iii)
semantic scene completion, which requires to anticipate the semantic scene in
the future. We provide baseline experiments and show that there is a need for
more sophisticated models to efficiently tackle these tasks. Our dataset opens
the door for the development of more advanced methods, but also provides
plentiful data to investigate new research directions.Comment: ICCV2019. See teaser video at http://bit.ly/SemanticKITTI-tease
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
We study the problem of efficient semantic segmentation for large-scale 3D
point clouds. By relying on expensive sampling techniques or computationally
heavy pre/post-processing steps, most existing approaches are only able to be
trained and operate over small-scale point clouds. In this paper, we introduce
RandLA-Net, an efficient and lightweight neural architecture to directly infer
per-point semantics for large-scale point clouds. The key to our approach is to
use random point sampling instead of more complex point selection approaches.
Although remarkably computation and memory efficient, random sampling can
discard key features by chance. To overcome this, we introduce a novel local
feature aggregation module to progressively increase the receptive field for
each 3D point, thereby effectively preserving geometric details. Extensive
experiments show that our RandLA-Net can process 1 million points in a single
pass with up to 200X faster than existing approaches. Moreover, our RandLA-Net
clearly surpasses state-of-the-art approaches for semantic segmentation on two
large-scale benchmarks Semantic3D and SemanticKITTI.Comment: CVPR 2020 Oral. Code and data are available at:
https://github.com/QingyongHu/RandLA-Ne
Sparse Bayesian Inference for Dense Semantic Mapping
Despite impressive advances in simultaneous localization and mapping, dense
robotic mapping remains challenging due to its inherent nature of being a
high-dimensional inference problem. In this paper, we propose a dense semantic
robotic mapping technique that exploits sparse Bayesian models, in particular,
the relevance vector machine, for high-dimensional sequential inference. The
technique is based on the principle of automatic relevance determination and
produces sparse models that use a small subset of the original dense training
set as the dominant basis. The resulting map posterior is continuous, and
queries can be made efficiently at any resolution. Moreover, the technique has
probabilistic outputs per semantic class through Bayesian inference. We
evaluate the proposed relevance vector semantic map using publicly available
benchmark datasets, NYU Depth V2 and KITTI; and the results show promising
improvements over the state-of-the-art techniques.Comment: Submitted to ICRA 2018, 8 page
Machine Learning Techniques and Applications For Ground-based Image Analysis
Ground-based whole sky cameras have opened up new opportunities for
monitoring the earth's atmosphere. These cameras are an important complement to
satellite images by providing geoscientists with cheaper, faster, and more
localized data. The images captured by whole sky imagers can have high spatial
and temporal resolution, which is an important pre-requisite for applications
such as solar energy modeling, cloud attenuation analysis, local weather
prediction, etc.
Extracting valuable information from the huge amount of image data by
detecting and analyzing the various entities in these images is challenging.
However, powerful machine learning techniques have become available to aid with
the image analysis. This article provides a detailed walk-through of recent
developments in these techniques and their applications in ground-based
imaging. We aim to bridge the gap between computer vision and remote sensing
with the help of illustrative examples. We demonstrate the advantages of using
machine learning techniques in ground-based image analysis via three primary
applications -- segmentation, classification, and denoising
Deep Semantic Segmentation for Automated Driving: Taxonomy, Roadmap and Challenges
Semantic segmentation was seen as a challenging computer vision problem few
years ago. Due to recent advancements in deep learning, relatively accurate
solutions are now possible for its use in automated driving. In this paper, the
semantic segmentation problem is explored from the perspective of automated
driving. Most of the current semantic segmentation algorithms are designed for
generic images and do not incorporate prior structure and end goal for
automated driving. First, the paper begins with a generic taxonomic survey of
semantic segmentation algorithms and then discusses how it fits in the context
of automated driving. Second, the particular challenges of deploying it into a
safety system which needs high level of accuracy and robustness are listed.
Third, different alternatives instead of using an independent semantic
segmentation module are explored. Finally, an empirical evaluation of various
semantic segmentation architectures was performed on CamVid dataset in terms of
accuracy and speed. This paper is a preliminary shorter version of a more
detailed survey which is work in progress.Comment: To appear in IEEE ITSC 201
- …