5,366 research outputs found
All Weather Perception: Joint Data Association, Tracking, and Classification for Autonomous Ground Vehicles
A novel probabilistic perception algorithm is presented as a real-time joint
solution to data association, object tracking, and object classification for an
autonomous ground vehicle in all-weather conditions. The presented algorithm
extends a Rao-Blackwellized Particle Filter originally built with a particle
filter for data association and a Kalman filter for multi-object tracking
(Miller et al. 2011a) to now also include multiple model tracking for
classification. Additionally a state-of-the-art vision detection algorithm that
includes heading information for autonomous ground vehicle (AGV) applications
was implemented. Cornell's AGV from the DARPA Urban Challenge was upgraded and
used to experimentally examine if and how state-of-the-art vision algorithms
can complement or replace lidar and radar sensors. Sensor and algorithm
performance in adverse weather and lighting conditions is tested. Experimental
evaluation demonstrates robust all-weather data association, tracking, and
classification where camera, lidar, and radar sensors complement each other
inside the joint probabilistic perception algorithm.Comment: 35 pages, 21 figures, 14 table
Fast image-based obstacle detection from unmanned surface vehicles
Obstacle detection plays an important role in unmanned surface vehicles
(USV). The USVs operate in highly diverse environments in which an obstacle may
be a floating piece of wood, a scuba diver, a pier, or a part of a shoreline,
which presents a significant challenge to continuous detection from images
taken onboard. This paper addresses the problem of online detection by
constrained unsupervised segmentation. To this end, a new graphical model is
proposed that affords a fast and continuous obstacle image-map estimation from
a single video stream captured onboard a USV. The model accounts for the
semantic structure of marine environment as observed from USV by imposing weak
structural constraints. A Markov random field framework is adopted and a highly
efficient algorithm for simultaneous optimization of model parameters and
segmentation mask estimation is derived. Our approach does not require
computationally intensive extraction of texture features and comfortably runs
in real-time. The algorithm is tested on a new, challenging, dataset for
segmentation and obstacle detection in marine environments, which is the
largest annotated dataset of its kind. Results on this dataset show that our
model outperforms the related approaches, while requiring a fraction of
computational effort.Comment: This is an extended version of the ACCV2014 paper [Kristan et al.,
2014] submitted to a journal. [Kristan et al., 2014] M. Kristan, J. Pers, V.
Sulic, S. Kovacic, A graphical model for rapid obstacle image-map estimation
from unmanned surface vehicles, in Proc. Asian Conf. Computer Vision, 201
CARRADA Dataset: Camera and Automotive Radar with Range-Angle-Doppler Annotations
High quality perception is essential for autonomous driving (AD) systems. To
reach the accuracy and robustness that are required by such systems, several
types of sensors must be combined. Currently, mostly cameras and laser scanners
(lidar) are deployed to build a representation of the world around the vehicle.
While radar sensors have been used for a long time in the automotive industry,
they are still under-used for AD despite their appealing characteristics
(notably, their ability to measure the relative speed of obstacles and to
operate even in adverse weather conditions). To a large extent, this situation
is due to the relative lack of automotive datasets with real radar signals that
are both raw and annotated. In this work, we introduce CARRADA, a dataset of
synchronized camera and radar recordings with range-angle-Doppler annotations.
We also present a semi-automatic annotation approach, which was used to
annotate the dataset, and a radar semantic segmentation baseline, which we
evaluate on several metrics. Both our code and dataset are available online.Comment: 8 pages, 5 figues. Accepted at ICPR 2020. Erratum: results in Table
III have been updated since the ICPR proceedings, models are selected using
the PP metric instead of the previously used PR metri
Machine Vision System for 3D Plant Phenotyping
Machine vision for plant phenotyping is an emerging research area for
producing high throughput in agriculture and crop science applications. Since
2D based approaches have their inherent limitations, 3D plant analysis is
becoming state of the art for current phenotyping technologies. We present an
automated system for analyzing plant growth in indoor conditions. A gantry
robot system is used to perform scanning tasks in an automated manner
throughout the lifetime of the plant. A 3D laser scanner mounted as the robot's
payload captures the surface point cloud data of the plant from multiple views.
The plant is monitored from the vegetative to reproductive stages in light/dark
cycles inside a controllable growth chamber. An efficient 3D reconstruction
algorithm is used, by which multiple scans are aligned together to obtain a 3D
mesh of the plant, followed by surface area and volume computations. The whole
system, including the programmable growth chamber, robot, scanner, data
transfer and analysis is fully automated in such a way that a naive user can,
in theory, start the system with a mouse click and get back the growth analysis
results at the end of the lifetime of the plant with no intermediate
intervention. As evidence of its functionality, we show and analyze
quantitative results of the rhythmic growth patterns of the dicot Arabidopsis
thaliana(L.), and the monocot barley (Hordeum vulgare L.) plants under their
diurnal light/dark cycles
Dynamic Environment Prediction in Urban Scenes using Recurrent Representation Learning
A key challenge for autonomous driving is safe trajectory planning in
cluttered, urban environments with dynamic obstacles, such as pedestrians,
bicyclists, and other vehicles. A reliable prediction of the future
environment, including the behavior of dynamic agents, would allow planning
algorithms to proactively generate a trajectory in response to a rapidly
changing environment. We present a novel framework that predicts the future
occupancy state of the local environment surrounding an autonomous agent by
learning a motion model from occupancy grid data using a neural network. We
take advantage of the temporal structure of the grid data by utilizing a
convolutional long-short term memory network in the form of the PredNet
architecture. This method is validated on the KITTI dataset and demonstrates
higher accuracy and better predictive power than baseline methods.Comment: 8 pages, updated final draft, accepted into Intelligent
Transportation Systems Conference (ITSC) 201
4D Generic Video Object Proposals
Many high-level video understanding methods require input in the form of
object proposals. Currently, such proposals are predominantly generated with
the help of networks that were trained for detecting and segmenting a set of
known object classes, which limits their applicability to cases where all
objects of interest are represented in the training set. This is a restriction
for automotive scenarios, where unknown objects can frequently occur. We
propose an approach that can reliably extract spatio-temporal object proposals
for both known and unknown object categories from stereo video. Our 4D Generic
Video Tubes (4D-GVT) method leverages motion cues, stereo data, and object
instance segmentation to compute a compact set of video-object proposals that
precisely localizes object candidates and their contours in 3D space and time.
We show that given only a small amount of labeled data, our 4D-GVT proposal
generator generalizes well to real-world scenarios, in which unknown categories
appear. It outperforms other approaches that try to detect as many objects as
possible by increasing the number of classes in the training set to several
thousand.Comment: ICRA 202
Self-Driving Cars: A Survey
We survey research on self-driving cars published in the literature focusing
on autonomous cars developed since the DARPA challenges, which are equipped
with an autonomy system that can be categorized as SAE level 3 or higher. The
architecture of the autonomy system of self-driving cars is typically organized
into the perception system and the decision-making system. The perception
system is generally divided into many subsystems responsible for tasks such as
self-driving-car localization, static obstacles mapping, moving obstacles
detection and tracking, road mapping, traffic signalization detection and
recognition, among others. The decision-making system is commonly partitioned
as well into many subsystems responsible for tasks such as route planning, path
planning, behavior selection, motion planning, and control. In this survey, we
present the typical architecture of the autonomy system of self-driving cars.
We also review research on relevant methods for perception and decision making.
Furthermore, we present a detailed description of the architecture of the
autonomy system of the self-driving car developed at the Universidade Federal
do Esp\'irito Santo (UFES), named Intelligent Autonomous Robotics Automobile
(IARA). Finally, we list prominent self-driving car research platforms
developed by academia and technology companies, and reported in the media
End-to-End Tracking and Semantic Segmentation Using Recurrent Neural Networks
In this work we present a novel end-to-end framework for tracking and
classifying a robot's surroundings in complex, dynamic and only partially
observable real-world environments. The approach deploys a recurrent neural
network to filter an input stream of raw laser measurements in order to
directly infer object locations, along with their identity in both visible and
occluded areas. To achieve this we first train the network using unsupervised
Deep Tracking, a recently proposed theoretical framework for end-to-end space
occupancy prediction. We show that by learning to track on a large amount of
unsupervised data, the network creates a rich internal representation of its
environment which we in turn exploit through the principle of inductive
transfer of knowledge to perform the task of it's semantic classification. As a
result, we show that only a small amount of labelled data suffices to steer the
network towards mastering this additional task. Furthermore we propose a novel
recurrent neural network architecture specifically tailored to tracking and
semantic classification in real-world robotics applications. We demonstrate the
tracking and classification performance of the method on real-world data
collected at a busy road junction. Our evaluation shows that the proposed
end-to-end framework compares favourably to a state-of-the-art, model-free
tracking solution and that it outperforms a conventional one-shot training
scheme for semantic classification
Detection and Tracking of General Movable Objects in Large 3D Maps
This paper studies the problem of detection and tracking of general objects
with long-term dynamics, observed by a mobile robot moving in a large
environment. A key problem is that due to the environment scale, it can only
observe a subset of the objects at any given time. Since some time passes
between observations of objects in different places, the objects might be moved
when the robot is not there. We propose a model for this movement in which the
objects typically only move locally, but with some small probability they jump
longer distances, through what we call global motion. For filtering, we
decompose the posterior over local and global movements into two linked
processes. The posterior over the global movements and measurement associations
is sampled, while we track the local movement analytically using Kalman
filters. This novel filter is evaluated on point cloud data gathered
autonomously by a mobile robot over an extended period of time. We show that
tracking jumping objects is feasible, and that the proposed probabilistic
treatment outperforms previous methods when applied to real world data. The key
to efficient probabilistic tracking in this scenario is focused sampling of the
object posteriors.Comment: Submitted for peer revie
A Fully Convolutional Network for Semantic Labeling of 3D Point Clouds
When classifying point clouds, a large amount of time is devoted to the
process of engineering a reliable set of features which are then passed to a
classifier of choice. Generally, such features - usually derived from the
3D-covariance matrix - are computed using the surrounding neighborhood of
points. While these features capture local information, the process is usually
time-consuming, and requires the application at multiple scales combined with
contextual methods in order to adequately describe the diversity of objects
within a scene. In this paper we present a 1D-fully convolutional network that
consumes terrain-normalized points directly with the corresponding spectral
data,if available, to generate point-wise labeling while implicitly learning
contextual features in an end-to-end fashion. Our method uses only the
3D-coordinates and three corresponding spectral features for each point.
Spectral features may either be extracted from 2D-georeferenced images, as
shown here for Light Detection and Ranging (LiDAR) point clouds, or extracted
directly for passive-derived point clouds,i.e. from muliple-view imagery. We
train our network by splitting the data into square regions, and use a pooling
layer that respects the permutation-invariance of the input points. Evaluated
using the ISPRS 3D Semantic Labeling Contest, our method scored second place
with an overall accuracy of 81.6%. We ranked third place with a mean F1-score
of 63.32%, surpassing the F1-score of the method with highest accuracy by
1.69%. In addition to labeling 3D-point clouds, we also show that our method
can be easily extended to 2D-semantic segmentation tasks, with promising
initial results
- …