15,919 research outputs found
Optimized Gated Deep Learning Architectures for Sensor Fusion
Sensor fusion is a key technology that integrates various sensory inputs to
allow for robust decision making in many applications such as autonomous
driving and robot control. Deep neural networks have been adopted for sensor
fusion in a body of recent studies. Among these, the so-called netgated
architecture was proposed, which has demonstrated improved performances over
the conventional convolutional neural networks (CNN). In this paper, we address
several limitations of the baseline negated architecture by proposing two
further optimized architectures: a coarser-grained gated architecture employing
(feature) group-level fusion weights and a two-stage gated architectures
leveraging both the group-level and feature level fusion weights. Using driving
mode prediction and human activity recognition datasets, we demonstrate the
significant performance improvements brought by the proposed gated
architectures and also their robustness in the presence of sensor noise and
failures.Comment: 10 pages, 5 figures. Submitted to ICLR 201
Collaborative signal and information processing for target detection with heterogeneous sensor networks
In this paper, an approach for target detection and acquisition with heterogeneous sensor networks through strategic resource allocation and coordination is presented. Based on sensor management and collaborative signal and information processing, low-capacity low-cost sensors are strategically deployed to guide and cue scarce high performance sensors in the network to improve the data quality, with which the mission is eventually completed more efficiently with lower cost. We focus on the problem of designing such a network system in which issues of resource selection and allocation, system behaviour and capacity, target behaviour and patterns, the environment, and multiple constraints such as the cost must be addressed simultaneously. Simulation results offer significant insight into sensor selection and network operation, and demonstrate the great benefits introduced by guided search in an application of hunting down and capturing hostile vehicles on the battlefield
Ego-motion and Surrounding Vehicle State Estimation Using a Monocular Camera
Understanding ego-motion and surrounding vehicle state is essential to enable
automated driving and advanced driving assistance technologies. Typical
approaches to solve this problem use fusion of multiple sensors such as LiDAR,
camera, and radar to recognize surrounding vehicle state, including position,
velocity, and orientation. Such sensing modalities are overly complex and
costly for production of personal use vehicles. In this paper, we propose a
novel machine learning method to estimate ego-motion and surrounding vehicle
state using a single monocular camera. Our approach is based on a combination
of three deep neural networks to estimate the 3D vehicle bounding box, depth,
and optical flow from a sequence of images. The main contribution of this paper
is a new framework and algorithm that integrates these three networks in order
to estimate the ego-motion and surrounding vehicle state. To realize more
accurate 3D position estimation, we address ground plane correction in
real-time. The efficacy of the proposed method is demonstrated through
experimental evaluations that compare our results to ground truth data
available from other sensors including Can-Bus and LiDAR
LIDAR-Camera Fusion for Road Detection Using Fully Convolutional Neural Networks
In this work, a deep learning approach has been developed to carry out road
detection by fusing LIDAR point clouds and camera images. An unstructured and
sparse point cloud is first projected onto the camera image plane and then
upsampled to obtain a set of dense 2D images encoding spatial information.
Several fully convolutional neural networks (FCNs) are then trained to carry
out road detection, either by using data from a single sensor, or by using
three fusion strategies: early, late, and the newly proposed cross fusion.
Whereas in the former two fusion approaches, the integration of multimodal
information is carried out at a predefined depth level, the cross fusion FCN is
designed to directly learn from data where to integrate information; this is
accomplished by using trainable cross connections between the LIDAR and the
camera processing branches.
To further highlight the benefits of using a multimodal system for road
detection, a data set consisting of visually challenging scenes was extracted
from driving sequences of the KITTI raw data set. It was then demonstrated
that, as expected, a purely camera-based FCN severely underperforms on this
data set. A multimodal system, on the other hand, is still able to provide high
accuracy. Finally, the proposed cross fusion FCN was evaluated on the KITTI
road benchmark where it achieved excellent performance, with a MaxF score of
96.03%, ranking it among the top-performing approaches
- …