968 research outputs found
Near-field Perception for Low-Speed Vehicle Automation using Surround-view Fisheye Cameras
Cameras are the primary sensor in automated driving systems. They provide
high information density and are optimal for detecting road infrastructure cues
laid out for human vision. Surround-view camera systems typically comprise of
four fisheye cameras with 190{\deg}+ field of view covering the entire
360{\deg} around the vehicle focused on near-field sensing. They are the
principal sensors for low-speed, high accuracy, and close-range sensing
applications, such as automated parking, traffic jam assistance, and low-speed
emergency braking. In this work, we provide a detailed survey of such vision
systems, setting up the survey in the context of an architecture that can be
decomposed into four modular components namely Recognition, Reconstruction,
Relocalization, and Reorganization. We jointly call this the 4R Architecture.
We discuss how each component accomplishes a specific aspect and provide a
positional argument that they can be synergized to form a complete perception
system for low-speed automation. We support this argument by presenting results
from previous works and by presenting architecture proposals for such a system.
Qualitative results are presented in the video at https://youtu.be/ae8bCOF77uY.Comment: Accepted for publication at IEEE Transactions on Intelligent
Transportation System
A New Wave in Robotics: Survey on Recent mmWave Radar Applications in Robotics
We survey the current state of millimeterwave (mmWave) radar applications in
robotics with a focus on unique capabilities, and discuss future opportunities
based on the state of the art. Frequency Modulated Continuous Wave (FMCW)
mmWave radars operating in the 76--81GHz range are an appealing alternative to
lidars, cameras and other sensors operating in the near visual spectrum. Radar
has been made more widely available in new packaging classes, more convenient
for robotics and its longer wavelengths have the ability to bypass visual
clutter such as fog, dust, and smoke. We begin by covering radar principles as
they relate to robotics. We then review the relevant new research across a
broad spectrum of robotics applications beginning with motion estimation,
localization, and mapping. We then cover object detection and classification,
and then close with an analysis of current datasets and calibration techniques
that provide entry points into radar research.Comment: 19 Pages, 11 Figures, 2 Tables, TRO Submission pendin
TractorEYE: Vision-based Real-time Detection for Autonomous Vehicles in Agriculture
Agricultural vehicles such as tractors and harvesters have for decades been able to navigate automatically and more efficiently using commercially available products such as auto-steering and tractor-guidance systems. However, a human operator is still required inside the vehicle to ensure the safety of vehicle and especially surroundings such as humans and animals. To get fully autonomous vehicles certified for farming, computer vision algorithms and sensor technologies must detect obstacles with equivalent or better than human-level performance. Furthermore, detections must run in real-time to allow vehicles to actuate and avoid collision.This thesis proposes a detection system (TractorEYE), a dataset (FieldSAFE), and procedures to fuse information from multiple sensor technologies to improve detection of obstacles and to generate a map. TractorEYE is a multi-sensor detection system for autonomous vehicles in agriculture. The multi-sensor system consists of three hardware synchronized and registered sensors (stereo camera, thermal camera and multi-beam lidar) mounted on/in a ruggedized and water-resistant casing. Algorithms have been developed to run a total of six detection algorithms (four for rgb camera, one for thermal camera and one for a Multi-beam lidar) and fuse detection information in a common format using either 3D positions or Inverse Sensor Models. A GPU powered computational platform is able to run detection algorithms online. For the rgb camera, a deep learning algorithm is proposed DeepAnomaly to perform real-time anomaly detection of distant, heavy occluded and unknown obstacles in agriculture. DeepAnomaly is -- compared to a state-of-the-art object detector Faster R-CNN -- for an agricultural use-case able to detect humans better and at longer ranges (45-90m) using a smaller memory footprint and 7.3-times faster processing. Low memory footprint and fast processing makes DeepAnomaly suitable for real-time applications running on an embedded GPU. FieldSAFE is a multi-modal dataset for detection of static and moving obstacles in agriculture. The dataset includes synchronized recordings from a rgb camera, stereo camera, thermal camera, 360-degree camera, lidar and radar. Precise localization and pose is provided using IMU and GPS. Ground truth of static and moving obstacles (humans, mannequin dolls, barrels, buildings, vehicles, and vegetation) are available as an annotated orthophoto and GPS coordinates for moving obstacles. Detection information from multiple detection algorithms and sensors are fused into a map using Inverse Sensor Models and occupancy grid maps. This thesis presented many scientific contribution and state-of-the-art within perception for autonomous tractors; this includes a dataset, sensor platform, detection algorithms and procedures to perform multi-sensor fusion. Furthermore, important engineering contributions to autonomous farming vehicles are presented such as easily applicable, open-source software packages and algorithms that have been demonstrated in an end-to-end real-time detection system. The contributions of this thesis have demonstrated, addressed and solved critical issues to utilize camera-based perception systems that are essential to make autonomous vehicles in agriculture a reality
Motorcycles that see: Multifocal stereo vision sensor for advanced safety systems in tilting vehicles
Advanced driver assistance systems, ADAS, have shown the possibility to anticipate crash accidents and effectively assist road users in critical traffic situations. This is not the case for motorcyclists, in fact ADAS for motorcycles are still barely developed. Our aim was to study a camera-based sensor for the application of preventive safety in tilting vehicles. We identified two road conflict situations for which automotive remote sensors installed in a tilting vehicle are likely to fail in the identification of critical obstacles. Accordingly, we set two experiments conducted in real traffic conditions to test our stereo vision sensor. Our promising results support the application of this type of sensors for advanced motorcycle safety applications
DSEC: A Stereo Event Camera Dataset for Driving Scenarios
Once an academic venture, autonomous driving has received unparalleled
corporate funding in the last decade. Still, the operating conditions of
current autonomous cars are mostly restricted to ideal scenarios. This means
that driving in challenging illumination conditions such as night, sunrise, and
sunset remains an open problem. In these cases, standard cameras are being
pushed to their limits in terms of low light and high dynamic range
performance. To address these challenges, we propose, DSEC, a new dataset that
contains such demanding illumination conditions and provides a rich set of
sensory data. DSEC offers data from a wide-baseline stereo setup of two color
frame cameras and two high-resolution monochrome event cameras. In addition, we
collect lidar data and RTK GPS measurements, both hardware synchronized with
all camera data. One of the distinctive features of this dataset is the
inclusion of high-resolution event cameras. Event cameras have received
increasing attention for their high temporal resolution and high dynamic range
performance. However, due to their novelty, event camera datasets in driving
scenarios are rare. This work presents the first high-resolution, large-scale
stereo dataset with event cameras. The dataset contains 53 sequences collected
by driving in a variety of illumination conditions and provides ground truth
disparity for the development and evaluation of event-based stereo algorithms.Comment: IEEE Robotics and Automation Letter
Automatic Dense 3D Scene Mapping from Non-overlapping Passive Visual Sensors for Future Autonomous Systems
The ever increasing demand for higher levels of autonomy for robots and vehicles means there is an ever greater need for such systems to be aware of their surroundings. Whilst solutions already exist for creating 3D scene maps, many are based on active scanning devices such as laser scanners and depth cameras that are either expensive, unwieldy, or do not function well under certain environmental conditions. As a result passive cameras are a favoured sensor due their low cost, small size, and ability to work in a range of lighting conditions.
In this work we address some of the remaining research challenges within the problem of 3D mapping around a moving platform. We utilise prior work in dense stereo imaging, Stereo Visual Odometry (SVO) and extend Structure from Motion (SfM) to create a pipeline optimised for on vehicle sensing.
Using forward facing stereo cameras, we use state of the art SVO and dense stereo techniques to map the scene in front of the vehicle. With significant amounts of prior research in dense stereo, we addressed the issue of selecting an appropriate method by creating a novel evaluation technique. Visual 3D mapping of dynamic scenes from a moving platform result in duplicated scene objects. We extend the prior work on mapping by introducing a generalized dynamic object removal process. Unlike other approaches that rely on computationally expensive segmentation or detection, our method utilises existing data from the mapping stage and the findings from our dense stereo evaluation. We introduce a new SfM approach that exploits our platform motion to create a novel dense mapping process that exceeds the 3D data generation rate of state of the art alternatives. Finally, we combine dense stereo, SVO, and our SfM approach to automatically align point clouds from non-overlapping views to create a rotational and scale consistent global 3D model
Semantic Mapping of Road Scenes
The problem of understanding road scenes has been on the fore-front in the computer vision community
for the last couple of years. This enables autonomous systems to navigate and understand
the surroundings in which it operates. It involves reconstructing the scene and estimating the objects
present in it, such as ‘vehicles’, ‘road’, ‘pavements’ and ‘buildings’. This thesis focusses on these
aspects and proposes solutions to address them.
First, we propose a solution to generate a dense semantic map from multiple street-level images.
This map can be imagined as the bird’s eye view of the region with associated semantic labels for
ten’s of kilometres of street level data. We generate the overhead semantic view from street level
images. This is in contrast to existing approaches using satellite/overhead imagery for classification
of urban region, allowing us to produce a detailed semantic map for a large scale urban area. Then
we describe a method to perform large scale dense 3D reconstruction of road scenes with associated
semantic labels. Our method fuses the depth-maps in an online fashion, generated from the
stereo pairs across time into a global 3D volume, in order to accommodate arbitrarily long image
sequences. The object class labels estimated from the street level stereo image sequence are used to
annotate the reconstructed volume. Then we exploit the scene structure in object class labelling by
performing inference over the meshed representation of the scene. By performing labelling over the
mesh we solve two issues: Firstly, images often have redundant information with multiple images
describing the same scene. Solving these images separately is slow, where our method is approximately
a magnitude faster in the inference stage compared to normal inference in the image domain.
Secondly, often multiple images, even though they describe the same scene result in inconsistent
labelling. By solving a single mesh, we remove the inconsistency of labelling across the images.
Also our mesh based labelling takes into account of the object layout in the scene, which is often
ambiguous in the image domain, thereby increasing the accuracy of object labelling. Finally, we perform
labelling and structure computation through a hierarchical robust PN Markov Random Field
defined on voxels and super-voxels given by an octree. This allows us to infer the 3D structure and
the object-class labels in a principled manner, through bounded approximate minimisation of a well
defined and studied energy functional. In this thesis, we also introduce two object labelled datasets
created from real world data. The 15 kilometre Yotta Labelled dataset consists of 8,000 images per
camera view of the roadways of the United Kingdom with a subset of them annotated with object
class labels and the second dataset is comprised of ground truth object labels for the publicly available
KITTI dataset. Both the datasets are available publicly and we hope will be helpful to the vision
research community
Vision-based localization methods under GPS-denied conditions
This paper reviews vision-based localization methods in GPS-denied
environments and classifies the mainstream methods into Relative Vision
Localization (RVL) and Absolute Vision Localization (AVL). For RVL, we discuss
the broad application of optical flow in feature extraction-based Visual
Odometry (VO) solutions and introduce advanced optical flow estimation methods.
For AVL, we review recent advances in Visual Simultaneous Localization and
Mapping (VSLAM) techniques, from optimization-based methods to Extended Kalman
Filter (EKF) based methods. We also introduce the application of offline map
registration and lane vision detection schemes to achieve Absolute Visual
Localization. This paper compares the performance and applications of
mainstream methods for visual localization and provides suggestions for future
studies.Comment: 32 pages, 15 figure
- …