    Benchmarking LiDAR Sensors for Development and Evaluation of Automotive Perception

    Environment perception and representation are some of the most critical tasks in automated driving. To meet the stringent needs of safety standards such as ISO 26262 there is a need for efficient quantitative evaluation of the perceived information. However, to use typical methods of evaluation, such as comparing using annotated data, is not scalable due to the manual effort involved. There is thus a need to automate the process of data annotation. This paper focuses on the LiDAR sensor and aims to identify the limitations of the sensor and provides a methodology to generate annotated data of a measurable quality. The limitations with the sensor are analysed in a Systematic Literature Review on available academic texts and refined by unstructured interviews with experts. The main contributions are 1) the SLR with related interviews to identify LiDAR sensor limitations and 2) the associated methodology which allows us to generate world representations

    Multi-modal Experts Network for Autonomous Driving

    End-to-end learning from sensory data has shown promising results in autonomous driving. While employing many sensors enhances world perception and should lead to more robust and reliable behavior of autonomous vehicles, it is challenging to train and deploy such network and at least two problems are encountered in the considered setting. The first one is the increase of computational complexity with the number of sensing devices. The other is the phenomena of network overfitting to the simplest and most informative input. We address both challenges with a novel, carefully tailored multi-modal experts network architecture and propose a multi-stage training procedure. The network contains a gating mechanism, which selects the most relevant input at each inference time step using a mixed discrete-continuous policy. We demonstrate the plausibility of the proposed approach on our 1/6 scale truck equipped with three cameras and one LiDAR.Comment: Published at the International Conference on Robotics and Automation (ICRA), 202

    Object Detection Using LiDAR and Camera Fusion in Off-road Conditions

    Seoses hüppelise huvi kasvuga autonoomsete sõidukite vastu viimastel aastatel on suurenenud ka vajadus täpsemate ja töökindlamate objektituvastuse meetodite järele. Kuigi tänu konvolutsioonilistele närvivõrkudele on palju edu saavutatud 2D objektituvastuses, siis võrreldavate tulemuste saavutamine 3D maailmas on seni jäänud unistuseks. Põhjuseks on mitmesugused probleemid eri modaalsusega sensorite andmevoogude ühitamisel, samuti on 3D maailmas märgendatud andmestike loomine aeganõudvam ja kallim. Sõltumata sellest, kas kasutame objektide kauguse hindamiseks stereo kaamerat või lidarit, kaasnevad andmevoogude ühitamisega ajastusprobleemid, mis raskendavad selliste lahenduste kasutamist reaalajas. Lisaks on enamus olemasolevaid lahendusi eelkõige välja töötatud ja testitud linnakeskkonnas liikumiseks.Töös pakutakse välja meetod 3D objektituvastuseks, mis põhineb 2D objektituvastuse tulemuste (objekte ümbritsevad kastid või segmenteerimise maskid) projitseerimisel 3D punktipilve ning saadud punktipilve filtreerimisel klasterdamismeetoditega. Tulemusi võrreldakse lihtsa termokaamera piltide filtreerimisel põhineva lahendusega. Täiendavalt viiakse läbi põhjalikud eksperimendid parimate algoritmi parameetrite leidmiseks objektituvastuseks maastikul, saavutamaks suurimat võimalikku täpsust reaalajas.Since the boom in the industry of autonomous vehicles, the need for preciseenvironment perception and robust object detection methods has grown. While we are making progress with state-of-the-art in 2D object detection with approaches such as convolutional neural networks, the challenge remains in efficiently achieving the same level of performance in 3D. The reasons for this include limitations of fusing multi-modal data and the cost of labelling different modalities for training such networks. Whether we use a stereo camera to perceive scene’s ranging information or use time of flight ranging sensors such as LiDAR, ​ the existing pipelines for object detection in point clouds have certain bottlenecks and latency issues which tend to affect the accuracy of detection in real time speed. Moreover, ​ these existing methods are primarily implemented and tested over urban cityscapes.This thesis presents a fusion based approach for detecting objects in 3D by projecting the proposed 2D regions of interest (object’s bounding boxes) or masks (semantically segmented images) to point clouds and applies outlier filtering techniques to filter out target object points in projected regions of interest. Additionally, we compare it with human detection using thermal image thresholding and filtering. Lastly, we performed rigorous benchmarks over the off-road environments to identify potential bottlenecks and to find a combination of pipeline parameters that can maximize the accuracy and performance of real-time object detection in 3D point clouds

    Traversability analysis in unstructured forested terrains for off-road autonomy using LIDAR data

    Scene perception and traversability analysis are real challenges for autonomous driving systems. In the context of off-road autonomy, there are additional challenges due to the unstructured environments and the existence of various vegetation types. It is necessary for the Autonomous Ground Vehicles (AGVs) to be able to identify obstacles and load-bearing surfaces in the terrain to ensure a safe navigation (McDaniel et al. 2012). The presence of vegetation in off-road autonomy applications presents unique challenges for scene understanding: 1) understory vegetation makes it difficult to detect obstacles or to identify load-bearing surfaces; and 2) trees are usually regarded as obstacles even though only trunks of the trees pose collision risk in navigation. The overarching goal of this dissertation was to study traversability analysis in unstructured forested terrains for off-road autonomy using LIDAR data. More specifically, to address the aforementioned challenges, this dissertation studied the impacts of the understory vegetation density on the solid obstacle detection performance of the off-road autonomous systems. By leveraging a physics-based autonomous driving simulator, a classification-based machine learning framework was proposed for obstacle detection based on point cloud data captured by LIDAR. Features were extracted based on a cumulative approach meaning that information related to each feature was updated at each timeframe when new data was collected by LIDAR. It was concluded that the increase in the density of understory vegetation adversely affected the classification performance in correctly detecting solid obstacles. Additionally, a regression-based framework was proposed for estimating the understory vegetation density for safe path planning purposes according to which the traversabilty risk level was regarded as a function of estimated density. Thus, the denser the predicted density of an area, the higher the risk of collision if the AGV traversed through that area. Finally, for the trees in the terrain, the dissertation investigated statistical features that can be used in machine learning algorithms to differentiate trees from solid obstacles in the context of forested off-road scenes. Using the proposed extracted features, the classification algorithm was able to generate high precision results for differentiating trees from solid obstacles. Such differentiation can result in more optimized path planning in off-road applications

    Fusion of LiDAR and camera sensor data for environment sensing in driverless vehicles

    Driverless vehicles operate by sensing and perceiving its surrounding environment to make the accurate driving decisions. A combination of several different sensors such as LiDAR, radar, ultrasound sensors and cameras are utilized to sense the surrounding environment of driverless vehicles. The heterogeneous sensors simultaneously capture various physical attributes of the environment. Such multimodality and redundancy of sensing need to be positively utilized for reliable and consistent perception of the environment through sensor data fusion. However, these multimodal sensor data streams are different from each other in many ways, such as temporal and spatial resolution, data format, and geometric alignment. For the subsequent perception algorithms to utilize the diversity offered by multimodal sensing, the data streams need to be spatially, geometrically and temporally aligned with each other. In this paper, we address the problem of fusing the outputs of a Light Detection and Ranging (LiDAR) scanner and a wide-angle monocular image sensor. The outputs of LiDAR scanner and the image sensor are of different spatial resolutions and need to be aligned with each other. A geometrical model is used to spatially align the two sensor outputs, followed by a Gaussian Process (GP) regression based resolution matching algorithm to interpolate the missing data with quantifiable uncertainty. The results indicate that the proposed sensor data fusion framework significantly aids the subsequent perception steps, as illustrated by the performance improvement of a typical free space detection algorithm

    Multimodal machine learning for intelligent mobility

    Scientific problems are solved by finding the optimal solution for a specific task. Some problems can be solved analytically while other problems are solved using data driven methods. The use of digital technologies to improve the transportation of people and goods, which is referred to as intelligent mobility, is one of the principal beneficiaries of data driven solutions. Autonomous vehicles are at the heart of the developments that propel Intelligent Mobility. Due to the high dimensionality and complexities involved in real-world environments, it needs to become commonplace for intelligent mobility to use data-driven solutions. As it is near impossible to program decision making logic for every eventuality manually. While recent developments of data-driven solutions such as deep learning facilitate machines to learn effectively from large datasets, the application of techniques within safety-critical systems such as driverless cars remain scarce.Autonomous vehicles need to be able to make context-driven decisions autonomously in different environments in which they operate. The recent literature on driverless vehicle research is heavily focused only on road or highway environments but have discounted pedestrianized areas and indoor environments. These unstructured environments tend to have more clutter and change rapidly over time. Therefore, for intelligent mobility to make a significant impact on human life, it is vital to extend the application beyond the structured environments. To further advance intelligent mobility, researchers need to take cues from multiple sensor streams, and multiple machine learning algorithms so that decisions can be robust and reliable. Only then will machines indeed be able to operate in unstructured and dynamic environments safely. Towards addressing these limitations, this thesis investigates data driven solutions towards crucial building blocks in intelligent mobility. Specifically, the thesis investigates multimodal sensor data fusion, machine learning, multimodal deep representation learning and its application of intelligent mobility. This work demonstrates that mobile robots can use multimodal machine learning to derive driver policy and therefore make autonomous decisions.To facilitate autonomous decisions necessary to derive safe driving algorithms, we present an algorithm for free space detection and human activity recognition. Driving these decision-making algorithms are specific datasets collected throughout this study. They include the Loughborough London Autonomous Vehicle dataset, and the Loughborough London Human Activity Recognition dataset. The datasets were collected using an autonomous platform design and developed in house as part of this research activity. The proposed framework for Free-Space Detection is based on an active learning paradigm that leverages the relative uncertainty of multimodal sensor data streams (ultrasound and camera). It utilizes an online learning methodology to continuously update the learnt model whenever the vehicle experiences new environments. The proposed Free Space Detection algorithm enables an autonomous vehicle to self-learn, evolve and adapt to new environments never encountered before. The results illustrate that online learning mechanism is superior to one-off training of deep neural networks that require large datasets to generalize to unfamiliar surroundings. The thesis takes the view that human should be at the centre of any technological development related to artificial intelligence. It is imperative within the spectrum of intelligent mobility where an autonomous vehicle should be aware of what humans are doing in its vicinity. Towards improving the robustness of human activity recognition, this thesis proposes a novel algorithm that classifies point-cloud data originated from Light Detection and Ranging sensors. The proposed algorithm leverages multimodality by using the camera data to identify humans and segment the region of interest in point cloud data. The corresponding 3-dimensional data was converted to a Fisher Vector Representation before being classified by a deep Convolutional Neural Network. The proposed algorithm classifies the indoor activities performed by a human subject with an average precision of 90.3%. When compared to an alternative point cloud classifier, PointNet[1], [2], the proposed framework out preformed on all classes. The developed autonomous testbed for data collection and algorithm validation, as well as the multimodal data-driven solutions for driverless cars, is the major contributions of this thesis. It is anticipated that these results and the testbed will have significant implications on the future of intelligent mobility by amplifying the developments of intelligent driverless vehicles.</div