Search CORE

261 research outputs found

Using a Rotating 3D LiDAR on a Mobile Robot for Estimation of Person’s Body Angle and Gender

Author: Brščić Drazen
Evans Rhys
Kanda Takayuki
Rehm Matthias
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Multi-Modal 3D Object Detection in Autonomous Driving: a Survey

Author: Ji Jianmin
Mao Qiuyu
Wang Yingjie
Zhang Yanyong
Zhang Yu
Zhu Hanqi
Publication venue
Publication date: 25/06/2021
Field of study

In the past few years, we have witnessed rapid development of autonomous driving. However, achieving full autonomy remains a daunting task due to the complex and dynamic driving environment. As a result, self-driving cars are equipped with a suite of sensors to conduct robust and accurate environment perception. As the number and type of sensors keep increasing, combining them for better perception is becoming a natural trend. So far, there has been no indepth review that focuses on multi-sensor fusion based perception. To bridge this gap and motivate future research, this survey devotes to review recent fusion-based 3D detection deep learning models that leverage multiple sensor data sources, especially cameras and LiDARs. In this survey, we first introduce the background of popular sensors for autonomous cars, including their common data representations as well as object detection networks developed for each type of sensor data. Next, we discuss some popular datasets for multi-modal 3D object detection, with a special focus on the sensor data included in each dataset. Then we present in-depth reviews of recent multi-modal 3D detection networks by considering the following three aspects of the fusion: fusion location, fusion data representation, and fusion granularity. After a detailed review, we discuss open challenges and point out possible solutions. We hope that our detailed review can help researchers to embark investigations in the area of multi-modal 3D object detection

arXiv.org e-Print Archive

Radars for Autonomous Driving: A Review of Deep Learning Methods and Challenges

Author: Mandal Soumyajit
Srivastav Arvind
Publication venue
Publication date: 17/06/2023
Field of study

Radar is a key component of the suite of perception sensors used for safe and reliable navigation of autonomous vehicles. Its unique capabilities include high-resolution velocity imaging, detection of agents in occlusion and over long ranges, and robust performance in adverse weather conditions. However, the usage of radar data presents some challenges: it is characterized by low resolution, sparsity, clutter, high uncertainty, and lack of good datasets. These challenges have limited radar deep learning research. As a result, current radar models are often influenced by lidar and vision models, which are focused on optical features that are relatively weak in radar data, thus resulting in under-utilization of radar's capabilities and diminishing its contribution to autonomous perception. This review seeks to encourage further deep learning research on autonomous radar data by 1) identifying key research themes, and 2) offering a comprehensive overview of current opportunities and challenges in the field. Topics covered include early and late fusion, occupancy flow estimation, uncertainty modeling, and multipath detection. The paper also discusses radar fundamentals and data representation, presents a curated list of recent radar datasets, and reviews state-of-the-art lidar and vision models relevant for radar research. For a summary of the paper and more results, visit the website: autonomous-radars.github.io

arXiv.org e-Print Archive

“Deep sensor fusion architecture for point-cloud semantic segmentation”

Author: Campeón Benjumea Leiver Andrés
Publication venue: Maestría en Ingeniería de Sistemas y Computación
Publication date: 01/01/2021
Field of study

Este trabajo de grado desarrolla un completo abordaje del analisis de datos y su procesamiento para obtener una mejor toma de decisiones, presentando así una arquitectura neuronal multimodal basada CNN, comprende explicaciones precisas de los sistemas que integra y realiza una evaluacion del comportamiento en el entorno.Los sistemas de conducción autónoma integran procedimientos realmente complejos, para los cuales la percepción del entorno del vehículo es una fuente de información clave para tomar decisiones durante maniobras en tiempo real. La segmentación semántica de los datos obtenidos de los sensores LiDAR ha desempeñado un papel importante en la consolidación de una representación densa de los objetos y eventos circundantes. Aunque se han hecho grandes avances para resolver esta tarea, creemos que hay una infrautilización de estrategias que aprovechas la fusión de sensores. Presentamos una arquitectura neuronal multimodal, basada en CNNs que es alimentada por las señales de entrada 2D del LiDAR y de la cámara, computa una representación profunda de ambos sensores, y predice un mapeo de etiquetas para el problema de segmentación de puntos en 3D. Evaluamos la arquitectura propuesta en un conjunto de datos derivados del popular dataset KITTI, que contempla clases semánticas comunes ( coche, peatón y ciclista). Nuestro modelo supera a los métodos existentes y muestra una mejora en el refinamiento de las máscaras de segmentación.Self-driving systems are composed by really complex pipelines in which perceiving the vehicle surroundings is a key source of information used to take real-time maneuver decisions. Semantic segmentation on LiDAR sensor data has played a big role in the consolidation of a dense understanding of the surrounding objects and events. Although great advances have been made for this task, we believe there is an under-exploitation of sensor fusion strategies. We present a multimodal neural architecture, based on CNNs that consumes 2D input signals from LiDAR and camera, computes a deep representation leveraging straightness from both sensors, and predicts a label mapping for the 3D point-wise segmentation problem. We evaluated the proposed architecture in a derived dataset from the KITTI vision benchmark suite which contemplates common semantic classes(i.e. car, pedestrian and cyclist). Our model outperforms existing methods and shows improvement in the segmentation masks refinement.MaestríaMagíster en Ingeniería de Sistemas y ComputaciónTable of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Autonomous vehicle perception systems . . . . . . . . . . . . . . . . . . . . 6 2.1 Semantic segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Autonomous vehicles sensing . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.1 Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.2 LiDAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.3 Radar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.4 Ultrasonic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Point clouds semantic segmentation . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.1 Raw pointcloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.2 Voxelization of pointclouds . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.3 Point cloud projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.4 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 Deep multimodal learning for semantic segmentation . . . . . . . . . . . . . 19 3.1 Method overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Point cloud transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3 Multimodal fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3.1 RGB modality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.2 LiDAR modality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.3 Fusion step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.4 Decoding part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.5 Optimization statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.1 KITTI dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.2 Evaluation metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.3 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.4.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Repositorio academico de la Universidad Tecnológica de Pereira

Object detection from a few LIDAR scanning planes

Author: Rózsa Zoltán
Szirányi Tamás
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

SZTAKI Publication Repository

LiDAR Object Detection Utilizing Existing CNNs for Smart Cities

Author: Ponnaganti Vinay
Publication venue: SJSU ScholarWorks
Publication date: 22/12/2020
Field of study

As governments and private companies alike race to achieve the vision of a smart city — where artificial intelligence (AI) technology is used to enable self-driving cars, cashier-less shopping experiences and connected home devices from thermostats to robot vacuum cleaners — advancements are being made in both software and hardware to enable increasingly real-time, accurate inference at the edge. One hardware solution adopted for this purpose is the LiDAR sensor, which utilizes infrared lasers to accurately detect and map its surroundings in 3D. On the software side, developers have turned to artificial neural networks to make predictions and recommendations with high accuracy. These neural networks have the potential, particularly run on purpose-built hardware such as GPUs and TPUs, to make inferences in near real-time, allowing the AI models to serve as a usable interface for real-world interactions with other AI-powered devices, or with human users. This paper aims to example the joint use of LiDAR sensors and AI to understand its importance in smart city environments

SJSU ScholarWorks

Inimeste tuvastamine ning kauguse hindamine kasutades kaamerat ning YOLOv3 tehisnärvivõrku

Author: Sattar Asif
Publication venue: Tartu Ülikool
Publication date: 01/01/2019
Field of study

Inimestega vähemalt samal tasemel keskkonnast aru saamine masinate poolt oleks kasulik paljudes domeenides. Mitmed erinevad sensored aitavad selle ülesande juures, enim on kasutatud kaameraid. Objektide tuvastamine on tähtis osa keskkonnast aru saamisel. Selle täpsus on viimasel ajal palju paranenud tänu arenenud masinõppe meetoditele nimega konvolutsioonilised närvivõrgud (CNN), mida treenitakse kasutades märgendatud kaamerapilte. Monokulaarkaamerapilt sisaldab 2D infot, kuid ei sisalda sügavusinfot. Teisalt, sügavusinfo on tähtis näiteks isesõitvate autode domeenis. Inimeste ohutus tuleb tagada näiteks töötades autonoomsete masinate läheduses või kui jalakäija ületab teed autonoomse sõiduki eest. Antud töös uuritakse võimalust, kuidas tuvastada inimesi ning hinnata nende kaugusi samaaegselt, kasutades RGB kaamerat, eesmärgiga kasutada seda autonoomseks sõitmiseks maastikul. Selleks täiustatakse hetkel parimat objektide tuvastamise konvolutsioonilist närvivõrku YOLOv3 (ingl k. You Only Look Once). Selle töö väliselt on simulatsioonitarkvaradega AirSim ning Unreal Engine loodud lumine metsamaastik koos inimestega erinevates kehapoosides. YOLOv3 närvivõrgu treenimiseks võeti simulatsioonist välja vajalikud andmed, kasutades skripte. Lisaks muudeti närvivõrku, et lisaks inimese asukohta tuvastavale piirikastile väljastataks ka inimese kauguse ennustus. Antud töö tulemuseks on mudel, mille ruutkesmine viga RMSE (ingl k. Root Mean Square Error) on 2.99m objektidele kuni 50m kaugusel, säilitades samaaegselt originaalse närvivõrgu inimeste tuvastamise täpsuse. Võrreldavate meetodite RMSE veaks leiti 4.26m (teist andmestikku kasutades) ja 4.79m (selles töös kasutatud andmestikul), mis vastavalt kasutavad kahte eraldiseisvat närvivõrku ning LASSO meetodit. See näitab suurt parenemist võrreldes teiste meetoditega. Edasisteks eesmärkideks on meetodi treenimine ning testimine päris maailmast kogutud andmetega, et näha, kas see üldistub ka sellistele keskkondadele.Making machines perceive environment better or at least as well as humans would be beneficial in lots of domains. Different sensors aid in this, most widely used of which is monocular camera. Object detection is a major part of environment perception and its accuracy has greatly improved in the last few years thanks to advanced machine learning methods called convolutional neural networks (CNN) that are trained on many labelled images. Monocular camera image contains two dimensional information, but contains no depth information of the scene. On the other hand, depth information of objects is important in a lot of areas related to autonomous driving, e.g. working next to an automated machine, pedestrian crossing a road in front of an autonomous vehicle, etc. This thesis presents an approach to detect humans and to predict their distance from RGB camera for off-road autonomous driving. This is done by improving YOLO (You Only Look Once) v3[1], a state-of-the-art object detection CNN. Outside of this thesis, an off-road scene depicting a snowy forest with humans in different body poses was simulated using AirSim and Unreal Engine. Data for training YOLOv3 neural network was extracted from there using custom scripts. Also, network was modified to not only predict humans and their bounding boxes, but also their distance from camera. RMSE of 2.99m for objects with distances up to 50m was achieved, while maintaining similar detection accuracy to the original network. Comparable methods using two neural networks and a LASSO model gave 4.26m (in an alternative dataset) and 4.79m (with dataset used is this work) RMSE respectively, showing a huge improvement over the baselines. Future work includes experiments with real-world data to see if the proposed approach generalizes to other environments

DSpace at Tartu University Library

MS3D++: Ensemble of Experts for Multi-Source Unsupervised Domain Adaptation in 3D Object Detection

Author: Berrio Julie Stephany
Nebot Eduardo
Shan Mao
Tsai Darren
Worrall Stewart
Publication venue
Publication date: 04/09/2023
Field of study

Deploying 3D detectors in unfamiliar domains has been demonstrated to result in a significant 70-90% drop in detection rate due to variations in lidar, geography, or weather from their training dataset. This domain gap leads to missing detections for densely observed objects, misaligned confidence scores, and increased high-confidence false positives, rendering the detector highly unreliable. To address this, we introduce MS3D++, a self-training framework for multi-source unsupervised domain adaptation in 3D object detection. MS3D++ generates high-quality pseudo-labels, allowing 3D detectors to achieve high performance on a range of lidar types, regardless of their density. Our approach effectively fuses predictions of an ensemble of multi-frame pre-trained detectors from different source domains to improve domain generalization. We subsequently refine predictions temporally to ensure temporal consistency in box localization and object classification. Furthermore, we present an in-depth study into the performance and idiosyncrasies of various 3D detector components in a cross-domain context, providing valuable insights for improved cross-domain detector ensembling. Experimental results on Waymo, nuScenes and Lyft demonstrate that detectors trained with MS3D++ pseudo-labels achieve state-of-the-art performance, comparable to training with human-annotated labels in Bird's Eye View (BEV) evaluation for both low and high density lidar. Code is available at https://github.com/darrenjkt/MS3

arXiv.org e-Print Archive

Recommended from our members

Real-time spatial modeling to detect and track resources on construction sites

Author: Teizer Jochen
Publication venue
Publication date: 01/01/2006
Field of study

For more than 10 years the U.S. construction industry has experienced over 1,000 fatalities annually. Many fatalities may have been prevented had the individuals and equipment involved been more aware of and alert to the physical state of the environment around them. Awareness may be improved by automatic 3D (three-dimensional) sensing and modeling of the job site environment in real-time. Existing 3D modeling approaches based on range scanning techniques are capable of modeling static objects only, and thus cannot model in real-time dynamic objects in an environment comprised of moving humans, equipment, and materials. Emerging prototype 3D video range cameras offer another alternative by facilitating affordable, wide field of view, automated static and dynamic object detection and tracking at frame rates better than 1Hz (real-time). This dissertation presents an imperical work and methodology to rapidly create a spatial model of construction sites and in particular to detect, model, and track the position, dimension, direction, and velocity of static and moving project resources in real-time, based on range data obtained from a three-dimensional video range camera in a static or moving position. Existing construction site 3D modeling approaches based on optical range sensing technologies (laser scanners, rangefinders, etc.) and 3D modeling approaches (dense, sparse, etc.) that offered potential solutions for this research are reviewed. The choice of an emerging sensing tool and preliminary experiments with this prototype sensing technology are discussed. These findings led to the development of a range data processing algorithm based on three-dimensional occupancy grids which is demonstrated in detail. Testing and validation of the proposed algorithms have been conducted to quantify the performance of sensor and algorithm through extensive experimentation involving static and moving objects. Experiments in indoor laboratory and outdoor construction environments have been conducted with construction resources such as humans, equipment, materials, or structures to verify the accuracy of the occupancy grid modeling approach. Results show that modeling objects and measuring their position, dimension, direction, and speed had an accuracy level compatible to the requirements of active safety features for construction. Results demonstrate that video rate 3D data acquisition and analysis of construction environments can support effective detection, tracking, and convex hull modeling of objects. Exploiting rapidly generated three-dimensional models for improved visualization, communications, and process control has inherent value, broad application, and potential impact, e.g. as-built vs. as-planned comparison, condition assessment, maintenance, operations, and construction activities control. In combination with effective management practices, this sensing approach has the potential to assist equipment operators to avoid incidents that result in reduce human injury, death, or collateral damage on construction sites.Civil, Architectural, and Environmental Engineerin

Texas ScholarWorks

Design and implementation of a sensor testing system with use of a cable drone

Author: Virtanen O. (Oskari)
Publication venue: University of Oulu
Publication date: 19/01/2022
Field of study

Abstract. This thesis aims to develop a testing method for various sensors by modifying a commercial cable cam system to drive with an automated process at constant speed. The goal is to find a way to lift the cables in the air securely without a need for humans to climb on ladders and place them afterwards. This is achieved with a hinged truss tower structure that keeps the cables stabile while the tower is lifted. Another goal was to achieve automated movement of the cable drone. This is done by connecting a tracking camera to a computer that is used to control the cable drone’s motor controller. This will have the drone behave in a certain way depending on the tracking camera’s position data. Third goal is to build a portable sensor system which collects and saves the data from the tested sensors. This goal is achieved with an aluminium profile frame which is equipped with all the necessary equipment, such as a powerful computer. Research included studying different sensors’ performance evaluation criteria and effect of the wind on magnitude of the force in this application. Research was done by studying written sources and consulting a cable camera company called Motion Compound GbR. Results of this master’s thesis are used to evaluate if the idea of using a cable cam is applicable for this kind of sensor testing system. As the conclusion the cable drone with automated driving is evaluated to be a practical method which can still be further developed to meet the requirements even better. Antureiden testausjärjestelmän suunnittelu ja toteuttaminen käyttäen vaijeridronea. Tiivistelmä. Tämän diplomityön tavoitteena on muokata kaupallisesta vaijerikamerajärjestelmästä vakionopeudella liikkuva testausmenetelmä eri antureille. Yhtenä työn tavoitteena on löytää tapa nostaa käytettävät vaijerit ylös turvallisesti siten, ettei niitä tarvitse asentaa jälkikäteen korkealla. Tämä toteutetaan saranoidulla, trusseista rakennetulla tornilla. Tornin huipulle asennetaan laakeroidut akselit sekä suoja, jotka yhdessä pitävät vaijerit paikoillaan myös tornin noston ajan. Toinen tavoite on saavuttaa vaijerilennokin automatisoitu liike. Tämä tapahtuu kytkemällä seurantakamera tietokoneeseen, jota käytetään ohjaamaan myös vaijeridronen moottoriohjainta. Näin vaijeridrone saadaan käyttäytymään halutulla tavalla riippuen seurantakameran sijaintitiedoista. Kolmas tavoite on rakentaa kannettava anturijärjestelmä, jolla kerätään ja tallennetaan testatuilla antureilla kerätty data. Tämä tavoite saavutetaan alumiiniprofiilirungolla, joka varustetaan tarvittavilla laitteilla, kuten esimerkiksi tehokkaalla tietokoneella. Tutkimukseen kuului myös antureiden suorituskyvyn arviointikriteereihin tutustuminen sekä työssä käytettävästä järjestelmästä koituvan voiman suuruuden laskeminen. Tutkimus tehtiin perehtymällä kirjallisuuteen ja konsultoimalla vaijerikamera-alalla toimivaa Motion Compound GbR -yritystä. Tämän diplomityön tuloksia voidaan hyödyntää arvioitaessa, onko vaijerikamerajärjestelmä sovellettavissa mainitun anturien testausjärjestelmän rakentamisessa. Lopputuloksena automatisoidulla ajolla varustetun vaijeridronen arvioidaan olevan tähän tarkoitukseen toimiva menetelmä, jota voidaan edelleen kehittää vastaamaan vaatimuksia vielä paremmin

University of Oulu Repository - Jultika