110 research outputs found

    Inimeste tuvastamine ning kauguse hindamine kasutades kaamerat ning YOLOv3 tehisnärvivõrku

    Get PDF
    Inimestega vähemalt samal tasemel keskkonnast aru saamine masinate poolt oleks kasulik paljudes domeenides. Mitmed erinevad sensored aitavad selle ülesande juures, enim on kasutatud kaameraid. Objektide tuvastamine on tähtis osa keskkonnast aru saamisel. Selle täpsus on viimasel ajal palju paranenud tänu arenenud masinõppe meetoditele nimega konvolutsioonilised närvivõrgud (CNN), mida treenitakse kasutades märgendatud kaamerapilte. Monokulaarkaamerapilt sisaldab 2D infot, kuid ei sisalda sügavusinfot. Teisalt, sügavusinfo on tähtis näiteks isesõitvate autode domeenis. Inimeste ohutus tuleb tagada näiteks töötades autonoomsete masinate läheduses või kui jalakäija ületab teed autonoomse sõiduki eest. Antud töös uuritakse võimalust, kuidas tuvastada inimesi ning hinnata nende kaugusi samaaegselt, kasutades RGB kaamerat, eesmärgiga kasutada seda autonoomseks sõitmiseks maastikul. Selleks täiustatakse hetkel parimat objektide tuvastamise konvolutsioonilist närvivõrku YOLOv3 (ingl k. You Only Look Once). Selle töö väliselt on simulatsioonitarkvaradega AirSim ning Unreal Engine loodud lumine metsamaastik koos inimestega erinevates kehapoosides. YOLOv3 närvivõrgu treenimiseks võeti simulatsioonist välja vajalikud andmed, kasutades skripte. Lisaks muudeti närvivõrku, et lisaks inimese asukohta tuvastavale piirikastile väljastataks ka inimese kauguse ennustus. Antud töö tulemuseks on mudel, mille ruutkesmine viga RMSE (ingl k. Root Mean Square Error) on 2.99m objektidele kuni 50m kaugusel, säilitades samaaegselt originaalse närvivõrgu inimeste tuvastamise täpsuse. Võrreldavate meetodite RMSE veaks leiti 4.26m (teist andmestikku kasutades) ja 4.79m (selles töös kasutatud andmestikul), mis vastavalt kasutavad kahte eraldiseisvat närvivõrku ning LASSO meetodit. See näitab suurt parenemist võrreldes teiste meetoditega. Edasisteks eesmärkideks on meetodi treenimine ning testimine päris maailmast kogutud andmetega, et näha, kas see üldistub ka sellistele keskkondadele.Making machines perceive environment better or at least as well as humans would be beneficial in lots of domains. Different sensors aid in this, most widely used of which is monocular camera. Object detection is a major part of environment perception and its accuracy has greatly improved in the last few years thanks to advanced machine learning methods called convolutional neural networks (CNN) that are trained on many labelled images. Monocular camera image contains two dimensional information, but contains no depth information of the scene. On the other hand, depth information of objects is important in a lot of areas related to autonomous driving, e.g. working next to an automated machine, pedestrian crossing a road in front of an autonomous vehicle, etc. This thesis presents an approach to detect humans and to predict their distance from RGB camera for off-road autonomous driving. This is done by improving YOLO (You Only Look Once) v3[1], a state-of-the-art object detection CNN. Outside of this thesis, an off-road scene depicting a snowy forest with humans in different body poses was simulated using AirSim and Unreal Engine. Data for training YOLOv3 neural network was extracted from there using custom scripts. Also, network was modified to not only predict humans and their bounding boxes, but also their distance from camera. RMSE of 2.99m for objects with distances up to 50m was achieved, while maintaining similar detection accuracy to the original network. Comparable methods using two neural networks and a LASSO model gave 4.26m (in an alternative dataset) and 4.79m (with dataset used is this work) RMSE respectively, showing a huge improvement over the baselines. Future work includes experiments with real-world data to see if the proposed approach generalizes to other environments

    Resilient Perception for Outdoor Unmanned Ground Vehicles

    Get PDF
    This thesis promotes the development of resilience for perception systems with a focus on Unmanned Ground Vehicles (UGVs) in adverse environmental conditions. Perception is the interpretation of sensor data to produce a representation of the environment that is necessary for subsequent decision making. Long-term autonomy requires perception systems that correctly function in unusual but realistic conditions that will eventually occur during extended missions. State-of-the-art UGV systems can fail when the sensor data are beyond the operational capacity of the perception models. The key to resilient perception system lies in the use of multiple sensor modalities and the pre-selection of appropriate sensor data to minimise the chance of failure. This thesis proposes a framework based on diagnostic principles to evaluate and preselect sensor data prior to interpretation by the perception system. Image-based quality metrics are explored and evaluated experimentally using infrared (IR) and visual cameras onboard a UGV in the presence of smoke and airborne dust. A novel quality metric, Spatial Entropy (SE), is introduced and evaluated. The proposed framework is applied to a state-of-the-art Visual-SLAM algorithm combining visual and IR imaging as a real-world example. An extensive experimental evaluation demonstrates that the framework allows for camera-based localisation that is resilient to a range of low-visibility conditions when compared to other methods that use a single sensor or combine sensor data without selection. The proposed framework allows for a resilient localisation in adverse conditions using image data but also has significant potential to benefit many perception applications. Employing multiple sensing modalities along with pre-selection of appropriate data is a powerful method to create resilient perception systems by anticipating and mitigating errors. The development of such resilient perception systems is a requirement for next-generation outdoor UGVs

    Assessment of simulated and real-world autonomy performance with small-scale unmanned ground vehicles

    Get PDF
    Off-road autonomy is a challenging topic that requires robust systems to both understand and navigate complex environments. While on-road autonomy has seen a major expansion in recent years in the consumer space, off-road systems are mostly relegated to niche applications. However, these applications can provide safety and navigation to dangerous areas that are the most suited for autonomy tasks. Traversability analysis is at the core of many of the algorithms employed in these topics. In this thesis, a Clearpath Robotics Jackal vehicle is equipped with a 3D Ouster laser scanner to define and traverse off-road environments. The Mississippi State University Autonomous Vehicle Simulator (MAVS) and the Navigating All Terrains Using Robotic Exploration (NATURE) autonomy stack are used in conjunction with the small-scale vehicle platform to traverse uneven terrain and collect data. Additionally, the NATURE stack is used as a point of comparison between a MAVS simulated and physical Clearpath Robotics Jackal vehicle in testing

    Stereo vision for obstacle detection in autonomous vehicle navigation

    Get PDF
    Master'sMASTER OF ENGINEERIN

    Road detection using intrinsic colors in a stereo vision system

    Get PDF
    Master'sMASTER OF ENGINEERIN

    Online self-supervised learning for road detection

    Get PDF
    We present a computer vision system for intelligent vehicles that distinguishes obstacles from roads by exploring online and self-supervised learning. It uses geometric information, derived from stereo-based obstacle detection, to obtain weak training labels for an SVM classifier. Subsequently, the SVM improves the road detection result by classifying image regions on basis of appearance information. In this work, we experimentally evaluate different image features to model road and obstacle appearances. It is shown that using both geometric information and HueSaturation appearance information improves the road detection task

    Spectral LADAR: Active Range-Resolved Imaging Spectroscopy

    Get PDF
    Imaging spectroscopy using ambient or thermally generated optical sources is a well developed technique for capturing two dimensional images with high per-pixel spectral resolution. The per-pixel spectral data is often a sufficient sampling of a material's backscatter spectrum to infer chemical properties of the constituent material to aid in substance identification. Separately, conventional LADAR sensors use quasi-monochromatic laser radiation to create three dimensional images of objects at high angular resolution, compared to RADAR. Advances in dispersion engineered photonic crystal fibers in recent years have made high spectral radiance optical supercontinuum sources practical, enabling this study of Spectral LADAR, a continuous polychromatic spectrum augmentation of conventional LADAR. This imaging concept, which combines multi-spectral and 3D sensing at a physical level, is demonstrated with 25 independent and parallel LADAR channels and generates point cloud images with three spatial dimensions and one spectral dimension. The independence of spectral bands is a key characteristic of Spectral LADAR. Each spectral band maintains a separate time waveform record, from which target parameters are estimated. Accordingly, the spectrum computed for each backscatter reflection is independently and unambiguously range unmixed from multiple target reflections that may arise from transmission of a single panchromatic pulse. This dissertation presents the theoretical background of Spectral LADAR, a shortwave infrared laboratory demonstrator system constructed as a proof-of-concept prototype, and the experimental results obtained by the prototype when imaging scenes at stand off ranges of 45 meters. The resultant point cloud voxels are spectrally classified into a number of material categories which enhances object and feature recognition. Experimental results demonstrate the physical level combination of active backscatter spectroscopy and range resolved sensing to produce images with a level of complexity, detail, and accuracy that is not obtainable with data-level registration and fusion of conventional imaging spectroscopy and LADAR. The capabilities of Spectral LADAR are expected to be useful in a range of applications, such as biomedical imaging and agriculture, but particularly when applied as a sensor in unmanned ground vehicle navigation. Applications to autonomous mobile robotics are the principal motivators of this study, and are specifically addressed

    HybridFusion: LiDAR and Vision Cross-Source Point Cloud Fusion

    Full text link
    Recently, cross-source point cloud registration from different sensors has become a significant research focus. However, traditional methods confront challenges due to the varying density and structure of cross-source point clouds. In order to solve these problems, we propose a cross-source point cloud fusion algorithm called HybridFusion. It can register cross-source dense point clouds from different viewing angle in outdoor large scenes. The entire registration process is a coarse-to-fine procedure. First, the point cloud is divided into small patches, and a matching patch set is selected based on global descriptors and spatial distribution, which constitutes the coarse matching process. To achieve fine matching, 2D registration is performed by extracting 2D boundary points from patches, followed by 3D adjustment. Finally, the results of multiple patch pose estimates are clustered and fused to determine the final pose. The proposed approach is evaluated comprehensively through qualitative and quantitative experiments. In order to compare the robustness of cross-source point cloud registration, the proposed method and generalized iterative closest point method are compared. Furthermore, a metric for describing the degree of point cloud filling is proposed. The experimental results demonstrate that our approach achieves state-of-the-art performance in cross-source point cloud registration
    corecore