14 research outputs found

    Point cloud voxel classification of aerial urban LiDAR using voxel attributes and random forest approach

    Get PDF
    The opportunities now afforded by increasingly available, dense, aerial urban LiDAR point clouds (greater than100 pts/m2) are arguably stymied by their sheer size, which precludes the effective use of many tools designed for point cloud data mining and classification. This paper introduces the point cloud voxel classification (PCVC) method, an automated, two-step solution for classifying terabytes of data without overwhelming the computational infrastructure. First, the point cloud is voxelized to reduce the number of points needed to be processed sequentially. Next, descriptive voxel attributes are assigned to aid in further classification. These attributes describe the point distribution within each voxel and the voxel's geo-location. These include 5 point-descriptors (density, standard deviation, clustered points, fitted plane, and plane's angle) and 2 voxel position attributes (elevation and neighbors). A random forest algorithm is then used for final classification of the object within each voxel using four categories: ground, roof, wall, and vegetation. The proposed approach was evaluated using a 297,126,417 point dataset from a 1 km2 area in Dublin, Ireland and 50% denser dataset of New York City of 13,912,692 points (150 m2). PCVC's main advantage is scalability achieved through a 99 % reduction in the number of points that needed to be sequentially categorized. Additionally, PCVC demonstrated strong classification results (precision of 0.92, recall of 0.91, and F1-score of 0.92) compared to previous work on the same data set (precision of 0.82-0.91, recall 0.86-0.89, and F1-score of 0.85-0.90).This work was funded by the National Science Foundation award 1940145

    Smart fusion of mobile laser scanner data with large scale topographic maps

    Get PDF

    Lidar-based Obstacle Detection and Recognition for Autonomous Agricultural Vehicles

    Get PDF
    Today, agricultural vehicles are available that can drive autonomously and follow exact route plans more precisely than human operators. Combined with advancements in precision agriculture, autonomous agricultural robots can reduce manual labor, improve workflow, and optimize yield. However, as of today, human operators are still required for monitoring the environment and acting upon potential obstacles in front of the vehicle. To eliminate this need, safety must be ensured by accurate and reliable obstacle detection and avoidance systems.In this thesis, lidar-based obstacle detection and recognition in agricultural environments has been investigated. A rotating multi-beam lidar generating 3D point clouds was used for point-wise classification of agricultural scenes, while multi-modal fusion with cameras and radar was used to increase performance and robustness. Two research perception platforms were presented and used for data acquisition. The proposed methods were all evaluated on recorded datasets that represented a wide range of realistic agricultural environments and included both static and dynamic obstacles.For 3D point cloud classification, two methods were proposed for handling density variations during feature extraction. One method outperformed a frequently used generic 3D feature descriptor, whereas the other method showed promising preliminary results using deep learning on 2D range images. For multi-modal fusion, four methods were proposed for combining lidar with color camera, thermal camera, and radar. Gradual improvements in classification accuracy were seen, as spatial, temporal, and multi-modal relationships were introduced in the models. Finally, occupancy grid mapping was used to fuse and map detections globally, and runtime obstacle detection was applied on mapped detections along the vehicle path, thus simulating an actual traversal.The proposed methods serve as a first step towards full autonomy for agricultural vehicles. The study has thus shown that recent advancements in autonomous driving can be transferred to the agricultural domain, when accurate distinctions are made between obstacles and processable vegetation. Future research in the domain has further been facilitated with the release of the multi-modal obstacle dataset, FieldSAFE

    Road Information Extraction from Mobile LiDAR Point Clouds using Deep Neural Networks

    Get PDF
    Urban roads, as one of the essential transportation infrastructures, provide considerable motivations for rapid urban sprawl and bring notable economic and social benefits. Accurate and efficient extraction of road information plays a significant role in the development of autonomous vehicles (AVs) and high-definition (HD) maps. Mobile laser scanning (MLS) systems have been widely used for many transportation-related studies and applications in road inventory, including road object detection, pavement inspection, road marking segmentation and classification, and road boundary extraction, benefiting from their large-scale data coverage, high surveying flexibility, high measurement accuracy, and reduced weather sensitivity. Road information from MLS point clouds is significant for road infrastructure planning and maintenance, and have an important impact on transportation-related policymaking, driving behaviour regulation, and traffic efficiency enhancement. Compared to the existing threshold-based and rule-based road information extraction methods, deep learning methods have demonstrated superior performance in 3D road object segmentation and classification tasks. However, three main challenges remain that impede deep learning methods for precisely and robustly extracting road information from MLS point clouds. (1) Point clouds obtained from MLS systems are always in large-volume and irregular formats, which has presented significant challenges for managing and processing such massive unstructured points. (2) Variations in point density and intensity are inevitable because of the profiling scanning mechanism of MLS systems. (3) Due to occlusions and the limited scanning range of onboard sensors, some road objects are incomplete, which considerably degrades the performance of threshold-based methods to extract road information. To deal with these challenges, this doctoral thesis proposes several deep neural networks that encode inherent point cloud features and extract road information. These novel deep learning models have been tested by several datasets to deliver robust and accurate road information extraction results compared to state-of-the-art deep learning methods in complex urban environments. First, an end-to-end feature extraction framework for 3D point cloud segmentation is proposed using dynamic point-wise convolutional operations at multiple scales. This framework is less sensitive to data distribution and computational power. Second, a capsule-based deep learning framework to extract and classify road markings is developed to update road information and support HD maps. It demonstrates the practical application of combining capsule networks with hierarchical feature encodings of georeferenced feature images. Third, a novel deep learning framework for road boundary completion is developed using MLS point clouds and satellite imagery, based on the U-shaped network and the conditional deep convolutional generative adversarial network (c-DCGAN). Empirical evidence obtained from experiments compared with state-of-the-art methods demonstrates the superior performance of the proposed models in road object semantic segmentation, road marking extraction and classification, and road boundary completion tasks

    Machine learning algorithms for structured decision making

    Get PDF

    Deep Learning Based Methods for Outdoor Robot Localization and Navigation

    Get PDF
    The number of elderly people is increasing around the globe. In order to support the growing of ageing society, mobile robot is one of viable choices for assisting the elders in their daily activities. These activities happen in any places, either indoor or outdoor. Although outdoor activities benefit the elders in many ways, outdoor environments contain difficulties from their unpredictable natures. Mobile robots for supporting humans in outdoor environments must automatically traverse through various difficulties in the environments using suitable navigation systems.Core components of mobile robots always include the navigation segments. Navigation system helps guiding the robot to its destination where it can perform its designated tasks. There are various tools to be chosen for navigation systems. Outdoor environments are mostly open for conventional navigation tools such as Global Positioning System (GPS) devices. In this thesis three systems for localization and navigation of mobile robots based on visual data and deep learning algorithms are proposed. The first localization system is based on landmark detection. The Faster Regional-Convolutional Neural Network (Faster R-CNN) detects landmarks and signs in the captured image. A Feed-Forward Neural Network (FFNN) is trained to determine robot location coordinates and compass orientation from detected landmarks. The dataset consists of images, geolocation data and labeled bounding boxes to train and test two proposed localization methods. Results are illustrated with absolute errors from the comparisons between localization results and reference geolocation data in the dataset. The second system is the navigation system based on visual data and a deep reinforcement learning algorithm called Deep Q Network (DQN). The employed DQN automatically guides the mobile robot with visual data in the form of images, which received from the only Universal Serial Bus (USB) camera that attached to the robot. DQN consists of a deep neural network called convolutional neural network (CNN), and a reinforcement learning algorithm named Q-Learning. It can make decisions with visual data as input, using experiences from consequences of trial-and-error attempts. Our DQN agents are trained in the simulation environments provided by a platform based on a First-Person Shooter (FPS) game named ViZDoom. Simulation is implemented for training to avoid any possible damage on the real robot during trial-and-error process. Perspective from the simulation is the same as if a camera is attached to the front of the mobile robot. There are many differences between the simulation and the real world. We applied a markerbased Augmented Reality (AR) algorithm to reduce differences between the simulation and the world by altering visual data from the camera with resources from the simulation.The second system is assigned the task of simple navigation to the robot, in which the starting location is fixed but the goal location is random in the designated zone. The robot must be able to detect and track the goal object using a USB camera as its only sensor. Once started, the robot must move from its starting location to the designated goal object. Our DQN navigation method is tested in the simulation and on the real robot. Performances of our DQN are measured quantitatively via average total scores and the number of success navigation attempts. The results show that our DQN can effectively guide a mobile robot to the goal object of the simple navigation tasks, for both the simulation and the real world.The third system employs a Transfer Learning (TL) strategy to reduce training time and resources required for the training of newly added tasks of DQN agents. The new task is the task of reaching the goal while also avoiding obstacles at the same time. Additionally, the starting and the goal locations are all random within the specified areas. The employed transfer learning strategy uses the whole network of the DQN agent trained for the first simple navigation task as the base for training the DQN agent for the second task. The training in our TL strategy decrease the exploration factor, which cause the agent to rely on the existing knowledge from the base network more than randomly selecting actions during the training. It results in the decreased training time, in which optimal solutions can be found faster than training from scratch.We evaluate performances of our TL strategy by comparing the DQN agents trained with our TL at different exploration factor values and the DQN agent trained from scratch. Additionally, agents trained from our TL are trained with the decreased number of episodes to extensively display performances of our TL agents. All DQN agents for the second navigation task are tested in the simulation to avoid any possible and uncontrollable damages from the obstacles. Performances are measured through success attempts and average total scores, same as in the first navigation task. Results show that DQN agents trained via the TL strategy can greatly outperform the agent trained from scratch, despite the lower number of training episodes.博士(工学)法政大学 (Hosei University

    Image-based recognition, 3D localization, and retro-reflectivity evaluation of high-quantity low-cost roadway assets for enhanced condition assessment

    Get PDF
    Systematic condition assessment of high-quantity low-cost roadway assets such as traffic signs, guardrails, and pavement markings requires frequent reporting on location and up-to-date status of these assets. Today, most Departments of Transportation (DOTs) in the US collect data using camera-mounted vehicles to filter, annotate, organize, and present the data necessary for these assessments. However, the cost and complexity of the collection, analysis, and reporting as-is conditions result in sparse and infrequent monitoring. Thus, some of the gains in efficiency are consumed by monitoring costs. This dissertation proposes to improve frequency, detail, and applicability of image-based condition assessment via automating detection, classification, and 3D localization of multiple types of high-quantity low-cost roadway assets using both images collected by the DOTs and online databases such Google Street View Images. To address the new requirements of US Federal Highway Administration (FHWA), a new method is also developed that simulates nighttime visibility of traffic signs from images taken during daytime and measures their retro-reflectivity condition. To initiate detection and classification of high-quantity low-cost roadway assets from street-level images, a number of algorithms are proposed that automatically segment and localize high-level asset categories in 3D. The first set of algorithms focus on the task of detecting and segmenting assets at high-level categories. More specifically, a method based on Semantic Texton Forest classifiers, segments each geo-registered 2D video frame at the pixel-level based on shape, texture, and color. A Structure from Motion (SfM) procedure reconstructs the road and its assets in 3D. Next, a voting scheme assigns the most observed asset category to each point in 3D. The experimental results from application of this method are promising, nevertheless because this method relies on using supervised ground-truth pixel labels for training purposes, scaling it to various types of assets is challenging. To address this issue, a non-parametric image parsing method is proposed that leverages lazy learning scheme for segmentation and recognition of roadway assets. The semi-supervised technique used in the proposed method does not need training and provides ground truth data in a more efficient manner. It is easily scalable to thousands of video frames captured during data collection. Once the high-level asset categories are detected, specific techniques needs to be exploited to detect and classify the assets at a higher level of granularity. To this end, performance of three computer vision algorithms are evaluated for classification of traffic signs in presence of cluttered backgrounds and static and dynamic occlusions. Without making any prior assumptions about the location of traffic signs in 2D, the best performing method uses histograms of oriented gradients and color together with multiple one-vs-all Support Vector Machines, and classifies these assets into warning, regulatory, stop, and yield sign categories. To minimize the reliance on visual data collected by the DOTs and improve frequency and applicability of condition assessment, a new end-to-end procedure is presented that applies the above algorithms and creates comprehensive inventory of traffic signs using Google Street View images. By processing images extracted using Google Street View API and discriminative classification scores from all images that see a sign, the most probable 3D location of each traffic sign is derived and is shown on the Google Earth using a dynamic heat map. A data card containing information about location, type, and condition of each detected traffic sign is also created. Finally, a computer vision-based algorithm is proposed that measures retro-reflectivity of traffic signs during daytime using a vehicle mounted device. The algorithm simulates nighttime visibility of traffic signs from images taken during daytime and measures their retro-reflectivity. The technique is faster, cheaper, and safer compared to the state-of-the-art as it neither requires nighttime operation nor requires manual sign inspection. It also satisfies measurement guidelines set forth by FHWA both in terms of granularity and accuracy. To validate the techniques, new detailed video datasets and their ground-truth were generated from 2.2-mile smart road research facility and two interstate highways in the US. The comprehensive dataset contains over 11,000 annotated U.S. traffic sign images and exhibits large variations in sign pose, scale, background, illumination, and occlusion conditions. The performance of all algorithms were examined using these datasets. For retro-reflectivity measurement of traffic signs, experiments were conducted at different times of day and for different distances. Results were compared with a method recommended by ASTM standards. The experimental results show promise in scalability of these methods to reduce the time and effort required for developing road inventories, especially for those assets such as guardrails and traffic lights that are not typically considered in 2D asset recognition methods and also multiple categories of traffic signs. The applicability of Google Street View Images for inventory management purposes and also the technique for retro-reflectivity measurement during daytime demonstrate strong potential in lowering inspection costs and improving safety in practical applications
    corecore