7 research outputs found

    A machine learning approach for the detection of supporting rock bolts from laser scan data in an underground mine

    This is the author accepted manuscript. The final version is available from Elsevier via the DOI in this recordRock bolts are a crucial part of underground infrastructure support; however, current methods to locate and record their positions are manual, time consuming and generally incomplete. This paper describes an effective method to automatically locate supporting rock bolts from a 3D laser scanned point cloud. The proposed method utilises a machine learning classifier combined with point descriptors based on neighbourhood properties to classify all data points as either ‘bolt’ or ‘not-bolt’ before using the Density Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm to divide the results into candidate bolt objects. The centroids of these objects are then computed and output as simple georeferenced 3D coordinates to be used by surveyors, mine managers and automated machines. Two classifiers were tested, a random forest and a shallow neural network, with the neural network providing the more accurate results. Alongside the different classifiers, different input feature types were also examined, including the eigenvalue based geometric features popular in the remote sensing community and the point histogram based features more common in the mobile robotics community. It was found that a combination of both feature sets provided the strongest results. The obtained precision and recall scores were 0.59 and 0.70 for the individual laser points and 0.93 and 0.86 for the bolt objects. This demonstrates that the model is robust to noise and misclassifications, as the bolt is still detected even if edge points are misclassified, provided that there are enough correct points to form a cluster. In some cases, the model can detect bolts which are not visible to the human interpreter.University of Exete

    Three-dimensional Laser-based Classification in Outdoor Environments

    Robotics research strives for deploying autonomous systems in populated environments, such as inner city traffic. Autonomous cars need a reliable collision avoidance, but also an object recognition to distinguish different classes of traffic participants. For both tasks, fast three-dimensional laser range sensors generating multiple accurate laser range scans per second, each consisting of a vast number of laser points, are often employed. In this thesis, we investigate and develop classification algorithms that allow us to automatically assign semantic labels to laser scans. We mainly face two challenges: (1) we have to ensure consistent and correct classification results and (2) we must efficiently process a vast number of laser points per scan. In consideration of these challenges, we cover both stages of classification -- the feature extraction from laser range scans and the classification model that maps from the features to semantic labels. As for the feature extraction, we contribute by thoroughly evaluating important state-of-the-art histogram descriptors. We investigate critical parameters of the descriptors and experimentally show for the first time that the classification performance can be significantly improved using a large support radius and a global reference frame. As for learning the classification model, we contribute with new algorithms that improve the classification efficiency and accuracy. Our first approach aims at deriving a consistent point-wise interpretation of the whole laser range scan. By combining efficient similarity-preserving hashing and multiple linear classifiers, we considerably improve the consistency of label assignments, requiring only minimal computational overhead compared to a single linear classifier. In the last part of the thesis, we aim at classifying objects represented by segments. We propose a novel hierarchical segmentation approach comprising multiple stages and a novel mixture classification model of multiple bag-of-words vocabularies. We demonstrate superior performance of both approaches compared to their single component counterparts using challenging real world datasets.Ziel des Forschungsbereichs Robotik ist der Einsatz autonomer Systeme in natĂŒrlichen Umgebungen, wie zum Beispiel innerstĂ€dtischem Verkehr. Autonome Fahrzeuge benötigen einerseits eine zuverlĂ€ssige Kollisionsvermeidung und andererseits auch eine Objekterkennung zur Unterscheidung verschiedener Klassen von Verkehrsteilnehmern. Verwendung finden vorallem drei-dimensionale Laserentfernungssensoren, die mehrere prĂ€zise Laserentfernungsscans pro Sekunde erzeugen und jeder Scan besteht hierbei aus einer hohen Anzahl an Laserpunkten. In dieser Dissertation widmen wir uns der Untersuchung und Entwicklung neuartiger Klassifikationsverfahren zur automatischen Zuweisung von semantischen Objektklassen zu Laserpunkten. Hierbei begegnen wir hauptsĂ€chlich zwei Herausforderungen: (1) wir möchten konsistente und korrekte Klassifikationsergebnisse erreichen und (2) die immense Menge an Laserdaten effizient verarbeiten. Unter BerĂŒcksichtigung dieser Herausforderungen untersuchen wir beide Verarbeitungsschritte eines Klassifikationsverfahrens -- die Merkmalsextraktion unter Nutzung von Laserdaten und das eigentliche Klassifikationsmodell, welches die Merkmale auf semantische Objektklassen abbildet. BezĂŒglich der Merkmalsextraktion leisten wir ein Beitrag durch eine ausfĂŒhrliche Evaluation wichtiger Histogrammdeskriptoren. Wir untersuchen kritische Deskriptorparameter und zeigen zum ersten Mal, dass die KlassifikationsgĂŒte unter Nutzung von großen Merkmalsradien und eines globalen Referenzrahmens signifikant gesteigert wird. BezĂŒglich des Lernens des Klassifikationsmodells, leisten wir BeitrĂ€ge durch neue Algorithmen, welche die Effizienz und Genauigkeit der Klassifikation verbessern. In unserem ersten Ansatz möchten wir eine konsistente punktweise Interpretation des gesamten Laserscans erreichen. Zu diesem Zweck kombinieren wir eine Ă€hnlichkeitserhaltende Hashfunktion und mehrere lineare Klassifikatoren und erreichen hierdurch eine erhebliche Verbesserung der Konsistenz der Klassenzuweisung bei minimalen zusĂ€tzlichen Aufwand im Vergleich zu einem einzelnen linearen Klassifikator. Im letzten Teil der Dissertation möchten wir Objekte, die als Segmente reprĂ€sentiert sind, klassifizieren. Wir stellen eine neuartiges hierarchisches Segmentierungsverfahren und ein neuartiges Klassifikationsmodell auf Basis einer Mixtur mehrerer bag-of-words Vokabulare vor. Wir demonstrieren unter Nutzung von praxisrelevanten DatensĂ€tzen, dass beide AnsĂ€tze im Vergleich zu ihren Entsprechungen aus einer einzelnen Komponente zu erheblichen Verbesserungen fĂŒhren

    Semantic Segmentation and Completion of 2D and 3D Scenes

    Semantic segmentation is one of the fundamental problems in computer vision. This thesis addresses various tasks, all related to the fine-grained, i.e. pixel-wise or voxel-wise, semantic understanding of a scene. In the recent years semantic segmentation by 2D convolutional neural networks has become as much as a default pre-processing step for many other computer vision tasks, since it outputs very rich spatially resolved feature maps and semantic labels that are useful for many higher level recognition tasks. In this thesis, we make several contributions to the field of semantic scene understanding using an image or a depth measurement, recorded by different types of laser sensors, as input. Firstly, we propose a new approach to 2D semantic segmentation of images. It consists of an adaptation of an existing approach for real time capability under constrained hardware demands that are required by a real life drone. The approach is based on a highly optimized implementation of random forests combined with a label propagation strategy. Next, we shift our focus to what we believe is one of the important next forefronts in computer vision: To give machines the ability to anticipate and extrapolate beyond what is captured in a single frame by a camera or depth sensor. This anticipation capability is what allows humans to efficiently interact with their environment. The need for this ability is most prominently displayed in the behaviour of today's autonomous cars. One of their shortcomings is that they only interpret the current sensor state, which prevents them from anticipating events which would require an adaptation of their driving policy. The result is a lot of sudden breaks and non-human-like driving behaviour, which can provoke accidents or negatively impact the traffic flow. Therefore we first propose a task to spatially anticipate semantic labels outside the field of view of an image. The task is based on the Cityscapes dataset, where each image has been center cropped. The goal is to train an algorithm that predicts the semantic segmentation map in the area outside the cropped input region. Along with the task itself, we propose an efficient iterative approach based on 2D convolutional neural networks by designing a task adapted loss function. Afterwards, we switch to the 3D domain. In three dimensions the goal shifts from assigning pixel-wise labels towards the reconstruction of the full 3D scene using a grid of labeled voxels. Thereby one has to anticipate the semantics and geometry in the space that is occluded by the objects themselves from the viewpoint of an image or laser sensor. The task is known as 3D semantic scene completion and has recently caught a lot of attention. Here we propose two new approaches that advance the performance of existing 3D semantic scene completion baselines. The first one is a two stream approach where we leverage a multi-modal input consisting of images and Kinect depth measurements in an early fusion scheme. Moreover we propose a more memory efficient input embedding. The second approach to semantic scene completion leverages the power of the recently introduced generative adversarial networks (GANs). Here we construct a network architecture that follows the GAN principles and uses a discriminator network as an additional regularizer in the 3D-CNN training. With our proposed approaches in semantic scene completion we achieve a new state-of-the-art performance on two benchmark datasets. Finally we observe that one of the shortcomings in semantic scene completion is the lack of a realistic, large scale dataset. We therefore introduce the first real world dataset for semantic scene completion based on the KITTI odometry benchmark. By semantically annotating alls scans of a 10 Hz Velodyne laser scanner, driving through urban and countryside areas, we obtain data that is valuable for many tasks including semantic scene completion. Along with the data we explore the performance of current semantic scene completion models as well as models for semantic point cloud segmentation and motion segmentation. The results show that there is still a lot of space for improvement for either tasks so our dataset is a valuable contribution for future research into these directions

    Development of Mining Sector Applications for Emerging Remote Sensing and Deep Learning Technologies

    This thesis uses neural networks and deep learning to address practical, real-world problems in the mining sector. The main focus is on developing novel applications in the area of object detection from remotely sensed data. This area has many potential mining applications and is an important part of moving towards data driven strategic decision making across the mining sector. The scientific contributions of this research are twofold; firstly, each of the three case studies demonstrate new applications which couple remote sensing and neural network based technologies for improved data driven decision making. Secondly, the thesis presents a framework to guide implementation of these technologies in the mining sector, providing a guide for researchers and professionals undertaking further studies of this type. The first case study builds a fully connected neural network method to locate supporting rock bolts from 3D laser scan data. This method combines input features from the remote sensing and mobile robotics research communities, generating accuracy scores up to 22% higher than those found using either feature set in isolation. The neural network approach also is compared to the widely used random forest classifier and is shown to outperform this classifier on the test datasets. Additionally, the algorithms’ performance is enhanced by adding a confusion class to the training data and by grouping the output predictions using density based spatial clustering. The method is tested on two datasets, gathered using different laser scanners, in different types of underground mines which have different rock bolting patterns. In both cases the method is found to be highly capable of detecting the rock bolts with recall scores of 0.87-0.96. The second case study investigates modern deep learning for LiDAR data. Here, multiple transfer learning strategies and LiDAR data representations are examined for the task of identifying historic mining remains. A transfer learning approach based on a Lunar crater detection model is used, due to the task similarities between both the underlying data structures and the geometries of the objects to be detected. The relationship between dataset resolution and detection accuracy is also examined, with the results showing that the approach is capable of detecting pits and shafts to a high degree of accuracy with precision and recall scores between 0.80-0.92, provided the input data is of sufficient quality and resolution. Alongside resolution, different LiDAR data representations are explored, showing that the precision-recall balance varies depending on the input LiDAR data representation. The third case study creates a deep convolutional neural network model to detect artisanal scale mining from multispectral satellite data. This model is trained from initialisation without transfer learning and demonstrates that accurate multispectral models can be built from a smaller training dataset when appropriate design and data augmentation strategies are adopted. Alongside the deep learning model, novel mosaicing algorithms are developed both to improve cloud cover penetration and to decrease noise in the final prediction maps. When applied to the study area, the results from this model provide valuable information about the expansion, migration and forest encroachment of artisanal scale mining in southwestern Ghana over the last four years. Finally, this thesis presents an implementation framework for these neural network based object detection models, to generalise the findings from this research to new mining sector deep learning tasks. This framework can be used to identify applications which would benefit from neural network approaches; to build the models; and to apply these algorithms in a real world environment. The case study chapters confirm that the neural network models are capable of interpreting remotely sensed data to a high degree of accuracy on real world mining problems, while the framework guides the development of new models to solve a wide range of related challenges

    Simultaneous localisation and mapping with prior information

    This thesis is concerned with Simultaneous Localisation and Mapping (SLAM), a technique by which a platform can estimate its trajectory with greater accuracy than odometry alone, especially when the trajectory incorporates loops. We discuss some of the shortcomings of the "classical" SLAM approach (in particular EKF-SLAM), which assumes that no information is known about the environment a priori. We argue that in general this assumption is needlessly stringent; for most environments, such as cities some prior information is known. We introduce an initial Bayesian probabilistic framework which considers the world as a hierarchy of structures, and maps (such as those produced by SLAM systems) as consisting of features derived from them. Common underlying structure between features in maps allows one to express and thus exploit geometric relations between them to improve their estimates. We apply the framework to EKF-SLAM for the case of a vehicle equipped with a range-bearing sensor operating in an urban environment, building up a metric map of point features, and using a prior map consisting of line segments representing building footprints. We develop a novel method called the Dual Representation, which allows us to use information from the prior map to not only improve the SLAM estimate, but also reduce the severity of errors associated with the EKF. Using the Dual Representation, we investigate the effect of varying the accuracy of the prior map for the case where the underlying structures and thus relations between the SLAM map and prior map are known. We then generalise to the more realistic case, where there is "clutter" - features in the environment that do not relate with the prior map. This involves forming a hypothesis for whether a pair of features in the SLAMstate and prior map were derived from the same structure, and evaluating this based on a geometric likelihood model. Initially we try an incrementalMultiple Hypothesis SLAM(MHSLAM) approach to resolve hypotheses, developing a novel method called the Common State Filter (CSF) to reduce the exponential growth in computational complexity inherent in this approach. This allows us to use information from the prior map immediately, thus reducing linearisation and EKF errors. However we find that MHSLAM is still too inefficient, even with the CSF, so we use a strategy that delays applying relations until we can infer whether they apply; we defer applying information from structure hypotheses until their probability of holding exceeds a threshold. Using this method we investigate the effect of varying degrees of "clutter" on the performance of SLAM