31 research outputs found

    TractorEYE: Vision-based Real-time Detection for Autonomous Vehicles in Agriculture

    Get PDF
    Agricultural vehicles such as tractors and harvesters have for decades been able to navigate automatically and more efficiently using commercially available products such as auto-steering and tractor-guidance systems. However, a human operator is still required inside the vehicle to ensure the safety of vehicle and especially surroundings such as humans and animals. To get fully autonomous vehicles certified for farming, computer vision algorithms and sensor technologies must detect obstacles with equivalent or better than human-level performance. Furthermore, detections must run in real-time to allow vehicles to actuate and avoid collision.This thesis proposes a detection system (TractorEYE), a dataset (FieldSAFE), and procedures to fuse information from multiple sensor technologies to improve detection of obstacles and to generate a map. TractorEYE is a multi-sensor detection system for autonomous vehicles in agriculture. The multi-sensor system consists of three hardware synchronized and registered sensors (stereo camera, thermal camera and multi-beam lidar) mounted on/in a ruggedized and water-resistant casing. Algorithms have been developed to run a total of six detection algorithms (four for rgb camera, one for thermal camera and one for a Multi-beam lidar) and fuse detection information in a common format using either 3D positions or Inverse Sensor Models. A GPU powered computational platform is able to run detection algorithms online. For the rgb camera, a deep learning algorithm is proposed DeepAnomaly to perform real-time anomaly detection of distant, heavy occluded and unknown obstacles in agriculture. DeepAnomaly is -- compared to a state-of-the-art object detector Faster R-CNN -- for an agricultural use-case able to detect humans better and at longer ranges (45-90m) using a smaller memory footprint and 7.3-times faster processing. Low memory footprint and fast processing makes DeepAnomaly suitable for real-time applications running on an embedded GPU. FieldSAFE is a multi-modal dataset for detection of static and moving obstacles in agriculture. The dataset includes synchronized recordings from a rgb camera, stereo camera, thermal camera, 360-degree camera, lidar and radar. Precise localization and pose is provided using IMU and GPS. Ground truth of static and moving obstacles (humans, mannequin dolls, barrels, buildings, vehicles, and vegetation) are available as an annotated orthophoto and GPS coordinates for moving obstacles. Detection information from multiple detection algorithms and sensors are fused into a map using Inverse Sensor Models and occupancy grid maps. This thesis presented many scientific contribution and state-of-the-art within perception for autonomous tractors; this includes a dataset, sensor platform, detection algorithms and procedures to perform multi-sensor fusion. Furthermore, important engineering contributions to autonomous farming vehicles are presented such as easily applicable, open-source software packages and algorithms that have been demonstrated in an end-to-end real-time detection system. The contributions of this thesis have demonstrated, addressed and solved critical issues to utilize camera-based perception systems that are essential to make autonomous vehicles in agriculture a reality

    Real-time object detection using monocular vision for low-cost automotive sensing systems

    Get PDF
    This work addresses the problem of real-time object detection in automotive environments using monocular vision. The focus is on real-time feature detection, tracking, depth estimation using monocular vision and finally, object detection by fusing visual saliency and depth information. Firstly, a novel feature detection approach is proposed for extracting stable and dense features even in images with very low signal-to-noise ratio. This methodology is based on image gradients, which are redefined to take account of noise as part of their mathematical model. Each gradient is based on a vector connecting a negative to a positive intensity centroid, where both centroids are symmetric about the centre of the area for which the gradient is calculated. Multiple gradient vectors define a feature with its strength being proportional to the underlying gradient vector magnitude. The evaluation of the Dense Gradient Features (DeGraF) shows superior performance over other contemporary detectors in terms of keypoint density, tracking accuracy, illumination invariance, rotation invariance, noise resistance and detection time. The DeGraF features form the basis for two new approaches that perform dense 3D reconstruction from a single vehicle-mounted camera. The first approach tracks DeGraF features in real-time while performing image stabilisation with minimal computational cost. This means that despite camera vibration the algorithm can accurately predict the real-world coordinates of each image pixel in real-time by comparing each motion-vector to the ego-motion vector of the vehicle. The performance of this approach has been compared to different 3D reconstruction methods in order to determine their accuracy, depth-map density, noise-resistance and computational complexity. The second approach proposes the use of local frequency analysis of i ii gradient features for estimating relative depth. This novel method is based on the fact that DeGraF gradients can accurately measure local image variance with subpixel accuracy. It is shown that the local frequency by which the centroid oscillates around the gradient window centre is proportional to the depth of each gradient centroid in the real world. The lower computational complexity of this methodology comes at the expense of depth map accuracy as the camera velocity increases, but it is at least five times faster than the other evaluated approaches. This work also proposes a novel technique for deriving visual saliency maps by using Division of Gaussians (DIVoG). In this context, saliency maps express the difference of each image pixel is to its surrounding pixels across multiple pyramid levels. This approach is shown to be both fast and accurate when evaluated against other state-of-the-art approaches. Subsequently, the saliency information is combined with depth information to identify salient regions close to the host vehicle. The fused map allows faster detection of high-risk areas where obstacles are likely to exist. As a result, existing object detection algorithms, such as the Histogram of Oriented Gradients (HOG) can execute at least five times faster. In conclusion, through a step-wise approach computationally-expensive algorithms have been optimised or replaced by novel methodologies to produce a fast object detection system that is aligned to the requirements of the automotive domain

    Path Planning for incline terrain using Embodied Artificial Intelligence

    Get PDF
    Η Ενσώματη Τεχνητή Νοημοσύνη στοχεύει στο να καλύψει την ανάγκη για την αναπαράσταση ενός προβλήματος αναζήτησης, καθώς και την αναπαράσταση του τι συνιστά “καλή” λύση για το πρόβλημα αυτό σε μια έξυπνη μηχανή. Στην περίπτωση της παρούσας πτυχιακής, αυτή η έξυπνη μηχανή είναι ένα ρομπότ. Συνδυάζοντας την Τεχνητή Νοημοσύνη και την Ρομποτική μπορούμε να ορίσουμε πειράματα των οποίων ο χώρος αναζήτησης είναι ο φυσικός κόσμος και τα αποτελέσματα κάθε πράξης συνιστούν την αξιολόγηση της κάθε λύσης. Στο πλαίσιο της πτυχιακής μου είχα την ευκαιρία να πειραματιστώ με την ανάπτυξη αλγορίθμων Τεχνητής Νοημοσύνης οι οποίοι καθοδηγούν ένα μη επανδρωμένο όχημα εδάφους στην ανακάλυψη μιας λύσης ενός δύσκολου προβλήματος πλοήγησης σε εξωτερικό χώρο, όπως η διάσχιση ενός εδάφους με απότομη κλίση. Επιχείρησα να αντιμετωπίσω το πρόβλημα αυτό με τρεις διαφορετικές προσεγγίσεις, μία με αλγόριθμο Hill Climbing, μία με N-best αναζήτηση και μία με Εξελικτικό Αλγόριθμο, καθεμία με τα δικά της προτερήματα και τις δικές της αδυναμίες. Τελικά, δημιούργησα και αξιολόγησα επίδειξεις, τόσο σε προσομοιωμένα σενάρια όσο και σε ένα σενάριο στον πραγματικό κόσμο. Τα αποτελέσματα αυτών των επιδείξεων δείχνουν μία σαφή πρόοδο στην προσέγγιση του προαναφερθέντος προβλήματος από μία ρομποτική πλαρφόρμα.Embodied Artificial Intelligence aims to cover the need of a search problem’s representation, as well as the representation of what constitutes a “good” solution to this problem in a smart machine. In this thesis’ case, this smart machine is a robot. When we combine Artificial Intelligence and Robotics we can define experiments where the search space is the physical world and the results of each action constitute each solution’s evaluation. In my thesis’ context, I had the opportunity to experiment with the development of artificial intelligence algorithms that guide an unmanned ground vehicle to discover the solution of a tough outdoor navigation problem, like traversing a terrain region of steep incline. I attempted to face the problem with three different approaches. A Hill Climbing algorithm approach, a N-best search approach and an Evolutionary Algorithm approach, each one with its own strengths and weaknesses. In the end, I created and I evaluated demonstrations, both in simulated scenarios and in a real world scenario. The results of these demonstrations show a clear progress in the approach of the aforementioned problem, by the robotic platform

    High-level environment representations for mobile robots

    Get PDF
    In most robotic applications we are faced with the problem of building a digital representation of the environment that allows the robot to autonomously complete its tasks. This internal representation can be used by the robot to plan a motion trajectory for its mobile base and/or end-effector. For most man-made environments we do not have a digital representation or it is inaccurate. Thus, the robot must have the capability of building it autonomously. This is done by integrating into an internal data structure incoming sensor measurements. For this purpose, a common solution consists in solving the Simultaneous Localization and Mapping (SLAM) problem. The map obtained by solving a SLAM problem is called ``metric'' and it describes the geometric structure of the environment. A metric map is typically made up of low-level primitives (like points or voxels). This means that even though it represents the shape of the objects in the robot workspace it lacks the information of which object a surface belongs to. Having an object-level representation of the environment has the advantage of augmenting the set of possible tasks that a robot may accomplish. To this end, in this thesis we focus on two aspects. We propose a formalism to represent in a uniform manner 3D scenes consisting of different geometric primitives, including points, lines and planes. Consequently, we derive a local registration and a global optimization algorithm that can exploit this representation for robust estimation. Furthermore, we present a Semantic Mapping system capable of building an \textit{object-based} map that can be used for complex task planning and execution. Our system exploits effective reconstruction and recognition techniques that require no a-priori information about the environment and can be used under general conditions

    Metric and appearance based visual SLAM for mobile robots

    Get PDF
    Simultaneous Localization and Mapping (SLAM) maintains autonomy for mobile robots and it has been studied extensively during the last two decades. It is the process of building the map of an unknown environment and determining the location of the robot using this map concurrently. Different kinds of sensors such as Global Positioning System (GPS), Inertial Measurement Unit (IMU), laser range finder and sonar are used for data acquisition in SLAM. In recent years, passive visual sensors are utilized in visual SLAM (vSLAM) problem because of their increasing ubiquity. This thesis is concerned with the metric and appearance-based vSLAM problems for mobile robots. From the point of view of metric-based vSLAM, a performance improvement technique is developed. Template matching based video stabilization and Harris corner detector are integrated. Extracting Harris corner features from stabilized video consistently increases the accuracy of the localization. Data coming from a video camera and odometry are fused in an Extended Kalman Filter (EKF) to determine the pose of the robot and build the map of the environment. Simulation results validate the performance improvement obtained by the proposed technique. Moreover, a visual perception system is proposed for appearance-based vSLAM and used for under vehicle classification. The proposed system consists of three main parts: monitoring, detection and classification. In the first part a new catadioptric camera system, where a perspective camera points downwards to a convex mirror mounted to the body of a mobile robot, is designed. Thanks to the catadioptric mirror the scenes against the camera optical axis direction can be viewed. In the second part speeded up robust features (SURF) are used to detect the hidden objects that are under vehicles. Fast appearance based mapping algorithm (FAB-MAP) is then exploited for the classification of the means of transportations in the third part. Experimental results show the feasibility of the proposed system. The proposed solution is implemented using a non-holonomic mobile robot. In the implementations the bottom of the tables in the laboratory are considered as the under vehicles. A database that includes di erent under vehicle images is used. All the algorithms are implemented in Microsoft Visual C++ and OpenCV 2.4.4

    Mapping, planning and exploration with Pose SLAM

    Get PDF
    This thesis reports research on mapping, path planning, and autonomous exploration. These are classical problems in robotics, typically studied independently, and here we link such problems by framing them within a common SLAM approach, adopting Pose SLAM as the basic state estimation machinery. The main contribution of this thesis is an approach that allows a mobile robot to plan a path using the map it builds with Pose SLAM and to select the appropriate actions to autonomously construct this map. Pose SLAM is the variant of SLAM where only the robot trajectory is estimated and where landmarks are only used to produce relative constraints between robot poses. In Pose SLAM, observations come in the form of relative-motion measurements between robot poses. With regards to extending the original Pose SLAM formulation, this thesis studies the computation of such measurements when they are obtained with stereo cameras and develops the appropriate noise propagation models for such case. Furthermore, the initial formulation of Pose SLAM assumes poses in SE(2) and in this thesis we extend this formulation to SE(3), parameterizing rotations either with Euler angles and quaternions. We also introduce a loop closure test that exploits the information from the filter using an independent measure of information content between poses. In the application domain, we present a technique to process the 3D volumetric maps obtained with this SLAM methodology, but with laser range scanning as the sensor modality, to derive traversability maps. Aside from these extensions to Pose SLAM, the core contribution of the thesis is an approach for path planning that exploits the modeled uncertainties in Pose SLAM to search for the path in the pose graph with the lowest accumulated robot pose uncertainty, i.e., the path that allows the robot to navigate to a given goal with the least probability of becoming lost. An added advantage of the proposed path planning approach is that since Pose SLAM is agnostic with respect to the sensor modalities used, it can be used in different environments and with different robots, and since the original pose graph may come from a previous mapping session, the paths stored in the map already satisfy constraints not easy modeled in the robot controller, such as the existence of restricted regions, or the right of way along paths. The proposed path planning methodology has been extensively tested both in simulation and with a real outdoor robot. Our path planning approach is adequate for scenarios where a robot is initially guided during map construction, but autonomous during execution. For other scenarios in which more autonomy is required, the robot should be able to explore the environment without any supervision. The second core contribution of this thesis is an autonomous exploration method that complements the aforementioned path planning strategy. The method selects the appropriate actions to drive the robot so as to maximize coverage and at the same time minimize localization and map uncertainties. An occupancy grid is maintained for the sole purpose of guaranteeing coverage. A significant advantage of the method is that since the grid is only computed to hypothesize entropy reduction of candidate map posteriors, it can be computed at a very coarse resolution since it is not used to maintain neither the robot localization estimate, nor the structure of the environment. Our technique evaluates two types of actions: exploratory actions and place revisiting actions. Action decisions are made based on entropy reduction estimates. By maintaining a Pose SLAM estimate at run time, the technique allows to replan trajectories online should significant change in the Pose SLAM estimate be detected. The proposed exploration strategy was tested in a common publicly available dataset comparing favorably against frontier based explorationPostprint (published version

    Mapping, planning and exploration with Pose SLAM

    Get PDF
    This thesis reports research on mapping, path planning, and autonomous exploration. These are classical problems in robotics, typically studied independently, and here we link such problems by framing them within a common SLAM approach, adopting Pose SLAM as the basic state estimation machinery. The main contribution of this thesis is an approach that allows a mobile robot to plan a path using the map it builds with Pose SLAM and to select the appropriate actions to autonomously construct this map. Pose SLAM is the variant of SLAM where only the robot trajectory is estimated and where landmarks are only used to produce relative constraints between robot poses. In Pose SLAM, observations come in the form of relative-motion measurements between robot poses. With regards to extending the original Pose SLAM formulation, this thesis studies the computation of such measurements when they are obtained with stereo cameras and develops the appropriate noise propagation models for such case. Furthermore, the initial formulation of Pose SLAM assumes poses in SE(2) and in this thesis we extend this formulation to SE(3), parameterizing rotations either with Euler angles and quaternions. We also introduce a loop closure test that exploits the information from the filter using an independent measure of information content between poses. In the application domain, we present a technique to process the 3D volumetric maps obtained with this SLAM methodology, but with laser range scanning as the sensor modality, to derive traversability maps. Aside from these extensions to Pose SLAM, the core contribution of the thesis is an approach for path planning that exploits the modeled uncertainties in Pose SLAM to search for the path in the pose graph with the lowest accumulated robot pose uncertainty, i.e., the path that allows the robot to navigate to a given goal with the least probability of becoming lost. An added advantage of the proposed path planning approach is that since Pose SLAM is agnostic with respect to the sensor modalities used, it can be used in different environments and with different robots, and since the original pose graph may come from a previous mapping session, the paths stored in the map already satisfy constraints not easy modeled in the robot controller, such as the existence of restricted regions, or the right of way along paths. The proposed path planning methodology has been extensively tested both in simulation and with a real outdoor robot. Our path planning approach is adequate for scenarios where a robot is initially guided during map construction, but autonomous during execution. For other scenarios in which more autonomy is required, the robot should be able to explore the environment without any supervision. The second core contribution of this thesis is an autonomous exploration method that complements the aforementioned path planning strategy. The method selects the appropriate actions to drive the robot so as to maximize coverage and at the same time minimize localization and map uncertainties. An occupancy grid is maintained for the sole purpose of guaranteeing coverage. A significant advantage of the method is that since the grid is only computed to hypothesize entropy reduction of candidate map posteriors, it can be computed at a very coarse resolution since it is not used to maintain neither the robot localization estimate, nor the structure of the environment. Our technique evaluates two types of actions: exploratory actions and place revisiting actions. Action decisions are made based on entropy reduction estimates. By maintaining a Pose SLAM estimate at run time, the technique allows to replan trajectories online should significant change in the Pose SLAM estimate be detected. The proposed exploration strategy was tested in a common publicly available dataset comparing favorably against frontier based exploratio

    Influence of complex environments on LiDAR-Based robot navigation

    Get PDF
    La navigation sécuritaire et efficace des robots mobiles repose grandement sur l’utilisation des capteurs embarqués. L’un des capteurs qui est de plus en plus utilisé pour cette tâche est le Light Detection And Ranging (LiDAR). Bien que les recherches récentes montrent une amélioration des performances de navigation basée sur les LiDARs, faire face à des environnements non structurés complexes ou des conditions météorologiques difficiles reste problématique. Dans ce mémoire, nous présentons une analyse de l’influence de telles conditions sur la navigation basée sur les LiDARs. Notre première contribution est d’évaluer comment les LiDARs sont affectés par les flocons de neige durant les tempêtes de neige. Pour ce faire, nous créons un nouvel ensemble de données en faisant l’acquisition de données durant six précipitations de neige. Une analyse statistique de ces ensembles de données, nous caractérisons la sensibilité de chaque capteur et montrons que les mesures de capteurs peuvent être modélisées de manière probabilistique. Nous montrons aussi que les précipitations de neige ont peu d’influence au-delà de 10 m. Notre seconde contribution est d’évaluer l’impact de structures tridimensionnelles complexes présentes en forêt sur les performances d’un algorithme de reconnaissance d’endroits. Nous avons acquis des données dans un environnement extérieur structuré et en forêt, ce qui permet d’évaluer l’influence de ces derniers sur les performances de reconnaissance d’endroits. Notre hypothèse est que, plus deux balayages laser sont proches l’un de l’autre, plus la croyance que ceux-ci proviennent du même endroit sera élevée, mais modulé par le niveau de complexité de l’environnement. Nos expériences confirment que la forêt, avec ses réseaux de branches compliqués et son feuillage, produit plus de données aberrantes et induit une chute plus rapide des performances de reconnaissance en fonction de la distance. Notre conclusion finale est que, les environnements complexes étudiés influencent négativement les performances de navigation basée sur les LiDARs, ce qui devrait être considéré pour développer des algorithmes de navigation robustes.To ensure safe and efficient navigation, mobile robots heavily rely on their ability to use on-board sensors. One such sensor, increasingly used for robot navigation, is the Light Detection And Ranging (LiDAR). Although recent research showed improvement in LiDAR-based navigation, dealing with complex unstructured environments or difficult weather conditions remains problematic. In this thesis, we present an analysis of the influence of such challenging conditions on LiDAR-based navigation. Our first contribution is to evaluate how LiDARs are affected by snowflakes during snowstorms. To this end, we create a novel dataset by acquiring data during six snowfalls using four sensors simultaneously. Based on statistical analysis of this dataset, we characterized the sensitivity of each device and showed that sensor measurements can be modelled in a probabilistic manner. We also showed that falling snow has little impact beyond a range of 10 m. Our second contribution is to evaluate the impact of complex of three-dimensional structures, present in forests, on the performance of a LiDAR-based place recognition algorithm. We acquired data in structured outdoor environment and in forest, which allowed evaluating the impact of the environment on the place recognition performance. Our hypothesis was that the closer two scans are acquired from each other, the higher the belief that the scans originate from the same place will be, but modulated by the level of complexity of the environments. Our experiments confirmed that forests, with their intricate network of branches and foliage, produce more outliers and induce recognition performance to decrease more quickly with distance when compared with structured outdoor environment. Our conclusion is that falling snow conditions and forest environments negatively impact LiDAR-based navigation performance, which should be considered to develop robust navigation algorithms
    corecore