40 research outputs found

    Lidar-based Obstacle Detection and Recognition for Autonomous Agricultural Vehicles

    Get PDF
    Today, agricultural vehicles are available that can drive autonomously and follow exact route plans more precisely than human operators. Combined with advancements in precision agriculture, autonomous agricultural robots can reduce manual labor, improve workflow, and optimize yield. However, as of today, human operators are still required for monitoring the environment and acting upon potential obstacles in front of the vehicle. To eliminate this need, safety must be ensured by accurate and reliable obstacle detection and avoidance systems.In this thesis, lidar-based obstacle detection and recognition in agricultural environments has been investigated. A rotating multi-beam lidar generating 3D point clouds was used for point-wise classification of agricultural scenes, while multi-modal fusion with cameras and radar was used to increase performance and robustness. Two research perception platforms were presented and used for data acquisition. The proposed methods were all evaluated on recorded datasets that represented a wide range of realistic agricultural environments and included both static and dynamic obstacles.For 3D point cloud classification, two methods were proposed for handling density variations during feature extraction. One method outperformed a frequently used generic 3D feature descriptor, whereas the other method showed promising preliminary results using deep learning on 2D range images. For multi-modal fusion, four methods were proposed for combining lidar with color camera, thermal camera, and radar. Gradual improvements in classification accuracy were seen, as spatial, temporal, and multi-modal relationships were introduced in the models. Finally, occupancy grid mapping was used to fuse and map detections globally, and runtime obstacle detection was applied on mapped detections along the vehicle path, thus simulating an actual traversal.The proposed methods serve as a first step towards full autonomy for agricultural vehicles. The study has thus shown that recent advancements in autonomous driving can be transferred to the agricultural domain, when accurate distinctions are made between obstacles and processable vegetation. Future research in the domain has further been facilitated with the release of the multi-modal obstacle dataset, FieldSAFE

    A Routine and Post-disaster Road Corridor Monitoring Framework for the Increased Resilience of Road Infrastructures

    Get PDF

    TractorEYE: Vision-based Real-time Detection for Autonomous Vehicles in Agriculture

    Get PDF
    Agricultural vehicles such as tractors and harvesters have for decades been able to navigate automatically and more efficiently using commercially available products such as auto-steering and tractor-guidance systems. However, a human operator is still required inside the vehicle to ensure the safety of vehicle and especially surroundings such as humans and animals. To get fully autonomous vehicles certified for farming, computer vision algorithms and sensor technologies must detect obstacles with equivalent or better than human-level performance. Furthermore, detections must run in real-time to allow vehicles to actuate and avoid collision.This thesis proposes a detection system (TractorEYE), a dataset (FieldSAFE), and procedures to fuse information from multiple sensor technologies to improve detection of obstacles and to generate a map. TractorEYE is a multi-sensor detection system for autonomous vehicles in agriculture. The multi-sensor system consists of three hardware synchronized and registered sensors (stereo camera, thermal camera and multi-beam lidar) mounted on/in a ruggedized and water-resistant casing. Algorithms have been developed to run a total of six detection algorithms (four for rgb camera, one for thermal camera and one for a Multi-beam lidar) and fuse detection information in a common format using either 3D positions or Inverse Sensor Models. A GPU powered computational platform is able to run detection algorithms online. For the rgb camera, a deep learning algorithm is proposed DeepAnomaly to perform real-time anomaly detection of distant, heavy occluded and unknown obstacles in agriculture. DeepAnomaly is -- compared to a state-of-the-art object detector Faster R-CNN -- for an agricultural use-case able to detect humans better and at longer ranges (45-90m) using a smaller memory footprint and 7.3-times faster processing. Low memory footprint and fast processing makes DeepAnomaly suitable for real-time applications running on an embedded GPU. FieldSAFE is a multi-modal dataset for detection of static and moving obstacles in agriculture. The dataset includes synchronized recordings from a rgb camera, stereo camera, thermal camera, 360-degree camera, lidar and radar. Precise localization and pose is provided using IMU and GPS. Ground truth of static and moving obstacles (humans, mannequin dolls, barrels, buildings, vehicles, and vegetation) are available as an annotated orthophoto and GPS coordinates for moving obstacles. Detection information from multiple detection algorithms and sensors are fused into a map using Inverse Sensor Models and occupancy grid maps. This thesis presented many scientific contribution and state-of-the-art within perception for autonomous tractors; this includes a dataset, sensor platform, detection algorithms and procedures to perform multi-sensor fusion. Furthermore, important engineering contributions to autonomous farming vehicles are presented such as easily applicable, open-source software packages and algorithms that have been demonstrated in an end-to-end real-time detection system. The contributions of this thesis have demonstrated, addressed and solved critical issues to utilize camera-based perception systems that are essential to make autonomous vehicles in agriculture a reality

    A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery

    Full text link
    Semantic segmentation (classification) of Earth Observation imagery is a crucial task in remote sensing. This paper presents a comprehensive review of technical factors to consider when designing neural networks for this purpose. The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these ANN families and their implications for semantic segmentation. Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalization and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation. By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides researchers and practitioners with a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of Earth Observation imagery.Comment: 145 pages with 32 figure

    Um estudo comparativo das abordagens de detecção e reconhecimento de texto para cenários de computação restrita

    Get PDF
    Orientadores: Ricardo da Silva Torres, Allan da Silva PintoDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Textos são elementos fundamentais para uma efetiva comunicação em nosso cotidiano. A mobilidade de pessoas e veículos em ambientes urbanos e a busca por um produto de interesse em uma prateleira de supermercado são exemplos de atividades em que o entendimento dos elementos textuais presentes no ambiente são essenciais para a execução da tarefa. Recentemente, diversos avanços na área de visão computacional têm sido reportados na literatura, com o desenvolvimento de algoritmos e métodos que objetivam reconhecer objetos e textos em cenas. Entretanto, a detecção e reconhecimento de textos são problemas considerados em aberto devido a diversos fatores que atuam como fontes de variabilidades durante a geração e captura de textos em cenas, o que podem impactar as taxas de detecção e reconhecimento de maneira significativa. Exemplo destes fatores incluem diferentes formas dos elementos textuais (e.g., circular ou em linha curva), estilos e tamanhos da fonte, textura, cor, variação de brilho e contraste, entre outros. Além disso, os recentes métodos considerados estado-da-arte, baseados em aprendizagem profunda, demandam altos custos de processamento computacional, o que dificulta a utilização de tais métodos em cenários de computação restritiva. Esta dissertação apresenta um estudo comparativo de técnicas de detecção e reconhecimento de texto, considerando tanto os métodos baseados em aprendizado profundo quanto os métodos que utilizam algoritmos clássicos de aprendizado de máquina. Esta dissertação também apresenta um método de fusão de caixas delimitadoras, baseado em programação genética (GP), desenvolvido para atuar tanto como uma etapa de pós-processamento, posterior a etapa de detecção, quanto para explorar a complementariedade dos algoritmos de detecção de texto investigados nesta dissertação. De acordo com o estudo comparativo apresentado neste trabalho, os métodos baseados em aprendizagem profunda são mais eficazes e menos eficientes, em comparação com os métodos clássicos da literatura e considerando as métricas adotadas. Além disso, o algoritmo de fusão proposto foi capaz de aprender informações complementares entre os métodos investigados nesta dissertação, o que resultou em uma melhora das taxas de precisão e revocação. Os experimentos foram conduzidos considerando os problemas de detecção de textos horizontais, verticais e de orientação arbitráriaAbstract: Texts are fundamental elements for effective communication in our daily lives. The mobility of people and vehicles in urban environments and the search for a product of interest on a supermarket shelf are examples of activities in which the understanding of the textual elements present in the environment is essential to succeed in such tasks. Recently, several advances in computer vision have been reported in the literature, with the development of algorithms and methods that aim to recognize objects and texts in scenes. However, text detection and recognition are still open problems due to several factors that act as sources of variability during scene text generation and capture, which can significantly impact detection and recognition rates of current algorithms. Examples of these factors include different shapes of textual elements (e.g., circular or curved), font styles and sizes, texture, color, brightness and contrast variation, among others. Besides, recent state-of-the-art methods based on deep learning demand high computational processing costs, which difficult their use in restricted computing scenarios. This dissertation presents a comparative study of text detection and recognition techniques, considering methods based on deep learning and methods that use classical machine learning algorithms. This dissertation also presents an algorithm for fusing bounding boxes, based on genetic programming (GP), developed to act as a post-processing step for a single text detector and to explore the complementarity of text detection algorithms investigated in this dissertation. According to the comparative study presented in this work, the methods based on deep learning are more effective and less efficient, in comparison to classic methods for text detection investigated in this work, considering the adopted metrics. Furthermore, the proposed GP-based fusion algorithm was able to learn complementary information from the methods investigated in this dissertation, which resulted in an improvement of precision and recall rates. The experiments were conducted considering text detection problems involving horizontal, vertical and arbitrary orientationsMestradoCiência da ComputaçãoMestre em Ciência da ComputaçãoCAPE

    Very High Resolution (VHR) Satellite Imagery: Processing and Applications

    Get PDF
    Recently, growing interest in the use of remote sensing imagery has appeared to provide synoptic maps of water quality parameters in coastal and inner water ecosystems;, monitoring of complex land ecosystems for biodiversity conservation; precision agriculture for the management of soils, crops, and pests; urban planning; disaster monitoring, etc. However, for these maps to achieve their full potential, it is important to engage in periodic monitoring and analysis of multi-temporal changes. In this context, very high resolution (VHR) satellite-based optical, infrared, and radar imaging instruments provide reliable information to implement spatially-based conservation actions. Moreover, they enable observations of parameters of our environment at greater broader spatial and finer temporal scales than those allowed through field observation alone. In this sense, recent very high resolution satellite technologies and image processing algorithms present the opportunity to develop quantitative techniques that have the potential to improve upon traditional techniques in terms of cost, mapping fidelity, and objectivity. Typical applications include multi-temporal classification, recognition and tracking of specific patterns, multisensor data fusion, analysis of land/marine ecosystem processes and environment monitoring, etc. This book aims to collect new developments, methodologies, and applications of very high resolution satellite data for remote sensing. The works selected provide to the research community the most recent advances on all aspects of VHR satellite remote sensing

    LIPIcs, Volume 277, GIScience 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 277, GIScience 2023, Complete Volum

    Deep Learning based Vehicle Detection in Aerial Imagery

    Get PDF
    This book proposes a novel deep learning based detection method, focusing on vehicle detection in aerial imagery recorded in top view. The base detection framework is extended by two novel components to improve the detection accuracy by enhancing the contextual and semantical content of the employed feature representation. To reduce the inference time, a lightweight CNN architecture is proposed as base architecture and a novel module that restricts the search area is introduced
    corecore