    Lidar-based Obstacle Detection and Recognition for Autonomous Agricultural Vehicles

    Today, agricultural vehicles are available that can drive autonomously and follow exact route plans more precisely than human operators. Combined with advancements in precision agriculture, autonomous agricultural robots can reduce manual labor, improve workflow, and optimize yield. However, as of today, human operators are still required for monitoring the environment and acting upon potential obstacles in front of the vehicle. To eliminate this need, safety must be ensured by accurate and reliable obstacle detection and avoidance systems.In this thesis, lidar-based obstacle detection and recognition in agricultural environments has been investigated. A rotating multi-beam lidar generating 3D point clouds was used for point-wise classification of agricultural scenes, while multi-modal fusion with cameras and radar was used to increase performance and robustness. Two research perception platforms were presented and used for data acquisition. The proposed methods were all evaluated on recorded datasets that represented a wide range of realistic agricultural environments and included both static and dynamic obstacles.For 3D point cloud classification, two methods were proposed for handling density variations during feature extraction. One method outperformed a frequently used generic 3D feature descriptor, whereas the other method showed promising preliminary results using deep learning on 2D range images. For multi-modal fusion, four methods were proposed for combining lidar with color camera, thermal camera, and radar. Gradual improvements in classification accuracy were seen, as spatial, temporal, and multi-modal relationships were introduced in the models. Finally, occupancy grid mapping was used to fuse and map detections globally, and runtime obstacle detection was applied on mapped detections along the vehicle path, thus simulating an actual traversal.The proposed methods serve as a first step towards full autonomy for agricultural vehicles. The study has thus shown that recent advancements in autonomous driving can be transferred to the agricultural domain, when accurate distinctions are made between obstacles and processable vegetation. Future research in the domain has further been facilitated with the release of the multi-modal obstacle dataset, FieldSAFE

    Generic object classification for autonomous robots

    Un dels principals problemes de la interacció dels robots autònoms és el coneixement de l'escena. El reconeixement és fonamental per a solucionar aquest problema i permetre als robots interactuar en un escenari no controlat. En aquest document presentem una aplicació pràctica de la captura d'objectes, de la normalització i de la classificació de senyals triangulars i circulars. El sistema s'introdueix en el robot Aibo de Sony per a millorar-ne la interacció. La metodologia presentada s'ha comprobat en simulacions i problemes de categorització reals, com ara la classificació de senyals de trànsit, amb resultats molt prometedors.Uno de los principales problemas de la interacción de los robots autónomos es el conocimiento de la escena. El reconocimiento es fundamental para solventar este problema y permitir a los robots interactuar en un escenario no controlado. En este documento, presentamos una aplicación práctica de captura del objeto, normalización y clasificación de señales triangulares y circulares. El sistema es introducido en el robot Aibo de Sony para mejorar el comportamiento de la interacción del robot. La metodología presentada ha sido testeada en simulaciones y problemas de categorización reales, como es la clasificación de señales de tráfico, con resultados muy prometedores.One of the main problems of autonomous robots interaction is the scene knowledge. Recognition is concerned to deal with this problem and to allow robots to interact in uncontrolled environments. In this paper, we present a practical application for object fitting, normalization and classification of triangular and circular signs. The system is introduced in the Aibo robot of Sony to increase the robot interaction behaviour. The presented methodology has been tested in real simulations and categorization problems, as the traffic signs classification, with very promising results.Nota: Aquest document conté originàriament altre material i/o programari només consultable a la Biblioteca de Ciència i Tecnologia

    Detection and Tracking of Pedestrians Using Doppler LiDAR

    Pedestrian detection and tracking is necessary for autonomous vehicles and traffic manage- ment. This paper presents a novel solution to pedestrian detection and tracking for urban scenarios based on Doppler LiDAR that records both the position and velocity of the targets. The workflow consists of two stages. In the detection stage, the input point cloud is first segmented to form clus- ters, frame by frame. A subsequent multiple pedestrian separation process is introduced to further segment pedestrians close to each other. While a simple speed classifier is capable of extracting most of the moving pedestrians, a supervised machine learning-based classifier is adopted to detect pedestrians with insignificant radial velocity. In the tracking stage, the pedestrian’s state is estimated by a Kalman filter, which uses the speed information to estimate the pedestrian’s dynamics. Based on the similarity between the predicted and detected states of pedestrians, a greedy algorithm is adopted to associate the trajectories with the detection results. The presented detection and tracking methods are tested on two data sets collected in San Francisco, California by a mobile Doppler LiDAR system. The results of the pedestrian detection demonstrate that the proposed two-step classifier can improve the detection performance, particularly for detecting pedestrians far from the sensor. For both data sets, the use of Doppler speed information improves the F1-score and the recall by 15% to 20%. The subsequent tracking from the Kalman filter can achieve 83.9–55.3% for the multiple object tracking accuracy (MOTA), where the contribution of the speed measurements is secondary and insignificant

    Fusion of aerial images and sensor data from a ground vehicle for improved semantic mapping

    This work investigates the use of semantic information to link ground level occupancy maps and aerial images. A ground level semantic map, which shows open ground and indicates the probability of cells being occupied by walls of buildings, is obtained by a mobile robot equipped with an omnidirectional camera, GPS and a laser range finder. This semantic information is used for local and global segmentation of an aerial image. The result is a map where the semantic information has been extended beyond the range of the robot sensors and predicts where the mobile robot can find buildings and potentially driveable ground

    Scene understanding for autonomous robots operating in indoor environments

    Mención Internacional en el título de doctorThe idea of having robots among us is not new. Great efforts are continually made to replicate human intelligence, with the vision of having robots performing different activities, including hazardous, repetitive, and tedious tasks. Research has demonstrated that robots are good at many tasks that are hard for us, mainly in terms of precision, efficiency, and speed. However, there are some tasks that humans do without much effort that are challenging for robots. Especially robots in domestic environments are far from satisfactorily fulfilling some tasks, mainly because these environments are unstructured, cluttered, and with a variety of environmental conditions to control. This thesis addresses the problem of scene understanding in the context of autonomous robots operating in everyday human environments. Furthermore, this thesis is developed under the HEROITEA research project that aims to develop a robot system to help elderly people in domestic environments as an assistant. Our main objective is to develop different methods that allow robots to acquire more information from the environment to progressively build knowledge that allows them to improve the performance on high-level robotic tasks. In this way, scene understanding is a broad research topic, and it is considered a complex task due to the multiple sub-tasks that are involved. In that context, in this thesis, we focus on three sub-tasks: object detection, scene recognition, and semantic segmentation of the environment. Firstly, we implement methods to recognize objects considering real indoor environments. We applied machine learning techniques incorporating uncertainties and more modern techniques based on deep learning. Besides, apart from detecting objects, it is essential to comprehend the scene where they can occur. For this reason, we propose an approach for scene recognition that considers the influence of the detected objects in the prediction process. We demonstrate that the exiting objects and their relationships can improve the inference about the scene class. We also consider that a scene recognition model can benefit from the advantages of other models. We propose a multi-classifier model for scene recognition based on weighted voting schemes. The experiments carried out in real-world indoor environments demonstrate that the adequate combination of independent classifiers allows obtaining a more robust and precise model for scene recognition. Moreover, to increase the understanding of a robot about its surroundings, we propose a new division of the environment based on regions to build a useful representation of the environment. Object and scene information is integrated into a probabilistic fashion generating a semantic map of the environment containing meaningful regions within each room. The proposed system has been assessed on simulated and real-world domestic scenarios, demonstrating its ability to generate consistent environment representations. Lastly, full knowledge of the environment can enhance more complex robotic tasks; that is why in this thesis, we try to study how a complete knowledge of the environment influences the robot’s performance in high-level tasks. To do so, we select an essential task, which is searching for objects. This mundane task can be considered a precondition to perform many complex robotic tasks such as fetching and carrying, manipulation, user requirements, among others. The execution of these activities by service robots needs full knowledge of the environment to perform each task efficiently. In this thesis, we propose two searching strategies that consider prior information, semantic representation of the environment, and the relationships between known objects and the type of scene. All our developments are evaluated in simulated and real-world environments, integrated with other systems, and operating in real platforms, demonstrating their feasibility to implement in real scenarios, and in some cases outperforming other approaches. We also demonstrate how our representation of the environment can boost the performance of more complex robotic tasks compared to more standard environmental representations.La idea de tener robots entre nosotros no es nueva. Continuamente se realizan grandes esfuerzos para replicar la inteligencia humana, con la visión de tener robots que realicen diferentes actividades, incluidas tareas peligrosas, repetitivas y tediosas. La investigación ha demostrado que los robots son buenos en muchas tareas que resultan difíciles para nosotros, principalmente en términos de precisión, eficiencia y velocidad. Sin embargo, existen tareas que los humanos realizamos sin mucho esfuerzo y que son un desafío para los robots. Especialmente, los robots en entornos domésticos están lejos de cumplir satisfactoriamente algunas tareas, principalmente porque estos entornos no son estructurados, pueden estar desordenados y cuentan con una gran variedad de condiciones ambientales que controlar. Esta tesis aborda el problema de la comprensión de la escena en el contexto de robots autónomos que operan en entornos humanos cotidianos. Asimismo, esta tesis se desarrolla en el marco del proyecto de investigación HEROITEA que tiene como objetivo desarrollar un sistema robótico que funcione como asistente para ayudar a personas mayores en entornos domésticos. Nuestro principal objetivo es desarrollar diferentes métodos que permitan a los robots adquirir más información del entorno a fin de construir progresivamente un conocimiento que les permita mejorar su desempeño en tareas robóticas más complejas. En este sentido, la comprensión de escenas es un tema de investigación amplio, y se considera una tarea compleja debido a las múltiples subtareas involucradas. En esta tesis nos enfocamos específicamente en tres subtareas: detección de objetos, reconocimiento de escenas y etiquetado semántico del entorno. Por un lado, implementamos métodos para el reconocimiento de objectos considerando entornos interiores reales. Aplicamos técnicas de aprendizaje automático incorporando incertidumbres y técnicas más modernas basadas en aprendizaje profundo. Además, aparte de detectar objetos, es fundamental comprender la escena donde estos se encuentran. Por esta razón, proponemos un modelo para el reconocimiento de escenas que considera la influencia de los objetos detectados en el proceso de predicción. Demostramos que los objetos existentes y sus relaciones pueden mejorar el proceso de inferencia de la categoría de la escena. También consideramos que un modelo de reconocimiento de escenas puede beneficiarse de las ventajas de otros modelos. Por ello, proponemos un multiclasificador para el reconocimiento de escenas basado en esquemas de votación ponderados. Los experimentos llevados a cabo en entornos interiores reales demuestran que la combinación adecuada de clasificadores independientes permite obtener un modelo más robusto y preciso para el reconocimiento de escenas. Adicionalmente, para aumentar la comprensión de un robot acerca de su entorno, proponemos una nueva división del entorno basada en regiones a fin de construir una representación útil del entorno. La información de objetos y de la escena se integra de forma probabilística generando un mapa semántico que contiene regiones significativas dentro de cada habitación. El sistema propuesto ha sido evaluado en entornos domésticos simulados y reales, demostrando su capacidad para generar representaciones consistentes del entorno. Por otro lado, el conocimiento integral del entorno puede mejorar tareas robóticas más complejas; es por ello que en esta tesis analizamos cómo el conocimiento completo del entorno influye en el desempeño del robot en tareas de alto nivel. Para ello, seleccionamos una tarea fundamental, que es la búsqueda de objetos. Esta tarea mundana puede considerarse una condición previa para realizar diversas tareas robóticas complejas, como transportar objetos, tareas de manipulación, atender requerimientos del usuario, entre otras. La ejecución de estas actividades por parte de robots de servicio requiere un conocimiento profundo del entorno para realizar cada tarea de manera eficiente. En esta tesis proponemos dos estrategias de búsqueda de objetos que consideran información previa, la representación semántica del entorno, las relaciones entre los objetos conocidos y el tipo de escena. Todos nuestros desarrollos son evaluados en entornos simulados y reales, integrados con otros sistemas y operando en plataformas reales, demostrando su viabilidad de ser implementados en escenarios reales y, en algunos casos, superando a otros enfoques. También demostramos cómo nuestra representación del entorno puede mejorar el desempeño de tareas robóticas más complejas en comparación con representaciones del entorno más tradicionales.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Carlos Balaguer Bernaldo de Quirós.- Secretario: Fernando Matía Espada.- Vocal: Klaus Strob

    Artificial Vision Algorithms for Socially Assistive Robot Applications: A Review of the Literature

    Today, computer vision algorithms are very important for different fields and applications, such as closed-circuit television security, health status monitoring, and recognizing a specific person or object and robotics. Regarding this topic, the present paper deals with a recent review of the literature on computer vision algorithms (recognition and tracking of faces, bodies, and objects) oriented towards socially assistive robot applications. The performance, frames per second (FPS) processing speed, and hardware implemented to run the algorithms are highlighted by comparing the available solutions. Moreover, this paper provides general information for researchers interested in knowing which vision algorithms are available, enabling them to select the one that is most suitable to include in their robotic system applicationsBeca Conacyt Doctorado No de CVU: 64683