2,291 research outputs found

    A Review of Deep Learning Methods and Applications for Unmanned Aerial Vehicles

    Get PDF
    Deep learning is recently showing outstanding results for solving a wide variety of robotic tasks in the areas of perception, planning, localization, and control. Its excellent capabilities for learning representations from the complex data acquired in real environments make it extremely suitable for many kinds of autonomous robotic applications. In parallel, Unmanned Aerial Vehicles (UAVs) are currently being extensively applied for several types of civilian tasks in applications going from security, surveillance, and disaster rescue to parcel delivery or warehouse management. In this paper, a thorough review has been performed on recent reported uses and applications of deep learning for UAVs, including the most relevant developments as well as their performances and limitations. In addition, a detailed explanation of the main deep learning techniques is provided. We conclude with a description of the main challenges for the application of deep learning for UAV-based solutions

    Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

    Get PDF
    Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved

    3D Robotic Sensing of People: Human Perception, Representation and Activity Recognition

    Get PDF
    The robots are coming. Their presence will eventually bridge the digital-physical divide and dramatically impact human life by taking over tasks where our current society has shortcomings (e.g., search and rescue, elderly care, and child education). Human-centered robotics (HCR) is a vision to address how robots can coexist with humans and help people live safer, simpler and more independent lives. As humans, we have a remarkable ability to perceive the world around us, perceive people, and interpret their behaviors. Endowing robots with these critical capabilities in highly dynamic human social environments is a significant but very challenging problem in practical human-centered robotics applications. This research focuses on robotic sensing of people, that is, how robots can perceive and represent humans and understand their behaviors, primarily through 3D robotic vision. In this dissertation, I begin with a broad perspective on human-centered robotics by discussing its real-world applications and significant challenges. Then, I will introduce a real-time perception system, based on the concept of Depth of Interest, to detect and track multiple individuals using a color-depth camera that is installed on moving robotic platforms. In addition, I will discuss human representation approaches, based on local spatio-temporal features, including new “CoDe4D” features that incorporate both color and depth information, a new “SOD” descriptor to efficiently quantize 3D visual features, and the novel AdHuC features, which are capable of representing the activities of multiple individuals. Several new algorithms to recognize human activities are also discussed, including the RG-PLSA model, which allows us to discover activity patterns without supervision, the MC-HCRF model, which can explicitly investigate certainty in latent temporal patterns, and the FuzzySR model, which is used to segment continuous data into events and probabilistically recognize human activities. Cognition models based on recognition results are also implemented for decision making that allow robotic systems to react to human activities. Finally, I will conclude with a discussion of future directions that will accelerate the upcoming technological revolution of human-centered robotics

    Vegetation Detection and Classification for Power Line Monitoring

    Get PDF
    Electrical network maintenance inspections must be regularly executed, to provide a continuous distribution of electricity. In forested countries, the electrical network is mostly located within the forest. For this reason, during these inspections, it is also necessary to assure that vegetation growing close to the power line does not potentially endanger it, provoking forest fires or power outages. Several remote sensing techniques have been studied in the last years to replace the labor-intensive and costly traditional approaches, be it field based or airborne surveillance. Besides the previously mentioned disadvantages, these approaches are also prone to error, since they are dependent of a human operator’s interpretation. In recent years, Unmanned Aerial Vehicle (UAV) platform applicability for this purpose has been under debate, due to its flexibility and potential for customisation, as well as the fact it can fly close to the power lines. The present study proposes a vegetation management and power line monitoring method, using a UAV platform. This method starts with the collection of point cloud data in a forest environment composed of power line structures and vegetation growing close to it. Following this process, multiple steps are taken, including: detection of objects in the working environment; classification of said objects into their respective class labels using a feature-based classifier, either vegetation or power line structures; optimisation of the classification results using point cloud filtering or segmentation algorithms. The method is tested using both synthetic and real data of forested areas containing power line structures. The Overall Accuracy of the classification process is about 87% and 97-99% for synthetic and real data, respectively. After the optimisation process, these values were refined to 92% for synthetic data and nearly 100% for real data. A detailed comparison and discussion of results is presented, providing the most important evaluation metrics and a visual representations of the attained results.Manutenções regulares da rede elétrica devem ser realizadas de forma a assegurar uma distribuição contínua de eletricidade. Em países com elevada densidade florestal, a rede elétrica encontra-se localizada maioritariamente no interior das florestas. Por isso, durante estas inspeções, é necessário assegurar também que a vegetação próxima da rede elétrica não a coloca em risco, provocando incêndios ou falhas elétricas. Diversas técnicas de deteção remota foram estudadas nos últimos anos para substituir as tradicionais abordagens dispendiosas com mão-de-obra intensiva, sejam elas através de vigilância terrestre ou aérea. Além das desvantagens mencionadas anteriormente, estas abordagens estão também sujeitas a erros, pois estão dependentes da interpretação de um operador humano. Recentemente, a aplicabilidade de plataformas com Unmanned Aerial Vehicles (UAV) tem sido debatida, devido à sua flexibilidade e potencial personalização, assim como o facto de conseguirem voar mais próximas das linhas elétricas. O presente estudo propõe um método para a gestão da vegetação e monitorização da rede elétrica, utilizando uma plataforma UAV. Este método começa pela recolha de dados point cloud num ambiente florestal composto por estruturas da rede elétrica e vegetação em crescimento próximo da mesma. Em seguida,múltiplos passos são seguidos, incluindo: deteção de objetos no ambiente; classificação destes objetos com as respetivas etiquetas de classe através de um classificador baseado em features, vegetação ou estruturas da rede elétrica; otimização dos resultados da classificação utilizando algoritmos de filtragem ou segmentação de point cloud. Este método é testado usando dados sintéticos e reais de áreas florestais com estruturas elétricas. A exatidão do processo de classificação é cerca de 87% e 97-99% para os dados sintéticos e reais, respetivamente. Após o processo de otimização, estes valores aumentam para 92% para os dados sintéticos e cerca de 100% para os dados reais. Uma comparação e discussão de resultados é apresentada, fornecendo as métricas de avaliação mais importantes e uma representação visual dos resultados obtidos

    Locomotion Optimization of Photoresponsive Small-scale Robot: A Deep Reinforcement Learning Approach

    Get PDF
    Soft robots comprise of elastic and flexible structures, and actuatable soft materials are often used to provide stimuli-responses, remotely controlled with different kinds of external stimuli, which is beneficial for designing small-scale devices. Among different stimuli-responsive materials, liquid crystal networks (LCNs) have gained a significant amount of attention for soft small-scale robots in the past decade being stimulated and actuated by light, which is clean energy, able to transduce energy remotely, easily available and accessible to sophisticated control. One of the persistent challenges in photoresponsive robotics is to produce controllable autonomous locomotion behavior. In this Thesis, different types of photoresponsive soft robots were used to realize light-powered locomotion, and an artificial intelligence-based approach was developed for controlling the movement. A robot tracking system, including an automatic laser steering function, was built for efficient robotic feature detection and steering the laser beam automatically to desired locations. Another robot prototype, a swimmer robot, driven by the automatically steered laser beam, showed directional movements including some degree of uncertainty and randomness in their locomotion behavior. A novel approach is developed to deal with the challenges related to the locomotion of photoresponsive swimmer robots. Machine learning, particularly deep reinforcement learning method, was applied to develop a control policy for autonomous locomotion behavior. This method can learn from its experiences by interacting with the robot and its environment without explicit knowledge of the robot structure, constituent material, and robotic mechanics. Due to the requirement of a large number of experiences to correlate the goodness of behavior control, a simulator was developed, which mimicked the uncertain and random movement behavior of the swimmer robots. This approach effectively adapted the random movement behaviors and developed an optimal control policy to reach different destination points autonomously within a simulated environment. This work has successfully taken a step towards the autonomous locomotion control of soft photoresponsive robots

    Learning to see across domains and modalities

    Get PDF
    Deep learning has recently raised hopes and expectations as a general solution for many applications (computer vision, natural language processing, speech recognition, etc.); indeed it has proven effective, but it also showed a strong dependence on large quantities of data. Generally speaking, deep learning models are especially susceptible to overfitting, due to their large number of internal parameters. Luckily, it has also been shown that, even when data is scarce, a successful model can be trained by reusing prior knowledge. Thus, developing techniques for \textit{transfer learning} (as this process is known), in its broadest definition, is a crucial element towards the deployment of effective and accurate intelligent systems into the real world. This thesis will focus on a family of transfer learning methods applied to the task of visual object recognition, specifically image classification. The visual recognition problem is central to computer vision research: many desired applications, from robotics to information retrieval, demand the ability to correctly identify categories, places, and objects. Transfer learning is a general term, and specific settings have been given specific names: when the learner has access to only unlabeled data from the target domain (where the model should perform) and labeled data from a different domain (the source), the problem is called unsupervised domain adaptation (DA). The first part of this thesis will focus on three methods for this setting. The three presented techniques for domain adaptation are fully distinct: the first one proposes the use of Domain Alignment layers to structurally align the distributions of the source and target domains in feature space. While the general idea of aligning feature distribution is not novel, we distinguish our method by being one of the very few that do so without adding losses. The second method is based on GANs: we propose a bidirectional architecture that jointly learns how to map the source images into the target visual style and vice-versa, thus alleviating the domain shift at the pixel level. The third method features an adversarial learning process that transforms both the images and the features of both domains in order to map them to a common, agnostic, space. While the first part of the thesis presented general purpose DA methods, the second part will focus on the real life issues of robotic perception, specifically RGB-D recognition. Robotic platforms are usually not limited to color perception; very often they also carry a Depth camera. Unfortunately, the depth modality is rarely used for visual recognition due to the lack of pretrained models from which to transfer and little data to train one on from scratch. We will first explore the use of synthetic data as proxy for real images by training a Convolutional Neural Network (CNN) on virtual depth maps, rendered from 3D CAD models, and then testing it on real robotic datasets. The second approach leverages the existence of RGB pretrained models, by learning how to map the depth data into the most discriminative RGB representation and then using existing models for recognition. This second technique is actually a pretty generic Transfer Learning method which can be applied to share knowledge across modalities
    corecore