275 research outputs found
Survey of computer vision algorithms and applications for unmanned aerial vehicles
This paper presents a complete review of computer vision algorithms and vision-based intelligent applications, that are developed in the field of the Unmanned Aerial Vehicles (UAVs) in the latest decade. During this time, the evolution of relevant technologies for UAVs; such as component miniaturization, the increase of computational capabilities, and the evolution of computer vision techniques have allowed an important advance in the development of UAVs technologies and applications. Particularly, computer vision technologies integrated in UAVs allow to develop cutting-edge technologies to cope with aerial perception difficulties; such as visual navigation algorithms, obstacle detection and avoidance and aerial decision-making. All these expert technologies have developed a wide spectrum of application for UAVs, beyond the classic military and defense purposes. Unmanned Aerial Vehicles and Computer Vision are common topics in expert systems, so thanks to the recent advances in perception technologies, modern intelligent applications are developed to enhance autonomous UAV positioning, or automatic algorithms to avoid aerial collisions, among others. Then, the presented survey is based on artificial perception applications that represent important advances in the latest years in the expert system field related to the Unmanned Aerial Vehicles. In this paper, the most significant advances in this field are presented, able to solve fundamental technical limitations; such as visual odometry, obstacle detection, mapping and localization, et cetera. Besides, they have been analyzed based on their capabilities and potential utility. Moreover, the applications and UAVs are divided and categorized according to different criteria.This research is supported by the Spanish Government through the CICYT projects (TRA2015-63708-R and TRA2013-48314-C3-1-R)
Vision-Based navigation system for unmanned aerial vehicles
Mención Internacional en el título de doctorThe main objective of this dissertation is to provide Unmanned Aerial Vehicles
(UAVs) with a robust navigation system; in order to allow the UAVs to perform
complex tasks autonomously and in real-time. The proposed algorithms deal with
solving the navigation problem for outdoor as well as indoor environments, mainly
based on visual information that is captured by monocular cameras. In addition,
this dissertation presents the advantages of using the visual sensors as the main
source of data, or complementing other sensors in providing useful information; in
order to improve the accuracy and the robustness of the sensing purposes.
The dissertation mainly covers several research topics based on computer vision
techniques: (I) Pose Estimation, to provide a solution for estimating the 6D pose of
the UAV. This algorithm is based on the combination of SIFT detector and FREAK
descriptor; which maintains the performance of the feature points matching and decreases
the computational time. Thereafter, the pose estimation problem is solved
based on the decomposition of the world-to-frame and frame-to-frame homographies.
(II) Obstacle Detection and Collision Avoidance, in which, the UAV is able to
sense and detect the frontal obstacles that are situated in its path. The detection
algorithm mimics the human behaviors for detecting the approaching obstacles; by
analyzing the size changes of the detected feature points, combined with the expansion
ratios of the convex hull constructed around the detected feature points
from consecutive frames. Then, by comparing the area ratio of the obstacle and the
position of the UAV, the method decides if the detected obstacle may cause a collision.
Finally, the algorithm extracts the collision-free zones around the obstacle,
and combining with the tracked waypoints, the UAV performs the avoidance maneuver.
(III) Navigation Guidance, which generates the waypoints to determine
the flight path based on environment and the situated obstacles. Then provide
a strategy to follow the path segments and in an efficient way and perform the
flight maneuver smoothly. (IV) Visual Servoing, to offer different control solutions (Fuzzy Logic Control (FLC) and PID), based on the obtained visual information; in
order to achieve the flight stability as well as to perform the correct maneuver; to
avoid the possible collisions and track the waypoints.
All the proposed algorithms have been verified with real flights in both indoor
and outdoor environments, taking into consideration the visual conditions; such as
illumination and textures. The obtained results have been validated against other
systems; such as VICON motion capture system, DGPS in the case of pose estimate
algorithm. In addition, the proposed algorithms have been compared with several
previous works in the state of the art, and are results proves the improvement in
the accuracy and the robustness of the proposed algorithms.
Finally, this dissertation concludes that the visual sensors have the advantages
of lightweight and low consumption and provide reliable information, which is
considered as a powerful tool in the navigation systems to increase the autonomy
of the UAVs for real-world applications.El objetivo principal de esta tesis es proporcionar Vehiculos Aereos no Tripulados
(UAVs) con un sistema de navegacion robusto, para permitir a los UAVs realizar
tareas complejas de forma autonoma y en tiempo real. Los algoritmos propuestos
tratan de resolver problemas de la navegacion tanto en ambientes interiores como
al aire libre basandose principalmente en la informacion visual captada por las camaras
monoculares. Ademas, esta tesis doctoral presenta la ventaja de usar sensores
visuales bien como fuente principal de datos o complementando a otros sensores
en el suministro de informacion util, con el fin de mejorar la precision y la
robustez de los procesos de deteccion.
La tesis cubre, principalmente, varios temas de investigacion basados en tecnicas
de vision por computador: (I) Estimacion de la Posicion y la Orientacion
(Pose), para proporcionar una solucion a la estimacion de la posicion y orientacion
en 6D del UAV. Este algoritmo se basa en la combinacion del detector SIFT y el
descriptor FREAK, que mantiene el desempeno del a funcion de puntos de coincidencia
y disminuye el tiempo computacional. De esta manera, se soluciona el
problema de la estimacion de la posicion basandose en la descomposicion de las
homografias mundo a imagen e imagen a imagen. (II) Deteccion obstaculos y elusion
colisiones, donde el UAV es capaz de percibir y detectar los obstaculos frontales
que se encuentran en su camino. El algoritmo de deteccion imita comportamientos
humanos para detectar los obstaculos que se acercan, mediante el analisis de la
magnitud del cambio de los puntos caracteristicos detectados de referencia, combinado
con los ratios de expansion de los contornos convexos construidos alrededor
de los puntos caracteristicos detectados en frames consecutivos. A continuacion,
comparando la proporcion del area del obstaculo y la posicion del UAV, el metodo
decide si el obstaculo detectado puede provocar una colision. Por ultimo, el algoritmo
extrae las zonas libres de colision alrededor del obstaculo y combinandolo
con los puntos de referencia, elUAV realiza la maniobra de evasion. (III) Guiado de navegacion, que genera los puntos de referencia para determinar la trayectoria de
vuelo basada en el entorno y en los obstaculos detectados que encuentra. Proporciona
una estrategia para seguir los segmentos del trazado de una manera eficiente
y realizar la maniobra de vuelo con suavidad. (IV) Guiado por Vision, para ofrecer
soluciones de control diferentes (Control de Logica Fuzzy (FLC) y PID), basados en
la informacion visual obtenida con el fin de lograr la estabilidad de vuelo, asi como
realizar la maniobra correcta para evitar posibles colisiones y seguir los puntos de
referencia.
Todos los algoritmos propuestos han sido verificados con vuelos reales en ambientes
exteriores e interiores, tomando en consideracion condiciones visuales como
la iluminacion y las texturas. Los resultados obtenidos han sido validados con otros
sistemas: como el sistema de captura de movimiento VICON y DGPS en el caso del
algoritmo de estimacion de la posicion y orientacion. Ademas, los algoritmos propuestos
han sido comparados con trabajos anteriores recogidos en el estado del arte
con resultados que demuestran una mejora de la precision y la robustez de los algoritmos
propuestos.
Esta tesis doctoral concluye que los sensores visuales tienen las ventajes de tener
un peso ligero y un bajo consumo y, proporcionar informacion fiable, lo cual lo
hace una poderosa herramienta en los sistemas de navegacion para aumentar la
autonomia de los UAVs en aplicaciones del mundo real.Programa Oficial de Doctorado en Ingeniería Eléctrica, Electrónica y AutomáticaPresidente: Carlo Regazzoni.- Secretario: Fernando García Fernández.- Vocal: Pascual Campoy Cerver
Vision-based localization methods under GPS-denied conditions
This paper reviews vision-based localization methods in GPS-denied
environments and classifies the mainstream methods into Relative Vision
Localization (RVL) and Absolute Vision Localization (AVL). For RVL, we discuss
the broad application of optical flow in feature extraction-based Visual
Odometry (VO) solutions and introduce advanced optical flow estimation methods.
For AVL, we review recent advances in Visual Simultaneous Localization and
Mapping (VSLAM) techniques, from optimization-based methods to Extended Kalman
Filter (EKF) based methods. We also introduce the application of offline map
registration and lane vision detection schemes to achieve Absolute Visual
Localization. This paper compares the performance and applications of
mainstream methods for visual localization and provides suggestions for future
studies.Comment: 32 pages, 15 figure
Fast, Accurate Thin-Structure Obstacle Detection for Autonomous Mobile Robots
Safety is paramount for mobile robotic platforms such as self-driving cars
and unmanned aerial vehicles. This work is devoted to a task that is
indispensable for safety yet was largely overlooked in the past -- detecting
obstacles that are of very thin structures, such as wires, cables and tree
branches. This is a challenging problem, as thin objects can be problematic for
active sensors such as lidar and sonar and even for stereo cameras. In this
work, we propose to use video sequences for thin obstacle detection. We
represent obstacles with edges in the video frames, and reconstruct them in 3D
using efficient edge-based visual odometry techniques. We provide both a
monocular camera solution and a stereo camera solution. The former incorporates
Inertial Measurement Unit (IMU) data to solve scale ambiguity, while the latter
enjoys a novel, purely vision-based solution. Experiments demonstrated that the
proposed methods are fast and able to detect thin obstacles robustly and
accurately under various conditions.Comment: Appeared at IEEE CVPR 2017 Workshop on Embedded Visio
Effective image enhancement and fast object detection for improved UAV applications
As an emerging field, unmanned aerial vehicles (UAVs) feature from interdisciplinary techniques in science, engineering and industrial sectors. The massive applications span from remote sensing, precision agriculture, marine inspection, coast guarding, environmental monitoring, natural resources monitoring, e.g. forest, land and river, and disaster assessment, to smart city, intelligent transportation and logistics and delivery.
With the fast growing demands from a wide range of application sectors, there is always a bottleneck how to improve the efficiency and efficacy of UAV in operation. Often, smart decision making is needed from the captured footages in a real-time manner, yet this is severely affected by the poor image quality, ineffective object detection and recognition models, and lack of robust and light models for supporting the edge computing and real deployment.
In this thesis, several innovative works have been focused and developed to tackle some of the above issues. First of all, considering the quality requirements of the UAV images, various approaches and models have been proposed, yet they focus on different aspects and produce inconsistent results. As such, the work in this thesis has been categorised into denoising and dehazing focused, followed by comprehensive evaluation in terms of both qualitative and quantitative assessment. These will provide valuable insights and useful guidance to help the end user and research community.
For fast and effective object detection and recognition, deep learning based models, especially the YOLO series, are popularly used. However, taking the YOLOv7 as the baseline, the performance is very much affected by a few factors, such as the low quality of the UAV images and the high-level of demanding of resources, leading to unsatisfactory performance in accuracy and processing speed. As a result, three major improvements, namely transformer, CIoULoss and the GhostBottleneck module, are introduced in this work to improve feature extraction, decision making in detection and recognition, and running efficiency. Comprehensive experiments on both publicly available and self-collected datasets have validated the efficiency and efficacy of the proposed algorithm.
In addition, to facilitate the real deployment such as edge computing scenarios, embedded implementation of the key algorithm modules is introduced. These include the creative implementation on the Xavier NX platform, in comparison to the standard workstation settings with the NVIDIA GPUs. As a result, it has demonstrated promising results with improved performance in reduced resources consumption of the CPU/GPU usage and enhanced frame rate of real-time processing to benefit the real-time deployment with the uncompromised edge computing.
Through these innovative investigation and development, a better understanding has been established on key challenges associated with UAV and Simultaneous Localisation and Mapping (SLAM) based applications, and possible solutions are presented.
Keywords: Unmanned aerial vehicles (UAV); Simultaneous Localisation and Mapping (SLAM); denoising; dehazing; object detection; object recognition; deep learning; YOLOv7; transformer; GhostBottleneck; scene matching; embedded implementation; Xavier NX; edge computing.As an emerging field, unmanned aerial vehicles (UAVs) feature from interdisciplinary techniques in science, engineering and industrial sectors. The massive applications span from remote sensing, precision agriculture, marine inspection, coast guarding, environmental monitoring, natural resources monitoring, e.g. forest, land and river, and disaster assessment, to smart city, intelligent transportation and logistics and delivery.
With the fast growing demands from a wide range of application sectors, there is always a bottleneck how to improve the efficiency and efficacy of UAV in operation. Often, smart decision making is needed from the captured footages in a real-time manner, yet this is severely affected by the poor image quality, ineffective object detection and recognition models, and lack of robust and light models for supporting the edge computing and real deployment.
In this thesis, several innovative works have been focused and developed to tackle some of the above issues. First of all, considering the quality requirements of the UAV images, various approaches and models have been proposed, yet they focus on different aspects and produce inconsistent results. As such, the work in this thesis has been categorised into denoising and dehazing focused, followed by comprehensive evaluation in terms of both qualitative and quantitative assessment. These will provide valuable insights and useful guidance to help the end user and research community.
For fast and effective object detection and recognition, deep learning based models, especially the YOLO series, are popularly used. However, taking the YOLOv7 as the baseline, the performance is very much affected by a few factors, such as the low quality of the UAV images and the high-level of demanding of resources, leading to unsatisfactory performance in accuracy and processing speed. As a result, three major improvements, namely transformer, CIoULoss and the GhostBottleneck module, are introduced in this work to improve feature extraction, decision making in detection and recognition, and running efficiency. Comprehensive experiments on both publicly available and self-collected datasets have validated the efficiency and efficacy of the proposed algorithm.
In addition, to facilitate the real deployment such as edge computing scenarios, embedded implementation of the key algorithm modules is introduced. These include the creative implementation on the Xavier NX platform, in comparison to the standard workstation settings with the NVIDIA GPUs. As a result, it has demonstrated promising results with improved performance in reduced resources consumption of the CPU/GPU usage and enhanced frame rate of real-time processing to benefit the real-time deployment with the uncompromised edge computing.
Through these innovative investigation and development, a better understanding has been established on key challenges associated with UAV and Simultaneous Localisation and Mapping (SLAM) based applications, and possible solutions are presented.
Keywords: Unmanned aerial vehicles (UAV); Simultaneous Localisation and Mapping (SLAM); denoising; dehazing; object detection; object recognition; deep learning; YOLOv7; transformer; GhostBottleneck; scene matching; embedded implementation; Xavier NX; edge computing
Detection and estimation of moving obstacles for a UAV
In recent years, research interest in Unmanned Aerial Vehicles (UAVs) has been grown rapidly because of their potential use for a wide range of applications. In this paper, we proposed a vision-based detection and position/velocity estimation of moving obstacle for a UAV. The knowledge of a moving obstacle's state, i.e., position, velocity, is essential to achieve better performance for an intelligent UAV system specially in autonomous navigation and landing tasks. The novelties are: (1) the design and implementation of a localization method using sensor fusion methodology which fuses Inertial Measurement Unit (IMU) signals and Pozyx signals; (2) The development of detection and estimation of moving obstacles method based on on-board vision system. Experimental results validate the effectiveness of the proposed approach. (C) 2019, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved
Visual Guidance for Unmanned Aerial Vehicles with Deep Learning
Unmanned Aerial Vehicles (UAVs) have been widely applied in the military and civilian domains. In recent years, the operation mode of UAVs is evolving from teleoperation to autonomous flight. In order to fulfill the goal of autonomous flight, a reliable guidance system is essential. Since the combination of Global Positioning System (GPS) and Inertial Navigation System (INS) systems cannot sustain autonomous flight in some situations where GPS can be degraded or unavailable, using computer vision as a primary method for UAV guidance has been widely explored. Moreover, GPS does not provide any information to the robot on the presence of obstacles.
Stereo cameras have complex architecture and need a minimum baseline to generate disparity map. By contrast, monocular cameras are simple and require less hardware resources. Benefiting from state-of-the-art Deep Learning (DL) techniques, especially Convolutional Neural Networks (CNNs), a monocular camera is sufficient to extrapolate mid-level visual representations such as depth maps and optical flow (OF) maps from the environment. Therefore, the objective of this thesis is to develop a real-time visual guidance method for UAVs in cluttered environments using a monocular camera and DL.
The three major tasks performed in this thesis are investigating the development of DL techniques and monocular depth estimation (MDE), developing real-time CNNs for MDE, and developing visual guidance methods on the basis of the developed MDE system. A comprehensive survey is conducted, which covers Structure from Motion (SfM)-based methods, traditional handcrafted feature-based methods, and state-of-the-art DL-based methods. More importantly, it also investigates the application of MDE in robotics. Based on the survey, two CNNs for MDE are developed. In addition to promising accuracy performance, these two CNNs run at high frame rates (126 fps and 90 fps respectively), on a single modest power Graphical Processing Unit (GPU).
As regards the third task, the visual guidance for UAVs is first developed on top of the designed MDE networks. To improve the robustness of UAV guidance, OF maps are integrated into the developed visual guidance method. A cross-attention module is applied to fuse the features learned from the depth maps and OF maps. The fused features are then passed through a deep reinforcement learning (DRL) network to generate the policy for guiding the flight of UAV. Additionally, a simulation framework is developed which integrates AirSim, Unreal Engine and PyTorch. The effectiveness of the developed visual guidance method is validated through extensive experiments in the simulation framework
- …