26 research outputs found
Vision-Based navigation system for unmanned aerial vehicles
Menci贸n Internacional en el t铆tulo de doctorThe main objective of this dissertation is to provide Unmanned Aerial Vehicles
(UAVs) with a robust navigation system; in order to allow the UAVs to perform
complex tasks autonomously and in real-time. The proposed algorithms deal with
solving the navigation problem for outdoor as well as indoor environments, mainly
based on visual information that is captured by monocular cameras. In addition,
this dissertation presents the advantages of using the visual sensors as the main
source of data, or complementing other sensors in providing useful information; in
order to improve the accuracy and the robustness of the sensing purposes.
The dissertation mainly covers several research topics based on computer vision
techniques: (I) Pose Estimation, to provide a solution for estimating the 6D pose of
the UAV. This algorithm is based on the combination of SIFT detector and FREAK
descriptor; which maintains the performance of the feature points matching and decreases
the computational time. Thereafter, the pose estimation problem is solved
based on the decomposition of the world-to-frame and frame-to-frame homographies.
(II) Obstacle Detection and Collision Avoidance, in which, the UAV is able to
sense and detect the frontal obstacles that are situated in its path. The detection
algorithm mimics the human behaviors for detecting the approaching obstacles; by
analyzing the size changes of the detected feature points, combined with the expansion
ratios of the convex hull constructed around the detected feature points
from consecutive frames. Then, by comparing the area ratio of the obstacle and the
position of the UAV, the method decides if the detected obstacle may cause a collision.
Finally, the algorithm extracts the collision-free zones around the obstacle,
and combining with the tracked waypoints, the UAV performs the avoidance maneuver.
(III) Navigation Guidance, which generates the waypoints to determine
the flight path based on environment and the situated obstacles. Then provide
a strategy to follow the path segments and in an efficient way and perform the
flight maneuver smoothly. (IV) Visual Servoing, to offer different control solutions (Fuzzy Logic Control (FLC) and PID), based on the obtained visual information; in
order to achieve the flight stability as well as to perform the correct maneuver; to
avoid the possible collisions and track the waypoints.
All the proposed algorithms have been verified with real flights in both indoor
and outdoor environments, taking into consideration the visual conditions; such as
illumination and textures. The obtained results have been validated against other
systems; such as VICON motion capture system, DGPS in the case of pose estimate
algorithm. In addition, the proposed algorithms have been compared with several
previous works in the state of the art, and are results proves the improvement in
the accuracy and the robustness of the proposed algorithms.
Finally, this dissertation concludes that the visual sensors have the advantages
of lightweight and low consumption and provide reliable information, which is
considered as a powerful tool in the navigation systems to increase the autonomy
of the UAVs for real-world applications.El objetivo principal de esta tesis es proporcionar Vehiculos Aereos no Tripulados
(UAVs) con un sistema de navegacion robusto, para permitir a los UAVs realizar
tareas complejas de forma autonoma y en tiempo real. Los algoritmos propuestos
tratan de resolver problemas de la navegacion tanto en ambientes interiores como
al aire libre basandose principalmente en la informacion visual captada por las camaras
monoculares. Ademas, esta tesis doctoral presenta la ventaja de usar sensores
visuales bien como fuente principal de datos o complementando a otros sensores
en el suministro de informacion util, con el fin de mejorar la precision y la
robustez de los procesos de deteccion.
La tesis cubre, principalmente, varios temas de investigacion basados en tecnicas
de vision por computador: (I) Estimacion de la Posicion y la Orientacion
(Pose), para proporcionar una solucion a la estimacion de la posicion y orientacion
en 6D del UAV. Este algoritmo se basa en la combinacion del detector SIFT y el
descriptor FREAK, que mantiene el desempeno del a funcion de puntos de coincidencia
y disminuye el tiempo computacional. De esta manera, se soluciona el
problema de la estimacion de la posicion basandose en la descomposicion de las
homografias mundo a imagen e imagen a imagen. (II) Deteccion obstaculos y elusion
colisiones, donde el UAV es capaz de percibir y detectar los obstaculos frontales
que se encuentran en su camino. El algoritmo de deteccion imita comportamientos
humanos para detectar los obstaculos que se acercan, mediante el analisis de la
magnitud del cambio de los puntos caracteristicos detectados de referencia, combinado
con los ratios de expansion de los contornos convexos construidos alrededor
de los puntos caracteristicos detectados en frames consecutivos. A continuacion,
comparando la proporcion del area del obstaculo y la posicion del UAV, el metodo
decide si el obstaculo detectado puede provocar una colision. Por ultimo, el algoritmo
extrae las zonas libres de colision alrededor del obstaculo y combinandolo
con los puntos de referencia, elUAV realiza la maniobra de evasion. (III) Guiado de navegacion, que genera los puntos de referencia para determinar la trayectoria de
vuelo basada en el entorno y en los obstaculos detectados que encuentra. Proporciona
una estrategia para seguir los segmentos del trazado de una manera eficiente
y realizar la maniobra de vuelo con suavidad. (IV) Guiado por Vision, para ofrecer
soluciones de control diferentes (Control de Logica Fuzzy (FLC) y PID), basados en
la informacion visual obtenida con el fin de lograr la estabilidad de vuelo, asi como
realizar la maniobra correcta para evitar posibles colisiones y seguir los puntos de
referencia.
Todos los algoritmos propuestos han sido verificados con vuelos reales en ambientes
exteriores e interiores, tomando en consideracion condiciones visuales como
la iluminacion y las texturas. Los resultados obtenidos han sido validados con otros
sistemas: como el sistema de captura de movimiento VICON y DGPS en el caso del
algoritmo de estimacion de la posicion y orientacion. Ademas, los algoritmos propuestos
han sido comparados con trabajos anteriores recogidos en el estado del arte
con resultados que demuestran una mejora de la precision y la robustez de los algoritmos
propuestos.
Esta tesis doctoral concluye que los sensores visuales tienen las ventajes de tener
un peso ligero y un bajo consumo y, proporcionar informacion fiable, lo cual lo
hace una poderosa herramienta en los sistemas de navegacion para aumentar la
autonomia de los UAVs en aplicaciones del mundo real.Programa Oficial de Doctorado en Ingenier铆a El茅ctrica, Electr贸nica y Autom谩ticaPresidente: Carlo Regazzoni.- Secretario: Fernando Garc铆a Fern谩ndez.- Vocal: Pascual Campoy Cerver
Event-Based Visual-Inertial Odometry Using Smart Features
Event-based cameras are a novel type of visual sensor that operate under a unique paradigm, providing asynchronous data on the log-level changes in light intensity for individual pixels. This hardware-level approach to change detection allows these cameras to achieve ultra-wide dynamic range and high temporal resolution. Furthermore, the advent of convolutional neural networks (CNNs) has led to state-of-the-art navigation solutions that now rival or even surpass human engineered algorithms. The advantages offered by event cameras and CNNs make them excellent tools for visual odometry (VO). This document presents the implementation of a CNN trained to detect and describe features within an image as well as the implementation of an event-based visual-inertial odometry (EVIO) pipeline, which estimates a vehicle\u27s 6-degrees-offreedom (DOF) pose using an affixed event-based camera with an integrated inertial measurement unit (IMU). The front-end of this pipeline utilizes a neural network for generating image frames from asynchronous event camera data. These frames are fed into a multi-state constraint Kalman filter (MSCKF) back-end that uses the output of the developed CNN to perform measurement updates. The EVIO pipeline was tested on a selection from the Event-Camera Dataset [1], and on a dataset collected from a fixed-wing unmanned aerial vehicle (UAV) flight test conducted by the Autonomy and Navigation Technology (ANT) Center
Topological place recognition for life-long visual localization
Premio Extraordinario de Doctorado de la UAH en el a帽o acad茅mico 2016-2017La navegaci贸n de veh铆culos inteligentes o robots m贸viles en per铆odos largos de tiempo ha experimentado un gran inter茅s por parte de la comunidad investigadora en los 煤ltimos a帽os. Los sistemas basados en c谩maras se han extendido ampliamente en el pasado reciente gracias a las mejoras en sus caracter铆sticas, precio y reducci贸n de tama帽o, a帽adidos a los progresos en t茅cnicas de visi贸n artificial. Por ello, la localizaci贸n basada en visi贸n es una aspecto clave para desarrollar una navegaci贸n aut贸noma robusta en situaciones a largo plazo. Teniendo en cuenta esto, la identificaci贸n de localizaciones por medio de t茅cnicas de reconocimiento de lugar topol贸gicas puede ser complementaria a otros enfoques como son las soluciones basadas en el Global Positioning System (GPS), o incluso suplementaria cuando la se帽al GPS no est谩 disponible.El estado del arte en reconocimiento de lugar topol贸gico ha mostrado un funcionamiento satisfactorio en el corto plazo. Sin embargo, la localizaci贸n visual a largo plazo es problem谩tica debido a los grandes cambios de apariencia que un lugar sufre como consecuencia de elementos din谩micos, la iluminaci贸n o la climatolog铆a, entre otros. El objetivo de esta tesis es enfrentarse a las dificultades de llevar a cabo una localizaci贸n topol贸gica eficiente y robusta a lo largo del tiempo. En consecuencia, se van a contribuir dos nuevos enfoques basados en reconocimiento visual de lugar para resolver los diferentes problemas asociados a una localizaci贸n visual a largo plazo. Por un lado, un m茅todo de reconocimiento de lugar visual basado en descriptores binarios es propuesto. La innovaci贸n de este enfoque reside en la descripci贸n global de secuencias de im谩genes como c贸digos binarios, que son extra铆dos mediante un descriptor basado en la t茅cnica denominada Local Difference Binary (LDB). Los descriptores son eficientemente asociados usando la distancia de Hamming y un m茅todo de b煤squeda conocido como Approximate Nearest Neighbors (ANN). Adem谩s, una t茅cnica de iluminaci贸n invariante es aplicada para mejorar el funcionamiento en condiciones luminosas cambiantes. El empleo de la descripci贸n binaria previamente introducida proporciona una reducci贸n de los costes computacionales y de memoria.Por otro lado, tambi茅n se presenta un m茅todo de reconocimiento de lugar visual basado en deep learning, en el cual los descriptores aplicados son procesados por una Convolutional Neural Network (CNN). Este es un concepto recientemente popularizado en visi贸n artificial que ha obtenido resultados impresionantes en problemas de clasificaci贸n de imagen. La novedad de nuestro enfoque reside en la fusi贸n de la informaci贸n de imagen de m煤ltiples capas convolucionales a varios niveles y granularidades. Adem谩s, los datos redundantes de los descriptores basados en CNNs son comprimidos en un n煤mero reducido de bits para una localizaci贸n m谩s eficiente. El descriptor final es condensado aplicando t茅cnicas de compresi贸n y binarizaci贸n para realizar una asociaci贸n usando de nuevo la distancia de Hamming. En t茅rminos generales, los m茅todos centrados en CNNs mejoran la precisi贸n generando representaciones visuales de las localizaciones m谩s detalladas, pero son m谩s costosos en t茅rminos de computaci贸n.Ambos enfoques de reconocimiento de lugar visual son extensamente evaluados sobre varios datasets p煤blicos. Estas pruebas arrojan una precisi贸n satisfactoria en situaciones a largo plazo, como es corroborado por los resultados mostrados, que comparan nuestros m茅todos contra los principales algoritmos del estado del arte, mostrando mejores resultados para todos los casos.Adem谩s, tambi茅n se ha analizado la aplicabilidad de nuestro reconocimiento de lugar topol贸gico en diferentes problemas de localizaci贸n. Estas aplicaciones incluyen la detecci贸n de cierres de lazo basada en los lugares reconocidos o la correcci贸n de la deriva acumulada en odometr铆a visual usando la informaci贸n proporcionada por los cierres de lazo. Asimismo, tambi茅n se consideran las aplicaciones de la detecci贸n de cambios geom茅tricos a lo largo de las estaciones del a帽o, que son esenciales para las actualizaciones de los mapas en sistemas de conducci贸n aut贸nomos centrados en una operaci贸n a largo plazo. Todas estas contribuciones son discutidas al final de la tesis, incluyendo varias conclusiones sobre el trabajo presentado y l铆neas de investigaci贸n futuras
Visual control of multi-rotor UAVs
Recent miniaturization of computer hardware, MEMs sensors, and high energy density
batteries have enabled highly capable mobile robots to become available at low cost.
This has driven the rapid expansion of interest in multi-rotor unmanned aerial vehicles.
Another area which has expanded simultaneously is small powerful computers, in the
form of smartphones, which nearly always have a camera attached, many of which now
contain a OpenCL compatible graphics processing units. By combining the results of
those two developments a low-cost multi-rotor UAV can be produced with a low-power
onboard computer capable of real-time computer vision. The system should also use
general purpose computer vision software to facilitate a variety of experiments.
To demonstrate this I have built a quadrotor UAV based on control hardware from
the Pixhawk project, and paired it with an ARM based single board computer, similar
those in high-end smartphones. The quadrotor weights 980 g and has a flight time of
10 minutes. The onboard computer capable of running a pose estimation algorithm
above the 10 Hz requirement for stable visual control of a quadrotor.
A feature tracking algorithm was developed for efficient pose estimation, which relaxed
the requirement for outlier rejection during matching. Compared with a RANSAC-
only algorithm the pose estimates were less variable with a Z-axis standard deviation
0.2 cm compared with 2.4 cm for RANSAC. Processing time per frame was also faster
with tracking, with 95 % confidence that tracking would process the frame within 50 ms,
while for RANSAC the 95 % confidence time was 73 ms. The onboard computer ran the
algorithm with a total system load of less than 25 %. All computer vision software uses
the OpenCV library for common computer vision algorithms, fulfilling the requirement
for running general purpose software.
The tracking algorithm was used to demonstrate the capability of the system by per-
forming visual servoing of the quadrotor (after manual takeoff). Response to external
perturbations was poor however, requiring manual intervention to avoid crashing. This
was due to poor visual controller tuning, and to variations in image acquisition and
attitude estimate timing due to using free running image acquisition.
The system, and the tracking algorithm, serve as proof of concept that visual control of
a quadrotor is possible using small low-power computers and general purpose computer
vision software
UAV vision system: Application in electric line following and 3D reconstruction of associated terrain
Abstract. In this work, a set of vision techniques applied to a UAV (Unmanned Aerial Vehicle) images is presented. The techniques are used to detect electrical lines and towers which are used in vision based navigation and for 3D associated terrain reconstruction. The developed work aims to be a previous stage for autonomous electrical infrastructure inspection. This work is divided in four stages: power line detection, transmission tower detection, UAV navigation and 3D reconstruction of associated terrain. In the first stage, a study of algorithms for line detection was performed. After that, an algorithm for line detection called CBS (Circle Based Search) which presented good results with azimuthal images was developed. This method offers a shorter response time in comparison with the Hough transform and the LSD (Line Segment Detector) algorithm, and a similar response to EDLines which is one of the fastest and most trustful algorithms for line detection. Given that most of the works related with line detection are focused in straight lines, an algorithm for catenary detection based on a concatenation process was developed. This algorithm was validated using real power line inspection images with catenaries. Additionally, in this work a tower detection method based on a feature descriptor with the capacity of detecting towers in times close to 100 ms was developed. Navigation over power lines by using UAVs requires a lot of tests because of the risk of failures and accidents. For this reason, a virtual environment for real time UAV simulation of visual navigation was developed by using ROS (Robot Operative System), which is open source. An onboard visual navigation system for UAV was also developed. This system allows the UAV to navigate following a power line in real sceneries by using the developed techniques. In the last part a 3D tower reconstruction that uses images obtained with UAVs is presented.}, keywordenglish = {line detection, inspection, navigation, tower detection, onboard vision system, UAV.Este trabajo presenta un conjunto de t茅cnicas de visi贸n aplicadas a im谩genes adquiridas mediante UAVs (veh铆culos a茅reos no tripulados). Las t茅cnicas se usan para la detecci贸n de l铆neas y torres el茅ctricas las cuales son usadas en un proceso de navegaci贸n basada en vision y para la reconstrucci贸n de terreno asociado en 3D. El proyecto est谩 planteado como una fase previa a un proceso de inspecci贸n de infraestructura electrica. El trabajo se encuentra dividido en cuatro partes: la detecci贸n de l铆neas de transmisi贸n el茅ctrica, la detecci贸n de torres de transmisi贸n, la navegaci贸n de UAVs y la reconstrucci贸n tridimensional de objetos tales como torres de transmisi贸n. En primer lugar se realiz贸 un estudio de los algoritmos para la detecci贸n de l铆neas en im谩genes. Posteriormente se desarroll贸 un algoritmo para la detecci贸n de l铆neas llamado CBS (B煤squeda basada en c铆rculos), el cual tiene buenos resultados en im谩genes azimutales de l铆neas el茅ctricas. Este m茅todo ofrece un tiempo de respuesta m谩s corto que la transformada de Houg o el algoritmo LSD (line segment detector), y un tiempo similar a EDLines el cual es uno de los algoritmos m谩s r谩pidos y confiables para detectar l铆neas. Debido a que la mayor铆a de trabajos relacionados con detecci贸n de l铆neas se enfocan en l铆neas rectas, se desarroll贸 un algoritmo para detectar catenarias que cuenta con un proceso de concatenaci贸n de segmentos, esta t茅cnica fue validada con im谩genes de catenarias obtenidas en inspecciones reales de infraestructura el茅ctrica. Adicionalmente se desarroll贸 un algoritmo basado en descriptores de caracter铆sticas para la detecci贸n de torres de transmisi贸n con la intenci贸n de facilitar los procesos de navegaci贸n e inspecci贸n. El proceso desarrollado ha permitido detectar torres en videos en tiempos cercanos a 100 ms. La navegaci贸n sobre l铆neas el茅ctricas mediante UAVs requiere una gran cantidad de pruebas debido al riesgo de fallos y accidentes, por esto se realiz贸 un ambiente virtual para la simulaci贸n en tiempo real de t茅cnicas de navegaci贸n basadas en caracter铆sticas visuales haciendo uso del entorno de ROS (Robot Operative System), el cual es de c贸digo abierto. Se desarrollo un sistema de navegaci贸n a bordo de un UAV el cual permitio obtener resultados de navegaci贸n aut贸noma en el seguimiento de l铆neas en escenarios reales usando las t茅cnicas desarrolladas. En la parte final del trabajo se realiz贸 una reconstrucci贸n 3D de torres electricas haciendo uso de imagenes adquiridas con UAVs.Doctorad
Vision-based Deformation Measurement for Pile-soil Testing
Image measurement technology has been widely used in monitoring the deformation of the soil field around the pile with its advantages of no damage, no contact, full-field measurement, no added quality and high sensitivity. But there are few researches on image-based bearing deformation measurement of the pile. Through an indoor pile-soil semi-model test, the rigid body displacement and load-bearing deformation of a new type of prefabricated steel tube pile foundation under horizontal load was measured based on image features. In this study, the concept of optical extensometer is first applied to the measurement of local average strain of a non-uniform deformed structure. Based on an improved feature point tracking algorithm SURF-BRISK, non-contact measurement of tiny strain of pile body is realized. In addition, based on DIC technology, this study also obtained the progressive development of displacement field of soil around pile. The above work fully reflects the non-contact convenience and full-field richness of the optical measurement method compared with the traditional measurement method
Stereo visual simultaneous localisation and mapping for an outdoor wheeled robot: a front-end study
For many mobile robotic systems, navigating an environment is a crucial step in autonomy and Visual Simultaneous Localisation and Mapping (vSLAM) has seen increased effective usage in this capacity. However, vSLAM is strongly dependent on the context in which it is applied, often using heuristic and special cases to provide efficiency and robustness. It is thus crucial to identify the important parameters and factors regarding a particular context as this heavily influences the necessary algorithms, processes, and hardware required for the best results. In this body of work, a generic front-end stereo vSLAM pipeline is tested in the context of a small-scale outdoor wheeled robot that occupies less than 1m3 of volume. The scale of the vehicle constrained the available processing power, Field Of View (FOV), actuation systems, and image distortions present. A dataset was collected with a custom platform that consisted of a Point Grey Bumblebee (Discontinued) stereo camera and Nvidia Jetson TK1 processor. A stereo front-end feature tracking framework was described and evaluated both in simulation and experimentally where appropriate. It was found that scale adversely affected lighting conditions, FOV, baseline, and processing power available, all crucial factors to improve upon. The stereo constraint was effective for robustness criteria, but ineffective in terms of processing power and metric reconstruction. An overall absolute odometer error of 0.25-3m was produced on the dataset but was unable to run in real-time
The Revisiting Problem in Simultaneous Localization and Mapping: A Survey on Visual Loop Closure Detection
Where am I? This is one of the most critical questions that any intelligent
system should answer to decide whether it navigates to a previously visited
area. This problem has long been acknowledged for its challenging nature in
simultaneous localization and mapping (SLAM), wherein the robot needs to
correctly associate the incoming sensory data to the database allowing
consistent map generation. The significant advances in computer vision achieved
over the last 20 years, the increased computational power, and the growing
demand for long-term exploration contributed to efficiently performing such a
complex task with inexpensive perception sensors. In this article, visual loop
closure detection, which formulates a solution based solely on appearance input
data, is surveyed. We start by briefly introducing place recognition and SLAM
concepts in robotics. Then, we describe a loop closure detection system's
structure, covering an extensive collection of topics, including the feature
extraction, the environment representation, the decision-making step, and the
evaluation process. We conclude by discussing open and new research challenges,
particularly concerning the robustness in dynamic environments, the
computational complexity, and scalability in long-term operations. The article
aims to serve as a tutorial and a position paper for newcomers to visual loop
closure detection.Comment: 25 pages, 15 figure