Search CORE

26 research outputs found

Text detection and recognition in natural images using computer vision techniques

Author: González Arroyo Álvaro
Publication venue
Publication date: 01/01/2013
Field of study

El reconocimiento de texto en imágenes reales ha centrado la atención de muchos investigadores en todo el mundo en los últimos años. El motivo es el incremento de productos de bajo coste como teléfonos móviles o Tablet PCs que incorporan dispositivos de captura de imágenes y altas capacidades de procesamiento. Con estos antecedentes, esta tesis presenta un método robusto para detectar, localizar y reconocer texto horizontal en imágenes diurnas tomadas en escenarios reales. El reto es complejo dada la enorme variabilidad de los textos existentes y de las condiciones de captura en entornos reales. Inicialmente se presenta una revisión de los principales trabajos de los últimos años en el campo del reconocimiento de texto en imágenes naturales. Seguidamente, se lleva a cabo un estudio de las características más adecuadas para describir texto respecto de objetos no correspondientes con texto. Típicamente, un sistema de reconocimiento de texto en imágenes está formado por dos grandes etapas. La primera consiste en detectar si existe texto en la imagen y de localizarlo con la mayor precisión posible, minimizando la cantidad de texto no detectado así como el número de falsos positivos. La segunda etapa consiste en reconocer el texto extraído. El método de detección aquí propuesto está basado en análisis de componentes conexos tras aplicar una segmentación que combina un método global como MSER con un método local, de forma que se mejoran las propuestas del estado del arte al segmentar texto incluso en situaciones complejas como imágenes borrosas o de muy baja resolución. El proceso de análisis de los componentes conexos extraídos se optimiza mediante algoritmos genéticos. Al contrario que otros sistemas, nosotros proponemos un método recursivo que permite restaurar aquellos objetos correspondientes con texto y que inicialmente son erróneamente descartados. De esta forma, se consigue mejorar en gran medida la fiabilidad de la detección. Aunque el método propuesto está basado en análisis de componentes conexos, en esta tesis se utiliza también la idea de los métodos basados en texturas para validar las áreas de texto detectadas. Por otro lado, nuestro método para reconocer texto se basa en identificar cada caracter y aplicar posteriormente un modelo de lenguaje para corregir las palabras mal reconocidas, al restringir la solución a un diccionario que contiene el conjunto de posibles términos. Se propone una nueva característica para reconocer los caracteres, a la que hemos dado el nombre de Direction Histogram (DH). Se basa en calcular el histograma de las direcciones del gradiente en los pixeles de borde. Esta característica se compara con otras del estado del arte y los resultados experimentales obtenidos sobre una base de datos compleja muestran que nuestra propuesta es adecuada ya que supera otros trabajos del estado del arte. Presentamos también un método de clasificación borrosa de letras basado en KNN, el cual permite separar caracteres erróneamente conectados durante la etapa de segmentación. El método de reconocimiento de texto propuesto no es solo capaz de reconocer palabras, sino también números y signos de puntuación. El reconocimiento de palabras se lleva a cabo mediante un modelo de lenguaje basado en inferencia probabilística y el British National Corpus, un completo diccionario del inglés británico moderno, si bien el algoritmo puede ser fácilmente adaptado para ser usado con cualquier otro diccionario. El modelo de lenguaje utiliza una modificación del algoritmo forward usando en Modelos Ocultos de Markov. Para comprobar el rendimiento del sistema propuesto, se han obtenido resultados experimentales con distintas bases de datos, las cuales incluyen imágenes en diferentes escenarios y situaciones. Estas bases de datos han sido usadas como banco de pruebas en la última década por la mayoría de investigadores en el área de reconocimiento de texto en imágenes naturales. Los resultados muestran que el sistema propuesto logra un rendimiento similar al del estado del arte en términos de localización, mientras que lo supera en términos de reconocimiento. Con objeto de mostrar la aplicabilidad del método propuesto en esta tesis, se presenta también un sistema de detección y reconocimiento de la información contenida en paneles de tráfico basado en el algoritmo desarrollado. El objetivo de esta aplicación es la creación automática de inventarios de paneles de tráfico de países o regiones que faciliten el mantenimiento de la señalización vertical de las carreteras, usando imágenes disponibles en el servicio Street View de Google. Se ha creado una base de datos para esta aplicación. Proponemos modelar los paneles de tráfico usando apariencia visual en lugar de las clásicas soluciones que utilizan bordes o características geométricas, con objeto de detectar aquellas imágenes en las que existen paneles de tráfico. Los resultados experimentales muestran la viabilidad del sistema propuesto

e_Buah - Biblioteca Digital de la Universidad de Alcalá

Biblioteca Digital de la Universidad de Alcalá

An Appearance-Based Tracking Algorithm for Aerial Search and Rescue Purposes

Author: Al-Kaff Abdulla Hussein
Armingol Moreno José María
Escalera Hueso Arturo de la
Gómez Silva María José
Moreno Olivo Francisco Miguel
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

The automation of the Wilderness Search and Rescue (WiSAR) task aims for high levels of understanding of various scenery. In addition, working in unfriendly and complex environments may cause a time delay in the operation and consequently put human lives at stake. In order to address this problem, Unmanned Aerial Vehicles (UAVs), which provide potential support to the conventional methods, are used. These vehicles are provided with reliable human detection and tracking algorithms; in order to be able to find and track the bodies of the victims in complex environments, and a robust control system to maintain safe distances from the detected bodies. In this paper, a human detection based on the color and depth data captured from onboard sensors is proposed. Moreover, the proposal of computing data association from the skeleton pose and a visual appearance measurement allows the tracking of multiple people with invariance to the scale, translation and rotation of the point of view with respect to the target objects. The system has been validated with real and simulation experiments, and the obtained results show the ability to track multiple individuals even after long-term disappearances. Furthermore, the simulations present the robustness of the implemented reactive control system as a promising tool for assisting the pilot to perform approaching maneuvers in a safe and smooth manner.This research is supported by Madrid Community project SEGVAUTO 4.0 P2018/EMT-4362) and by the Spanish Government CICYT projects (TRA2015-63708-R and TRA2016-78886-C3-1-R), and Ministerio de Educación, Cultura y Deporte para la Formación de Profesorado Universitario (FPU14/02143). Also, we gratefully acknowledge the support of the NVIDIA Corporation with the donation of the GPUs used for this research

Directory of Open Access Journals

Universidad Carlos III de Madrid e-Archivo

Development of Self-Learning Type-2 Fuzzy Systems for System Identification and Control of Autonomous Systems

Author: Al-Mahturi Ayad
Publication venue: UNSW, Sydney
Publication date: 01/01/2021
Field of study

Modelling and control of dynamic systems are faced by multiple technical challenges, mainly due to the nature of uncertain complex, nonlinear, and time-varying systems. Traditional modelling techniques require a complete understanding of system dynamics and obtaining comprehensive mathematical models is not always achievable due to limited knowledge of the systems as well as the presence of multiple uncertainties in the environment. As universal approximators, fuzzy logic systems (FLSs), neural networks (NNs) and neuro-fuzzy systems have proved to be successful computational tools for representing the behaviour of complex dynamical systems. Moreover, FLSs, NNs and learning-based techniques have been gaining popularity for controlling complex, ill-defined, nonlinear, and time-varying systems in the face of uncertainties. However, fuzzy rules derived by experts can be too ad-hoc, and the performance is less than optimum. In other words, generating fuzzy rules and membership functions in fuzzy systems is a potential challenge especially for systems with many variables. Moreover, under the umbrella of FLSs, although type-1 fuzzy logic control systems (T1-FLCs) have been applied to control various complex nonlinear systems, they have limited capability to handle uncertainties. Aiming to accommodate uncertainties, type-2 fuzzy logic control systems (T2-FLCs) were established. This thesis aims to address the shortcomings of existing fuzzy techniques by utilisation of type-2 FLCs with novel adaptive capabilities. The first contribution of this thesis is a novel online system identification technique by means of a recursive interval type-2 Takagi-Sugeno fuzzy C-means clustering technique (IT2-TS-FC) to accommodate the footprint-of-uncertainties (FoUs). This development is meant to specifically address the shortcomings of type-1 fuzzy systems in capturing the footprint-of-uncertainties such as mechanical wear, rotor damage, battery drain and sensor and actuator faults. Unlike previous type-2 TS fuzzy models, the proposed method constructs two fuzzifiers (upper and lower) and two regression coefficients in the consequent part to handle uncertainties. The weighted least square method is employed to compute the regression coefficients. The proposed method is validated using two benchmarks, namely, real flight test data of a quadcopter drone and Mackey-Glass time series data. The algorithm has the capability to model uncertainties (e.g., noisy dataset). The second contribution of this thesis is the development of a novel self-adaptive interval type-2 fuzzy controller named the SAF2C for controlling multi-input multi-output (MIMO) nonlinear systems. The adaptation law is derived using sliding mode control (SMC) theory to reduce the computation time so that the learning process can be expedited by 80% compared to separate single-input single-output (SISO) controllers. The system employs the `Enhanced Iterative Algorithm with Stop Condition' (EIASC) type-reduction method, which is more computationally efficient than the `Karnik-Mendel' type-reduction algorithm. The stability of the SAF2C is proven using the Lyapunov technique. To ensure the applicability of the proposed control scheme, SAF2C is implemented to control several dynamical systems, including a simulated MIMO hexacopter unmanned aerial vehicle (UAV) in the face of external disturbance and parameter variations. The ability of SAF2C to filter the measurement noise is demonstrated, where significant improvement is obtained using the proposed controller in the face of measurement noise. Also, the proposed closed-loop control system is applied to control other benchmark dynamic systems (e.g., a simulated autonomous underwater vehicle and inverted pendulum on a cart system) demonstrating high accuracy and robustness to variations in system parameters and external disturbance. Another contribution of this thesis is a novel stand-alone enhanced self-adaptive interval type-2 fuzzy controller named the ESAF2C algorithm, whose type-2 fuzzy parameters are tuned online using the SMC theory. This way, we expect to design a computationally efficient adaptive Type-2 fuzzy system, suitable for real-time applications by introducing the EIASC type-reducer. The proposed technique is applied on a quadcopter UAV (QUAV), where extensive simulations and real-time flight tests for a hovering QUAV under wind disturbances are also conducted to validate the efficacy of the ESAF2C. Specifically, the control performance is investigated in the face of external wind gust disturbances, generated using an industrial fan. Stability analysis of the ESAF2C control system is investigated using the Lyapunov theory. Yet another contribution of this thesis is the development of a type-2 evolving fuzzy control system (T2-EFCS) to facilitate self-learning (either from scratch or from a certain predefined rule). T2-EFCS has two phases, namely, the structure learning and the parameters learning. The structure of T2-EFCS does not require previous information about the fuzzy structure, and it can start the construction of its rules from scratch with only one rule. The rules are then added and pruned in an online fashion to achieve the desired set-point. The proposed technique is applied to control an unmanned ground vehicle (UGV) in the presence of multiple external disturbances demonstrating the robustness of the proposed control systems. The proposed approach turns out to be computationally efficient as the system employs fewer fuzzy parameters while maintaining superior control performance

Detection of bodies in maritime rescue operations using Unmanned Aerial Vehicles with multispectral cameras

Author: Fisher Robert
Gallego Antonio-Javier
Gil Pablo
Pertusa Antonio
Publication venue: 'Wiley'
Publication date: 05/12/2018
Field of study

In this study, we use unmanned aerial vehicles equipped with multispectral cameras to search for bodies in maritime rescue operations. A series of flights were performed in open‐water scenarios in the northwest of Spain, using a certified aquatic rescue dummy in dangerous areas and real people when the weather conditions allowed it. The multispectral images were aligned and used to train a convolutional neural network for body detection. An exhaustive evaluation was performed to assess the best combination of spectral channels for this task. Three approaches based on a MobileNet topology were evaluated, using (a) the full image, (b) a sliding window, and (c) a precise localization method. The first method classifies an input image as containing a body or not, the second uses a sliding window to yield a class for each subimage, and the third uses transposed convolutions returning a binary output in which the body pixels are marked. In all cases, the MobileNet architecture was modified by adding custom layers and preprocessing the input to align the multispectral camera channels. Evaluation shows that the proposed methods yield reliable results, obtaining the best classification performance when combining green, red‐edge, and near‐infrared channels. We conclude that the precise localization approach is the most suitable method, obtaining a similar accuracy as the sliding window but achieving a spatial localization close to 1 m. The presented system is about to be implemented for real maritime rescue operations carried out by Babcock Mission Critical Services Spain.This study was performed in collaboration with BabcockMCS Spain and funded by the Galicia Region Government through the Civil UAVs Initiative program, the Spanish Government’s Ministry of Economy, Industry, and Competitiveness through the RTC‐2014‐1863‐8 and INAER4‐14Y (IDI‐20141234) projects, and the grant number 730897 under the HPC‐EUROPA3 project supported by Horizon 2020