401 research outputs found

    Contributions to improve the technologies supporting unmanned aircraft operations

    Get PDF
    Mención Internacional en el título de doctorUnmanned Aerial Vehicles (UAVs), in their smaller versions known as drones, are becoming increasingly important in today's societies. The systems that make them up present a multitude of challenges, of which error can be considered the common denominator. The perception of the environment is measured by sensors that have errors, the models that interpret the information and/or define behaviors are approximations of the world and therefore also have errors. Explaining error allows extending the limits of deterministic models to address real-world problems. The performance of the technologies embedded in drones depends on our ability to understand, model, and control the error of the systems that integrate them, as well as new technologies that may emerge. Flight controllers integrate various subsystems that are generally dependent on other systems. One example is the guidance systems. These systems provide the engine's propulsion controller with the necessary information to accomplish a desired mission. For this purpose, the flight controller is made up of a control law for the guidance system that reacts to the information perceived by the perception and navigation systems. The error of any of the subsystems propagates through the ecosystem of the controller, so the study of each of them is essential. On the other hand, among the strategies for error control are state-space estimators, where the Kalman filter has been a great ally of engineers since its appearance in the 1960s. Kalman filters are at the heart of information fusion systems, minimizing the error covariance of the system and allowing the measured states to be filtered and estimated in the absence of observations. State Space Models (SSM) are developed based on a set of hypotheses for modeling the world. Among the assumptions are that the models of the world must be linear, Markovian, and that the error of their models must be Gaussian. In general, systems are not linear, so linearization are performed on models that are already approximations of the world. In other cases, the noise to be controlled is not Gaussian, but it is approximated to that distribution in order to be able to deal with it. On the other hand, many systems are not Markovian, i.e., their states do not depend only on the previous state, but there are other dependencies that state space models cannot handle. This thesis deals a collection of studies in which error is formulated and reduced. First, the error in a computer vision-based precision landing system is studied, then estimation and filtering problems from the deep learning approach are addressed. Finally, classification concepts with deep learning over trajectories are studied. The first case of the collection xviiistudies the consequences of error propagation in a machine vision-based precision landing system. This paper proposes a set of strategies to reduce the impact on the guidance system, and ultimately reduce the error. The next two studies approach the estimation and filtering problem from the deep learning approach, where error is a function to be minimized by learning. The last case of the collection deals with a trajectory classification problem with real data. This work completes the two main fields in deep learning, regression and classification, where the error is considered as a probability function of class membership.Los vehículos aéreos no tripulados (UAV) en sus versiones de pequeño tamaño conocidos como drones, van tomando protagonismo en las sociedades actuales. Los sistemas que los componen presentan multitud de retos entre los cuales el error se puede considerar como el denominador común. La percepción del entorno se mide mediante sensores que tienen error, los modelos que interpretan la información y/o definen comportamientos son aproximaciones del mundo y por consiguiente también presentan error. Explicar el error permite extender los límites de los modelos deterministas para abordar problemas del mundo real. El rendimiento de las tecnologías embarcadas en los drones, dependen de nuestra capacidad de comprender, modelar y controlar el error de los sistemas que los integran, así como de las nuevas tecnologías que puedan surgir. Los controladores de vuelo integran diferentes subsistemas los cuales generalmente son dependientes de otros sistemas. Un caso de esta situación son los sistemas de guiado. Estos sistemas son los encargados de proporcionar al controlador de los motores información necesaria para cumplir con una misión deseada. Para ello se componen de una ley de control de guiado que reacciona a la información percibida por los sistemas de percepción y navegación. El error de cualquiera de estos sistemas se propaga por el ecosistema del controlador siendo vital su estudio. Por otro lado, entre las estrategias para abordar el control del error se encuentran los estimadores en espacios de estados, donde el filtro de Kalman desde su aparición en los años 60, ha sido y continúa siendo un gran aliado para los ingenieros. Los filtros de Kalman son el corazón de los sistemas de fusión de información, los cuales minimizan la covarianza del error del sistema, permitiendo filtrar los estados medidos y estimarlos cuando no se tienen observaciones. Los modelos de espacios de estados se desarrollan en base a un conjunto de hipótesis para modelar el mundo. Entre las hipótesis se encuentra que los modelos del mundo han de ser lineales, markovianos y que el error de sus modelos ha de ser gaussiano. Generalmente los sistemas no son lineales por lo que se realizan linealizaciones sobre modelos que a su vez ya son aproximaciones del mundo. En otros casos el ruido que se desea controlar no es gaussiano, pero se aproxima a esta distribución para poder abordarlo. Por otro lado, multitud de sistemas no son markovianos, es decir, sus estados no solo dependen del estado anterior, sino que existen otras dependencias que los modelos de espacio de estados no son capaces de abordar. Esta tesis aborda un compendio de estudios sobre los que se formula y reduce el error. En primer lugar, se estudia el error en un sistema de aterrizaje de precisión basado en visión por computador. Después se plantean problemas de estimación y filtrado desde la aproximación del aprendizaje profundo. Por último, se estudian los conceptos de clasificación con aprendizaje profundo sobre trayectorias. El primer caso del compendio estudia las consecuencias de la propagación del error de un sistema de aterrizaje de precisión basado en visión artificial. En este trabajo se propone un conjunto de estrategias para reducir el impacto sobre el sistema de guiado, y en última instancia reducir el error. Los siguientes dos estudios abordan el problema de estimación y filtrado desde la perspectiva del aprendizaje profundo, donde el error es una función que minimizar mediante aprendizaje. El último caso del compendio aborda un problema de clasificación de trayectorias con datos reales. Con este trabajo se completan los dos campos principales en aprendizaje profundo, regresión y clasificación, donde se plantea el error como una función de probabilidad de pertenencia a una clase.I would like to thank the Ministry of Science and Innovation for granting me the funding with reference PRE2018-086793, associated to the project TEC2017-88048-C2-2-R, which provide me the opportunity to carry out all my PhD. activities, including completing an international research internship.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Antonio Berlanga de Jesús.- Secretario: Daniel Arias Medina.- Vocal: Alejandro Martínez Cav

    Study of stochastic and machine learning tecniques for anomaly-based Web atack detection

    Get PDF
    Mención Internacional en el título de doctorWeb applications are exposed to different threats and it is necessary to protect them. Intrusion Detection Systems (IDSs) are a solution external to the web application that do not require the modification of the application’s code in order to protect it. These systems are located in the network, monitoring events and searching for signs of anomalies or threats that can compromise the security of the information systems. IDSs have been applied to traffic analysis of different protocols, such as TCP, FTP or HTTP. Web Application Firewalls (WAFs) are special cases of IDSs that are specialized in analyzing HTTP traffic with the aim of safeguarding web applications. The increase in the amount of data traveling through the Internet and the growing sophistication of the attacks, make necessary protection mechanisms that are both effective and efficient. This thesis proposes three anomaly-based WAFs with the characteristics of being high-speed, reaching high detection results and having a simple design. The anomaly-based approach defines the normal behavior of web application. Actions that deviate from it are considered anomalous. The proposed WAFs work at the application layer analyzing the payload of HTTP requests. These systems are designed with different detection algorithms in order to compare their results and performance. Two of the systems proposed are based on stochastic techniques: one of them is based on statistical techniques and the other one in Markov chains. The third WAF presented in this thesis is ML-based. Machine Learning (ML) deals with constructing computer programs that automatically learn with experience and can be very helpful in dealing with big amounts of data. Concretely, this third WAF is based on decision trees given their proved effectiveness in intrusion detection. In particular, four algorithms are employed: C4.5, CART, Random Tree and Random Forest. Typically, two phases are distinguished in IDSs: preprocessing and processing. In the case of stochastic systems, preprocessing includes feature extraction. The processing phase consists in training the system in order to learn the normal behavior and later testing how well it classifies the incoming requests as either normal or anomalous. The detection models of the systems are implemented either with statistical techniques or with Markov chains, depending on the system considered. For the system based on decision trees, the preprocessing phase comprises feature extraction as well as feature selection. These two phases are optimized. On the one hand, new feature extraction methods are proposed. They combine features extracted by means of expert knowledge and n-grams, and have the capacity of improving the detection results of both techniques separately. For feature selection, the Generic Feature Selection GeFS measure has been used, which has been proven to be very effective in reducing the number of redundant and irrelevant features. Additionally, for the three systems, a study for establishing the minimum number of requests required to train them in order to achieve a certain detection result has been performed. Reducing the number of training requests can greatly help in the optimization of the resource consumption of WAFs as well as on the data gathering process. Besides designing and implementing the systems, evaluating them is an essential step. For that purpose, a dataset is necessary. Unfortunately, finding labeled and adequate datasets is not an easy task. In fact, the study of the most popular datasets in the intrusion detection field reveals that most of them do not satisfy the requirements for evaluating WAFs. In order to tackle this situation, this thesis proposes the new CSIC dataset, that satisfies the necessary conditions to satisfactorily evaluate WAFs. The proposed systems have been experimentally evaluated. For that, the proposed CSIC dataset and the existing ECML/PKDD dataset have been used. The three presented systems have been compared in terms of their detection results, processing time and number of training requests used. For this comparison, the CSIC dataset has been used. In summary, this thesis proposes three WAFs based on stochastic and ML techniques. Additionally, the systems are compared, what allows to determine which system is the most appropriate for each scenario.Las aplicaciones web están expuestas a diferentes amenazas y es necesario protegerlas. Los sistemas de detección de intrusiones (IDSs del inglés Intrusion Detection Systems) son una solución externa a la aplicación web que no requiere la modificación del código de la aplicación para protegerla. Estos sistemas se sitúan en la red, monitorizando los eventos y buscando señales de anomalías o amenazas que puedan comprometer la seguridad de los sistemas de información. Los IDSs se han aplicado al análisis de tráfico de varios protocolos, tales como TCP, FTP o HTTP. Los Cortafuegos de Aplicaciones Web (WAFs del inglés Web Application Firewall) son un caso especial de los IDSs que están especializados en analizar tráfico HTTP con el objetivo de salvaguardar las aplicaciones web. El incremento en la cantidad de datos circulando por Internet y la creciente sofisticación de los ataques hace necesario contar con mecanismos de protección que sean efectivos y eficientes. Esta tesis propone tres WAFs basados en anomalías que tienen las características de ser de alta velocidad, alcanzar altos resultados de detección y contar con un diseño sencillo. El enfoque basado en anomalías define el comportamiento normal de la aplicación, de modo que las acciones que se desvían del mismo se consideran anómalas. Los WAFs diseñados trabajan en la capa de aplicación y analizan el contenido de las peticiones HTTP. Estos sistemas están diseñados con diferentes algoritmos de detección para comparar sus resultados y rendimiento. Dos de los sistemas propuestos están basados en técnicas estocásticas: una de ellas está basada en técnicas estadísticas y la otra en cadenas de Markov. El tercer WAF presentado en esta tesis está basado en aprendizaje automático. El aprendizaje automático (ML del inglés Machine Learning) se ocupa de cómo construir programas informáticos que aprenden automáticamente con la experiencia y puede ser muy útil cuando se trabaja con grandes cantidades de datos. En concreto, este tercer WAF está basado en árboles de decisión, dada su probada efectividad en la detección de intrusiones. En particular, se han empleado cuatro algoritmos: C4.5, CART, Random Tree y Random Forest. Típicamente se distinguen dos fases en los IDSs: preprocesamiento y procesamiento. En el caso de los sistemas estocásticos, en la fase de preprocesamiento se realiza la extracción de características. El procesamiento consiste en el entrenamiento del sistema para que aprenda el comportamiento normal y más tarde se comprueba cuán bien el sistema es capaz de clasificar las peticiones entrantes como normales o anómalas. Los modelos de detección de los sistemas están implementados bien con técnicas estadísticas o bien con cadenas de Markov, dependiendo del sistema considerado. Para el sistema basado en árboles de decisión la fase de preprocesamiento comprende tanto la extracción de características como la selección de características. Estas dos fases se han optimizado. Por un lado, se proponen nuevos métodos de extracción de características. Éstos combinan características extraídas por medio de conocimiento experto y n-gramas y tienen la capacidad de mejorar los resultados de detección de ambas técnicas por separado. Para la selección de características, se ha utilizado la medida GeFS (del inglés Generic Feature Selection), la cual ha probado ser muy efectiva en la reducción del número de características redundantes e irrelevantes. Además, para los tres sistemas, se ha realizado un estudio para establecer el mínimo número de peticiones necesarias para entrenarlos y obtener un cierto resultado. Reducir el número de peticiones de entrenamiento puede ayudar en gran medida a la optimización del consumo de recursos de los WAFs así como en el proceso de adquisición de datos. Además de diseñar e implementar los sistemas, la tarea de evaluarlos es esencial. Para este propósito es necesario un conjunto de datos. Desafortunadamente, encontrar conjuntos de datos etiquetados y adecuados no es una tarea fácil. De hecho, el estudio de los conjuntos de datos más utilizados en el campo de la detección de intrusiones revela que la mayoría de ellos no cumple los requisitos para evaluar WAFs. Para enfrentar esta situación, esta tesis presenta un nuevo conjunto de datos llamado CSIC, que satisface las condiciones necesarias para evaluar WAFs satisfactoriamente. Los sistemas propuestos se han evaluado experimentalmente. Para ello, se ha utilizado el conjunto de datos propuesto (CSIC) y otro existente llamado ECML/PKDD. Los tres sistemas presentados se han comparado con respecto a sus resultados de detección, tiempo de procesamiento y número de peticiones de entrenamiento utilizadas. Para esta comparación se ha utilizado el conjunto de datos CSIC. En resumen, esta tesis propone tres WAFs basados en técnicas estocásticas y de ML. Además, se han comparado estos sistemas entre sí, lo que permite determinar qué sistema es el más adecuado para cada escenario.Este trabajo ha sido realizado en el marco de las becas predoctorales de la Junta de Amplicación de Estudios (JAE) de la Agencia Estatal Consejo Superior de Investigaciones Científicas (CSIC).Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Luis Hernández Encinas.- Secretario: Juan Manuel Estévez Tapiador.- Vocal: Georg Carl

    Statistical Fusion of Multi-aspect Synthetic Aperture Radar Data for Automatic Road Extraction

    Get PDF
    In this dissertation, a new statistical fusion for automatic road extraction from SAR images taken from different looking angles (i.e. multi-aspect SAR data) was presented. The main input to the fusion is extracted line features. The fusion is carried out on decision-level and is based on Bayesian network theory

    Mathematics in Software Reliability and Quality Assurance

    Get PDF
    This monograph concerns the mathematical aspects of software reliability and quality assurance and consists of 11 technical papers in this emerging area. Included are the latest research results related to formal methods and design, automatic software testing, software verification and validation, coalgebra theory, automata theory, hybrid system and software reliability modeling and assessment

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Forest point processes for the automatic extraction of networks in raster data

    Get PDF
    International audienceIn this paper, we propose a new stochastic approach for the automatic detection of network structures in raster data. We represent a network as a set of trees with acyclic planar graphs. We embed this model in the probabilistic framework of spatial point processes and determine the most probable configuration of trees by stochastic sampling. That is, different configurations are constructed randomly by modifying the graph parameters and by adding or removing nodes and edges to/ from the current trees. Each configuration is evaluated based on the probabilities for these changes and an energy function describing the conformity with a predefined model. By using the Reversible jump Markov chain Monte Carlo sampler, an approximation of the global optimum of the energy function is iteratively reached. Although our main target application is the extraction of rivers and tidal channels in digital terrain models, experiments with other types of networks in images show the transferability to further applications. Qualitative and quantitative evaluations demonstrate the competitiveness of our approach with respect to existing algorithms

    A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery

    Full text link
    Semantic segmentation (classification) of Earth Observation imagery is a crucial task in remote sensing. This paper presents a comprehensive review of technical factors to consider when designing neural networks for this purpose. The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these ANN families and their implications for semantic segmentation. Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalization and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation. By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides researchers and practitioners with a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of Earth Observation imagery.Comment: 145 pages with 32 figure

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
    corecore