668 research outputs found

    Fault detection in operating helicopter drive train components based on support vector data description

    Get PDF
    The objective of the paper is to develop a vibration-based automated procedure dealing with early detection of mechanical degradation of helicopter drive train components using Health and Usage Monitoring Systems (HUMS) data. An anomaly-detection method devoted to the quantification of the degree of deviation of the mechanical state of a component from its nominal condition is developed. This method is based on an Anomaly Score (AS) formed by a combination of a set of statistical features correlated with specific damages, also known as Condition Indicators (CI), thus the operational variability is implicitly included in the model through the CI correlation. The problem of fault detection is then recast as a one-class classification problem in the space spanned by a set of CI, with the aim of a global differentiation between normal and anomalous observations, respectively related to healthy and supposedly faulty components. In this paper, a procedure based on an efficient one-class classification method that does not require any assumption on the data distribution, is used. The core of such an approach is the Support Vector Data Description (SVDD), that allows an efficient data description without the need of a significant amount of statistical data. Several analyses have been carried out in order to validate the proposed procedure, using flight vibration data collected from a H135, formerly known as EC135, servicing helicopter, for which micro-pitting damage on a gear was detected by HUMS and assessed through visual inspection. The capability of the proposed approach of providing better trade-off between false alarm rates and missed detection rates with respect to individual CI and to the AS obtained assuming jointly-Gaussian-distributed CI has been also analysed

    Assessing Wildfire Damage from High Resolution Satellite Imagery Using Classification Algorithms

    Get PDF
    Wildfire damage assessments are important information for first responders, govern- ment agencies, and insurance companies to estimate the cost of damages and to help provide relief to those affected by a wildfire. With the help of Earth Observation satellite technology, determining the burn area extent of a fire can be done with traditional remote sensing methods like Normalized Burn Ratio. Using Very High Resolution satellites can help give even more accurate damage assessments but will come with some tradeoffs; these satellites can provide higher spatial and temporal resolution at the expense of better spectral resolution. As a wildfire burn area cannot be determined by traditional remote sensing methods with higher spatial resolution satellites, the use of machine learning can help predict the extent of the wildfire. This research project proposes an object-based classification method to train and compare several machine learning algorithms to detect the remaining burn scars after the event of a wildfire. Then, a building damage assessment approach is provided. The results of this research project shows that random forests can predict the burn scars with an accuracy of 86% using high resolution image data

    The power spectrum from the angular distribution of galaxies in the CFHTLS-Wide fields at redshift ~0.7

    Get PDF
    We measure the real-space galaxy power spectrum on large scales at redshifts 0.5 to 1.2 using optical colour-selected samples from the CFHT Legacy Survey. With the redshift distributions measured with a preliminary ~14000 spectroscopic redshifts from the VIMOS Public Extragalactic Redshift Survey (VIPERS), we deproject the angular distribution and directly estimate the three-dimensional power spectrum. We use a maximum likelihood estimator that is optimal for a Gaussian random field giving well-defined window functions and error estimates. This measurement presents an initial look at the large-scale structure field probed by the VIPERS survey. We measure the galaxy bias of the VIPERS-like sample to be b_g=1.38 +- 0.05 (sigma_8=0.8) on scales k<0.2h/mpc averaged over 0.5<z<1.2. We further investigate three photometric redshift slices, and marginalising over the bias factors while keeping other LCDM parameters fixed, we find the matter density Omega_m=0.30+-0.06.Comment: Minor changes to match journal versio

    Enhanced clustering analysis pipeline for performance analysis of parallel applications

    Get PDF
    Clustering analysis is widely used to stratify data in the same cluster when they are similar according to the specific metrics. We can use the cluster analysis to group the CPU burst of a parallel application, and the regions on each process in-between communication calls or calls to the parallel runtime. The resulting clusters obtained are the different computational trends or phases that appear in the application. These clusters are useful to understand the behavior of the computation part of the application and focus the analyses on those that present performance issues. Although density-based clustering algorithms are a powerful and efficient tool to summarize this type of information, their traditional user-guided clustering methodology has many shortcomings and deficiencies in dealing with the complexity of data, the diversity of data structures, high-dimensionality of data, and the dramatic increase in the amount of data. Consequently, the majority of DBSCAN-like algorithms have weaknesses to handle high-dimensionality and/or Multi-density data, and they are sensitive to their hyper-parameter configuration. Furthermore, extracting insight from the obtained clusters is an intuitive and manual task. To mitigate these weaknesses, we have proposed a new unified approach to replace the user-guided clustering with an automated clustering analysis pipeline, called Enhanced Cluster Identification and Interpretation (ECII) pipeline. To build the pipeline, we propose novel techniques including Robust Independent Feature Selection, Feature Space Curvature Map, Organization Component Analysis, and hyper-parameters tuning to feature selection, density homogenization, cluster interpretation, and model selection which are the main components of our machine learning pipeline. This thesis contributes four new techniques to the Machine Learning field with a particular use case in Performance Analytics field. The first contribution is a novel unsupervised approach for feature selection on noisy data, called Robust Independent Feature Selection (RIFS). Specifically, we choose a feature subset that contains most of the underlying information, using the same criteria as the Independent component analysis. Simultaneously, the noise is separated as an independent component. The second contribution of the thesis is a parametric multilinear transformation method to homogenize cluster densities while preserving the topological structure of the dataset, called Feature Space Curvature Map (FSCM). We present a new Gravitational Self-organizing Map to model the feature space curvature by plugging the concepts of gravity and fabric of space into the Self-organizing Map algorithm to mathematically describe the density structure of the data. To homogenize the cluster density, we introduce a novel mapping mechanism to project the data from the non-Euclidean curved space to a new Euclidean flat space. The third contribution is a novel topological-based method to study potentially complex high-dimensional categorized data by quantifying their shapes and extracting fine-grain insights from them to interpret the clustering result. We introduce our Organization Component Analysis (OCA) method for the automatic arbitrary cluster-shape study without an assumption about the data distribution. Finally, to tune the DBSCAN hyper-parameters, we propose a new tuning mechanism by combining techniques from machine learning and optimization domains, and we embed it in the ECII pipeline. Using this cluster analysis pipeline with the CPU burst data of a parallel application, we provide the developer/analyst with a high-quality SPMD computation structure detection with the added value that reflects the fine grain of the computation regions.El análisis de conglomerados se usa ampliamente para estratificar datos en el mismo conglomerado cuando son similares según las métricas específicas. Nosotros puede usar el análisis de clúster para agrupar la ráfaga de CPU de una aplicación paralela y las regiones en cada proceso intermedio llamadas de comunicación o llamadas al tiempo de ejecución paralelo. Los clusters resultantes obtenidos son las diferentes tendencias computacionales o fases que aparecen en la solicitud. Estos clusters son útiles para entender el comportamiento de la parte de computación del aplicación y centrar los análisis en aquellos que presenten problemas de rendimiento. Aunque los algoritmos de agrupamiento basados en la densidad son una herramienta poderosa y eficiente para resumir este tipo de información, su La metodología tradicional de agrupación en clústeres guiada por el usuario tiene muchas deficiencias y deficiencias al tratar con la complejidad de los datos, la diversidad de estructuras de datos, la alta dimensionalidad de los datos y el aumento dramático en la cantidad de datos. En consecuencia, el La mayoría de los algoritmos similares a DBSCAN tienen debilidades para manejar datos de alta dimensionalidad y/o densidad múltiple, y son sensibles a su configuración de hiperparámetros. Además, extraer información de los clústeres obtenidos es una forma intuitiva y tarea manual Para mitigar estas debilidades, hemos propuesto un nuevo enfoque unificado para reemplazar el agrupamiento guiado por el usuario con un canalización de análisis de agrupamiento automatizado, llamada canalización de identificación e interpretación de clúster mejorada (ECII). para construir el tubería, proponemos técnicas novedosas que incluyen la selección robusta de características independientes, el mapa de curvatura del espacio de características, Análisis de componentes de la organización y ajuste de hiperparámetros para la selección de características, homogeneización de densidad, agrupación interpretación y selección de modelos, que son los componentes principales de nuestra canalización de aprendizaje automático. Esta tesis aporta cuatro nuevas técnicas al campo de Machine Learning con un caso de uso particular en el campo de Performance Analytics. La primera contribución es un enfoque novedoso no supervisado para la selección de características en datos ruidosos, llamado Robust Independent Feature. Selección (RIFS).Específicamente, elegimos un subconjunto de funciones que contiene la mayor parte de la información subyacente, utilizando el mismo criterios como el análisis de componentes independientes. Simultáneamente, el ruido se separa como un componente independiente. La segunda contribución de la tesis es un método de transformación multilineal paramétrica para homogeneizar densidades de clústeres mientras preservando la estructura topológica del conjunto de datos, llamado Mapa de Curvatura del Espacio de Características (FSCM). Presentamos un nuevo Gravitacional Mapa autoorganizado para modelar la curvatura del espacio característico conectando los conceptos de gravedad y estructura del espacio en el Algoritmo de mapa autoorganizado para describir matemáticamente la estructura de densidad de los datos. Para homogeneizar la densidad del racimo, introducimos un mecanismo de mapeo novedoso para proyectar los datos del espacio curvo no euclidiano a un nuevo plano euclidiano espacio. La tercera contribución es un nuevo método basado en topología para estudiar datos categorizados de alta dimensión potencialmente complejos mediante cuantificando sus formas y extrayendo información detallada de ellas para interpretar el resultado de la agrupación. presentamos nuestro Método de análisis de componentes de organización (OCA) para el estudio automático de forma arbitraria de conglomerados sin una suposición sobre el distribución de datos.Postprint (published version

    Object Detection in High Resolution Aerial Images and Hyperspectral Remote Sensing Images

    Get PDF
    With rapid developments in satellite and sensor technologies, there has been a dramatic increase in the availability of remotely sensed images. However, the exploration of these images still involves a tremendous amount of human interventions, which are tedious, time-consuming, and inefficient. To help imaging experts gain a complete understanding of the images and locate the objects of interest in a more accurate and efficient way, there is always an urgent need for developing automatic detection algorithms. In this work, we delve into the object detection problems in remote sensing applications, exploring the detection algorithms for both hyperspectral images (HSIs) and high resolution aerial images. In the first part, we focus on the subpixel target detection problem in HSIs with low spatial resolutions, where the objects of interest are much smaller than the image pixel spatial resolution. To this end, we explore the detection frameworks that integrate image segmentation techniques in designing the matched filters (MFs). In particular, we propose a novel image segmentation algorithm to identify the spatial-spectral coherent image regions, from which the background statistics were estimated for deriving the MFs. Extensive experimental studies were carried out to demonstrate the advantages of the proposed subpixel target detection framework. Our studies show the superiority of the approach when comparing to state-of-the-art methods. The second part of the thesis explores the object based image analysis (OBIA) framework for geospatial object detection in high resolution aerial images. Specifically, we generate a tree representation of the aerial images from the output of hierarchical image segmentation algorithms and reformulate the object detection problem into a tree matching task. We then proposed two tree-matching algorithms for the object detection framework. We demonstrate the efficiency and effectiveness of the proposed tree-matching based object detection framework. In the third part, we study object detection in high resolution aerial images from a machine learning perspective. We investigate both traditional machine learning based framework and end-to-end convolutional neural network (CNN) based approach for various object detection tasks. In the traditional detection framework, we propose to apply the Gaussian process classifier (GPC) to train an object detector and demonstrate the advantages of the probabilistic classification algorithm. In the CNN based approach, we proposed a novel scale transfer module that generates enhanced feature maps for object detection. Our results show the efficiency and competitiveness of the proposed algorithms when compared to state-of-the-art counterparts

    Principal Component Analysis

    Get PDF
    This book is aimed at raising awareness of researchers, scientists and engineers on the benefits of Principal Component Analysis (PCA) in data analysis. In this book, the reader will find the applications of PCA in fields such as image processing, biometric, face recognition and speech processing. It also includes the core concepts and the state-of-the-art methods in data analysis and feature extraction

    Sparse Modeling for Image and Vision Processing

    Get PDF
    In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

    Vacuum ultraviolet laser induced breakdown spectroscopy (VUV-LIBS) for pharmaceutical analysis

    Get PDF
    Laser induced breakdown spectroscopy (LIBS) allows quick analysis to determine the elemental composition of the target material. Samples need little\no preparation, removing the risk of contamination or loss of analyte. It is minimally ablative so negligible amounts of the sample is destroyed, while allowing quantitative and qualitative results. Vacuum ultraviolet (VUV)-LIBS, due to the abundance of transitions at shorter wavelengths, offers improvements over LIBS in the visible region, such as achieving lower limits of detection for trace elements and extends LIBS to elements\samples not suitable to visible LIBS. These qualities also make VUV-LIBS attractive for pharmaceutical analysis. Due to success in the pharmaceutical sector molecules representing the active pharmaceutical ingredients (APIs) have become increasingly complex. These organic compounds reveal spectra densely populated with carbon and oxygen lines in the visible and infrared regions, making it increasingly difficult to identify an inorganic analyte. The VUV region poses a solution as there is much better spacing between spectral lines. VUV-LIBS experiments were carried out on pharmaceutical samples. This work is a proof of principle that VUV-LIBS in conjunction with machine learning can tell pharmaceuticals apart via classification. This work will attempt to test this principle in two ways. Firstly, by classifying pharmaceuticals that are very different from one another i.e., having different APIs. This first test will gauge the efficacy of separating into different classes analytes that are essentially carbohydrates with distinctly different APIs apart from one another using their VUV emission spectra. Secondly, by classifying two different brands of the same pharmaceutical, i.e., paracetamol. The second test will investigate of the ability of machine learning to abstract and identify the differences in the spectra of two pharmaceuticals with the same API and separate them. This second test presents the application of VUV-LIBS combined with machine learning as a solution for at-line analysis of similar analytes e.g., quality control. The machine learning techniques explored in this thesis were convolutional neural networks (CNNs), support vector machines, self-organizing maps and competitive learning. The motivation for the application of principal component analysis (PCA) and machine learning is for the classification of analytes, allowing us to distinguish pharmaceuticals from one another based on their spectra. PCA and the machine learning techniques are compared against one another in this thesis. Several innovations were made; this work is the first in LIBS to implement the use of a short-time Fourier transform (STFT) method to generate input images for a CNN for VUV-LIBS spectra. This is also believed to be the first work in LIBS to carry out the development and application of an ellipsoidal classifier based on PCA. The results of this work show that by lowering the pulse energy it is possible to gather more useful spectra over the surface of a sample. Although this yields spectra with poorer signal-to-noise, the samples can still be classified using the machine learning analytics. The results in this thesis indicate that, of all the machine learning techniques evaluated, CNNs have the best classification accuracy combined with the fastest run time. Prudent data augmentation can significantly reduce experimental workloads, without reducing classification rates

    Face Centered Image Analysis Using Saliency and Deep Learning Based Techniques

    Get PDF
    Image analysis starts with the purpose of configuring vision machines that can perceive like human to intelligently infer general principles and sense the surrounding situations from imagery. This dissertation studies the face centered image analysis as the core problem in high level computer vision research and addresses the problem by tackling three challenging subjects: Are there anything interesting in the image? If there is, what is/are that/they? If there is a person presenting, who is he/she? What kind of expression he/she is performing? Can we know his/her age? Answering these problems results in the saliency-based object detection, deep learning structured objects categorization and recognition, human facial landmark detection and multitask biometrics. To implement object detection, a three-level saliency detection based on the self-similarity technique (SMAP) is firstly proposed in the work. The first level of SMAP accommodates statistical methods to generate proto-background patches, followed by the second level that implements local contrast computation based on image self-similarity characteristics. At last, the spatial color distribution constraint is considered to realize the saliency detection. The outcome of the algorithm is a full resolution image with highlighted saliency objects and well-defined edges. In object recognition, the Adaptive Deconvolution Network (ADN) is implemented to categorize the objects extracted from saliency detection. To improve the system performance, L1/2 norm regularized ADN has been proposed and tested in different applications. The results demonstrate the efficiency and significance of the new structure. To fully understand the facial biometrics related activity contained in the image, the low rank matrix decomposition is introduced to help locate the landmark points on the face images. The natural extension of this work is beneficial in human facial expression recognition and facial feature parsing research. To facilitate the understanding of the detected facial image, the automatic facial image analysis becomes essential. We present a novel deeply learnt tree-structured face representation to uniformly model the human face with different semantic meanings. We show that the proposed feature yields unified representation in multi-task facial biometrics and the multi-task learning framework is applicable to many other computer vision tasks
    corecore