112 research outputs found

    Data Driven Approach to Non-stationary EMA Fault Detection and Investigation Into Remaining Useful Life

    Get PDF
    Growing interest in using Electromechanical Actuators (EMAs) to replace current hydraulic actuation methods on aircraft control surfaces has driven significant research in the area of prognostics and health management. Non- stationary speeds and loads in the course of controlling an aircraft surface make fault identification in EMAs difficult. This work presents a time- frequency analysis of EMA thrust bearing vibration signals using wavelet transforms. A relatively small EMA system is designed and built to allow for simple, quick, and repeatable component replacement. A simulated signal is developed to test four potential faults in the system. Classification is performed using an artificial neural network (ANN), which yields over 99% accuracy. Indentation faults from moderate and heavy loads are seeded in thrust bearings, which are then tested to generate data. The ANN achieves 95% classification accuracy in a two class scenario using healthy and moderately indented bearings. A three class test is executed using thrust bearings at each level of damage to perform preliminary remaining useful life (RUL) testing, where an ANN is able to identify the fault severity with an accuracy of 88%

    Data-driven remote fault detection and diagnosis of HVAC terminal units using machine learning techniques

    Get PDF
    The modernising and retrofitting of older buildings has created a drive to install building management systems (BMS) aimed to assist building managers pave the way towards smarter energy use, improve maintenance and increase occupants comfort inside a building. BMS is a computerised control system that controls and monitors a building’s equipment, services such as lighting, ventilation, power systems, fire and security systems, etc. Buildings are becoming more and more complex environments and energy consumption has globally increased to 40% in the past decades. Still, there is no generalised solution or standardisation method available to maintain and handle a building’s energy consumption. Thus this research aims to discover an intelligent solution for the building’s electrical and mechanical units that consume the most power. Indeed, remote control and monitoring of Heating, Ventilation and Air-Conditioning (HVAC) units based on the received information through the thousands of sensors and actuators, is a crucial task in BMS. Thus, it is a foremost task to identify faulty units automatically to optimise running and energy usage. Therefore, a comprehensive analysis on HVAC data and the development of computational intelligent methods for automatic fault detection and diagnosis is been presented here for a period of July 2015 to October 2015 on a real commercial building in London. This study mainly investigated one of the HVAC sub-units namely Fan-coil unit’s terminal unit (TU). It comprises of the three stages: data collection, pre-processing, and machine learning. Further to the aspects of machine learning algorithms for TU behaviour identification by employing unsupervised, supervised, and semi-supervised learning algorithms and their combination was employed to make an automatic intelligent solution for building services. The accuracy of these employed algorithms have been measured in both training and testing phases, results compared with different suitable algorithms, and validated through statistical measures. This research provides an intelligent solution for the real time prediction through the development of an effective automatic fault detection and diagnosis system creating a smarter way to handle the BMS data for energy optimisation

    Semi-Supervised Learning Techniques for Automated Fault Detection and Diagnosis of HVAC System

    Get PDF
    This work demonstrates and evaluates semisupervised learning (SSL) techniques for heating, ventilation and air-conditioning (HVAC) data from a real building to automatically discover and identify faults. Real HVAC sensor data is unfortunately usually unstructured and unlabelled, thus, to ensure better performance of automated methods promoting machine-learning techniques requires raw data to be preprocessed, increasing the overall operational costs of the system employed and makes real time application difficult. Due to the data complexity and limited availability of labelled information, semi-supervised learning based robust automatic fault detection and diagnosis (AFDD) tool has been proposed here. Further, this method has been tested and compared for more than 50 thousand TUs. Established statistical performance metrics and paired t-test have been applied to validate the proposed work

    MAPPING AND DECOMPOSING SCALE-DEPENDENT SOIL MOISTURE VARIABILITY WITHIN AN INNER BLUEGRASS LANDSCAPE

    Get PDF
    There is a shared desire among public and private sectors to make more reliable predictions, accurate mapping, and appropriate scaling of soil moisture and associated parameters across landscapes. A discrepancy often exists between the scale at which soil hydrologic properties are measured and the scale at which they are modeled for management purposes. Moreover, little is known about the relative importance of hydrologic modeling parameters as soil moisture fluctuates with time. More research is needed to establish which observation scales in space and time are optimal for managing soil moisture variation over large spatial extents and how these scales are affected by fluctuations in soil moisture content with time. This research fuses high resolution geoelectric and light detection and ranging (LiDAR) as auxiliary measures to support sparse direct soil sampling over a 40 hectare inner BluegrassKentucky (USA) landscape. A Veris 3100 was used to measure shallow and deep apparent electrical conductivity (aEC) in tandem with soil moisture sampling on three separate dates with ascending soil moisture contents ranging from plant wilting point to near field capacity. Terrain attributes were produced from 2010 LiDAR ground returns collected at ≤1 m nominal pulse spacing. Exploratory statistics revealed several variables best associate with soil moisture, including terrain features (slope, profile curvature, and elevation), soil physical and chemical properties (calcium, cation exchange capacity, organic matter, clay and sand) and aEC for each date. Multivariate geostatistics, time stability analyses, and spatial regression were performed to characterize scale-dependent soil moisture patterns in space with time to determine which soil-terrain parameters influence soil moisture distribution. Results showed that soil moisture variation was time stable across the landscape and primarily associated with long-range (~250 m) soil physicochemical properties. When the soils approached field capacity, however, there was a shift in relative importance from long-range soil physicochemical properties to short-range (~70 m) terrain attributes, albeit this shift did not cause time instability. Results obtained suggest soil moisture’s interaction with soil-terrain parameters is time dependent and this dependence influences which observation scale is optimal to sample and manage soil moisture variation

    Automatic fault detection and diagnosis in refrigeration systems, A data-driven approach

    Get PDF

    On the application of domain adaptation in structural health monitoring

    Get PDF
    The application of machine learning within Structural Health Monitoring (SHM) has been widely successful in a variety of applications. However, most techniques are built upon the assumption that both training and test data were drawn from the same underlying distribution. This fact means that unless test data were obtained from the same system in the same operating conditions, the machine learning inferences from the training data will not provide accurate predictions when applied to the test data. Therefore, to train a robust predictor conventionally, new training data and labels must be recollected for every new structure considered, which is significantly expensive and often impossible in an SHM context. Transfer learning, in the form of domain adaptation, offers a novel solution to these problems by providing a method for mapping feature and label distributions for different structures, labelled source and unlabelled target structures, onto the same space. As a result, classifiers trained on a labelled structure in the source domain will generalise to a different unlabelled target structure. Furthermore, a holistic discussion of contexts in which domain adaptation is applicable are discussed, specifically for population-based SHM. Three domain adaptation techniques are demonstrated on four case studies providing new frameworks for approaching the problem of SHM

    Computational solutions for addressing heterogeneity in DNA methylation data

    Get PDF
    DNA methylation, a reversible epigenetic modification, has been implicated with various bi- ological processes including gene regulation. Due to the multitude of datasets available, it is a premier candidate for computational tool development, especially for investigating hetero- geneity within and across samples. We differentiate between three levels of heterogeneity in DNA methylation data: between-group, between-sample, and within-sample heterogeneity. Here, we separately address these three levels and present new computational approaches to quantify and systematically investigate heterogeneity. Epigenome-wide association studies relate a DNA methylation aberration to a phenotype and therefore address between-group heterogeneity. To facilitate such studies, which necessar- ily include data processing, exploratory data analysis, and differential analysis of DNA methy- lation, we extended the R-package RnBeads. We implemented novel methods for calculating the epigenetic age of individuals, novel imputation methods, and differential variability analysis. A use-case of the new features is presented using samples from Ewing sarcoma patients. As an important driver of epigenetic differences between phenotypes, we systematically investigated associations between donor genotypes and DNA methylation states in methylation quantitative trait loci (methQTL). To that end, we developed a novel computational framework –MAGAR– for determining statistically significant associations between genetic and epigenetic variations. We applied the new pipeline to samples obtained from sorted blood cells and complex bowel tissues of healthy individuals and found that tissue-specific and common methQTLs have dis- tinct genomic locations and biological properties. To investigate cell-type-specific DNA methylation profiles, which are the main drivers of within-group heterogeneity, computational deconvolution methods can be used to dissect DNA methylation patterns into latent methylation components. Deconvolution methods require pro- files of high technical quality and the identified components need to be biologically interpreted. We developed a computational pipeline to perform deconvolution of complex DNA methyla- tion data, which implements crucial data processing steps and facilitates result interpretation. We applied the protocol to lung adenocarcinoma samples and found indications of tumor in- filtration by immune cells and associations of the detected components with patient survival. Within-sample heterogeneity (WSH), i.e., heterogeneous DNA methylation patterns at a ge- nomic locus within a biological sample, is often neglected in epigenomic studies. We present the first systematic benchmark of scores quantifying WSH genome-wide using simulated and experimental data. Additionally, we created two novel scores that quantify DNA methyla- tion heterogeneity at single CpG resolution with improved robustness toward technical biases. WSH scores describe different types of WSH in simulated data, quantify differential hetero- geneity, and serve as a reliable estimator of tumor purity. Due to the broad availability of DNA methylation data, the levels of heterogeneity in DNA methylation data can be comprehensively investigated. We contribute novel computational frameworks for analyzing DNA methylation data with respect to different levels of hetero- geneity. We envision that this toolbox will be indispensible for understanding the functional implications of DNA methylation patterns in health and disease.DNA Methylierung ist eine reversible, epigenetische Modifikation, die mit verschiedenen biologischen Prozessen wie beispielsweise der Genregulation in Verbindung steht. Eine Vielzahl von DNA Methylierungsdatensätzen bildet die perfekte Grundlage zur Entwicklung von Softwareanwendungen, insbesondere um Heterogenität innerhalb und zwischen Proben zu beschreiben. Wir unterscheiden drei Ebenen von Heterogenität in DNA Methylierungsdaten: zwischen Gruppen, zwischen Proben und innerhalb einer Probe. Hier betrachten wir die drei Ebenen von Heterogenität in DNA Methylierungsdaten unabhängig voneinander und präsentieren neue Ansätze um die Heterogenität zu beschreiben und zu quantifizieren. Epigenomweite Assoziationsstudien verknüpfen eine DNA Methylierungsveränderung mit einem Phänotypen und beschreiben Heterogenität zwischen Gruppen. Um solche Studien, welche Datenprozessierung, sowie exploratorische und differentielle Datenanalyse beinhalten, zu vereinfachen haben wir die R-basierte Softwareanwendung RnBeads erweitert. Die Erweiterungen beinhalten neue Methoden, um das epigenetische Alter vorherzusagen, neue Schätzungsmethoden für fehlende Datenpunkte und eine differentielle Variabilitätsanalyse. Die Analyse von Ewing-Sarkom Patientendaten wurde als Anwendungsbeispiel für die neu entwickelten Methoden gewählt. Wir untersuchten Assoziationen zwischen Genotypen und DNA Methylierung von einzelnen CpGs, um sogenannte methylation quantitative trait loci (methQTL) zu definieren. Diese stellen einen wichtiger Faktor dar, der epigenetische Unterschiede zwischen Gruppen induziert. Hierzu entwickelten wir ein neues Softwarepaket (MAGAR), um statistisch signifikante Assoziationen zwischen genetischer und epigenetischer Variation zu identifizieren. Wir wendeten diese Pipeline auf Blutzelltypen und komplexe Biopsien von gesunden Individuen an und konnten gemeinsame und gewebespezifische methQTLs in verschiedenen Bereichen des Genoms lokalisieren, die mit unterschiedlichen biologischen Eigenschaften verknüpft sind. Die Hauptursache für Heterogenität innerhalb einer Gruppe sind zelltypspezifische DNA Methylierungsmuster. Um diese genauer zu untersuchen kann Dekonvolutionssoftware die DNA Methylierungsmatrix in unabhängige Variationskomponenten zerlegen. Dekonvolutionsmethoden auf Basis von DNA Methylierung benötigen technisch hochwertige Profile und die identifizierten Komponenten müssen biologisch interpretiert werden. In dieser Arbeit entwickelten wir eine computerbasierte Pipeline zur Durchführung von Dekonvolutionsexperimenten, welche die Datenprozessierung und Interpretation der Resultate beinhaltet. Wir wendeten das entwickelte Protokoll auf Lungenadenokarzinome an und fanden Anzeichen für eine Tumorinfiltration durch Immunzellen, sowie Verbindungen zum Überleben der Patienten. Heterogenität innerhalb einer Probe (within-sample heterogeneity, WSH), d.h. heterogene Methylierungsmuster innerhalb einer Probe an einer genomischen Position, wird in epigenomischen Studien meist vernachlässigt. Wir präsentieren den ersten Vergleich verschiedener, genomweiter WSH Maße auf simulierten und experimentellen Daten. Zusätzlich entwickelten wir zwei neue Maße um WSH für einzelne CpGs zu berechnen, welche eine verbesserte Robustheit gegenüber technischen Faktoren aufweisen. WSH Maße beschreiben verschiedene Arten von WSH, quantifizieren differentielle Heterogenität und sagen Tumorreinheit vorher. Aufgrund der breiten Verfügbarkeit von DNA Methylierungsdaten können die Ebenen der Heterogenität ganzheitlich beschrieben werden. In dieser Arbeit präsentieren wir neue Softwarelösungen zur Analyse von DNA Methylierungsdaten in Bezug auf die verschiedenen Ebenen der Heterogenität. Wir sind davon überzeugt, dass die vorgestellten Softwarewerkzeuge unverzichtbar für das Verständnis von DNA Methylierung im kranken und gesunden Stadium sein werden

    Monitoring Pneumatic Actuators’ Behavior Using Real-World Data Set

    Get PDF
    Developing a big data signal processing method is to monitor the behavior of a common component: a pneumatic actuator. The method is aimed at supporting condition-based maintenance activities: monitoring signals over an extended period, and identifying, classifying diferent machine states that may indicate abnormal behavior. Furthermore, preparing a balanced data set for training supervised machine learning models that represent the component’s all identifed conditions. Peak detection, garbage removal and down-sampling by interpolation were applied for signal preprocessing. Undersampling the over-represented signals, Ward’s hierarchical clustering with multivariate Euclidean distance calculation and Kohonen selforganizing map (KSOM) methods were used for identifying and grouping similar signal patterns. The study demonstrated that the behavior of equipment displaying complex signals could be monitored with the method described. Both hierarchical clustering and KSOM are suitable methods for identifying and clustering signals of diferent machine states that may be overlooked if screened by humans. Using the proposed methods, signals could be screened thoroughly and over a long period of time that is critical when failures or abnormal behavior is rare. Visual display of the identifed clusters over time could help analyzing the deterioration of machine conditions. The clustered signals could be used to create a balanced set of training data for developing supervised machine learning models to automatically identify previously recognized machine conditions that indicate abnormal behavior

    Doctor of Philosophy

    Get PDF
    dissertationCorrelation is a powerful relationship measure used in many fields to estimate trends and make forecasts. When the data are complex, large, and high dimensional, correlation identification is challenging. Several visualization methods have been proposed to solve these problems, but they all have limitations in accuracy, speed, or scalability. In this dissertation, we propose a methodology that provides new visual designs that show details when possible and aggregates when necessary, along with robust interactive mechanisms that together enable quick identification and investigation of meaningful relationships in large and high-dimensional data. We propose four techniques using this methodology. Depending on data size and dimensionality, the most appropriate visualization technique can be provided to optimize the analysis performance. First, to improve correlation identification tasks between two dimensions, we propose a new correlation task-specific visualization method called correlation coordinate plot (CCP). CCP transforms data into a powerful coordinate system for estimating the direction and strength of correlations among dimensions. Next, we propose three visualization designs to optimize correlation identification tasks in large and multidimensional data. The first is snowflake visualization (Snowflake), a focus+context layout for exploring all pairwise correlations. The next proposed design is a new interactive design for representing and exploring data relationships in parallel coordinate plots (PCPs) for large data, called data scalable parallel coordinate plots (DSPCP). Finally, we propose a novel technique for storing and accessing the multiway dependencies through visualization (MultiDepViz). We evaluate these approaches by using various use cases, compare them to prior work, and generate user studies to demonstrate how our proposed approaches help users explore correlation in large data efficiently. Our results confirmed that CCP/Snowflake, DSPCP, and MultiDepViz methods outperform some current visualization techniques such as scatterplots (SCPs), PCPs, SCP matrix, Corrgram, Angular Histogram, and UntangleMap in both accuracy and timing. Finally, these approaches are applied in real-world applications such as a debugging tool, large-scale code performance data, and large-scale climate data

    A New Approach to Arabic Sign Language Recognition System

    Get PDF
    • …
    corecore