72 research outputs found

    Neuron Clustering for Mitigating Catastrophic Forgetting in Supervised and Reinforcement Learning

    Get PDF
    Neural networks have had many great successes in recent years, particularly with the advent of deep learning and many novel training techniques. One issue that has affected neural networks and prevented them from performing well in more realistic online environments is that of catastrophic forgetting. Catastrophic forgetting affects supervised learning systems when input samples are temporally correlated or are non-stationary. However, most real-world problems are non-stationary in nature, resulting in prolonged periods of time separating inputs drawn from different regions of the input space. Reinforcement learning represents a worst-case scenario when it comes to precipitating catastrophic forgetting in neural networks. Meaningful training examples are acquired as the agent explores different regions of its state/action space. When the agent is in one such region, only highly correlated samples from that region are typically acquired. Moreover, the regions that the agent is likely to visit will depend on its current policy, suggesting that an agent that has a good policy may avoid exploring particular regions. The confluence of these factors means that without some mitigation techniques, supervised neural networks as function approximation in temporal-difference learning will be restricted to the simplest test cases. This work explores catastrophic forgetting in neural networks in terms of supervised and reinforcement learning. A simple mathematical model is introduced to argue that catastrophic forgetting is a result of overlapping representations in the hidden layers in which updates to the weights can affect multiple unrelated regions of the input space. A novel neural network architecture, dubbed cluster-select, is introduced which utilizes online clustering for the selection of a subset of hidden neurons to be activated in the feedforward and backpropagation stages. Clusterselect is demonstrated to outperform leading techniques in both classification nd regression. In the context of reinforcement learning, cluster-select is studied for both fully and partially observable Markov decision processes and is demonstrated to converge faster and behave in a more stable manner when compared to other state-of-the-art algorithms

    Fault Diagnosis Of Sensor And Actuator Faults In Multi-Zone Hvac Systems

    Get PDF
    Globally, the buildings sector accounts for 30% of the energy consumption and more than 55% of the electricity demand. Specifically, the Heating, Ventilation, and Air Conditioning (HVAC) system is the most extensively operated component and it is responsible alone for 40% of the final building energy usage. HVAC systems are used to provide healthy and comfortable indoor conditions, and their main objective is to maintain the thermal comfort of occupants with minimum energy usage. HVAC systems include a considerable number of sensors, controlled actuators, and other components. They are at risk of malfunctioning or failure resulting in reduced efficiency, potential interference with the execution of supervision schemes, and equipment deterioration. Hence, Fault Diagnosis (FD) of HVAC systems is essential to improve their reliability, efficiency, and performance, and to provide preventive maintenance. In this thesis work, two neural network-based methods are proposed for sensor and actuator faults in a 3-zone HVAC system. For sensor faults, an online semi-supervised sensor data validation and fault diagnosis method using an Auto-Associative Neural Network (AANN) is developed. The method is based on the implementation of Nonlinear Principal Component Analysis (NPCA) using a Back-Propagation Neural Network (BPNN) and it demonstrates notable capability in sensor fault and inaccuracy correction, measurement noise reduction, missing sensor data replacement, and in both single and multiple sensor faults diagnosis. In addition, a novel on-line supervised multi-model approach for actuator fault diagnosis using Convolutional Neural Networks (CNNs) is developed for single actuator faults. It is based a data transformation in which the 1-dimensional data are configured into a 2-dimensional representation without the use of advanced signal processing techniques. The CNN-based actuator fault diagnosis approach demonstrates improved performance capability compared with the commonly used Machine Learning-based algorithms (i.e., Support Vector Machine and standard Neural Networks). The presented schemes are compared with other commonly used HVAC fault diagnosis methods for benchmarking and they are proven to be superior, effective, accurate, and reliable. The proposed approaches can be applied to large-scale buildings with additional zones

    "Ann" artifical neural networks and fuzzy logic models for cooling load prediction

    Get PDF
    Thesis (Master)--Izmir Institute of Technology, Mechanical Engineering, Izmir, 2005Includes bibliographical references (leaves: 44-45)Text in English; Abstract: Turkish and Englishx, 45 leavesIn this thesis Artificial Neural Networks (ANN) and fuzzy logic models of the building energy use predictions were created. Data collected from a Hawaian 42 storey commercial building chiller plant power consumption and independent hourly climate data were obtained from the National Climate Data Center of the USA. These data were used in both ANN and the fuzzy model setting up and testing. The tropical climate data consisted of dry bulb temperature, wet bulb temperature, dew point temperature, relative humidity percentage, wind speed and wind direction.Both input variables and the output variable of the central chiller plant power consumption were fuzzified, and fuzzy membership functions were employed. The Mamdani fuzzy rules (32 rule) in If .Then format with the centre of gravity (COG; centroid) defuzzification were employed. The average percentage error levels in the fuzzy model and the ANN model were end up with 11.6% (R2.0.88) and 10.3% (R2.0.87), respectively. The fuzzy model is successfully presented for predicting chiller plant energy use in tropical climates with small seasonal and daily variations that makes this fuzzy model

    Illumination Invariant Deep Learning for Hyperspectral Data

    Get PDF
    Motivated by the variability in hyperspectral images due to illumination and the difficulty in acquiring labelled data, this thesis proposes different approaches for learning illumination invariant feature representations and classification models for hyperspectral data captured outdoors, under natural sunlight. The approaches integrate domain knowledge into learning algorithms and hence does not rely on a priori knowledge of atmospheric parameters, additional sensors or large amounts of labelled training data. Hyperspectral sensors record rich semantic information from a scene, making them useful for robotics or remote sensing applications where perception systems are used to gain an understanding of the scene. Images recorded by hyperspectral sensors can, however, be affected to varying degrees by intrinsic factors relating to the sensor itself (keystone, smile, noise, particularly at the limits of the sensed spectral range) but also by extrinsic factors such as the way the scene is illuminated. The appearance of the scene in the image is tied to the incident illumination which is dependent on variables such as the position of the sun, geometry of the surface and the prevailing atmospheric conditions. Effects like shadows can make the appearance and spectral characteristics of identical materials to be significantly different. This degrades the performance of high-level algorithms that use hyperspectral data, such as those that do classification and clustering. If sufficient training data is available, learning algorithms such as neural networks can capture variability in the scene appearance and be trained to compensate for it. Learning algorithms are advantageous for this task because they do not require a priori knowledge of the prevailing atmospheric conditions or data from additional sensors. Labelling of hyperspectral data is, however, difficult and time-consuming, so acquiring enough labelled samples for the learning algorithm to adequately capture the scene appearance is challenging. Hence, there is a need for the development of techniques that are invariant to the effects of illumination that do not require large amounts of labelled data. In this thesis, an approach to learning a representation of hyperspectral data that is invariant to the effects of illumination is proposed. This approach combines a physics-based model of the illumination process with an unsupervised deep learning algorithm, and thus requires no labelled data. Datasets that vary both temporally and spatially are used to compare the proposed approach to other similar state-of-the-art techniques. The results show that the learnt representation is more invariant to shadows in the image and to variations in brightness due to changes in the scene topography or position of the sun in the sky. The results also show that a supervised classifier can predict class labels more accurately and more consistently across time when images are represented using the proposed method. Additionally, this thesis proposes methods to train supervised classification models to be more robust to variations in illumination where only limited amounts of labelled data are available. The transfer of knowledge from well-labelled datasets to poorly labelled datasets for classification is investigated. A method is also proposed for enabling small amounts of labelled samples to capture the variability in spectra across the scene. These samples are then used to train a classifier to be robust to the variability in the data caused by variations in illumination. The results show that these approaches make convolutional neural network classifiers more robust and achieve better performance when there is limited labelled training data. A case study is presented where a pipeline is proposed that incorporates the methods proposed in this thesis for learning robust feature representations and classification models. A scene is clustered using no labelled data. The results show that the pipeline groups the data into clusters that are consistent with the spatial distribution of the classes in the scene as determined from ground truth

    Cyber Data Anomaly Detection Using Autoencoder Neural Networks

    Get PDF
    The Department of Defense requires a secure presence in the cyber domain to successfully execute its stated mission of deterring war and protecting the security of the United States. With potentially millions of logged network events occurring on defended networks daily, a limited staff of cyber analysts require the capability to identify novel network actions for security adjudication. The detection methodology proposed uses an autoencoder neural network optimized via design of experiments for the identification of anomalous network events. Once trained, each logged network event is analyzed by the neural network and assigned an outlier score. The network events with the largest outlier scores are anomalous and worthy of further review by cyber analysts. This neural network approach can operate in conjunction with alternate tools for outlier detection, enhancing the overall anomaly detection capability of cyber analysts

    A Deep Learning Approach for Condition-Based Fault Prediction in Industrial Equipment

    Get PDF
    RÉSUMÉ : Tout type de système se dégrade avec l’usage et le temps. Adopter une bonne stratégie de maintenance est donc nécessaire afin que les équipements industriels fonctionnent adéquatement tout au long de leur durée de vie. Au fur et à mesure que l’importance accordée à la santé et la sécurité augmente dans les industries, la maintenance devient de plus en plus imposante. Ceci conduit donc au développement de stratégies de maintenance plus sophistiquées. Parmi ces stratégies, la Maintenance Conditionnelle (« Condition-Based Maintenance ») est la plus avancée d’entre elles. Par ailleurs, avec l’avènement de l’Industrie 4.0, le nombre de capteurs dans l’industrie augmente rapidement, ouvrant la porte à la surveillance continue et automatique des équipements industriels. Plus particulièrement, nous présentons une technique où l’indicateur de fautes futures est basé sur la différence entre la valeur d’une variable mesurable quelconque et la valeur attendue de cette variable lors de son opération normale. Le fait de devoir estimer correctement la valeur de ce paramètre fait intervenir la maintenance en plus de la prévision. Nous proposons alors d’utiliser l’apprentissage machine (« Machine Learning »), plus précisément un réseau de neurones profond (« Deep Neural Network »), afin de prédire la valeur du paramètre. La théorie dernière ces techniques est présentée dans ce travail, portant une attention particulière aux réseaux LSTM (« Long Short-Term Memory »). Ce type d’architecture neuronale est adaptée pour la prévision impliquant des séries temporelles à plusieurs variables, qui est le type de données enregistrées par les capteurs en industrie. Dans ce travail, la technique de prédiction de fautes est appliquée aux électrolyseurs. Un électrolyseur est un système composé de plusieurs cellules électrochimiques où l’électrolyse a lieu. La présence de fautes inattendues peut mener à des incidents catastrophiques pouvant causer de grands torts aux employés et à l’environnement. Les cellules électrochimiques se comportent comme une résistance électrique. En effet, lorsqu’un courant est appliqué à ses bornes, ceci provoque une chute de tension. L’amplitude de cette chute de tension dépend des conditions d’opération ainsi que de la dégradation de la cellule, qui n’est pas connue. De ce fait, la tension est la variable surveillée du système afin de pouvoir prévoir les fautes, puisque sa valeur augmente soudainement lorsqu’une faute est sur le point de se produire. Nous présentons les limitations reliées aux données disponibles. Afin de surmonter ces limitations, nous voulons nous baser seulement sur les données provenant des capteurs et non pas sur une intervention humaine. Nous proposons donc une nouvelle approche basée sur un réseau de neurones de type encodeur-décodeur. Le réseau reçoit les conditions de fonctionnement en entrée. Le rôle de l’encodeur est de trouver une représentation fidèle de la dégradation et de la transmettre au décodeur. Quant à lui, son rôle est de prédire la tension de la cellule. Comme aucune donnée de dégradation n’est fournie au réseau, nous considérons notre approche comme étant un encodeur auto-supervisé (« Self-Supervised Encoder »). De même, la sortie de l’encodeur peut être représentée graphiquement, rendant possible l’interprétation du réseau de neurones développé. Après le développement du modèle, il est testé et les résultats obtenus sont comparés avec ceux du modèle paramétrique présentement utilisé à cette fin, qui est défini par des experts. Les résultats montrent que le réseau de neurones est apte à prédire le voltage de plusieurs cellules comportant des niveaux de dégradation différents. Par ailleurs, l’erreur de prédiction est réduite de 53 % par rapport au modèle paramétrique. Cette amélioration permet à notre modèle de prédire une faute 31 heures avant qu’elle ne se produise. Ceci correspond à une augmentation du temps de réaction de 64 % comparativement aux résultats obtenus avec le modèle paramétrique.----------ABSTRACT : Systems always degrade with use and time. A good maintenance strategy is necessary for keeping industrial equipment working as designed throughout its lifetime. As safety and quality concerns raised in industry, the importance of maintenance evolved, leading to the development of sophisticated maintenance strategies. Among them, Condition-Based Maintenance is the most advanced. It is based on the monitorization of the system’s properties, detecting impending faults, and triggering maintenance actions only when the system is going to fail. Moreover, with the advent of Industry 4.0, the number of sensors in industry is rapidly increasing, opening the door to the continuous and automatic monitorization of industrial equipment. In particular, we present a technique where the indicator of an incoming fault is based on the divergence between the value of a measurable feature and its expected value in a healthy system. The need to correctly estimate the variable’s value involves both maintenance and forecasting. We propose using Machine Learning algorithms for predicting the variable and, more concretely, Deep Neural Networks. We introduce the theory behind them, paying particular attention to Long Short-Term Memory (LSTM) networks. This neural architecture is suited for forecasting multivariate time series data, which is the one registered by industrial sensors. We apply this fault prediction technique to an electrolyzer. Electrolyzers are systems composed of multiple electrochemical cells, where electrolysis takes place. Unexpected cells’ faults may lead to catastrophic incidents, such as explosions and fire, with the consequent harm to the plant’s operators and the environment. Electrochemical cells act similarly to resistors, causing a voltage drop when a current is applied to them. The magnitude of this voltage drop depends on the operating conditions and the cell’s degradation, which is not known. The voltage is the monitored variable for predicting the faults, as it increases suddenly when a fault is about to happen. We present the limitations related to the available data. In order to overcome them, relying only on sensors’ data and without human interaction, we propose a new approach based on an encoder-decoder neural architecture. The network receives the operating conditions as input. The encoder’s task is to find a faithful representation of the degradation and to pass it to the decoder, which in turn predicts the cell’s voltage. As no labeled degradation data is given to the network, we consider our approach to be a self-supervised encoder. Moreover, the output of the encoder can be plotted in a graph, adding interpretability to the neural network model. We compare the results obtained by our network to the expert-defined parametric model that is currently used for this task. Results show that our neural network is able to predict the voltage of multiple cells with different levels of degradation. Moreover, it reduces the prediction error that was obtained by the parametric model by 53%. This improvement enabled our network to predict a fault 31 hours before it happened, a 64% increase in reaction time compared to the parametric model
    • …
    corecore