72 research outputs found
Recommended from our members
Synaptic plasticity and memory addressing in biological and artificial neural networks
Biological brains are composed of neurons, interconnected by synapses to create large complex networks. Learning and memory occur, in large part, due to synaptic plasticity -- modifications in the efficacy of information transmission through these synaptic connections. Artificial neural networks model these with neural "units" which communicate through synaptic weights. Models of learning and memory propose synaptic plasticity rules that describe and predict the weight modifications. An equally important but under-evaluated question is the selection of \textit{which} synapses should be updated in response to a memory event. In this work, we attempt to separate the questions of synaptic plasticity from that of memory addressing.
Chapter 1 provides an overview of the problem of memory addressing and a summary of the solutions that have been considered in computational neuroscience and artificial intelligence, as well as those that may exist in biology. Chapter 2 presents in detail a solution to memory addressing and synaptic plasticity in the context of familiarity detection, suggesting strong feedforward weights and anti-Hebbian plasticity as the respective mechanisms. Chapter 3 proposes a model of recall, with storage performed by addressing through local third factors and neo-Hebbian plasticity, and retrieval by content-based addressing. In Chapter 4, we consider the problem of concurrent memory consolidation and memorization. Both storage and retrieval are performed by content-based addressing, but the plasticity rule itself is implemented by gradient descent, modulated according to whether an item should be stored in a distributed manner or memorized verbatim. However, the classical method for computing gradients in recurrent neural networks, backpropagation through time, is generally considered unbiological. In Chapter 5 we suggest a more realistic implementation through an approximation of recurrent backpropagation.
Taken together, these results propose a number of potential mechanisms for memory storage and retrieval, each of which separates the mechanism of synaptic updating -- plasticity -- from that of synapse selection -- addressing. Explicit studies of memory addressing may find applications not only in artificial intelligence but also in biology. In artificial networks, for example, selectively updating memories in large language models can help improve user privacy and security. In biological ones, understanding memory addressing can help with health outcomes and treating memory-based illnesses such as Alzheimers or PTSD
Neuron Clustering for Mitigating Catastrophic Forgetting in Supervised and Reinforcement Learning
Neural networks have had many great successes in recent years, particularly with the advent of deep learning and many novel training techniques. One issue that has affected neural networks and prevented them from performing well in more realistic online environments is that of catastrophic forgetting. Catastrophic forgetting affects supervised learning systems when input samples are temporally correlated or are non-stationary. However, most real-world problems are non-stationary in nature, resulting in prolonged periods of time separating inputs drawn from different regions of the input space.
Reinforcement learning represents a worst-case scenario when it comes to precipitating catastrophic forgetting in neural networks. Meaningful training examples are acquired as the agent explores different regions of its state/action space. When the agent is in one such region, only highly correlated samples from that region are typically acquired. Moreover, the regions that the agent is likely to visit will depend on its current policy, suggesting that an agent that has a good policy may avoid exploring particular regions. The confluence of these factors means that without some mitigation techniques, supervised neural networks as function approximation in temporal-difference learning will be restricted to the simplest test cases.
This work explores catastrophic forgetting in neural networks in terms of supervised and reinforcement learning. A simple mathematical model is introduced to argue that catastrophic forgetting is a result of overlapping representations in the hidden layers in which updates to the weights can affect multiple unrelated regions of the input space. A novel neural network architecture, dubbed cluster-select, is introduced which utilizes online clustering for the selection of a subset of hidden neurons to be activated in the feedforward and backpropagation stages. Clusterselect is demonstrated to outperform leading techniques in both classification nd regression. In the context of reinforcement learning, cluster-select is studied for both fully and partially observable Markov decision processes and is demonstrated to converge faster and behave in a more stable manner when compared to other state-of-the-art algorithms
Fault Diagnosis Of Sensor And Actuator Faults In Multi-Zone Hvac Systems
Globally, the buildings sector accounts for 30% of the energy consumption and
more than 55% of the electricity demand. Specifically, the Heating, Ventilation, and
Air Conditioning (HVAC) system is the most extensively operated component and it is
responsible alone for 40% of the final building energy usage. HVAC systems are used
to provide healthy and comfortable indoor conditions, and their main objective is to
maintain the thermal comfort of occupants with minimum energy usage.
HVAC systems include a considerable number of sensors, controlled actuators, and
other components. They are at risk of malfunctioning or failure resulting in reduced efficiency,
potential interference with the execution of supervision schemes, and equipment
deterioration. Hence, Fault Diagnosis (FD) of HVAC systems is essential to improve
their reliability, efficiency, and performance, and to provide preventive maintenance.
In this thesis work, two neural network-based methods are proposed for sensor and
actuator faults in a 3-zone HVAC system. For sensor faults, an online semi-supervised
sensor data validation and fault diagnosis method using an Auto-Associative Neural
Network (AANN) is developed. The method is based on the implementation of Nonlinear
Principal Component Analysis (NPCA) using a Back-Propagation Neural Network
(BPNN) and it demonstrates notable capability in sensor fault and inaccuracy
correction, measurement noise reduction, missing sensor data replacement, and in both
single and multiple sensor faults diagnosis. In addition, a novel on-line supervised multi-model approach for actuator fault diagnosis using Convolutional Neural Networks
(CNNs) is developed for single actuator faults. It is based a data transformation in
which the 1-dimensional data are configured into a 2-dimensional representation without
the use of advanced signal processing techniques. The CNN-based actuator fault
diagnosis approach demonstrates improved performance capability compared with the
commonly used Machine Learning-based algorithms (i.e., Support Vector Machine and
standard Neural Networks).
The presented schemes are compared with other commonly used HVAC fault diagnosis
methods for benchmarking and they are proven to be superior, effective, accurate,
and reliable. The proposed approaches can be applied to large-scale buildings with
additional zones
"Ann" artifical neural networks and fuzzy logic models for cooling load prediction
Thesis (Master)--Izmir Institute of Technology, Mechanical Engineering, Izmir, 2005Includes bibliographical references (leaves: 44-45)Text in English; Abstract: Turkish and Englishx, 45 leavesIn this thesis Artificial Neural Networks (ANN) and fuzzy logic models of the building energy use predictions were created. Data collected from a Hawaian 42 storey commercial building chiller plant power consumption and independent hourly climate data were obtained from the National Climate Data Center of the USA. These data were used in both ANN and the fuzzy model setting up and testing. The tropical climate data consisted of dry bulb temperature, wet bulb temperature, dew point temperature, relative humidity percentage, wind speed and wind direction.Both input variables and the output variable of the central chiller plant power consumption were fuzzified, and fuzzy membership functions were employed. The Mamdani fuzzy rules (32 rule) in If .Then format with the centre of gravity (COG; centroid) defuzzification were employed. The average percentage error levels in the fuzzy model and the ANN model were end up with 11.6% (R2.0.88) and 10.3% (R2.0.87), respectively. The fuzzy model is successfully presented for predicting chiller plant energy use in tropical climates with small seasonal and daily variations that makes this fuzzy model
Illumination Invariant Deep Learning for Hyperspectral Data
Motivated by the variability in hyperspectral images due to illumination and the difficulty in acquiring labelled data, this thesis proposes different approaches for learning illumination invariant feature representations and classification models for hyperspectral data captured outdoors, under natural sunlight. The approaches integrate domain knowledge into learning algorithms and hence does not rely on a priori knowledge of atmospheric parameters, additional sensors or large amounts of labelled training data. Hyperspectral sensors record rich semantic information from a scene, making them useful for robotics or remote sensing applications where perception systems are used to gain an understanding of the scene. Images recorded by hyperspectral sensors can, however, be affected to varying degrees by intrinsic factors relating to the sensor itself (keystone, smile, noise, particularly at the limits of the sensed spectral range) but also by extrinsic factors such as the way the scene is illuminated. The appearance of the scene in the image is tied to the incident illumination which is dependent on variables such as the position of the sun, geometry of the surface and the prevailing atmospheric conditions. Effects like shadows can make the appearance and spectral characteristics of identical materials to be significantly different. This degrades the performance of high-level algorithms that use hyperspectral data, such as those that do classification and clustering. If sufficient training data is available, learning algorithms such as neural networks can capture variability in the scene appearance and be trained to compensate for it. Learning algorithms are advantageous for this task because they do not require a priori knowledge of the prevailing atmospheric conditions or data from additional sensors. Labelling of hyperspectral data is, however, difficult and time-consuming, so acquiring enough labelled samples for the learning algorithm to adequately capture the scene appearance is challenging. Hence, there is a need for the development of techniques that are invariant to the effects of illumination that do not require large amounts of labelled data. In this thesis, an approach to learning a representation of hyperspectral data that is invariant to the effects of illumination is proposed. This approach combines a physics-based model of the illumination process with an unsupervised deep learning algorithm, and thus requires no labelled data. Datasets that vary both temporally and spatially are used to compare the proposed approach to other similar state-of-the-art techniques. The results show that the learnt representation is more invariant to shadows in the image and to variations in brightness due to changes in the scene topography or position of the sun in the sky. The results also show that a supervised classifier can predict class labels more accurately and more consistently across time when images are represented using the proposed method. Additionally, this thesis proposes methods to train supervised classification models to be more robust to variations in illumination where only limited amounts of labelled data are available. The transfer of knowledge from well-labelled datasets to poorly labelled datasets for classification is investigated. A method is also proposed for enabling small amounts of labelled samples to capture the variability in spectra across the scene. These samples are then used to train a classifier to be robust to the variability in the data caused by variations in illumination. The results show that these approaches make convolutional neural network classifiers more robust and achieve better performance when there is limited labelled training data. A case study is presented where a pipeline is proposed that incorporates the methods proposed in this thesis for learning robust feature representations and classification models. A scene is clustered using no labelled data. The results show that the pipeline groups the data into clusters that are consistent with the spatial distribution of the classes in the scene as determined from ground truth
Cyber Data Anomaly Detection Using Autoencoder Neural Networks
The Department of Defense requires a secure presence in the cyber domain to successfully execute its stated mission of deterring war and protecting the security of the United States. With potentially millions of logged network events occurring on defended networks daily, a limited staff of cyber analysts require the capability to identify novel network actions for security adjudication. The detection methodology proposed uses an autoencoder neural network optimized via design of experiments for the identification of anomalous network events. Once trained, each logged network event is analyzed by the neural network and assigned an outlier score. The network events with the largest outlier scores are anomalous and worthy of further review by cyber analysts. This neural network approach can operate in conjunction with alternate tools for outlier detection, enhancing the overall anomaly detection capability of cyber analysts
A Deep Learning Approach for Condition-Based Fault Prediction in Industrial Equipment
RÉSUMÉ : Tout type de système se dégrade avec l’usage et le temps. Adopter une bonne stratégie de maintenance est donc nécessaire afin que les équipements industriels fonctionnent adéquatement tout au long de leur durée de vie. Au fur et à mesure que l’importance accordée à la santé et la sécurité augmente dans les industries, la maintenance devient de plus en plus imposante. Ceci conduit donc au développement de stratégies de maintenance plus sophistiquées. Parmi ces stratégies, la Maintenance Conditionnelle (« Condition-Based Maintenance ») est la plus avancée d’entre elles. Par ailleurs, avec l’avènement de l’Industrie 4.0, le nombre de capteurs dans l’industrie augmente rapidement, ouvrant la porte à la surveillance continue et automatique des équipements industriels. Plus particulièrement, nous présentons une technique où l’indicateur de fautes futures est basé sur la différence entre la valeur d’une variable mesurable quelconque et la valeur attendue de cette variable lors de son opération normale. Le fait de devoir estimer correctement la valeur de ce paramètre fait intervenir la maintenance en plus de la prévision. Nous proposons alors d’utiliser l’apprentissage machine (« Machine Learning »), plus précisément un réseau de neurones profond (« Deep Neural Network »), afin de prédire la valeur du paramètre. La théorie dernière ces techniques est présentée dans ce travail, portant une attention particulière aux réseaux LSTM (« Long Short-Term Memory »). Ce type d’architecture neuronale est adaptée pour la prévision impliquant des séries temporelles à plusieurs variables, qui est le type de données enregistrées par les capteurs en industrie.
Dans ce travail, la technique de prédiction de fautes est appliquée aux électrolyseurs. Un électrolyseur est un système composé de plusieurs cellules électrochimiques où l’électrolyse a lieu. La présence de fautes inattendues peut mener à des incidents catastrophiques pouvant causer de grands torts aux employés et à l’environnement. Les cellules électrochimiques se comportent comme une résistance électrique. En effet, lorsqu’un courant est appliqué à ses bornes, ceci provoque une chute de tension. L’amplitude de cette chute de tension dépend des conditions d’opération ainsi que de la dégradation de la cellule, qui n’est pas connue. De ce fait, la tension est la variable surveillée du système afin de pouvoir prévoir les fautes, puisque sa valeur augmente soudainement lorsqu’une faute est sur le point de se produire. Nous présentons les limitations reliées aux données disponibles. Afin de surmonter ces limitations, nous voulons nous baser seulement sur les données provenant des capteurs et non pas sur une intervention humaine. Nous proposons donc une nouvelle approche basée sur un réseau de neurones de type encodeur-décodeur. Le réseau reçoit les conditions de fonctionnement en entrée. Le rôle de l’encodeur est de trouver une représentation fidèle de la dégradation et de la transmettre au décodeur. Quant à lui, son rôle est de prédire la tension de la cellule. Comme aucune donnée de dégradation n’est fournie au réseau, nous considérons notre approche comme étant un encodeur auto-supervisé (« Self-Supervised Encoder »). De même, la sortie de l’encodeur peut être représentée graphiquement, rendant possible l’interprétation du réseau de neurones développé. Après le développement du modèle, il est testé et les résultats obtenus sont comparés avec ceux du modèle paramétrique présentement utilisé à cette fin, qui est défini par des experts. Les résultats montrent que le réseau de neurones est apte à prédire le voltage de plusieurs cellules comportant des niveaux de dégradation différents. Par ailleurs, l’erreur de prédiction est réduite de 53 % par rapport au modèle paramétrique. Cette amélioration permet à notre modèle de prédire une faute 31 heures avant qu’elle ne se produise. Ceci correspond à une augmentation du temps de réaction de 64 % comparativement aux résultats obtenus avec le modèle paramétrique.----------ABSTRACT : Systems always degrade with use and time. A good maintenance strategy is necessary for keeping industrial equipment working as designed throughout its lifetime. As safety and quality concerns raised in industry, the importance of maintenance evolved, leading to the development of sophisticated maintenance strategies. Among them, Condition-Based Maintenance is the most advanced. It is based on the monitorization of the system’s properties, detecting impending faults, and triggering maintenance actions only when the system is going to fail. Moreover, with the advent of Industry 4.0, the number of sensors in industry is rapidly increasing, opening the door to the continuous and automatic monitorization of industrial equipment. In particular, we present a technique where the indicator of an incoming fault is based on the divergence between the value of a measurable feature and its expected value in a healthy system. The need to correctly estimate the variable’s value involves both maintenance and forecasting. We propose using Machine Learning algorithms for predicting the variable and, more concretely, Deep Neural Networks. We introduce the theory behind them, paying particular attention to Long Short-Term Memory (LSTM) networks. This neural architecture is suited for forecasting multivariate time series data, which is the one registered by industrial sensors. We apply this fault prediction technique to an electrolyzer. Electrolyzers are systems composed of multiple electrochemical cells, where electrolysis takes place. Unexpected cells’ faults may lead to catastrophic incidents, such as explosions and fire, with the consequent harm to the plant’s operators and the environment. Electrochemical cells act similarly to resistors, causing a voltage drop when a current is applied to them. The magnitude of this voltage drop depends on the operating conditions and the cell’s degradation, which is not known. The voltage is the monitored variable for predicting the faults, as it increases suddenly when a fault is about to happen. We present the limitations related to the available data. In order to overcome them, relying only on sensors’ data and without human interaction, we propose a new approach based on an encoder-decoder neural architecture. The network receives the operating conditions as input. The encoder’s task is to find a faithful representation of the degradation and to pass it to the decoder, which in turn predicts the cell’s voltage. As no labeled degradation data is given to the network, we consider our approach to be a self-supervised encoder. Moreover, the output of the encoder can be plotted in a graph, adding interpretability to the neural network model.
We compare the results obtained by our network to the expert-defined parametric model that is currently used for this task. Results show that our neural network is able to predict the voltage of multiple cells with different levels of degradation. Moreover, it reduces the prediction error that was obtained by the parametric model by 53%. This improvement enabled our network to predict a fault 31 hours before it happened, a 64% increase in reaction time compared to the parametric model
Recommended from our members
Applications of machine learning to water resources management: A review of present status and future opportunities
Data availability:
No data was used for the research described in the article.The corrected proof will be replaced by version of record in due course.Copyright © 2024 The Authors. Water is the most valuable natural resource on earth that plays a critical role in the socio-economic development of humans worldwide. Water is used for various purposes, including, but not limited to, drinking, recreation, irrigation, and hydropower production. The expected population growth at a global scale, coupled with the predicted climate change-induced impacts, warrants the need for proactive and effective management of water resources. Over the recent decades, machine learning tools have been widely applied to various water resources management-related fields and have often shown promising results. Despite the publication of several review articles on machine learning applications in water-related fields, this review paper presents for the first time a comprehensive review of machine learning techniques applied to water resources management, focusing on the most recent achievements. The study examines the potential for advanced machine learning techniques to improve decision support systems in the various sectors within the realm of water resources management, which includes groundwater management, streamflow forecasting, water distribution systems, water quality and wastewater treatment, water demand and consumption, hydropower and marine energy, water drainage systems, and flood management and defence. This study provides an overview of the state-of-the-art machine learning approaches to the water industry and how they can be used to ensure water supply sustainability, quality, and flood and drought mitigation. This review covers the most recent related studies to provide the most recent snapshot of machine learning applications in the water industry. Overall, LSTM networks have been proven to exhibit reliable performance, often outperforming ANN models, traditional machine learning models, and established physics-based models. Hybrid ML techniques have exhibited great forecasting accuracy across all water-related fields, often showing superior computational power over traditional ANNs architectures. In addition to purely data-driven models, physical-based hybrid models have also been developed to improve prediction performance. These efforts further demonstrate that Machine learning can be a powerful practical tool for water resources management. It provides insights, predictions, and optimisation capabilities to help enhance sustainable water use and management and improve socio-economic development, healthy ecosystems and human existence.EPSRC project reference 2339403 to S. Sayed and A. Ahmed
- …