2,409 research outputs found

    A Bayesian Approach to Control Loop Performance Diagnosis Incorporating Background Knowledge of Response Information

    Get PDF
    To isolate the problem source degrading the control loop performance, this work focuses on how to incorporate background knowledge into Bayesian inference. In an effort to reduce dependence on the amount of historical data available, we consider a general kind of background knowledge which appears in many applications. The knowledge, known as response information, is about what faults can possibly affect each of the monitors. We show how this knowledge can be translated to constraints on the underlying probability distributions and introduced in the Bayesian diagnosis. In this way, the dimensionality of the observation space is reduced and thus the diagnosis can be more reliable. Furthermore, for the judgments to be consistent, the set of posterior probabilities of each possible abnormality that are computed from different observation subspaces is synthesized to obtain the partially ordered posteriors. The eigenvalue formulation is used on the pairwise comparison matrix. The proposed approach is applied to a diagnosis problem on an oil sand solids handling system, where it is shown how the combination of background knowledge and data enhances the control performance diagnosis even when the abnormality data are sparse in the historical database

    End-to-end anomaly detection in stream data

    Get PDF
    Nowadays, huge volumes of data are generated with increasing velocity through various systems, applications, and activities. This increases the demand for stream and time series analysis to react to changing conditions in real-time for enhanced efficiency and quality of service delivery as well as upgraded safety and security in private and public sectors. Despite its very rich history, time series anomaly detection is still one of the vital topics in machine learning research and is receiving increasing attention. Identifying hidden patterns and selecting an appropriate model that fits the observed data well and also carries over to unobserved data is not a trivial task. Due to the increasing diversity of data sources and associated stochastic processes, this pivotal data analysis topic is loaded with various challenges like complex latent patterns, concept drift, and overfitting that may mislead the model and cause a high false alarm rate. Handling these challenges leads the advanced anomaly detection methods to develop sophisticated decision logic, which turns them into mysterious and inexplicable black-boxes. Contrary to this trend, end-users expect transparency and verifiability to trust a model and the outcomes it produces. Also, pointing the users to the most anomalous/malicious areas of time series and causal features could save them time, energy, and money. For the mentioned reasons, this thesis is addressing the crucial challenges in an end-to-end pipeline of stream-based anomaly detection through the three essential phases of behavior prediction, inference, and interpretation. The first step is focused on devising a time series model that leads to high average accuracy as well as small error deviation. On this basis, we propose higher-quality anomaly detection and scoring techniques that utilize the related contexts to reclassify the observations and post-pruning the unjustified events. Last but not least, we make the predictive process transparent and verifiable by providing meaningful reasoning behind its generated results based on the understandable concepts by a human. The provided insight can pinpoint the anomalous regions of time series and explain why the current status of a system has been flagged as anomalous. Stream-based anomaly detection research is a principal area of innovation to support our economy, security, and even the safety and health of societies worldwide. We believe our proposed analysis techniques can contribute to building a situational awareness platform and open new perspectives in a variety of domains like cybersecurity, and health

    Oil and Gas flow Anomaly Detection on offshore naturally flowing wells using Deep Neural Networks

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThe Oil and Gas industry, as never before, faces multiple challenges. It is being impugned for being dirty, a pollutant, and hence the more demand for green alternatives. Nevertheless, the world still has to rely heavily on hydrocarbons, since it is the most traditional and stable source of energy, as opposed to extensively promoted hydro, solar or wind power. Major operators are challenged to produce the oil more efficiently, to counteract the newly arising energy sources, with less of a climate footprint, more scrutinized expenditure, thus facing high skepticism regarding its future. It has to become greener, and hence to act in a manner not required previously. While most of the tools used by the Hydrocarbon E&P industry is expensive and has been used for many years, it is paramount for the industry’s survival and prosperity to apply predictive maintenance technologies, that would foresee potential failures, making production safer, lowering downtime, increasing productivity and diminishing maintenance costs. Many efforts were applied in order to define the most accurate and effective predictive methods, however data scarcity affects the speed and capacity for further experimentations. Whilst it would be highly beneficial for the industry to invest in Artificial Intelligence, this research aims at exploring, in depth, the subject of Anomaly Detection, using the open public data from Petrobras, that was developed by experts. For this research the Deep Learning Neural Networks, such as Recurrent Neural Networks with LSTM and GRU backbones, were implemented for multi-class classification of undesirable events on naturally flowing wells. Further, several hyperparameter optimization tools were explored, mainly focusing on Genetic Algorithms as being the most advanced methods for such kind of tasks. The research concluded with the best performing algorithm with 2 stacked GRU and the following vector of hyperparameters weights: [1, 47, 40, 14], which stand for timestep 1, number of hidden units 47, number of epochs 40 and batch size 14, producing F1 equal to 0.97%. As the world faces many issues, one of which is the detrimental effect of heavy industries to the environment and as result adverse global climate change, this project is an attempt to contribute to the field of applying Artificial Intelligence in the Oil and Gas industry, with the intention to make it more efficient, transparent and sustainable

    ANOMALY INFERENCE BASED ON HETEROGENEOUS DATA SOURCES IN AN ELECTRICAL DISTRIBUTION SYSTEM

    Get PDF
    Harnessing the heterogeneous data sets would improve system observability. While the current metering infrastructure in distribution network has been utilized for the operational purpose to tackle abnormal events, such as weather-related disturbance, the new normal we face today can be at a greater magnitude. Strengthening the inter-dependencies as well as incorporating new crowd-sourced information can enhance operational aspects such as system reconfigurability under extreme conditions. Such resilience is crucial to the recovery of any catastrophic events. In this dissertation, it is focused on the anomaly of potential foul play within an electrical distribution system, both primary and secondary networks as well as its potential to relate to other feeders from other utilities. The distributed generation has been part of the smart grid mission, the addition can be prone to electronic manipulation. This dissertation provides a comprehensive establishment in the emerging platform where the computing resources have been ubiquitous in the electrical distribution network. The topics covered in this thesis is wide-ranging where the anomaly inference includes load modeling and profile enhancement from other sources to infer of topological changes in the primary distribution network. While metering infrastructure has been the technological deployment to enable remote-controlled capability on the dis-connectors, this scholarly contribution represents the critical knowledge of new paradigm to address security-related issues, such as, irregularity (tampering by individuals) as well as potential malware (a large-scale form) that can massively manipulate the existing network control variables, resulting into large impact to the power grid

    Monitoring, diagnostics and improvement of process performance

    Get PDF
    The data generated in a chemical industry is a reflection of the process. With the modern computer control systems and data logging facilities, there is an increasing ability to collect large amounts of data. As there are many underlying aspects of the process in that data, with its proper utilization, it is possible to obtain useful information for process monitoring and fault diagnosis in addition to many other decision making activities. The purpose of this research is to utilize the data driven multivariate techniques of Principal Component Analysis (PCA) and Independent Component Analysis (ICA) for the estimation of process parameters. This research also includes analysis and comparison of these techniques for fault detection and diagnosis along with introduction, explanation and results from a new methodology developed in this research work namely Hybrid Independent Component Analysis (HICA).The first part of this research is the utilization of models of PCA and ICA for estimation of process parameters. The individual techniques of PCA and ICA are applied separately to the original data set of a waste water treatment plant (WWTP) and the process parameters for the unknown conditions of the process are calculated. For each of the techniques (PCA and ICA), the validation of the calculated parameters is carried out by construction of Decision Trees on WWTP dataset using inductive data mining and See 5.0. Both individual techniques were able to estimate all parameters successfully. The minor limitation in the validation of all results may be due to the strict application of these techniques to Gaussian and non-Gaussian data sets respectively. Using statistical analysis it was shown that the data set used in this work exhibits Gaussian and non-Gaussian behaviour.In the second part of this work multivariate techniques of Principal Component Analysis (PCA) and Independent Component Analysis (ICA) have been used for fault detection and diagnosis of a process along with introduction of the new technique, Hybrid Independent Component Analysis (HICA). The techniques are applied to two case studies, the waste water treatment plant (WWTP) and an Air pollution data set. As reported in literature, PCA and ICA proved to be useful tools for process monitoring on both data set, but a comparison of PCA and ICA along with the newly developed technique (HICA) illustrated the superiority of HICA over PCA and ICA. It is evident from the fact that PCA detected 74% and 67% of the faults in the WWTP data and Air pollution data set respectively. ICA successfully detected 61.3% and 62% of the faults from these datasets. Finally HICA showed improved results by the detection of 90% and 81% of the faults in both case studies. This showed that the new developed algorithm is more effective than the individual techniques of PCA and ICA. For fault diagnosis using PCA, ICA and HICA, contribution plots are constructed leading to the identification of responsible variable/s for a particular fault. This part also includes the work done for the estimation of process parameters using HICA technique as was done with PCA and ICA in the first part of the research. As expected HICA technique was more successful in estimation of parameters than PCA and ICA in line with its working for process monitoring

    AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments

    Get PDF
    This report considers the application of Articial Intelligence (AI) techniques to the problem of misuse detection and misuse localisation within telecommunications environments. A broad survey of techniques is provided, that covers inter alia rule based systems, model-based systems, case based reasoning, pattern matching, clustering and feature extraction, articial neural networks, genetic algorithms, arti cial immune systems, agent based systems, data mining and a variety of hybrid approaches. The report then considers the central issue of event correlation, that is at the heart of many misuse detection and localisation systems. The notion of being able to infer misuse by the correlation of individual temporally distributed events within a multiple data stream environment is explored, and a range of techniques, covering model based approaches, `programmed' AI and machine learning paradigms. It is found that, in general, correlation is best achieved via rule based approaches, but that these suffer from a number of drawbacks, such as the difculty of developing and maintaining an appropriate knowledge base, and the lack of ability to generalise from known misuses to new unseen misuses. Two distinct approaches are evident. One attempts to encode knowledge of known misuses, typically within rules, and use this to screen events. This approach cannot generally detect misuses for which it has not been programmed, i.e. it is prone to issuing false negatives. The other attempts to `learn' the features of event patterns that constitute normal behaviour, and, by observing patterns that do not match expected behaviour, detect when a misuse has occurred. This approach is prone to issuing false positives, i.e. inferring misuse from innocent patterns of behaviour that the system was not trained to recognise. Contemporary approaches are seen to favour hybridisation, often combining detection or localisation mechanisms for both abnormal and normal behaviour, the former to capture known cases of misuse, the latter to capture unknown cases. In some systems, these mechanisms even work together to update each other to increase detection rates and lower false positive rates. It is concluded that hybridisation offers the most promising future direction, but that a rule or state based component is likely to remain, being the most natural approach to the correlation of complex events. The challenge, then, is to mitigate the weaknesses of canonical programmed systems such that learning, generalisation and adaptation are more readily facilitated

    Data-driven fault detection using trending analysis

    Get PDF
    The objective of this research is to develop data-driven fault detection methods which do not rely on mathematical models yet are capable of detecting process malfunctions. Instead of using mathematical models for comparing performances, the methods developed rely on extensive collection of data to establish classification schemes that detect faults in new data. The research develops two different trending approaches. One uses the normal data to define a one-class classifier. The second approach uses a data mining technique, e.g. support vector machine (SVM) to define multi class classifiers. Each classifier is trained on a set of example objects. The one-class classification assumes that only information of one of the classes, namely the normal class, is available. The boundary between the two classes, normal and faulty, is estimated from data of the normal class only. The research assumes that the convex hull of the normal data can be used to define a boundary separating normal and faulty data. The multi class classifier is implemented through several binary classifiers. It is assumed that data from two classes are available and the decision boundary is supported from both sides by example objects. In order to detect significant trends in the data the research implements a non-uniform quantization technique, based on Lloyd’s algorithm and defines a special subsequence-based kernel. The effect of the subsequence length is examined through computer simulations and theoretical analysis. The test bed used to collect data and implement the fault detection is a six degrees of freedom, rigid body model of a B747 100/200 and only faults in the actuators are considered. In order to thoroughly test the efficiency of the approach, the test use only sensor data that does not include manipulated variables. Even with this handicap the approach is effective with the average of 79.5% correct detection and 16.7% missed alarm and 3.9% false alarms for six different faults
    • …
    corecore