10 research outputs found

    Applications of Support Vector Machines as a Robust tool in High Throughput Virtual Screening

    Get PDF
    Chemical space is enormously huge but not all of it is pertinent for the drug designing. Virtual screening methods act as knowledge-based filters to discover the coveted novel lead molecules possessing desired pharmacological properties. Support Vector Machines (SVM) is a reliable virtual screening tool for prioritizing molecules with the required biological activity and minimum toxicity. It has to its credit inherent advantages such as support for noisy data mainly coming from varied high-throughput biological assays, high sensitivity, specificity, prediction accuracy and reduction in false positives. SVM-based classification methods can efficiently discriminate inhibitors from non-inhibitors, actives from inactives, toxic from non-toxic and promiscuous from non-promiscuous molecules. As the principles of drug design are also applicable for agrochemicals, SVM methods are being applied for virtual screening for pesticides too. The current review discusses the basic kernels and models used for binary discrimination and also features used for developing SVM-based scoring functions, which will enhance our understanding of molecular interactions. SVM modeling has also been compared by many researchers with other statistical methods such as Artificial Neural Networks, k-nearest neighbour (kNN), decision trees, partial least squares, etc. Such studies have also been discussed in this review. Moreover, a case study involving the use of SVM method for screening molecules for cancer therapy has been carried out and the preliminary results presented here indicate that the SVM is an excellent classifier for screening the molecules

    Knowledge incorporated support vector machines to detect faults in tennessee eastman process

    No full text
    A support vector machine with knowledge incorporation is applied to detect the faults in Tennessee Eastman Process, a benchmark problem in chemical engineering. The knowledge incorporated algorithm takes advantage of the information on horizontal translation invariance in tangent direction of the instances in dataset. This essentially changes the representation of the input data while training the algorithm. These local translations do not alter the class membership of the instances in the dataset. The results on binary as well as multiple fault detection justify the use of knowledge incorporation

    Explainability: Relevance based Dynamic Deep Learning Algorithm for Fault Detection and Diagnosis in Chemical Processes

    Full text link
    The focus of this work is on Statistical Process Control (SPC) of a manufacturing process based on available measurements. Two important applications of SPC in industrial settings are fault detection and diagnosis (FDD). In this work a deep learning (DL) based methodology is proposed for FDD. We investigate the application of an explainability concept to enhance the FDD accuracy of a deep neural network model trained with a data set of relatively small number of samples. The explainability is quantified by a novel relevance measure of input variables that is calculated from a Layerwise Relevance Propagation (LRP) algorithm. It is shown that the relevances can be used to discard redundant input feature vectors/ variables iteratively thus resulting in reduced over-fitting of noisy data, increasing distinguishability between output classes and superior FDD test accuracy. The efficacy of the proposed method is demonstrated on the benchmark Tennessee Eastman Process.Comment: Under Review. arXiv admin note: text overlap with arXiv:2012.0386

    Optimized Kernel-Based Conformal Predictor for Online Fault Detection

    Get PDF
    为了提高相符预测器的计算效率,在算法中引入基于核的度量学习.将其学习过程分解成2部分:先通过提高75%的训练样本的类可分性获得1个优化核;然后在优化的核空间中采用k近邻方法设计奇异度函数,并使用剩下的25%的样本实现标准的相符预测器算法.将新算法应用于田纳西-伊斯曼过程的多类故障诊断问题,实验结果表明,在保证高的预测效率的同时,新算法可以显著降低计算时间.In order to improve the computational efficiency of conformal predictora,procedure of adaptive kernel-based distance metric learning was incorporated in the algorithm.The learning process was divided into two stages.Firstlya,n op-timized kernel was obtained by increasing the class separability of 75% of the training samples.Secondlyt,he k nearest neighbor classifier was used to design a nonconformity measure function in the optimized kernel space.And then the stan-dard conformal predictor algorithm was conducted on the remaining 25% of the training samples.The new method was ap-plied to the multiple fault diagnosis of Tennessee Eastman process.The results show that the new algorithm provides substan-tial reductions in computational timea,nd ensures high predictive efficiency as well.厦门大学985二期工程信息创新平台资助项目(0000-x07204);厦门市科技计划资助项目(3502Z20083028

    Fault detection and root cause diagnosis using dynamic Bayesian network

    Get PDF
    This thesis presents two real time process fault detection and diagnosis (FDD) techniques incorporating process data and prior knowledge. Unlike supervised monitoring techniques, both these methods can perform without having any prior information of a fault. In the first part of this research, a hybrid methodology is developed combining principal component analysis (PCA), Bayesian network (BN) and multiple uncertain (likelihood) evidence to improve the diagnostic capacity of PCA and existing PCA-BN schemes with hard evidence based updating. A dynamic BN (DBN) based FDD methodology is proposed in the later part of this work which provides detection and accurate diagnosis by a single tool. Furthermore, fault propagation pathway is analyzed using the predictive feature of a BN and cause-effect relationships among the process variables. Proposed frameworks are successfully validated by applying to several process models

    Deep Recurrent Neural Networks for Fault Detection and Classification

    Get PDF
    Deep Learning is one of the fastest growing research topics in process systems engineering due to the ability of deep learning models to represent and predict non-linear behavior in many applications. However, the application of these models in chemical engineering is still in its infancy. Thus, a key goal of this work is assessing the capabilities of deep-learning based models in a chemical engineering applications. The specific focus in the current work is detection and classification of faults in a large industrial plant involving several chemical unit operations. Towards this goal we compare the efficacy of a deep learning based algorithm to other state-of-the-art multivariate statistical based techniques for fault detection and classification. The comparison is conducted using simulated data from a chemical benchmark case study that has been often used to test fault detection algorithms, the Tennessee Eastman Process (TEP). A real time online scheme is proposed in the current work that enhances the detection and classifications of all the faults occurring in the simulation. This is accomplished by formulating a fault-detection model capable of describing the dynamic nonlinear relationships among the output variables and manipulated variables that can be measured in the Tennessee Eastman Process during the occurrence of faults or in the absence of them. In particular, we are focusing on specific faults that cannot be correctly detected and classified by traditional statistical methods nor by simpler Artificial Neural Networks (ANN). To increase the detectability of these faults, a deep Recurrent Neural Network (RNN) is programmed that uses dynamic information of the process along a pre-specified time horizon. In this research we first studied the effect of the number of samples feed into the RNN in order to capture more dynamical information of the faults and showed that accuracy increases with this number e.g. average classification rates were 79.8%, 80.3%, 81% and 84% for the RNN with 5, 15, 25 and 100 number of samples respectively. As well, to increase the classification accuracy of difficult to observe faults we developed a hierarchical structure where faults are grouped into subsets and classified with separate models for each subset. Also, to improve the classification for faults that resulted in responses with low signal to noise ratio excitation was added to the process through an implementation of a pseudo random signal(PRS). By applying the hierarchical structure there is an increment on the signal-to-noise ratio of faults 3 and 9, which translates in an improvement in the classification accuracy in both of these faults by 43.0% and 17.2% respectively for the case of 100 number of samples and by 8.7% and 23.4% for 25 number samples. On the other hand, applying a PRS to excite the system has showed a dramatic increase in the classification rates of the normal state to 88.7% and fault 15 up to 76.4%. Therefore, the proposed method is able to improve considerably both the detection and classification accuracy of several observable faults, as well as faults considered to be unobservable when using other detection algorithms. Overall, the comparison of the deep learning algorithms with Dynamic PCA (Principal Component Analysis) techniques showed a clear superiority of the deep learning techniques in classifying faults in nonlinear dynamic processes. Finally, we develop these same techniques to different operational modes of the TEP simulation, achieving comparable improvements to the classification accuracies

    Plantwide simulation and monitoring of offshore oil and gas production facility

    Get PDF
    Monitoring is one of the major concerns in offshore oil and gas production platform since the access to the offshore facilities is difficult. Also, it is quite challenging to extract oil and gas safely in such a harsh environment, and any abnormalities may lead to a catastrophic event. The process data, including all possible faulty scenarios, is required to build an appropriate monitoring system. Since the plant wide process data is not available in the literature, a dynamic model and simulation of an offshore oil and gas production platform is developed by using Aspen HYSYS. Modeling and simulations are handy tools for designing and predicting the accurate behavior of a production plant. The model was built based on the gas processing plant at the North Sea platform reported in Voldsund et al. (2013). Several common faults from different fault categories were simulated in the dynamic system, and their impacts on the overall hydrocarbon production were analyzed. The simulated data are then used to build a monitoring system for each of the faulty states. A new monitoring method has been proposed by combining Principal Component Analysis (PCA) and Dynamic PCA (DPCA) with Artificial Neural Network (ANN). The application of ANN to process systems is quite difficult as it involves a very large number of input neurons to model the system. Training of such large scale network is time-consuming and provides poor accuracy with a high error rate. In PCA-ANN and DPCA-ANN monitoring system, PCA and DPCA are used to reduce the dimension of the training data set and extract the main features of measured variables. Subsequently ANN uses this lower-dimensional score vectors to build a training model and classify the abnormalities. It is found that the proposed approach reduces the time to train ANN and successfully diagnose, detects and classifies the faults with a high accuracy rate

    An investigation on automatic systems for fault diagnosis in chemical processes

    Get PDF
    Plant safety is the most important concern of chemical industries. Process faults can cause economic loses as well as human and environmental damages. Most of the operational faults are normally considered in the process design phase by applying methodologies such as Hazard and Operability Analysis (HAZOP). However, it should be expected that failures may occur in an operating plant. For this reason, it is of paramount importance that plant operators can promptly detect and diagnose such faults in order to take the appropriate corrective actions. In addition, preventive maintenance needs to be considered in order to increase plant safety. Fault diagnosis has been faced with both analytic and data-based models and using several techniques and algorithms. However, there is not yet a general fault diagnosis framework that joins detection and diagnosis of faults, either registered or non-registered in records. Even more, less efforts have been focused to automate and implement the reported approaches in real practice. According to this background, this thesis proposes a general framework for data-driven Fault Detection and Diagnosis (FDD), applicable and susceptible to be automated in any industrial scenario in order to hold the plant safety. Thus, the main requirement for constructing this system is the existence of historical process data. In this sense, promising methods imported from the Machine Learning field are introduced as fault diagnosis methods. The learning algorithms, used as diagnosis methods, have proved to be capable to diagnose not only the modeled faults, but also novel faults. Furthermore, Risk-Based Maintenance (RBM) techniques, widely used in petrochemical industry, are proposed to be applied as part of the preventive maintenance in all industry sectors. The proposed FDD system together with an appropriate preventive maintenance program would represent a potential plant safety program to be implemented. Thus, chapter one presents a general introduction to the thesis topic, as well as the motivation and scope. Then, chapter two reviews the state of the art of the related fields. Fault detection and diagnosis methods found in literature are reviewed. In this sense a taxonomy that joins both Artificial Intelligence (AI) and Process Systems Engineering (PSE) classifications is proposed. The fault diagnosis assessment with performance indices is also reviewed. Moreover, it is exposed the state of the art corresponding to Risk Analysis (RA) as a tool for taking corrective actions to faults and the Maintenance Management for the preventive actions. Finally, the benchmark case studies against which FDD research is commonly validated are examined in this chapter. The second part of the thesis, integrated by chapters three to six, addresses the methods applied during the research work. Chapter three deals with the data pre-processing, chapter four with the feature processing stage and chapter five with the diagnosis algorithms. On the other hand, chapter six introduces the Risk-Based Maintenance techniques for addressing the plant preventive maintenance. The third part includes chapter seven, which constitutes the core of the thesis. In this chapter the proposed general FD system is outlined, divided in three steps: diagnosis model construction, model validation and on-line application. This scheme includes a fault detection module and an Anomaly Detection (AD) methodology for the detection of novel faults. Furthermore, several approaches are derived from this general scheme for continuous and batch processes. The fourth part of the thesis presents the validation of the approaches. Specifically, chapter eight presents the validation of the proposed approaches in continuous processes and chapter nine the validation of batch process approaches. Chapter ten raises the AD methodology in real scaled batch processes. First, the methodology is applied to a lab heat exchanger and then it is applied to a Photo-Fenton pilot plant, which corroborates its potential and success in real practice. Finally, the fifth part, including chapter eleven, is dedicated to stress the final conclusions and the main contributions of the thesis. Also, the scientific production achieved during the research period is listed and prospects on further work are envisaged.La seguridad de planta es el problema más inquietante para las industrias químicas. Un fallo en planta puede causar pérdidas económicas y daños humanos y al medio ambiente. La mayoría de los fallos operacionales son previstos en la etapa de diseño de un proceso mediante la aplicación de técnicas de Análisis de Riesgos y de Operabilidad (HAZOP). Sin embargo, existe la probabilidad de que pueda originarse un fallo en una planta en operación. Por esta razón, es de suma importancia que una planta pueda detectar y diagnosticar fallos en el proceso y tomar las medidas correctoras adecuadas para mitigar los efectos del fallo y evitar lamentables consecuencias. Es entonces también importante el mantenimiento preventivo para aumentar la seguridad y prevenir la ocurrencia de fallos. La diagnosis de fallos ha sido abordada tanto con modelos analíticos como con modelos basados en datos y usando varios tipos de técnicas y algoritmos. Sin embargo, hasta ahora no existe la propuesta de un sistema general de seguridad en planta que combine detección y diagnosis de fallos ya sea registrados o no registrados anteriormente. Menos aún se han reportado metodologías que puedan ser automatizadas e implementadas en la práctica real. Con la finalidad de abordar el problema de la seguridad en plantas químicas, esta tesis propone un sistema general para la detección y diagnosis de fallos capaz de implementarse de forma automatizada en cualquier industria. El principal requerimiento para la construcción de este sistema es la existencia de datos históricos de planta sin previo filtrado. En este sentido, diferentes métodos basados en datos son aplicados como métodos de diagnosis de fallos, principalmente aquellos importados del campo de “Aprendizaje Automático”. Estas técnicas de aprendizaje han resultado ser capaces de detectar y diagnosticar no sólo los fallos modelados o “aprendidos”, sino también nuevos fallos no incluidos en los modelos de diagnosis. Aunado a esto, algunas técnicas de mantenimiento basadas en riesgo (RBM) que son ampliamente usadas en la industria petroquímica, son también propuestas para su aplicación en el resto de sectores industriales como parte del mantenimiento preventivo. En conclusión, se propone implementar en un futuro no lejano un programa general de seguridad de planta que incluya el sistema de detección y diagnosis de fallos propuesto junto con un adecuado programa de mantenimiento preventivo. Desglosando el contenido de la tesis, el capítulo uno presenta una introducción general al tema de esta tesis, así como también la motivación generada para su desarrollo y el alcance delimitado. El capítulo dos expone el estado del arte de las áreas relacionadas al tema de tesis. De esta forma, los métodos de detección y diagnosis de fallos encontrados en la literatura son examinados en este capítulo. Asimismo, se propone una taxonomía de los métodos de diagnosis que unifica las clasificaciones propuestas en el área de Inteligencia Artificial y de Ingeniería de procesos. En consecuencia, se examina también la evaluación del performance de los métodos de diagnosis en la literatura. Además, en este capítulo se revisa y reporta el estado del arte correspondiente al “Análisis de Riesgos” y a la “Gestión del Mantenimiento” como técnicas complementarias para la toma de medidas correctoras y preventivas. Por último se abordan los casos de estudio considerados como puntos de referencia en el campo de investigación para la aplicación del sistema propuesto. La tercera parte incluye el capítulo siete, el cual constituye el corazón de la tesis. En este capítulo se presenta el esquema o sistema general de diagnosis de fallos propuesto. El sistema es dividido en tres partes: construcción de los modelos de diagnosis, validación de los modelos y aplicación on-line. Además incluye un modulo de detección de fallos previo a la diagnosis y una metodología de detección de anomalías para la detección de nuevos fallos. Por último, de este sistema se desglosan varias metodologías para procesos continuos y por lote. La cuarta parte de esta tesis presenta la validación de las metodologías propuestas. Específicamente, el capítulo ocho presenta la validación de las metodologías propuestas para su aplicación en procesos continuos y el capítulo nueve presenta la validación de las metodologías correspondientes a los procesos por lote. El capítulo diez valida la metodología de detección de anomalías en procesos por lote reales. Primero es aplicada a un intercambiador de calor escala laboratorio y después su aplicación es escalada a un proceso Foto-Fenton de planta piloto, lo cual corrobora el potencial y éxito de la metodología en la práctica real. Finalmente, la quinta parte de esta tesis, compuesta por el capítulo once, es dedicada a presentar y reafirmar las conclusiones finales y las principales contribuciones de la tesis. Además, se plantean las líneas de investigación futuras y se lista el trabajo desarrollado y presentado durante el periodo de investigación