31 research outputs found

    Incipient Fault Detection, Diagnosis, and Prognosis using Canonical Variate Dissimilarity Analysis

    Get PDF
    Industrial process monitoring deals with three main activities, namely, fault detection, fault diagnosis, and fault prognosis. Respectively, these activities seek to answer three questions: ‘Has a fault occurred?’, ‘Where did it occur and how large?’, and ‘How will it progress in the future?’ As opposed to abrupt faults, incipient faults are those that slowly develop in time, leading ultimately to process failure or an emergency situation. A recently developed multivariate statistical tool for early detection of incipient faults under varying operating conditions is the Canonical Variate Dissimilarity Analysis (CVDA). In CVDA, a dissimilarity-based statistical index was derived to improve the detection sensitivity upon the traditional canonical variate analysis (CVA) indices. This study aims to extend the CVDA detection framework towards diagnosis and prognosis of process conditions. For diagnosis, contribution maps are used to convey the magnitude and location of the incipient fault effects, as well as their evolution in time. For prognosis, CVA state-space prediction and Kalman filtering during faulty conditions are proposed in this work. By covering the three main process monitoring activities in one framework, our work can serve as a baseline strategy for future application to large process industries

    A Review of Kernel Methods for Feature Extraction in Nonlinear Process Monitoring

    Get PDF
    Kernel methods are a class of learning machines for the fast recognition of nonlinear patterns in any data set. In this paper, the applications of kernel methods for feature extraction in industrial process monitoring are systematically reviewed. First, we describe the reasons for using kernel methods and contextualize them among other machine learning tools. Second, by reviewing a total of 230 papers, this work has identified 12 major issues surrounding the use of kernel methods for nonlinear feature extraction. Each issue was discussed as to why they are important and how they were addressed through the years by many researchers. We also present a breakdown of the commonly used kernel functions, parameter selection routes, and case studies. Lastly, this review provides an outlook into the future of kernel-based process monitoring, which can hopefully instigate more advanced yet practical solutions in the process industries

    Monitoring a reverse osmosis process with kernel principal component analysis: A preliminary approach

    Get PDF
    The water purification process is becoming increasingly important to ensure the continuity and quality of subsequent production processes, and it is particularly relevant in pharmaceutical contexts. However, in this context, the difficulties arising during the monitoring process are manifold. On the one hand, the monitoring process reveals various discontinuities due to different characteristics of the input water. On the other hand, the monitoring process is discontinuous and random itself, thus not guaranteeing continuity of the parameters and hindering a straightforward analysis. Consequently, further research on water purification processes is paramount to identify the most suitable techniques able to guarantee good performance. Against this background, this paper proposes an application of kernel principal component analysis for fault detection in a process with the above-mentioned characteristics. Based on the temporal variability of the process, the paper suggests the use of past and future matrices as input for fault detection as an alternative to the original dataset. In this manner, the temporal correlation between process parameters and machine health is accounted for. The proposed approach confirms the possibility of obtaining very good monitoring results in the analyzed context

    A Kernel Design Approach to Improve Kernel Subspace Identification

    Get PDF
    Subspace identification methods, such as canonical variate analysis (CVA), are noniterative tools suitable for the state-space modeling of multi-input, multi-output processes, e.g., industrial processes, using input–output data. To learn nonlinear system behavior, kernel subspace techniques are commonly used. However, the issue of kernel design must be given more attention because the type of kernel can influence the kind of nonlinearities that the model can capture. In this article, a new kernel design is proposed for CVA-based identification, which is a mixture of a global and local kernel to enhance generalization ability and includes a mechanism to vary the influence of each process variable into the model response. During validation, model hyper-parameters were tuned using random search. The overall method is called feature-relevant mixed kernel CVA (FR-MKCVA). Using an evaporator case study, the trained FR-MKCVA models show a better fit to observed data than those of single-kernel CVA, linear CVA, and neural net models under both interpolation and extrapolation scenarios. This work provides a basis for future exploration of deep and diverse kernel designs for system identification

    Fault diagnosis in multivariate statistical process monitoring

    Get PDF
    The application of multivariate statistical process monitoring (MSPM) methods has gained considerable momentum over the last couple of decades, especially in the processing industry for achieving higher throughput at sustainable rates, reducing safety related events and minimizing potential environmental impacts. Multivariate process deviations occur when the relationships amongst many process characteristics are different from the expected. The fault detection ability of methods such as principal component analysis (PCA) and process monitoring has been reported in literature and demonstrated in selective practical applications. However, the methodologies employed to diagnose the reason for the identified multivariate process faults have not gained the anticipated traction in practice. One explanation for this might be that the current diagnostic approaches attempt to rank process variables according to their individual contribution to process faults. However, the lack of these approaches to correctly identify the variables responsible for the process deviation is well researched and communicated in literature. Specifically, these approaches suffer from a phenomenon known as fault smearing. In this research it is argued, using several illustrations, that the objective of assigning individual importance rankings to process variables is not appropriate in a multivariate setting. A new methodology is introduced for performing fault diagnosis in multivariate process monitoring. More specifically, a multivariate diagnostic method is proposed that ranks variable pairs as opposed to individual variables. For PCA based MSPM, a novel fault diagnosis method is developed that decomposes the fault identification statistics into a sum of parts, with each part representing the contribution of a specific variable pair. An approach is also developed to quantify the statistical significance of each pairwise contribution. In addition, it is illustrated how the pairwise contributions can be analysed further to obtain an individual importance ranking of the process variables. Two methodologies are developed that can be applied to calculate the individual ranking following the pairwise contributions analysis. However, it is advised that the individual rankings should be interpreted together with the pairwise contributions. The application of this new approach to PCA based MSPM and fault diagnosis is illustrated using a simulated data set

    A Novel Data-Driven Fault Tree Methodology for Fault Diagnosis and Prognosis

    Get PDF
    RÉSUMÉ : La thèse développe une nouvelle méthodologie de diagnostic et de pronostic de défauts dans un système complexe, nommée Interpretable logic tree analysis (ILTA), qui combine les techniques d’extraction de connaissances à partir des bases de données « knowledge discovery in database (KDD) » et l’analyse d’arbre de défaut « fault tree analysis (FTA) ». La méthodologie capitalise les avantages des deux techniques pour appréhender la problématique de diagnostic et de pronostic de défauts. Bien que les arbres de défauts offrent des modèles interprétables pour déterminer les causes possibles à l’origine d’un défaut, leur utilisation pour le diagnostic de défauts dans un système industriel est limitée, en raison de la nécessité de faire appel à des connaissances expertes pour décrire les relations de cause-à-effet entre les processus internes du système. Cependant, il sera intéressant d’exploiter la puissance d’analyse des arbres de défaut mais construit à partir des connaissances explicites et non biaisées extraites directement des bases de données sur la causalité des fautes. Par conséquent, la méthodologie ILTA fonctionne de manière analogue à la logique du modèle d'analyse d'arbre de défaut (FTA) mais avec une implication minimale des experts. Cette approche de modélisation doit rejoindre la logique des experts pour représenter la structure hiérarchique des défauts dans un système complexe. La méthodologie ILTA est appliquée à la gestion des risques de défaillance en fournissant deux modèles d'arborescence avancés interprétables à plusieurs niveaux (MILTA) et au cours du temps (ITCA). Le modèle MILTA est conçu pour accomplir la tâche de diagnostic de défaillance dans les systèmes complexes. Il est capable de décomposer un défaut complexe et de modéliser graphiquement sa structure de causalité dans un arbre à plusieurs niveaux. Par conséquent, un expert est en mesure de visualiser l’influence des relations hiérarchiques de cause à effet menant à la défaillance principale. De plus, quantifier ces causes en attribuant des probabilités aide à comprendre leur contribution dans l’occurrence de la défaillance du système. Le modèle ITCA est conçu pour réaliser la tâche de pronostic de défaillance dans les systèmes complexes. Basé sur une répartition des données au cours du temps, le modèle ITCA capture l’effet du vieillissement du système à travers de l’évolution de la structure de causalité des fautes. Ainsi, il décrit les changements de causalité résultant de la détérioration et du vieillissement au cours de la vie du système.----------ABSTRACT : The thesis develops a new methodology for diagnosis and prognosis of faults in a complex system, called Interpretable logic tree analysis (ILTA), which combines knowledge extraction techniques from knowledge discovery in databases (KDD) and the fault tree analysis (FTA). The methodology combined the advantages of the both techniques for understanding the problem of diagnosis and prognosis of faults. Although fault trees provide interpretable models for determining the possible causes of a fault, its use for fault diagnosis in an industrial system is limited, due to the need for expert knowledge to describe cause-and-effect relationships between internal system processes. However, it will be interesting to exploit the analytical power of fault trees but built from explicit and unbiased knowledge extracted directly from databases on the causality of faults. Therefore, the ILTA methodology works analogously to the logic of the fault tree analysis model (FTA) but with minimal involvement of experts. This modeling approach joins the logic of experts to represent the hierarchical structure of faults in a complex system. The ILTA methodology is applied to failure risk management by providing two interpretable advanced logic models: a multi-level tree (MILTA) and a multilevel tree over time (ITCA). The MILTA model is designed to accomplish the task of diagnosing failure in complex systems. It is able to decompose a complex defect and graphically model its causal structure in a tree on several levels. As a result, an expert is able to visualize the influence of hierarchical cause and effect relationships leading to the main failure. In addition, quantifying these causes by assigning probabilities helps to understand their contribution to the occurrence of system failure. The second model is a logical tree interpretable in time (ITCA), designed to perform the task of prognosis of failure in complex systems. Based on a distribution of data over time, the ITCA model captures the effect of the aging of the system through the evolution of the fault causation structure. Thus, it describes the causal changes resulting from deterioration and aging over the life of the system

    FAULT DETECTION AND PREDICTION IN ELECTROMECHANICAL SYSTEMS VIA THE DISCRETIZED STATE VECTOR-BASED PATTERN ANALYSIS OF MULTI-SENSOR SIGNALS

    Get PDF
    Department of System Design and Control EngineeringIn recent decades, operation and maintenance strategies for industrial applications have evolved from corrective maintenance and preventive maintenance, to condition-based monitoring and eventually predictive maintenance. High performance sensors and data logging technologies have enabled us to monitor the operational states of systems and predict fault occurrences. Several time series analysis methods have been proposed in the literature to classify system states via multi-sensor signals. Since the time series of sensor signals is often characterized as very-short, intermittent, transient, highly nonlinear, and non-stationary random signals, they make time series analyses more complex. Therefore, time series discretization has been popularly applied to extract meaningful features from original complex signals. There are several important issues to be addressed in discretization for fault detection and prediction: (i) What is the fault pattern that represents a system???s faulty states, (ii) How can we effectively search for fault patterns, (iii) What is a symptom pattern to predict fault occurrences, and (iv) What is a systematic procedure for online fault detection and prediction. In this regard, this study proposes a fault detection and prediction framework that consists of (i) definition of system???s operational states, (ii) definitions of fault and symptom patterns, (iii) multivariate discretization, (iv) severity and criticality analyses, and (v) online detection and prediction procedures. Given the time markers of fault occurrences, we can divide a system???s operational states into fault and no-fault states. We postulate that a symptom state precedes the occurrence of a fault within a certain time period and hence a no-fault state consists of normal and symptom states. Fault patterns are therefore found only in fault states, whereas symptom patterns are either only found in the system???s symptom states (being absent in the normal states) or not found in the given time series, but similar to fault patterns. To determine the length of a symptom state, we present a symptom pattern-based iterative search method. In order to identify the distinctive behaviors of multi-sensor signals, we propose a multivariate discretization approach that consists mainly of label definition, label specification, and event codification. Discretization parameters are delicately controlled by considering the key characteristics of multi-sensor signals. We discuss how to measure the severity degrees of fault and symptom patterns, and how to assess the criticalities of fault states. We apply the fault and symptom pattern extraction and severity assessment methods to online fault detection and prediction. Finally, we demonstrate the performance of the proposed framework through the following six case studies: abnormal cylinder temperature in a marine diesel engine, automotive gasoline engine knockings, laser weld defects, buzz, squeak, and rattle (BSR) noises from a car door trim (using a typical acoustic sensor array and using acoustic emission sensors respectively), and visual stimuli cognition tests by the P300 experiment.ope

    ADVANCES ON BILINEAR MODELING OF BIOCHEMICAL BATCH PROCESSES

    Full text link
    [EN] This thesis is aimed to study the implications of the statistical modeling approaches proposed for the bilinear modeling of batch processes, develop new techniques to overcome some of the problems that have not been yet solved and apply them to data of biochemical processes. The study, discussion and development of the new methods revolve around the four steps of the modeling cycle, from the alignment, preprocessing and calibration of batch data to the monitoring of batches trajectories. Special attention is given to the problem of the batch synchronization, and its effect on the modeling from different angles. The manuscript has been divided into four blocks. First, a state-of- the-art of the latent structures based-models in continuous and batch processes and traditional univariate and multivariate statistical process control systems is carried out. The second block of the thesis is devoted to the preprocessing of batch data, in particular, to the equalization and synchronization of batch trajectories. The first section addresses the problem of the lack of equalization in the variable trajectories. The different types of unequalization scenarios that practitioners might finnd in batch processes are discussed and the solutions to equalize batch data are introduced. In the second section, a theoretical study of the nature of batch processes and of the synchronization of batch trajectories as a prior step to bilinear modeling is carried out. The topics under discussion are i) whether the same synchronization approach must be applied to batch data in presence of different types of asynchronisms, and ii) whether synchronization is always required even though the length of the variable trajectories are constant across batches. To answer these questions, a thorough study of the most common types of asynchronisms that may be found in batch data is done. Furthermore, two new synchronization techniques are proposed to solve the current problems in post-batch and real-time synchronization. To improve fault detection and classification, new unsupervised control charts and supervised fault classifiers based on the information generated by the batch synchronization are also proposed. In the third block of the manuscript, a research work is performed on the parameter stability associated with the most used synchronization methods and principal component analysis (PCA)-based Batch Multivariate Statistical Process Control methods. The results of this study have revealed that accuracy in batch synchronization has a profound impact on the PCA model parameters stability. Also, the parameter stability is closely related to the type of preprocessing performed in batch data, and the type of model and unfolding used to transform the three-way data structure to two-way. The setting of the parameter stability, the source of variability remaining after preprocessing and the process dynamics should be balanced in such a way that multivariate statistical models are accurate in fault detection and diagnosis and/or in online prediction. Finally, the fourth block introduces a graphical user-friendly interface developed in Matlab code for batch process understanding and monitoring. To perform multivariate analysis, the last developments in process chemometrics, including the methods proposed in this thesis, are implemented.[ES] La presente tesis doctoral tiene como objetivo estudiar las implicaciones de los métodos estadísticos propuestos para la modelización bilineal de procesos por lotes, el desarrollo de nuevas técnicas para solucionar algunos de los problemas más complejos aún por resolver en esta línea de investigación y aplicar los nuevos métodos a datos provenientes de procesos bioquímicos para su evaluación estadística. El estudio, la discusión y el desarrollo de los nuevos métodos giran en torno a las cuatro fases del ciclo de modelización: desde la sincronización, ecualización, preprocesamiento y calibración de los datos, a la monitorización de las trayectorias de las variables del proceso. Se presta especial atención al problema de la sincronización y su efecto en la modelización estadística desde distintas perspectivas. El manuscrito se ha dividido en cuatro grandes bloques. En primer lugar, se realiza una revisión bibliográfica de las técnicas de proyección sobre estructuras latentes para su aplicación en procesos continuos y por lotes, y del diseño de sistemas de control basados en modelos estadísticos multivariantes. El segundo bloque del documento versa sobre el preprocesamiento de los datos, en concreto, sobre la ecualización y la sincronización. La primera parte aborda el problema de la falta de ecualización en las trayectorias de las variables. Se discuten las diferentes políticas de muestreo que se pueden encontrar en procesos por lotes y las soluciones para ecualizar las variables. En la segunda parte de esta sección, se realiza un estudio teórico sobre la naturaleza de los procesos por lotes y de la sincronización de las trayectorias como paso previo a la modelización bilineal. Los temas bajo discusión son: i) si se debe utilizar el mismo enfoque de sincronización en lotes afectados por diferentes tipos de asincronismos, y ii) si la sincronización es siempre necesaria aún y cuando las trayectorias de las variables tienen la misma duración en todos los lotes. Para responder a estas preguntas, se lleva a cabo un estudio exhaustivo de los tipos más comunes de asincronismos que se pueden encontrar en este tipo de datos. Además, se proponen dos nuevas técnicas de sincronización para resolver los problemas existentes en aplicaciones post-morten y en tiempo real. Para mejorar la detección de fallos y la clasificación, también se proponen nuevos gráficos de control no supervisados y clasificadores de fallos supervisados en base a la información generada por la sincronización de los lotes. En el tercer bloque del manuscrito se realiza un estudio de la estabilidad de los parámetros asociados a los métodos de sincronización y a los métodos estadístico multivariante basados en el Análisis de Componentes Principales (PCA) más utilizados para el control de procesos. Los resultados de este estudio revelan que la precisión de la sincronización de las trayectorias tiene un impacto significativo en la estabilidad de los parámetros de los modelos PCA. Además, la estabilidad paramétrica está estrechamente relacionada con el tipo de preprocesamiento realizado en los datos de los lotes, el tipo de modelo a justado y el despliegue utilizado para transformar la estructura de datos de tres a dos dimensiones. El ajuste de la estabilidad de los parámetros, la fuente de variabilidad que queda después del preprocesamiento de los datos y la captura de las dinámicas del proceso deben ser a justados de forma equilibrada de tal manera que los modelos estadísticos multivariantes sean precisos en la detección y diagnóstico de fallos y/o en la predicción en tiempo real. Por último, el cuarto bloque del documento describe una interfaz gráfica de usuario que se ha desarrollado en código Matlab para la comprensión y la supervisión de procesos por lotes. Para llevar a cabo los análisis multivariantes, se han implementado los últimos desarrollos en la quimiometría de proc[CA] Aquesta tesi doctoral te com a objectiu estudiar les implicacions dels mètodes de modelització estadística proposats per a la modelització bilineal de processos per lots, el desenvolupament de noves tècniques per resoldre els problemes encara no resolts en aquesta línia de recerca i aplicar els nous mètodes a les dades dels processos bioquímics. L'estudi, la discussió i el desenvolupament dels nous mètodes giren entorn a les quatre fases del cicle de modelització, des de l'alineació, preprocessament i el calibratge de les dades provinents de lots, a la monitorització de les trajectòries. Es presta especial atenció al problema de la sincronització per lots, i el seu efecte sobre el modelatge des de diferents angles. El manuscrit s'ha dividit en quatre grans blocs. En primer lloc, es realitza una revisió bibliogràfica dels principals mètodes basats en tècniques de projecció sobre estructures latents en processos continus i per lots, així com dels sistemes de control estadístics multivariats. El segon bloc del document es dedica a la preprocessament de les dades provinents de lots, en particular, l' equalització i la sincronització. La primera part aborda el problema de la manca d'equalització en les trajectòries de les variables. Es discuteixen els diferents tipus d'escenaris en que les variables estan mesurades a distints intervals i les solucions per equalitzar-les en processos per lots. A la segona part d'aquesta secció es porta a terme un estudi teòric de la naturalesa dels processos per lots i de la sincronització de les trajectòries de lots com a pas previ al modelatge bilineal. Els temes en discussió són: i) si el mateix enfocament de sincronització ha de ser aplicat a les dades del lot en presència de diferents tipus de asincronismes, i ii) si la sincronització sempre es requereix tot i que la longitud de les trajectòries de les variables són constants en tots el lots. Per respondre a aquestes preguntes, es du a terme un estudi exhaustiu dels tipus més comuns de asincronismes que es poden trobar en les dades provinents de lots. A més, es proposen dues noves tècniques de sincronització per resoldre els problemes existents la sincronització post-morten i en temps real. Per millorar la detecció i la classificació de anomalies, també es proposen nous gràfics de control no supervisats i classificadors de falla supervisats dissenyats en base a la informació generada per la sincronització de lots. En el tercer bloc del manuscrit es realitza un treball de recerca sobre l'estabilitat dels paràmetres associats als mètodes de sincronització i als mètodes estadístics multivariats basats en l'Anàlisi de Components Principals (PCA) més utilitzats per al control de processos. Els resultats d'aquest estudi revelen que la precisió en la sincronització per lots te un profund impacte en l'estabilitat dels paràmetres dels models PCA. A més, l'estabilitat paramètrica està estretament relacionat amb el tipus de preprocessament realitzat en les dades provinents de lots, el tipus de model i el desplegament utilitzat per transformar l'estructura de dades de tres a dos dimensions. L'ajust de l'estabilitat dels paràmetres, la font de variabilitat que queda després del preprocessament i la captura de la dinàmica de procés ha de ser equilibrada de tal manera que els models estadístics multivariats són precisos en la detecció i diagnòstic de fallades i/o en la predicció en línia. Finalment, el quart bloc del document introdueix una interfície gràfica d'usuari que s'ha dissenyat e implementat en Matlab per a la comprensió i la supervisió de processos per lots. Per dur a terme aquestes anàlisis multivariats, s'han implementat els últims desenvolupaments en la quimiometria de processos, incloent-hi els mètodes proposats en aquesta tesi.González Martínez, JM. (2015). ADVANCES ON BILINEAR MODELING OF BIOCHEMICAL BATCH PROCESSES [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/55684TESISPremios Extraordinarios de tesis doctorale

    A precise bare simulation approach to the minimization of some distances. Foundations

    Full text link
    In information theory -- as well as in the adjacent fields of statistics, machine learning, artificial intelligence, signal processing and pattern recognition -- many flexibilizations of the omnipresent Kullback-Leibler information distance (relative entropy) and of the closely related Shannon entropy have become frequently used tools. To tackle corresponding constrained minimization (respectively maximization) problems by a newly developed dimension-free bare (pure) simulation method, is the main goal of this paper. Almost no assumptions (like convexity) on the set of constraints are needed, within our discrete setup of arbitrary dimension, and our method is precise (i.e., converges in the limit). As a side effect, we also derive an innovative way of constructing new useful distances/divergences. To illustrate the core of our approach, we present numerous examples. The potential for widespread applicability is indicated, too; in particular, we deliver many recent references for uses of the involved distances/divergences and entropies in various different research fields (which may also serve as an interdisciplinary interface)
    corecore