74 research outputs found

    An Event-Based Approach to Distributed Diagnosis of Continuous Systems

    Get PDF
    Distributed fault diagnosis solutions are becoming necessary due to the complexity of modern engineering systems, and the advent of smart sensors and computing elements. This paper presents a novel event-based approach for distributed diagnosis of abrupt parametric faults in continuous systems, based on a qualitative abstraction of measurement deviations from the nominal behavior. We systematically derive dynamic fault signatures expressed as event-based fault models. We develop a distributed diagnoser design algorithm that uses these models for designing local event-based diagnosers based on global diagnosability analysis. The local diagnosers each generate globally correct diagnosis results locally, without a centralized coordinator, and by communicating a minimal number of measurements between themselves. The proposed approach is applied to a multi-tank system, and results demonstrate a marked improvement in scalability compared to a centralized approach

    Improving Distributed Diagnosis Through Structural Model Decomposition

    Get PDF
    Complex engineering systems require efficient fault diagnosis methodologies, but centralized approaches do not scale well, and this motivates the development of distributed solutions. This work presents an event-based approach for distributed diagnosis of abrupt parametric faults in continuous systems, by using the structural model decomposition capabilities provided by Possible Conflicts. We develop a distributed diagnosis algorithm that uses residuals computed by extending Possible Conflicts to build local event-based diagnosers based on global diagnosability analysis. The proposed approach is applied to a multitank system, and results demonstrate an improvement in the design of local diagnosers. Since local diagnosers use only a subset of the residuals, and use subsystem models to compute residuals (instead of the global system model), the local diagnosers are more efficient than previously developed distributed approaches

    Distributed fault diagnosis using minimal structurally over-determined sets: Application to a water distribution network

    Get PDF
    © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksDistributed fault diagnosis is becoming more and more common in industries, to diagnose faults in any large scale system. There are a lot of disadvantages using centralized fault diagnosis in large-scale systems, since in a centralized implementation all the information has to be collected in one location which is generally not possible or very difficult. Moreover, a centralized system needs a high performance centralized unit which generally in most cases is not available. Due to these difficulties in recent years distributed fault diagnosis techniques have been investigated [10]. In distributed fault diagnosis [1] [2], the global diagnoses for the complete system can be computed from the results in all agents and local diagnose is computed from the results of one agent. In distributed fault diagnosis [3] [8], a global coordination process is not necessary and each subsystem depends on a local diagnoser for local diagnosis tasks and communicating with the remaining local diagnosers until a global diagnosis is produced.Accepted versio

    A Qualitative Event-Based Approach to Continuous Systems Diagnosis

    Full text link

    Distributed Methods for Estimation and Fault Diagnosis: the case of Large-scale Networked Systems

    Get PDF
    2011/2012L’obiettivo di questa tesi è il monitoraggio di sistemi complessi a larga-scala. L’importanza di questo argomento è dovuto alla rinnovata enfasi data alle problematiche riguardanti la sicurezza e l’affidabilità dei sistemi, diventate requisiti fondamentali nella progettazione. Infatti, la crescente complessità dei moderni sistemi, dove le relazioni fra i diversi componenti, con il mondo esterno e con il fattore umano sono sempre più importanti, implica una crescente attenzione ai rischi e ai costi dovuti ai guasti e lo sviluppo di approcci nuovi per il controllo e il monitoraggio. Mentre nel contesto centralizzato i problemi di stima e di diagnostica di guasto sono stati ampiamente studiati, lo sviluppo di metodologie specifiche per sistemi distribuiti, larga scala o “networked”, come i Cyber-Physical Systems e i Systems-of-Systems, è cominciato negli ultimi anni. Il sistema fisico è rappresentato come l’interconnessione di sottosistemi ottenuti attraverso una decomposizione del sistema complesso dove le sovrapposizioni sono consentite. L’approccio si basa sul modello dinamico non-lineare dei sottosistemi e sull’approssimazione adattativa delle non note interconnessioni fra i sottosistemi. La novità è la proposta di un’architettura unica che tenga conto dei molteplici aspetti che costituiscono i sistemi moderni, integrando il sistema fisico, il livello sensoriale e il sistema di diagnostica e considerando le relazioni fra questi ambienti e le reti di comunicazione. In particolare, vengono proposte delle soluzioni ai problemi che emergono dall’utilizzo di reti di comunicazione e dal considerare sistemi distribuiti e networked. Il processo di misura è effettuato da un insieme di reti di sensori, disaccoppiando il livello fisico da quello diagnostico e aumentando in questo modo la scalabilità e l’affidabilità del sistema diagnostico complessivo. Un nuovo metodo di stima distribuita per reti di sensori è utilizzato per filtrare le misure minimizzando sia la media sia la varianza dell’errore di stima attraverso la soluzione di un problema di ottimizzazione di Pareto. Un metodo per la re-sincronizzazione delle misure è proposto per gestire sistemi multi-rate e misure asincrone e per compensare l’effetto dei ritardi nella rete di comunicazione fra sensori e diagnostici. Poiché uno dei problemi più importanti quando si considerano sistemi distribuiti e reti di comunicazione è per l’appunto il verificarsi di ritardi di trasmissione e perdite di pacchetti, si propone una strategia di compensazione dei ritardi , basata sull’uso di Time Stamps e buffer e sull’introduzione di una matrice di consenso tempo-variante, che permette di gestire il problema dei ritardi nella rete di comunicazione fra diagnostici. Gli schemi distribuiti per la detection e l’isolation dei guasti sono sviluppati, garantendo la convergenza degli stimatori e derivando le condizioni sufficienti per la detectability e l’isolability. La matrice tempo-variante proposta permette di migliorare queste proprietà definendo delle soglie meno conservative. Alcuni risultati sperimentali provano l’efficacia del metodo proposto. Infine, le architetture distribuite per la detection e l’isolation, sviluppate nel caso tempo-discreto, sono estese al caso tempo continuo e nello scenario in cui lo stato non è completamente misurabile, sia a tempo continuo che a tempo discreto.This thesis deals with the problem of the monitoring of modern complex systems. The motivation is the renewed emphasis given to monitoring and fault-tolerant systems. In fact, nowadays reliability is a key requirement in the design of technical systems. While fault diagnosis architectures and estimation methods have been extensively studied for centralized systems, the interest towards distributed, networked, large-scale and complex systems, such as Cyber-Physical Systems and Systems-of-Systems, has grown in the recent years. The increased complexity in modern systems implies the need for novel tools, able to consider all the different aspects and levels constituting these systems. The system being monitored is modeled as the interconnection of several subsystems and a divide et impera approach allowing overlapping decomposition is used. The local diagnostic decision is made on the basis of the knowledge of the local subsystem dynamic model and of an adaptive approximation of the uncertain interconnection with neighboring subsystems. The goal is to integrate all the aspects of the monitoring process in a comprehensive architecture, taking into account the physical environment, the sensor layer, the diagnosers level and the communication networks. In particular, specifically designed methods are developed in order to take into account the issues emerging when dealing with communication networks and distributed systems. The introduction of the sensor layer, composed by a set of sensor networks, allows the decoupling of the physical and the sensing/computation topologies, bringing some advantages, such as scalability and reliability of the diagnosis architecture. We design the measurements acquisition task by proposing a distributed estimation method for sensor networks, able to filter measurements so that both the variance and the mean of the estimation error are minimized by means of a Pareto optimization problem. Moreover, we consider multi-rate systems and non synchronized measurements, having in mind realistic applications. A re-synchronization method is proposed in order to manage the case of multi-rate systems and to compensate delays in the communication network between sensors and diagnosers. Since one of the problems when dealing with distributed, large-scale or networked systems and therefore with a communication network, is inevitably the presence of stochastic delays and packet dropouts, we propose therefore a distributed delay compensation strategy in the communication network between diagnosers, based on the use of Time Stamps and buffers and the definition of a time-varying consensus matrix. The goal of the novel time-varying matrix is twofold: it allows to manage communication delays, packet dropouts and interrupted links and to optimize detectability and isolability skills by defining less conservative thresholds. The distributed fault detection and isolation schemes are studied and analytical results regarding fault detectability, isolability and estimator convergence are derived. Simulation results show the effectiveness of the proposed architecture. For the sake of completeness, the monitoring architecture is studied and adapted to different frameworks: the fault detection and isolation methodology is extended for continuous-time systems and the case where the state is only partially measurable is considered for discrete-time and continuous-time systems.XXV Ciclo198

    A Structural Model Decomposition Framework for Systems Health Management

    Get PDF
    Systems health management (SHM) is an important set of technologies aimed at increasing system safety and reliability by detecting, isolating, and identifying faults; and predicting when the system reaches end of life (EOL), so that appropriate fault mitigation and recovery actions can be taken. Model-based SHM approaches typically make use of global, monolithic system models for online analysis, which results in a loss of scalability and efficiency for large-scale systems. Improvement in scalability and efficiency can be achieved by decomposing the system model into smaller local submodels and operating on these submodels instead. In this paper, the global system model is analyzed offline and structurally decomposed into local submodels. We define a common model decomposition framework for extracting submodels from the global model. This framework is then used to develop algorithms for solving model decomposition problems for the design of three separate SHM technologies, namely, estimation (which is useful for fault detection and identification), fault isolation, and EOL prediction. We solve these model decomposition problems using a three-tank system as a case study

    An Integrated Framework for Model-Based Distributed Diagnosis and Prognosis

    Get PDF
    Diagnosis and prognosis are necessary tasks for system reconfiguration and fault-adaptive control in complex systems. Diagnosis consists of detection, isolation and identification of faults, while prognosis consists of prediction of the remaining useful life of systems. This paper presents a novel integrated framework for model-based distributed diagnosis and prognosis, where system decomposition is used to enable the diagnosis and prognosis tasks to be performed in a distributed way. We show how different submodels can be automatically constructed to solve the local diagnosis and prognosis problems. We illustrate our approach using a simulated four-wheeled rover for different fault scenarios. Our experiments show that our approach correctly performs distributed fault diagnosis and prognosis in an efficient and robust manner

    FAST : a fault detection and identification software tool

    Get PDF
    The aim of this work is to improve the reliability and safety of complex critical control systems by contributing to the systematic application of fault diagnosis. In order to ease the utilization of fault detection and isolation (FDI) tools in the industry, a systematic approach is required to allow the process engineers to analyze a system from this perspective. In this way, it should be possible to analyze this system to find if it provides the required fault diagnosis and redundancy according to the process criticality. In addition, it should be possible to evaluate what-if scenarios by slightly modifying the process (f.i. adding sensors or changing their placement) and evaluating the impact in terms of the fault diagnosis and redundancy possibilities. Hence, this work proposes an approach to analyze a process from the FDI perspective and for this purpose provides the tool FAST which covers from the analysis and design phase until the final FDI supervisor implementation in a real process. To synthesize the process information, a very simple format has been defined based on XML. This format provides the needed information to systematically perform the Structural Analysis of that process. Any process can be analyzed, the only restriction is that the models of the process components need to be available in the FAST tool. The processes are described in FAST in terms of process variables, components and relations and the tool performs the structural analysis of the process obtaining: (i) the structural matrix, (ii) the perfect matching, (iii) the analytical redundancy relations (if any) and (iv) the fault signature matrix. To aid in the analysis process, FAST can operate stand alone in simulation mode allowing the process engineer to evaluate the faults, its detectability and implement changes in the process components and topology to improve the diagnosis and redundancy capabilities. On the other hand, FAST can operate on-line connected to the process plant through an OPC interface. The OPC interface enables the possibility to connect to almost any process which features a SCADA system for supervisory control. When running in on-line mode, the process is monitored by a software agent known as the Supervisor Agent. FAST has also the capability of implementing distributed FDI using its multi-agent architecture. The tool is able to partition complex industrial processes into subsystems, identify which process variables need to be shared by each subsystem and instantiate a Supervision Agent for each of the partitioned subsystems. The Supervision Agents once instantiated will start diagnosing their local components and handle the requests to provide the variable values which FAST has identified as shared with other agents to support the distributed FDI process.Per tal de facilitar la utilització d'eines per la detecció i identificació de fallades (FDI) en la indústria, es requereix un enfocament sistemàtic per permetre als enginyers de processos analitzar un sistema des d'aquesta perspectiva. D'aquesta forma, hauria de ser possible analitzar aquest sistema per determinar si proporciona el diagnosi de fallades i la redundància d'acord amb la seva criticitat. A més, hauria de ser possible avaluar escenaris de casos modificant lleugerament el procés (per exemple afegint sensors o canviant la seva localització) i avaluant l'impacte en quant a les possibilitats de diagnosi de fallades i redundància. Per tant, aquest projecte proposa un enfocament per analitzar un procés des de la perspectiva FDI i per tal d'implementar-ho proporciona l'eina FAST la qual cobreix des de la fase d'anàlisi i disseny fins a la implementació final d'un supervisor FDI en un procés real. Per sintetitzar la informació del procés s'ha definit un format simple basat en XML. Aquest format proporciona la informació necessària per realitzar de forma sistemàtica l'Anàlisi Estructural del procés. Qualsevol procés pot ser analitzat, només hi ha la restricció de que els models dels components han d'estar disponibles en l'eina FAST. Els processos es descriuen en termes de variables de procés, components i relacions i l'eina realitza l'anàlisi estructural obtenint: (i) la matriu estructural, (ii) el Perfect Matching, (iii) les relacions de redundància analítica, si n'hi ha, i (iv) la matriu signatura de fallades. Per ajudar durant el procés d'anàlisi, FAST pot operar aïlladament en mode de simulació permetent a l'enginyer de procés avaluar fallades, la seva detectabilitat i implementar canvis en els components del procés i la topologia per tal de millorar les capacitats de diagnosi i redundància. Per altra banda, FAST pot operar en línia connectat al procés de la planta per mitjà d'una interfície OPC. La interfície OPC permet la possibilitat de connectar gairebé a qualsevol procés que inclogui un sistema SCADA per la seva supervisió. Quan funciona en mode en línia, el procés està monitoritzat per un agent software anomenat l'Agent Supervisor. Addicionalment, FAST té la capacitat d'implementar FDI de forma distribuïda utilitzant la seva arquitectura multi-agent. L'eina permet dividir sistemes industrials complexes en subsistemes, identificar quines variables de procés han de ser compartides per cada subsistema i generar una instància d'Agent Supervisor per cadascun dels subsistemes identificats. Els Agents Supervisor un cop activats, començaran diagnosticant els components locals i despatxant les peticions de valors per les variables que FAST ha identificat com compartides amb altres agents, per tal d'implementar el procés FDI de forma distribuïda.Postprint (published version

    Real-time performance diagnosis and evaluation of big data systems in cloud datacenters

    Get PDF
    PhD ThesisModern big data processing systems are becoming very complex in terms of largescale, high-concurrency and multiple talents. Thus, many failures and performance reductions only happen at run-time and are very difficult to capture. Moreover, some issues may only be triggered when some components are executed. To analyze the root cause of these types of issues, we have to capture the dependencies of each component in real-time. Big data processing systems, such as Hadoop and Spark, usually work in large-scale, highly-concurrent, and multi-tenant environments that can easily cause hardware and software malfunctions or failures, thereby leading to performance degradation. Several systems and methods exist to detect big data processing systems’ performance degradation, perform root-cause analysis, and even overcome the issues causing such degradation. However, these solutions focus on specific problems such as stragglers and inefficient resource utilization. There is a lack of a generic and extensible framework to support the real-time diagnosis of big data systems. Performance diagnosis and prediction of big data systems are highly complex as these frameworks are typically deployed in cloud data centers that are large-scale, highly concurrent, and follows a multi-tenant model. Several factors, including hardware heterogeneity, stochastic networks and application workloads may impact the performance of big data systems. The current state-of-the-art does not sufficiently address the challenge of determining complex, usually stochastic and hidden relationships between these factors. To handle performance diagnosis and evaluation of big data systems in cloud environments, this thesis proposes multilateral research towards monitoring and performance diagnosis and prediction in cloud-based large-scale distributed systems by involving a novel combination of an effective and efficient deployment pipeline.The key contributions of this dissertation are listed below: - i - • Designing a real-time big data monitoring system called SmartMonit that efficiently collects the runtime system information including computing resource utilization and job execution information and then interacts the collected information with the Execution Graph modeled as directed acyclic graphs (DAGs). • Developing AutoDiagn, an automated real-time diagnosis framework for big data systems, that automatically detects performance degradation and inefficient resource utilization problems, while providing an online detection and semi-online root-cause analysis for a big data system. • Designing a novel root-cause analysis technique/system called BigPerf for big data systems that analyzes and characterizes the performance of big data applications by incorporating Bayesian networks to determine uncertain and complex relationships between performance related factors. The key contributions of this dissertation are listed below: - i - • Designing a real-time big data monitoring system called SmartMonit that efficiently collects the runtime system information including computing resource utilization and job execution information and then interacts the collected information with the Execution Graph modeled as directed acyclic graphs (DAGs). • Developing AutoDiagn, an automated real-time diagnosis framework for big data systems, that automatically detects performance degradation and inefficient resource utilization problems, while providing an online detection and semi-online root-cause analysis for a big data system. • Designing a novel root-cause analysis technique/system called BigPerf for big data systems that analyzes and characterizes the performance of big data applications by incorporating Bayesian networks to determine uncertain and complex relationships between performance related factors. The key contributions of this dissertation are listed below: - i - • Designing a real-time big data monitoring system called SmartMonit that efficiently collects the runtime system information including computing resource utilization and job execution information and then interacts the collected information with the Execution Graph modeled as directed acyclic graphs (DAGs). • Developing AutoDiagn, an automated real-time diagnosis framework for big data systems, that automatically detects performance degradation and inefficient resource utilization problems, while providing an online detection and semi-online root-cause analysis for a big data system. • Designing a novel root-cause analysis technique/system called BigPerf for big data systems that analyzes and characterizes the performance of big data applications by incorporating Bayesian networks to determine uncertain and complex relationships between performance related factors.State of the Republic of Turkey and the Turkish Ministry of National Educatio
    • …
    corecore