44 research outputs found

    Discrete and hybrid methods for the diagnosis of distributed systems

    Get PDF
    Many important activities of modern society rely on the proper functioning of complex systems such as electricity networks, telecommunication networks, manufacturing plants and aircrafts. The supervision of such systems must include strong diagnosis capability to be able to effectively detect the occurrence of faults and ensure appropriate corrective measures can be taken in order to recover from the faults or prevent total failure. This thesis addresses issues in the diagnosis of large complex systems. Such systems are usually distributed in nature, i.e. they consist of many interconnected components each having their own local behaviour. These components interact together to produce an emergent global behaviour that is complex. As those systems increase in complexity and size, their diagnosis becomes increasingly challenging. In the first part of this thesis, a method is proposed for diagnosis on distributed systems that avoids a monolithic global computation. The method, based on converting the graph of the system into a junction tree, takes into account the topology of the system in choosing how to merge local diagnoses on the components while still obtaining a globally consistent result. The method is shown to work well for systems with tree or near-tree structures. This method is further extended to handle systems with high clustering by selectively ignoring some connections that would still allow an accurate diagnosis to be obtained. A hybrid system approach is explored in the second part of the thesis, where continuous dynamics information on the system is also retained to help better isolate or identify faults. A hybrid system framework is presented that models both continuous dynamics and discrete evolution in dynamical systems, based on detecting changes in the fundamental governing dynamics of the system rather than on residual estimation. This makes it possible to handle systems that might not be well characterised and where parameter drift is present. The discrete aspect of the hybrid system model is used to derive diagnosability conditions using indicator functions for the detection and isolation of multiple, arbitrary sequential or simultaneous events in hybrid dynamical networks. Issues with diagnosis in the presence of uncertainty in measurements due sensor or actuator noise are addressed. Faults may generate symptoms that are in the same order of magnitude as the latter. The use of statistical techniques,within a hybrid system framework, is proposed to detect these elusive fault symptoms and translate this information into probabilities for the actual operational mode and possibility of transition between modes which makes it possible to apply probabilistic analysis on the system to handle the underlying uncertainty present

    Investigation of the robustness of star graph networks

    Full text link
    The star interconnection network has been known as an attractive alternative to n-cube for interconnecting a large number of processors. It possesses many nice properties, such as vertex/edge symmetry, recursiveness, sublogarithmic degree and diameter, and maximal fault tolerance, which are all desirable when building an interconnection topology for a parallel and distributed system. Investigation of the robustness of the star network architecture is essential since the star network has the potential of use in critical applications. In this study, three different reliability measures are proposed to investigate the robustness of the star network. First, a constrained two-terminal reliability measure referred to as Distance Reliability (DR) between the source node u and the destination node I with the shortest distance, in an n-dimensional star network, Sn, is introduced to assess the robustness of the star network. A combinatorial analysis on DR especially for u having a single cycle is performed under different failure models (node, link, combined node/link failure). Lower bounds on the special case of the DR: antipode reliability, are derived, compared with n-cube, and shown to be more fault-tolerant than n-cube. The degradation of a container in a Sn having at least one operational optimal path between u and I is also examined to measure the system effectiveness in the presence of failures under different failure models. The values of MTTF to each transition state are calculated and compared with similar size containers in n-cube. Meanwhile, an upper bound under the probability fault model and an approximation under the fixed partitioning approach on the ( n-1)-star reliability are derived, and proved to be similarly accurate and close to the simulations results. Conservative comparisons between similar size star networks and n-cubes show that the star network is more robust than n-cube in terms of ( n-1)-network reliability

    Replay Attack Detection Based on Parity Space Method for Cyber-Physical Systems

    Full text link
    The replay attack detection problem is studied from a new perspective based on parity space method in this paper. The proposed detection methods have the ability to distinguish system fault and replay attack, handle both input and output data replay, maintain certain control performance, and can be implemented conveniently and efficiently. First, the replay attack effect on the residual is derived and analyzed. The residual change induced by replay attack is characterized explicitly and the detection performance analysis based on two different test statistics are given. Second, based on the replay attack effect characterization, targeted passive and active design for detection performance enhancement are proposed. Regarding the passive design, four optimization schemes regarding different cost functions are proposed with optimal parity matrix solutions, and the unified solution to the passive optimization schemes is obtained; the active design is enabled by a marginally stable filter so as to enlarge the replay attack effect on the residual for detection. Simulations and comparison studies are given to show the effectiveness of the proposed methods

    Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack

    Get PDF
    Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria

    Analytical models of a fault-tolerant multiple module microprocessor system

    Get PDF
    Imperial Users onl

    Real-Time Fault Diagnosis of Permanent Magnet Synchronous Motor and Drive System

    Get PDF
    Permanent Magnet Synchronous Motors (PMSMs) have gained massive popularity in industrial applications such as electric vehicles, robotic systems, and offshore industries due to their merits of efficiency, power density, and controllability. PMSMs working in such applications are constantly exposed to electrical, thermal, and mechanical stresses, resulting in different faults such as electrical, mechanical, and magnetic faults. These faults may lead to efficiency reduction, excessive heat, and even catastrophic system breakdown if not diagnosed in time. Therefore, developing methods for real-time condition monitoring and detection of faults at early stages can substantially lower maintenance costs, downtime of the system, and productivity loss. In this dissertation, condition monitoring and detection of the three most common faults in PMSMs and drive systems, namely inter-turn short circuit, demagnetization, and sensor faults are studied. First, modeling and detection of inter-turn short circuit fault is investigated by proposing one FEM-based model, and one analytical model. In these two models, efforts are made to extract either fault indicators or adjustments for being used in combination with more complex detection methods. Subsequently, a systematic fault diagnosis of PMSM and drive system containing multiple faults based on structural analysis is presented. After implementing structural analysis and obtaining the redundant part of the PMSM and drive system, several sequential residuals are designed and implemented based on the fault terms that appear in each of the redundant sets to detect and isolate the studied faults which are applied at different time intervals. Finally, real-time detection of faults in PMSMs and drive systems by using a powerful statistical signal-processing detector such as generalized likelihood ratio test is investigated. By using generalized likelihood ratio test, a threshold was obtained based on choosing the probability of a false alarm and the probability of detection for each detector based on which decision was made to indicate the presence of the studied faults. To improve the detection and recovery delay time, a recursive cumulative GLRT with an adaptive threshold algorithm is implemented. As a result, a more processed fault indicator is achieved by this recursive algorithm that is compared to an arbitrary threshold, and a decision is made in real-time performance. The experimental results show that the statistical detector is able to efficiently detect all the unexpected faults in the presence of unknown noise and without experiencing any false alarm, proving the effectiveness of this diagnostic approach.publishedVersio

    Fault-tolerant computer study

    Get PDF
    A set of building block circuits is described which can be used with commercially available microprocessors and memories to implement fault tolerant distributed computer systems. Each building block circuit is intended for VLSI implementation as a single chip. Several building blocks and associated processor and memory chips form a self checking computer module with self contained input output and interfaces to redundant communications buses. Fault tolerance is achieved by connecting self checking computer modules into a redundant network in which backup buses and computer modules are provided to circumvent failures. The requirements and design methodology which led to the definition of the building block circuits are discussed

    Big Data Analytics for Complex Systems

    Get PDF
    The evolution of technology in all fields led to the generation of vast amounts of data by modern systems. Using data to extract information, make predictions, and make decisions is the current trend in artificial intelligence. The advancement of big data analytics tools made accessing and storing data easier and faster than ever, and machine learning algorithms help to identify patterns in and extract information from data. The current tools and machines in health, computer technologies, and manufacturing can generate massive raw data about their products or samples. The author of this work proposes a modern integrative system that can utilize big data analytics, machine learning, super-computer resources, and industrial health machines’ measurements to build a smart system that can mimic the human intelligence skills of observations, detection, prediction, and decision-making. The applications of the proposed smart systems are included as case studies to highlight the contributions of each system. The first contribution is the ability to utilize big data revolutionary and deep learning technologies on production lines to diagnose incidents and take proper action. In the current digital transformational industrial era, Industry 4.0 has been receiving researcher attention because it can be used to automate production-line decisions. Reconfigurable manufacturing systems (RMS) have been widely used to reduce the setup cost of restructuring production lines. However, the current RMS modules are not linked to the cloud for online decision-making to take the proper decision; these modules must connect to an online server (super-computer) that has big data analytics and machine learning capabilities. The online means that data is centralized on cloud (supercomputer) and accessible in real-time. In this study, deep neural networks are utilized to detect the decisive features of a product and build a prediction model in which the iFactory will make the necessary decision for the defective products. The Spark ecosystem is used to manage the access, processing, and storing of the big data streaming. This contribution is implemented as a closed cycle, which for the best of our knowledge, no one in the literature has introduced big data analysis using deep learning on real-time applications in the manufacturing system. The code shows a high accuracy of 97% for classifying the normal versus defective items. The second contribution, which is in Bioinformatics, is the ability to build supervised machine learning approaches based on the gene expression of patients to predict proper treatment for breast cancer. In the trial, to personalize treatment, the machine learns the genes that are active in the patient cohort with a five-year survival period. The initial condition here is that each group must only undergo one specific treatment. After learning about each group (or class), the machine can personalize the treatment of a new patient by diagnosing the patients’ gene expression. The proposed model will help in the diagnosis and treatment of the patient. The future work in this area involves building a protein-protein interaction network with the selected genes for each treatment to first analyze the motives of the genes and target them with the proper drug molecules. In the learning phase, a couple of feature-selection techniques and supervised standard classifiers are used to build the prediction model. Most of the nodes show a high-performance measurement where accuracy, sensitivity, specificity, and F-measure ranges around 100%. The third contribution is the ability to build semi-supervised learning for the breast cancer survival treatment that advances the second contribution. By understanding the relations between the classes, we can design the machine learning phase based on the similarities between classes. In the proposed research, the researcher used the Euclidean matrix distance among each survival treatment class to build the hierarchical learning model. The distance information that is learned through a non-supervised approach can help the prediction model to select the classes that are away from each other to maximize the distance between classes and gain wider class groups. The performance measurement of this approach shows a slight improvement from the second model. However, this model reduced the number of discriminative genes from 47 to 37. The model in the second contribution studies each class individually while this model focuses on the relationships between the classes and uses this information in the learning phase. Hierarchical clustering is completed to draw the borders between groups of classes before building the classification models. Several distance measurements are tested to identify the best linkages between classes. Most of the nodes show a high-performance measurement where accuracy, sensitivity, specificity, and F-measure ranges from 90% to 100%. All the case study models showed high-performance measurements in the prediction phase. These modern models can be replicated for different problems within different domains. The comprehensive models of the newer technologies are reconfigurable and modular; any newer learning phase can be plugged-in at both ends of the learning phase. Therefore, the output of the system can be an input for another learning system, and a newer feature can be added to the input to be considered for the learning phase

    Fifth Conference on Artificial Intelligence for Space Applications

    Get PDF
    The Fifth Conference on Artificial Intelligence for Space Applications brings together diverse technical and scientific work in order to help those who employ AI methods in space applications to identify common goals and to address issues of general interest in the AI community. Topics include the following: automation for Space Station; intelligent control, testing, and fault diagnosis; robotics and vision; planning and scheduling; simulation, modeling, and tutoring; development tools and automatic programming; knowledge representation and acquisition; and knowledge base/data base integration
    corecore