23 research outputs found

    Fault-tolerant Hamiltonian cycle strategy for fast node fault diagnosis based on PMC in data center networks

    Get PDF
    System-level fault diagnosis model, namely, the PMC model, detects fault nodes only through the mutual testing of nodes in the system without physical equipment. In order to achieve server nodes fault diagnosis in large-scale data center networks (DCNs), the traditional algorithm based on the PMC model cannot meet the characteristics of high diagnosability, high accuracy and high efficiency due to its inability to ensure that the test nodes are fault-free. This paper first proposed a fault-tolerant Hamiltonian cycle fault diagnosis (FHFD) algorithm, which tests nodes in the order of the Hamiltonian cycle to ensure that the test nodes are faultless. In order to improve testing efficiency, a hierarchical diagnosis mechanism was further proposed, which recursively divides high scale structures into a large number of low scale structures based on the recursive structure characteristics of DCNs. Additionally, we proved that 2(n−2)nk−1 2(n-2){n^{k-1}} and (n−2)tn,k/tn,1 (n-2){t_{n, k}}/{t_{n, 1}} faulty nodes could be detected for BCuben,k BCub{e_{n, k}} and DCelln,k DCel{l_{n, k}} within a limited time for the proposed diagnosis strategy. Simulation experiments have also shown that our proposed strategy has improved the diagnosability and test efficiency dramatically

    Sequence-Oriented Diagnosis of Discrete-Event Systems

    Get PDF
    Model-based diagnosis has always been conceived as set-oriented, meaning that a candidate is a set of faults, or faulty components, that explains a collection of observations. This perspective applies equally to both static and dynamical systems. Diagnosis of discrete-event systems (DESs) is no exception: a candidate is traditionally a set of faults, or faulty events, occurring in a trajectory of the DES that conforms with a given sequence of observations. As such, a candidate does not embed any temporal relationship among faults, nor does it account for multiple occurrences of the same fault. To improve diagnostic explanation and support decision making, a sequence-oriented perspective to diagnosis of DESs is presented, where a candidate is a sequence of faults occurring in a trajectory of the DES, called a fault sequence. Since a fault sequence is possibly unbounded, as the same fault may occur an unlimited number of times in the trajectory, the set of (output) candidates may be unbounded also, which contrasts with set-oriented diagnosis, where the set of candidates is bounded by the powerset of the domain of faults. Still, a possibly unbounded set of fault sequences is shown to be a regular language, which can be defined by a regular expression over the domain of faults, a property that makes sequence-oriented diagnosis feasible in practice. The task of monitoring-based diagnosis is considered, where a new candidate set is generated at the occurrence of each observation. The approach is based on three different techniques: (1) blind diagnosis, with no compiled knowledge, (2) greedy diagnosis, with total knowledge compilation, and (3) lazy diagnosis, with partial knowledge compilation. By knowledge we mean a data structure slightly similar to a classical DES diagnoser, which can be generated (compiled) either entirely offline (greedy diagnosis) or incrementally online (lazy diagnosis). Experimental evidence suggests that, among these techniques, only lazy diagnosis may be viable in non-trivial application domains

    Local majorities, coalitions and monopolies in graphs: a review

    Get PDF
    AbstractThis paper provides an overview of recent developments concerning the process of local majority voting in graphs, and its basic properties, from graph theoretic and algorithmic standpoints

    Robust Sensor Fusion Algorithms: Calibration and Cost Minimization.

    Get PDF
    A system reacting to its environment requires sensor input to model the environment. Unfortunately, sensors are electromechanical devices subject to physical limitations. It is challenging for a system to robustly evaluate sensor data which is of questionable accuracy and dependability. Sensor fusion addresses this problem by taking inputs from several sensors and merging the individual sensor readings into a single logical reading. The use of heterogeneous physical sensors allows a logical sensor to be less sensitive to the limitations of any single sensor technology, and the use of multiple identical sensors allows the system to tolerate failures of some of its component physical sensors. These are examples of fault masking, or N-modular redundancy. This research resolves two problems of fault masking systems: the automatic calibration of systems which return partially redundant image data is problematic, and the cost incurred by installing redundant system components can be prohibitive. Both are presented in mathematical terms as optimization problems. To combine inputs from multiple independent sensors, readings must be registered to a common coordinate system. This problem is complex when functions equating the readings are not known a priori. It is even more difficult in the case of sensor readings, where data contains noise and may have a sizable periodic component. A practical method must find a near optimal answer in the presence of large amounts of noise. The first part of this research derives a computational scheme capable of registering partially overlapping noisy sensor readings. Another problem with redundant systems is the cost incurred by redundancy. The trade-off between reliability and system cost is most evident in fault-tolerant systems. Given several component types with known dependability statistics, it is possible to determine the combinations of components which fulfill dependability constraints by modeling the system using Markov chains. When unit costs are known, it is desirable to use low cost combinations of components to fulfill the reliability constraints. The second part of this dissertation develops a methodology for designing sensor systems, with redundant components, which satisfy dependability constraints at near minimal cost. Open problems are also listed

    Bitwise Simulation of the Controller Area Network

    Get PDF
    Computer Scienc

    Fault-tolerant software: dependability/performance trade-offs, concurrency and system support

    Get PDF
    PhD ThesisAs the use of computer systems becomes more and more widespread in applications that demand high levels of dependability, these applications themselves are growing in complexity in a rapid rate, especially in the areas that require concurrent and distributed computing. Such complex systems are very prone to faults and errors. No matter how rigorously fault avoidance and fault removal techniques are applied, software design faults often remain in systems when they are delivered to the customers. In fact, residual software faults are becoming the significant underlying cause of system failures and the lack of dependability. There is tremendous need for systematic techniques for building dependable software, including the fault tolerance techniques that ensure software-based systems to operate dependably even when potential faults are present. However, although there has been a large amount of research in the area of fault-tolerant software, existing techniques are not yet sufficiently mature as a practical engineering discipline for realistic applications. In particular, they are often inadequate when applied to highly concurrent and distributed software. This thesis develops new techniques for building fault-tolerant software, addresses the problem of achieving high levels of dependability in concurrent and distributed object systems, and studies system-level support for implementing dependable software. Two schemes are developed - the t/(n-l)-VP approach is aimed at increasing software reliability and controlling additional complexity, while the SCOP approach presents an adaptive way of dynamically adjusting software reliability and efficiency aspects. As a more general framework for constructing dependable concurrent and distributed software, the Coordinated Atomic (CA) Action scheme is examined thoroughly. Key properties of CA actions are formalized, conceptual model and mechanisms for handling application level exceptions are devised, and object-based diversity techniques are introduced to cope with potential software faults. These three schemes are evaluated analytically and validated by controlled experiments. System-level support is also addressed with a multi-level system architecture. An architectural pattern for implementing fault-tolerant objects is documented in detail to capture existing solutions and our previous experience. An industrial safety-critical application, the Fault-Tolerant Production Cell, is used as a case study to examine most of the concepts and techniques developed in this research.ESPRIT

    Ernst Denert Award for Software Engineering 2020

    Get PDF
    This open access book provides an overview of the dissertations of the eleven nominees for the Ernst Denert Award for Software Engineering in 2020. The prize, kindly sponsored by the Gerlind & Ernst Denert Stiftung, is awarded for excellent work within the discipline of Software Engineering, which includes methods, tools and procedures for better and efficient development of high quality software. An essential requirement for the nominated work is its applicability and usability in industrial practice. The book contains eleven papers that describe the works by Jonathan Brachthäuser (EPFL Lausanne) entitled What You See Is What You Get: Practical Effect Handlers in Capability-Passing Style, Mojdeh Golagha’s (Fortiss, Munich) thesis How to Effectively Reduce Failure Analysis Time?, Nikolay Harutyunyan’s (FAU Erlangen-Nürnberg) work on Open Source Software Governance, Dominic Henze’s (TU Munich) research about Dynamically Scalable Fog Architectures, Anne Hess’s (Fraunhofer IESE, Kaiserslautern) work on Crossing Disciplinary Borders to Improve Requirements Communication, Istvan Koren’s (RWTH Aachen U) thesis DevOpsUse: A Community-Oriented Methodology for Societal Software Engineering, Yannic Noller’s (NU Singapore) work on Hybrid Differential Software Testing, Dominic Steinhofel’s (TU Darmstadt) thesis entitled Ever Change a Running System: Structured Software Reengineering Using Automatically Proven-Correct Transformation Rules, Peter Wägemann’s (FAU Erlangen-Nürnberg) work Static Worst-Case Analyses and Their Validation Techniques for Safety-Critical Systems, Michael von Wenckstern’s (RWTH Aachen U) research on Improving the Model-Based Systems Engineering Process, and Franz Zieris’s (FU Berlin) thesis on Understanding How Pair Programming Actually Works in Industry: Mechanisms, Patterns, and Dynamics – which actually won the award. The chapters describe key findings of the respective works, show their relevance and applicability to practice and industrial software engineering projects, and provide additional information and findings that have only been discovered afterwards, e.g. when applying the results in industry. This way, the book is not only interesting to other researchers, but also to industrial software professionals who would like to learn about the application of state-of-the-art methods in their daily work
    corecore