27,616 research outputs found

    DeSyRe: on-Demand System Reliability

    No full text
    The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

    Review of recent research towards power cable life cycle management

    Get PDF
    Power cables are integral to modern urban power transmission and distribution systems. For power cable asset managers worldwide, a major challenge is how to manage effectively the expensive and vast network of cables, many of which are approaching, or have past, their design life. This study provides an in-depth review of recent research and development in cable failure analysis, condition monitoring and diagnosis, life assessment methods, fault location, and optimisation of maintenance and replacement strategies. These topics are essential to cable life cycle management (LCM), which aims to maximise the operational value of cable assets and is now being implemented in many power utility companies. The review expands on material presented at the 2015 JiCable conference and incorporates other recent publications. The review concludes that the full potential of cable condition monitoring, condition and life assessment has not fully realised. It is proposed that a combination of physics-based life modelling and statistical approaches, giving consideration to practical condition monitoring results and insulation response to in-service stress factors and short term stresses, such as water ingress, mechanical damage and imperfections left from manufacturing and installation processes, will be key to success in improved LCM of the vast amount of cable assets around the world

    A synthesis of logic and bio-inspired techniques in the design of dependable systems

    Get PDF
    Much of the development of model-based design and dependability analysis in the design of dependable systems, including software intensive systems, can be attributed to the application of advances in formal logic and its application to fault forecasting and verification of systems. In parallel, work on bio-inspired technologies has shown potential for the evolutionary design of engineering systems via automated exploration of potentially large design spaces. We have not yet seen the emergence of a design paradigm that effectively combines these two techniques, schematically founded on the two pillars of formal logic and biology, from the early stages of, and throughout, the design lifecycle. Such a design paradigm would apply these techniques synergistically and systematically to enable optimal refinement of new designs which can be driven effectively by dependability requirements. The paper sketches such a model-centric paradigm for the design of dependable systems, presented in the scope of the HiP-HOPS tool and technique, that brings these technologies together to realise their combined potential benefits. The paper begins by identifying current challenges in model-based safety assessment and then overviews the use of meta-heuristics at various stages of the design lifecycle covering topics that span from allocation of dependability requirements, through dependability analysis, to multi-objective optimisation of system architectures and maintenance schedules

    Postselection threshold against biased noise

    Get PDF
    The highest current estimates for the amount of noise a quantum computer can tolerate are based on fault-tolerance schemes relying heavily on postselecting on no detected errors. However, there has been no proof that these schemes give even a positive tolerable noise threshold. A technique to prove a positive threshold, for probabilistic noise models, is presented. The main idea is to maintain strong control over the distribution of errors in the quantum state at all times. This distribution has correlations which conceivably could grow out of control with postselection. But in fact, the error distribution can be written as a mixture of nearby distributions each satisfying strong independence properties, so there are no correlations for postselection to amplify.Comment: 13 pages, FOCS 2006; conference versio

    Unattended network operations technology assessment study. Technical support for defining advanced satellite systems concepts

    Get PDF
    The results are summarized of an unattended network operations technology assessment study for the Space Exploration Initiative (SEI). The scope of the work included: (1) identified possible enhancements due to the proposed Mars communications network; (2) identified network operations on Mars; (3) performed a technology assessment of possible supporting technologies based on current and future approaches to network operations; and (4) developed a plan for the testing and development of these technologies. The most important results obtained are as follows: (1) addition of a third Mars Relay Satellite (MRS) and MRS cross link capabilities will enhance the network's fault tolerance capabilities through improved connectivity; (2) network functions can be divided into the six basic ISO network functional groups; (3) distributed artificial intelligence technologies will augment more traditional network management technologies to form the technological infrastructure of a virtually unattended network; and (4) a great effort is required to bring the current network technology levels for manned space communications up to the level needed for an automated fault tolerance Mars communications network

    Combined Time and Information Redundancy for SEU-Tolerance in Energy-Efficient Real-Time Systems

    No full text
    Recently the trade-off between energy consumption and fault-tolerance in real-time systems has been highlighted. These works have focused on dynamic voltage scaling (DVS) to reduce dynamic energy dissipation and on time redundancy to achieve transient-fault tolerance. While the time redundancy technique exploits the available slack time to increase the fault-tolerance by performing recovery executions, DVS exploits slack time to save energy. Therefore we believe there is a resource conflict between the time-redundancy technique and DVS. The first aim of this paper is to propose the usage of information redundancy to solve this problem. We demonstrate through analytical and experimental studies that it is possible to achieve both higher transient fault-tolerance (tolerance to single event upsets (SEU)) and less energy using a combination of information and time redundancy when compared with using time redundancy alone. The second aim of this paper is to analyze the interplay of transient-fault tolerance (SEU-tolerance) and adaptive body biasing (ABB) used to reduce static leakage energy, which has not been addressed in previous studies. We show that the same technique (i.e. the combination of time and information redundancy) is applicable to ABB-enabled systems and provides more advantages than time redundancy alone

    Study of fault-tolerant software technology

    Get PDF
    Presented is an overview of the current state of the art of fault-tolerant software and an analysis of quantitative techniques and models developed to assess its impact. It examines research efforts as well as experience gained from commercial application of these techniques. The paper also addresses the computer architecture and design implications on hardware, operating systems and programming languages (including Ada) of using fault-tolerant software in real-time aerospace applications. It concludes that fault-tolerant software has progressed beyond the pure research state. The paper also finds that, although not perfectly matched, newer architectural and language capabilities provide many of the notations and functions needed to effectively and efficiently implement software fault-tolerance
    • …
    corecore