2,747 research outputs found

    X-Rel: Energy-Efficient and Low-Overhead Approximate Reliability Framework for Error-Tolerant Applications Deployed in Critical Systems

    Full text link
    Triple Modular Redundancy (TMR) is one of the most common techniques in fault-tolerant systems, in which the output is determined by a majority voter. However, the design diversity of replicated modules and/or soft errors that are more likely to happen in the nanoscale era may affect the majority voting scheme. Besides, the significant overheads of the TMR scheme may limit its usage in energy consumption and area-constrained critical systems. However, for most inherently error-resilient applications such as image processing and vision deployed in critical systems (like autonomous vehicles and robotics), achieving a given level of reliability has more priority than precise results. Therefore, these applications can benefit from the approximate computing paradigm to achieve higher energy efficiency and a lower area. This paper proposes an energy-efficient approximate reliability (X-Rel) framework to overcome the aforementioned challenges of the TMR systems and get the full potential of approximate computing without sacrificing the desired reliability constraint and output quality. The X-Rel framework relies on relaxing the precision of the voter based on a systematical error bounding method that leverages user-defined quality and reliability constraints. Afterward, the size of the achieved voter is used to approximate the TMR modules such that the overall area and energy consumption are minimized. The effectiveness of employing the proposed X-Rel technique in a TMR structure, for different quality constraints as well as with various reliability bounds are evaluated in a 15-nm FinFET technology. The results of the X-Rel voter show delay, area, and energy consumption reductions of up to 86%, 87%, and 98%, respectively, when compared to those of the state-of-the-art approximate TMR voters.Comment: This paper has been published in IEEE Transactions on Very Large Scale Integration (VLSI) System

    Mitigating Silent Data Corruptions In Integer Matrix Products: Toward Reliable Multimedia Computing On Unreliable Hardware

    Get PDF
    The generic matrix multiply (GEMM) routine comprises the compute and memory-intensive part of many information retrieval, machine learning and object recognition systems that process integer inputs. Therefore, it is of paramount importance to ensure that integer GEMM computations remain robust to silent data corruptions (SDCs), which stem from accidental voltage or frequency overscaling, or other hardware non-idealities. In this paper, we introduce a new method for SDC mitigation based on the concept of numerical packing. The key difference between our approach and all existing methods is the production of redundant results within the numerical representation of the outputs, rather than as a separate set of checksums. Importantly, unlike well-known algorithm-based fault tolerance (ABFT) approaches for GEMM, the proposed approach can reliably detect the locations of the vast majority of all possible SDCs in the results of GEMM computations. An experimental investigation of voltage-scaled integer GEMM computations for visual descriptor matching within state-of-the art image and video retrieval algorithms running on an Intel i7- 4578U 3GHz processor shows that SDC mitigation based on numerical packing leads to comparable or lower execution and energy-consumption overhead in comparison to all other alternatives

    Preliminary Candidate Advanced Avionics System (PCAAS)

    Get PDF
    Specifications which define the system functional requirements, the subsystem and interface needs, and other requirements such as maintainability, modularity, and reliability are summarized. A design definition of all required avionics functions and a system risk analysis are presented

    Health management design considerations for an all electric aircraft

    Get PDF
    This paper explains the On-board IVHM system for a State-Of-the-Art “All electric aircraft” and explores implementing practices for analysis based design, illustrations and development of IVHM capabilities. On implementing the system as an on board system will carry out fault detection and isolation, recommend maintenance action, provides prognostic capabilities to highest possible problems before these became critical. The vehicle Condition Based Maintenance (CBM) and adaptive control algorithm development based on an open architecture system which allow “Plug in and Plug off” various systems in a more efficient and flexible way. The scope of the IVHM design included consideration of data collection and communication from the continuous monitoring of aircraft systems, observation of current system states, and processing of this data to support proper maintenance and repair actions. Legacy commercial platforms and HM applications for various subsystems of these aircraft were identified. The list of possible applications was down-selected to a reduced number that offer the highest value using a QFD matrix based on the cost benefit analysis. Requirements, designs and system architectures were developed for these applications. The application areas considered included engine, tires and brakes, pneumatics and air conditioning, generator, and structures. IVHM design program included identification of application sensors, functions and interfaces; IVHM system architecture, descriptions of certification requirements and approaches; the results of a cost/benefit analyses and recommended standards and technology gaps. The work concluded with observations on nature of HM, the technologies, and the approaches and challenges to its integration into the current avionics, support system and business infrastructure. The IVHM design for All Electric Hybrid Wing Body (HWB) Aircraft has a challenging task of addressing and resolving the shortfalls in the legacy IVHM framework. The challenges like sensor battery maintenance, handling big data from SHM, On-Ground Data transfer by light, Extraction of required features at sensor nodes/RDCUs, ECAM/EICAS Interfaces, issues of certification of wireless SHM network has been addressed in this paper. Automatic Deployable Flight Data recorders are used in the design of HWB aircraft in which critical flight parameters are recorded. The component selection of IVHM system including software and hardware have been based on the COTS technology. The design emphasis on high levels of reliability and maintainability. The above systems are employed using IMA and integrated on AFDX data bus. The design activities has to pass through design reviews on systematic basis and the overall approach has been to make system highly lighter, effective “All weather” compatible and modular. It is concluded from the study of advancement in IVHM capabilities and new service offerings that IVHM technology is emerging as well as challenging. With the inclusion of adaptive control, vehicle condition based maintenance and pilot fatigue monitoring, IVHM evolved as a more proactively involved on-board system

    Fault-tolerant computation using algebraic homomorphisms

    Get PDF
    Also issued as Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1992.Includes bibliographical references (p. 193-196).Supported by the Defense Advanced Research Projects Agency, monitored by the U.S. Navy Office of Naval Research. N00014-89-J-1489 Supported by the Charles S. Draper Laboratories. DL-H-418472Paul E. Beckmann

    On Fault Tolerance Methods for Networks-on-Chip

    Get PDF
    Technology scaling has proceeded into dimensions in which the reliability of manufactured devices is becoming endangered. The reliability decrease is a consequence of physical limitations, relative increase of variations, and decreasing noise margins, among others. A promising solution for bringing the reliability of circuits back to a desired level is the use of design methods which introduce tolerance against possible faults in an integrated circuit. This thesis studies and presents fault tolerance methods for network-onchip (NoC) which is a design paradigm targeted for very large systems-onchip. In a NoC resources, such as processors and memories, are connected to a communication network; comparable to the Internet. Fault tolerance in such a system can be achieved at many abstraction levels. The thesis studies the origin of faults in modern technologies and explains the classification to transient, intermittent and permanent faults. A survey of fault tolerance methods is presented to demonstrate the diversity of available methods. Networks-on-chip are approached by exploring their main design choices: the selection of a topology, routing protocol, and flow control method. Fault tolerance methods for NoCs are studied at different layers of the OSI reference model. The data link layer provides a reliable communication link over a physical channel. Error control coding is an efficient fault tolerance method especially against transient faults at this abstraction level. Error control coding methods suitable for on-chip communication are studied and their implementations presented. Error control coding loses its effectiveness in the presence of intermittent and permanent faults. Therefore, other solutions against them are presented. The introduction of spare wires and split transmissions are shown to provide good tolerance against intermittent and permanent errors and their combination to error control coding is illustrated. At the network layer positioned above the data link layer, fault tolerance can be achieved with the design of fault tolerant network topologies and routing algorithms. Both of these approaches are presented in the thesis together with realizations in the both categories. The thesis concludes that an optimal fault tolerance solution contains carefully co-designed elements from different abstraction levelsSiirretty Doriast

    Custom Integrated Circuits

    Get PDF
    Contains reports on nine research projects.Analog Devices, Inc.International Business Machines CorporationJoint Services Electronics Program Contract DAAL03-89-C-0001U.S. Air Force - Office of Scientific Research Contract AFOSR 86-0164BDuPont CorporationNational Science Foundation Grant MIP 88-14612U.S. Navy - Office of Naval Research Contract N00014-87-K-0825American Telephone and TelegraphDigital Equipment CorporationNational Science Foundation Grant MIP 88-5876
    • …
    corecore