1,492 research outputs found

    Catastrophic Faults in Reconfigurable Linear Arrays of Processors

    Get PDF
    In regular architectures of identical processing elements, a widely used technique to improve the reconfigurability of the system consists of providing redundant processing elements and mechanisms of reconfiguration. In this paper we consider linear arrays of processing elements, with unidirectional bypass links of length g. We count the number of particular sets of faulty processing elements. We show that the number of catastrophic faults of g elements is equal to the (g-1)-th Catalan number. We also provide algorithms to rank and unrank all catastrophic sets of g faults. Finally, we describe a linear time algorithm that generate all such sets of faults

    Efficient reconfigurable techniques for VLSI arrays with 6-port switches

    Get PDF
    This paper proposes an efficient techniques to reconfigure a two-dimensional degradable very large scale integration/wafer scale integration (VLSI/WSI) array under the row and column routing constraints, which has been shown to be NP-complete. The proposed VLSI/WSI array consists of identical processing elements such as processors or memory cells embedded in a 6-port switch lattice in the form of a rectangular grid. It has been shown that the proposed VLSI structure with 6-port switches eliminates the need to incorporate internal bypass within processing elements and leads to notable increase in the harvest when compared with the one using 4-port switches. A new greedy rerouting algorithm and compensation approaches are also proposed to maximize harvest through reconfiguration. Experimental results show that the proposed VLSI array with 6-port switches consistently outperforms the most efficient alternative, proposed in literature, toward maximizing the harvest in the presence of fault processing elements

    On Fault Tolerance Methods for Networks-on-Chip

    Get PDF
    Technology scaling has proceeded into dimensions in which the reliability of manufactured devices is becoming endangered. The reliability decrease is a consequence of physical limitations, relative increase of variations, and decreasing noise margins, among others. A promising solution for bringing the reliability of circuits back to a desired level is the use of design methods which introduce tolerance against possible faults in an integrated circuit. This thesis studies and presents fault tolerance methods for network-onchip (NoC) which is a design paradigm targeted for very large systems-onchip. In a NoC resources, such as processors and memories, are connected to a communication network; comparable to the Internet. Fault tolerance in such a system can be achieved at many abstraction levels. The thesis studies the origin of faults in modern technologies and explains the classification to transient, intermittent and permanent faults. A survey of fault tolerance methods is presented to demonstrate the diversity of available methods. Networks-on-chip are approached by exploring their main design choices: the selection of a topology, routing protocol, and flow control method. Fault tolerance methods for NoCs are studied at different layers of the OSI reference model. The data link layer provides a reliable communication link over a physical channel. Error control coding is an efficient fault tolerance method especially against transient faults at this abstraction level. Error control coding methods suitable for on-chip communication are studied and their implementations presented. Error control coding loses its effectiveness in the presence of intermittent and permanent faults. Therefore, other solutions against them are presented. The introduction of spare wires and split transmissions are shown to provide good tolerance against intermittent and permanent errors and their combination to error control coding is illustrated. At the network layer positioned above the data link layer, fault tolerance can be achieved with the design of fault tolerant network topologies and routing algorithms. Both of these approaches are presented in the thesis together with realizations in the both categories. The thesis concludes that an optimal fault tolerance solution contains carefully co-designed elements from different abstraction levelsSiirretty Doriast

    Memory built-in self-repair and correction for improving yield: a review

    Get PDF
    Nanometer memories are highly prone to defects due to dense structure, necessitating memory built-in self-repair as a must-have feature to improve yield. Today’s system-on-chips contain memories occupying an area as high as 90% of the chip area. Shrinking technology uses stricter design rules for memories, making them more prone to manufacturing defects. Further, using 3D-stacked memories makes the system vulnerable to newer defects such as those coming from through-silicon-vias (TSV) and micro bumps. The increased memory size is also resulting in an increase in soft errors during system operation. Multiple memory repair techniques based on redundancy and correction codes have been presented to recover from such defects and prevent system failures. This paper reviews recently published memory repair methodologies, including various built-in self-repair (BISR) architectures, repair analysis algorithms, in-system repair, and soft repair handling using error correcting codes (ECC). It provides a classification of these techniques based on method and usage. Finally, it reviews evaluation methods used to determine the effectiveness of the repair algorithms. The paper aims to present a survey of these methodologies and prepare a platform for developing repair methods for upcoming-generation memories

    Reconfigurable architecture for very large scale microelectronic systems

    Get PDF

    Control and reliability of optical networks in multiprocessors

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1993.Includes bibliographical references (leaves 138-142).by James Jonathan Olsen.Ph.D

    Fault Secure Encoder and Decoder for NanoMemory Applications

    Get PDF
    Memory cells have been protected from soft errors for more than a decade; due to the increase in soft error rate in logic circuits, the encoder and decoder circuitry around the memory blocks have become susceptible to soft errors as well and must also be protected. We introduce a new approach to design fault-secure encoder and decoder circuitry for memory designs. The key novel contribution of this paper is identifying and defining a new class of error-correcting codes whose redundancy makes the design of fault-secure detectors (FSD) particularly simple. We further quantify the importance of protecting encoder and decoder circuitry against transient errors, illustrating a scenario where the system failure rate (FIT) is dominated by the failure rate of the encoder and decoder. We prove that Euclidean geometry low-density parity-check (EG-LDPC) codes have the fault-secure detector capability. Using some of the smaller EG-LDPC codes, we can tolerate bit or nanowire defect rates of 10% and fault rates of 10^(-18) upsets/device/cycle, achieving a FIT rate at or below one for the entire memory system and a memory density of 10^(11) bit/cm^2 with nanowire pitch of 10 nm for memory blocks of 10 Mb or larger. Larger EG-LDPC codes can achieve even higher reliability and lower area overhead
    corecore