3,519 research outputs found

    Flexible Spare Core Placement in Torus Topology based NoCs and its validation on an FPGA

    Get PDF
    In the nano-scale era, Network-on-Chip (NoC) interconnection paradigm has gained importance to abide by the communication challenges in Chip Multi-Processors (CMPs). With increased integration density on CMPs, NoC components namely cores, routers, and links are susceptible to failures. Therefore, to improve system reliability, there is a need for efficient fault-tolerant techniques that mitigate permanent faults in NoC based CMPs. There exists several fault-tolerant techniques that address the permanent faults in application cores while placing the spare cores onto NoC topologies. However, these techniques are limited to Mesh topology based NoCs. There are few approaches that have realized the fault-tolerant solutions on an FPGA, but the study on architectural aspects of NoC is limited. This paper presents the flexible placement of spare core onto Torus topology-based NoC design by considering core faults and validating it on an FPGA. In the first phase, a mathematical formulation based on Integer Linear Programming (ILP) and meta-heuristic based Particle Swarm Optimization (PSO) have been proposed for the placement of spare core. In the second phase, we have implemented NoC router addressing scheme, routing algorithm, run-time fault injection model, and fault-tolerant placement of spare core onto Torus topology using an FPGA. Experiments have been done by taking different multimedia and synthetic application benchmarks. This has been done in both static and dynamic simulation environments followed by hardware implementation. In the static simulation environment, the experimentations are carried out by scaling the network size and router faults in the network. The results obtained from our approach outperform the methods such as Fault-tolerant Spare Core Mapping (FSCM), Simulated Annealing (SA), and Genetic Algorithm (GA) proposed in the literature. For the experiments carried out by scaling the network size, our proposed methodology shows an average improvement of 18.83%, 4.55%, 12.12% in communication cost over the approaches FSCM, SA, and GA, respectively. For the experiments carried out by scaling the router faults in the network, our approach shows an improvement of 34.27%, 26.26%, and 30.41% over the approaches FSCM, SA, and GA, respectively. For the dynamic simulations, our approach shows an average improvement of 5.67%, 0.44%, and 3.69%, over the approaches FSCM, SA, and GA, respectively. In the hardware implementation, our approach shows an average improvement of 5.38%, 7.45%, 27.10% in terms of application runtime over the approaches SA, GA, and FSCM, respectively. This shows the superiority of the proposed approach over the approaches presented in the literature.publishedVersio

    Validation of a software dependability tool via fault injection experiments

    Get PDF
    Presents the validation of the strategies employed in the RECCO tool to analyze a C/C++ software; the RECCO compiler scans C/C++ source code to extract information about the significance of the variables that populate the program and the code structure itself. Experimental results gathered on an Open Source Router are used to compare and correlate two sets of critical variables, one obtained by fault injection experiments, and the other applying the RECCO tool, respectively. Then the two sets are analyzed, compared, and correlated to prove the effectiveness of RECCO's methodology

    The Chameleon Architecture for Streaming DSP Applications

    Get PDF
    We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8 mm2^2 in a 130 nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool

    On Fault Resilient Network-on-Chip for Many Core Systems

    Get PDF
    Rapid scaling of transistor gate sizes has increased the density of on-chip integration and paved the way for heterogeneous many-core systems-on-chip, significantly improving the speed of on-chip processing. The design of the interconnection network of these complex systems is a challenging one and the network-on-chip (NoC) is now the accepted scalable and bandwidth efficient interconnect for multi-processor systems on-chip (MPSoCs). However, the performance enhancements of technology scaling come at the cost of reliability as on-chip components particularly the network-on-chip become increasingly prone to faults. In this thesis, we focus on approaches to deal with the errors caused by such faults. The results of these approaches are obtained not only via time-consuming cycle-accurate simulations but also by analytical approaches, allowing for faster and accurate evaluations, especially for larger networks. Redundancy is the general approach to deal with faults, the mode of which varies according to the type of fault. For the NoC, there exists a classification of faults into transient, intermittent and permanent faults. Transient faults appear randomly for a few cycles and may be caused by the radiation of particles. Intermittent faults are similar to transient faults, however, differing in the fact that they occur repeatedly at the same location, eventually leading to a permanent fault. Permanent faults by definition are caused by wires and transistors being permanently short or open. Generally, spatial redundancy or the use of redundant components is used for dealing with permanent faults. Temporal redundancy deals with failures by re-execution or by retransmission of data while information redundancy adds redundant information to the data packets allowing for error detection and correction. Temporal and information redundancy methods are useful when dealing with transient and intermittent faults. In this dissertation, we begin with permanent faults in NoC in the form of faulty links and routers. Our approach for spatial redundancy adds redundant links in the diagonal direction to the standard rectangular mesh topology resulting in the hexagonal and octagonal NoCs. In addition to redundant links, adaptive routing must be used to bypass faulty components. We develop novel fault-tolerant deadlock-free adaptive routing algorithms for these topologies based on the turn model without the use of virtual channels. Our results show that the hexagonal and octagonal NoCs can tolerate all 2-router and 3-router faults, respectively, while the mesh has been shown to tolerate all 1-router faults. To simplify the restricted-turn selection process for achieving deadlock freedom, we devised an approach based on the channel dependency matrix instead of the state-of-the-art Duato's method of observing the channel dependency graph for cycles. The approach is general and can be used for the turn selection process for any regular topology. We further use algebraic manipulations of the channel dependency matrix to analytically assess the fault resilience of the adaptive routing algorithms when affected by permanent faults. We present and validate this method for the 2D mesh and hexagonal NoC topologies achieving very high accuracy with a maximum error of 1%. The approach is very general and allows for faster evaluations as compared to the generally used cycle-accurate simulations. In comparison, existing works usually assume a limited number of faults to be able to analytically assess the network reliability. We apply the approach to evaluate the fault resilience of larger NoCs demonstrating the usefulness of the approach especially compared to cycle-accurate simulations. Finally, we concentrate on temporal and information redundancy techniques to deal with transient and intermittent faults in the router resulting in the dropping and hence loss of packets. Temporal redundancy is applied in the form of ARQ and retransmission of lost packets. Information redundancy is applied by the generation and transmission of redundant linear combinations of packets known as random linear network coding. We develop an analytic model for flexible evaluation of these approaches to determine the network performance parameters such as residual error rates and increased network load. The analytic model allows to evaluate larger NoCs and different topologies and to investigate the advantage of network coding compared to uncoded transmissions. We further extend the work with a small insight to the problem of secure communication over the NoC. Assuming large heterogeneous MPSoCs with components from third parties, the communication is subject to active attacks in the form of packet modification and drops in the NoC routers. Devising approaches to resolve these issues, we again formulate analytic models for their flexible and accurate evaluations, with a maximum estimation error of 7%

    Interconnection Networks for Scalable Quantum Computers

    Full text link
    We show that the problem of communication in a quantum computer reduces to constructing reliable quantum channels by distributing high-fidelity EPR pairs. We develop analytical models of the latency, bandwidth, error rate and resource utilization of such channels, and show that 100s of qubits must be distributed to accommodate a single data communication. Next, we show that a grid of teleportation nodes forms a good substrate on which to distribute EPR pairs. We also explore the control requirements for such a network. Finally, we propose a specific routing architecture and simulate the communication patterns of the Quantum Fourier Transform to demonstrate the impact of resource contention.Comment: To appear in International Symposium on Computer Architecture 2006 (ISCA 2006
    corecore