2,648 research outputs found

    Fault-tolerant irregular topology design method for network-on-chips

    Get PDF
    As the technology sizes of integrated circuits (ICs) scale down rapidly, current transistor densities on chips dramatically increase. While nanometer feature sizes allow denser chip designs in each technology generation, fabricated ICs become more susceptible to wear-outs, causing operation failure. Even a single link failure within an on-chip fabric can halt communication between application blocks, which makes the entire chip useless. In this study, we aim to make faulty chips designed with Network-on-Chip (NoC) communication usable. Specifically, we present a fault-tolerant irregular topology generation method for application specific NoC designs. Designed NoC topology allows a different routing path if there is a link failure on the default routing. We compare fault-tolerant topologies with regular fault-tolerant ring topologies, and non-fault-tolerant application specific irregular topologies on energy consumption, performance, and area using multimedia benchmarks and custom-generated graphs. © 2014 IEEE

    Fault-Tolerant Topology Generation Method for Application-Specific Network-on-Chips

    Get PDF
    As the technology sizes of integrated circuits (ICs) scale down rapidly, current transistor densities on chips dramatically increase. While nanometer feature sizes allow denser chip designs in each technology generation, fabricated ICs become more susceptible to wear-outs, causing operation failure. Even a single link failure within an on-chip fabric can halt communication between application blocks, which makes the entire chip useless. In this paper, we aim to make faulty chips designed with network-on-chip (NoC) communication usable. Specifically, we present fault-tolerant irregular topology-generation method for application-specific NoC designs. Designed NoC topology allows different routing path if there is a link failure on the default routing path. Additionally, we present a simulated annealing-based application mapping algorithm aiming to minimize total energy consumption of the NoC design. We compare fault-tolerant topologies with nonfault-tolerant application-specific irregular topologies on energy consumption, performance, and area using multimedia benchmarks and custom-generated graphs. Our results demonstrate that our method is able to determine fault-tolerant topologies with negligible area increase and better energy values. © 1982-2012 IEEE

    Fault-Tolerant Application-Specific Topology based NoC and its Prototype on an FPGA

    Get PDF
    Application-Specific Networks-on-Chips (ASNoCs) are suitable communication platforms for meeting current application requirements. Interconnection links are the primary components involved in communication between the cores of an ASNoC design. The integration density in ASNoC increases with continuous scaling down of the transistor size. Excessive integration density in ASNoC can result in the formation of thermal hotspots, which can cause a system to fail permanently. As a result, fault-tolerant techniques are required to address the permanent faults in interconnection links of an ASNoC design. By taking into account link faults in the topology, this paper introduces a fault-tolerant application-specific topology-based NoC design and its prototype on an FPGA. To place spare links in the ASNoC topology, a meta-heuristic algorithm based on Particle Swarm Optimization (PSO) is proposed. By taking link faults into account in ASNoC design, we also propose an application mapping heuristic and a table-based fault-tolerant routing algorithm. Experiments are carried out for a specific link and any link fault in fault-tolerant topologies generated by our approach and approaches reported in the literature. For the experimentation, we used the multi-media applications Picture-in-Picture (PiP), Moving Pictures Expert Group (MPEG) - 4, MP3Encoder, and Video Object Plane Decoder (VOPD). Experiments are run on software and hardware platforms. The static performance metric communication cost and the dynamic performance metrics network latency, throughput, and router power consumption are examined using software platform. In the hardware platform, the Field Programmable Gate Array (FPGA) is used to validate proposed fault-tolerant topologies and analyze performance metrics such as application runtime, resource utilization, and power consumption. The results are compared with the existing approaches, specifically Ring topology and its modified versions on both software and hardware platforms. The experimental results obtained from software and hardware platforms for a specific link and any link fault show significant improvements in performance metrics using our approach when compared with the related works in the literature.publishedVersio

    Network-on-Chip

    Get PDF
    Limitations of bus-based interconnections related to scalability, latency, bandwidth, and power consumption for supporting the related huge number of on-chip resources result in a communication bottleneck. These challenges can be efficiently addressed with the implementation of a network-on-chip (NoC) system. This book gives a detailed analysis of various on-chip communication architectures and covers different areas of NoCs such as potentials, architecture, technical challenges, optimization, design explorations, and research directions. In addition, it discusses current and future trends that could make an impactful and meaningful contribution to the research and design of on-chip communications and NoC systems

    On Fault Resilient Network-on-Chip for Many Core Systems

    Get PDF
    Rapid scaling of transistor gate sizes has increased the density of on-chip integration and paved the way for heterogeneous many-core systems-on-chip, significantly improving the speed of on-chip processing. The design of the interconnection network of these complex systems is a challenging one and the network-on-chip (NoC) is now the accepted scalable and bandwidth efficient interconnect for multi-processor systems on-chip (MPSoCs). However, the performance enhancements of technology scaling come at the cost of reliability as on-chip components particularly the network-on-chip become increasingly prone to faults. In this thesis, we focus on approaches to deal with the errors caused by such faults. The results of these approaches are obtained not only via time-consuming cycle-accurate simulations but also by analytical approaches, allowing for faster and accurate evaluations, especially for larger networks. Redundancy is the general approach to deal with faults, the mode of which varies according to the type of fault. For the NoC, there exists a classification of faults into transient, intermittent and permanent faults. Transient faults appear randomly for a few cycles and may be caused by the radiation of particles. Intermittent faults are similar to transient faults, however, differing in the fact that they occur repeatedly at the same location, eventually leading to a permanent fault. Permanent faults by definition are caused by wires and transistors being permanently short or open. Generally, spatial redundancy or the use of redundant components is used for dealing with permanent faults. Temporal redundancy deals with failures by re-execution or by retransmission of data while information redundancy adds redundant information to the data packets allowing for error detection and correction. Temporal and information redundancy methods are useful when dealing with transient and intermittent faults. In this dissertation, we begin with permanent faults in NoC in the form of faulty links and routers. Our approach for spatial redundancy adds redundant links in the diagonal direction to the standard rectangular mesh topology resulting in the hexagonal and octagonal NoCs. In addition to redundant links, adaptive routing must be used to bypass faulty components. We develop novel fault-tolerant deadlock-free adaptive routing algorithms for these topologies based on the turn model without the use of virtual channels. Our results show that the hexagonal and octagonal NoCs can tolerate all 2-router and 3-router faults, respectively, while the mesh has been shown to tolerate all 1-router faults. To simplify the restricted-turn selection process for achieving deadlock freedom, we devised an approach based on the channel dependency matrix instead of the state-of-the-art Duato's method of observing the channel dependency graph for cycles. The approach is general and can be used for the turn selection process for any regular topology. We further use algebraic manipulations of the channel dependency matrix to analytically assess the fault resilience of the adaptive routing algorithms when affected by permanent faults. We present and validate this method for the 2D mesh and hexagonal NoC topologies achieving very high accuracy with a maximum error of 1%. The approach is very general and allows for faster evaluations as compared to the generally used cycle-accurate simulations. In comparison, existing works usually assume a limited number of faults to be able to analytically assess the network reliability. We apply the approach to evaluate the fault resilience of larger NoCs demonstrating the usefulness of the approach especially compared to cycle-accurate simulations. Finally, we concentrate on temporal and information redundancy techniques to deal with transient and intermittent faults in the router resulting in the dropping and hence loss of packets. Temporal redundancy is applied in the form of ARQ and retransmission of lost packets. Information redundancy is applied by the generation and transmission of redundant linear combinations of packets known as random linear network coding. We develop an analytic model for flexible evaluation of these approaches to determine the network performance parameters such as residual error rates and increased network load. The analytic model allows to evaluate larger NoCs and different topologies and to investigate the advantage of network coding compared to uncoded transmissions. We further extend the work with a small insight to the problem of secure communication over the NoC. Assuming large heterogeneous MPSoCs with components from third parties, the communication is subject to active attacks in the form of packet modification and drops in the NoC routers. Devising approaches to resolve these issues, we again formulate analytic models for their flexible and accurate evaluations, with a maximum estimation error of 7%

    Classification of networks-on-chip in the context of analysis of promising self-organizing routing algorithms

    Full text link
    This paper contains a detailed analysis of the current state of the network-on-chip (NoC) research field, based on which the authors propose the new NoC classification that is more complete in comparison with previous ones. The state of the domain associated with wireless NoC is investigated, as the transition to these NoCs reduces latency. There is an assumption that routing algorithms from classical network theory may demonstrate high performance. So, in this article, the possibility of the usage of self-organizing algorithms in a wireless NoC is also provided. This approach has a lot of advantages described in the paper. The results of the research can be useful for developers and NoC manufacturers as specific recommendations, algorithms, programs, and models for the organization of the production and technological process.Comment: 10 p., 5 fig. Oral presentation on APSSE 2021 conferenc

    Health Condition Monitoring and Fault-Tolerant Operation of Adjustable Speed Drives

    Get PDF
    Adjustable speed drives (ASDs) have been extensively used in industrial applications over the past few decades because of their benefits of energy saving and control flexibilities. However, the wider penetration of ASD systems into industrial applications is hindered by the lack of health monitoring and fault-tolerant operation techniques, especially in safety-critical applications. In this dissertation, a comprehensive portfolio of health condition monitoring and fault-tolerant operation strategies is developed and implemented for multilevel neutral-point-clamped (NPC) power converters in ASDs. Simulations and experiments show that these techniques can improve power cycling lifetime of power transistors, on-line diagnosis of switch faults, and fault-tolerant capabilities.The first contribution of this dissertation is the development of a lifetime improvement Pulse Width Modulation (PWM) method which can significantly extend the power cycling lifetime of Insulated Gate Bipolar Transistors (IGBTs) in NPC inverters operating at low frequencies. This PWM method is achieved by injecting a zero-sequence signal with a frequency higher than that of the IGBT junction-to-case thermal time constants. This, in turn, lowers IGBT junction temperatures at low output frequencies. Thermal models, simulation and experimental verifications are carried out to confirm the effectiveness of this PWM method. As a second contribution of this dissertation, a novel on-line diagnostic method is developed for electronic switch faults in power converters. Targeted at three-level NPC converters, this diagnostic method can diagnose any IGBT faults by utilizing the information on the dc-bus neutral-point current and switching states. This diagnostic method only requires one additional current sensor for sensing the neutral-point current. Simulation and experimental results verified the efficacy of this diagnostic method.The third contribution consists of the development and implementation of a fault-tolerant topology for T-Type NPC power converters. In this fault-tolerant topology, one additional phase leg is added to the original T-Type NPC converter. In addition to providing a fault-tolerant solution to certain switch faults in the converter, this fault-tolerant topology can share the overload current with the original phase legs, thus increasing the overload capabilities of the power converters. A lab-scale 30-kVA ASD based on this proposed topology is implemented and the experimental results verified its benefits
    corecore