Search CORE

2,635 research outputs found

DeSyRe: on-Demand System Reliability

Author: Armato Antonino
Bouganis Christos-Savvas
Falsafi Babak
Gaydadjiev Georgi
Isaza Sebastian
Malek Alirad
Mariani Riccardo
Pnevmatikatos Dionisios N
Pradhan Dhiraj K
Rauwerda Gerard
Seepers Robert
Shafik Rishad Ahmed
Sourdis Ioannis
Strydis Christos
Sunesen Kim
Theodoropoulos Dimitris
Tzilis Stavros
Vavouras Michail
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

Southampton (e-Prints Soton)

EUR Research Repository

Chalmers Research

Chalmers Publication Library

Explore Bristol Research

Reliability-aware and energy-efficient system level design for networks-on-chip

Author: Zou Yong
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2015
Field of study

2015 Spring.Includes bibliographical references.With CMOS technology aggressively scaling into the ultra-deep sub-micron (UDSM) regime and application complexity growing rapidly in recent years, processors today are being driven to integrate multiple cores on a chip. Such chip multiprocessor (CMP) architectures offer unprecedented levels of computing performance for highly parallel emerging applications in the era of digital convergence. However, a major challenge facing the designers of these emerging multicore architectures is the increased likelihood of failure due to the rise in transient, permanent, and intermittent faults caused by a variety of factors that are becoming more and more prevalent with technology scaling. On-chip interconnect architectures are particularly susceptible to faults that can corrupt transmitted data or prevent it from reaching its destination. Reliability concerns in UDSM nodes have in part contributed to the shift from traditional bus-based communication fabrics to network-on-chip (NoC) architectures that provide better scalability, performance, and utilization than buses. In this thesis, to overcome potential faults in NoCs, my research began by exploring fault-tolerant routing algorithms. Under the constraint of deadlock freedom, we make use of the inherent redundancy in NoCs due to multiple paths between packet sources and sinks and propose different fault-tolerant routing schemes to achieve much better fault tolerance capabilities than possible with traditional routing schemes. The proposed schemes also use replication opportunistically to optimize the balance between energy overhead and arrival rate. As 3D integrated circuit (3D-IC) technology with wafer-to-wafer bonding has been recently proposed as a promising candidate for future CMPs, we also propose a fault-tolerant routing scheme for 3D NoCs which outperforms the existing popular routing schemes in terms of energy consumption, performance and reliability. To quantify reliability and provide different levels of intelligent protection, for the first time, we propose the network vulnerability factor (NVF) metric to characterize the vulnerability of NoC components to faults. NVF determines the probabilities that faults in NoC components manifest as errors in the final program output of the CMP system. With NVF aware partial protection for NoC components, almost 50% energy cost can be saved compared to the traditional approach of comprehensively protecting all NoC components. Lastly, we focus on the problem of fault-tolerant NoC design, that involves many NP-hard sub-problems such as core mapping, fault-tolerant routing, and fault-tolerant router configuration. We propose a novel design-time (RESYN) and a hybrid design and runtime (HEFT) synthesis framework to trade-off energy consumption and reliability in the NoC fabric at the system level for CMPs. Together, our research in fault-tolerant NoC routing, reliability modeling, and reliability aware NoC synthesis substantially enhances NoC reliability and energy-efficiency beyond what is possible with traditional approaches and state-of-the-art strategies from prior work

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Reinforcement Learning based Fault-Tolerant Routing Algorithm for Mesh based NoC and its FPGA Implementation

Author: Cenkermaddi Linga Reddy
Joshi Soumya
Samala Jagadheesh
Veda Bhanu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

Network-on-Chip (NoC) has emerged as the most promising on-chip interconnection framework in Multi-Processor System-on-Chips (MPSoCs) due to its efficiency and scalability. In the deep submicron level, NoCs are vulnerable to faults, which leads to the failure of network components such as links and routers. Failures in NoC components diminish system efficiency and reliability. This paper proposes a Reinforcement Learning based Fault-Tolerant Routing (RL-FTR) algorithm to tackle the routing issues caused by link and router faults in the mesh-based NoC architecture. The efficiency of the proposed RL-FTR algorithm is examined using System-C based cycle-accurate NoC simulator. Simulations are carried out by increasing the number of links and router faults in various sizes of mesh. Followed by simulations, real-time functioning of the proposed RL-FTR algorithm is observed using the FPGA implementation. Results of the simulation and hardware shows that the proposed RL-FTR algorithm provides an optimal routing path from the source router to the destination router.publishedVersio

Agder University Research Archive

On the Potential of NoC Virtualization for Multicore Chips

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Crossref

Fault-Tolerant Application-Specific Topology based NoC and its Prototype on an FPGA

Author: Bhanu P. Veda
Cenkeramaddi Linga Reddy
J Soumya
Kumar Rajat
Rahul Govindan
Singh Vishal
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Application-Specific Networks-on-Chips (ASNoCs) are suitable communication platforms for meeting current application requirements. Interconnection links are the primary components involved in communication between the cores of an ASNoC design. The integration density in ASNoC increases with continuous scaling down of the transistor size. Excessive integration density in ASNoC can result in the formation of thermal hotspots, which can cause a system to fail permanently. As a result, fault-tolerant techniques are required to address the permanent faults in interconnection links of an ASNoC design. By taking into account link faults in the topology, this paper introduces a fault-tolerant application-specific topology-based NoC design and its prototype on an FPGA. To place spare links in the ASNoC topology, a meta-heuristic algorithm based on Particle Swarm Optimization (PSO) is proposed. By taking link faults into account in ASNoC design, we also propose an application mapping heuristic and a table-based fault-tolerant routing algorithm. Experiments are carried out for a specific link and any link fault in fault-tolerant topologies generated by our approach and approaches reported in the literature. For the experimentation, we used the multi-media applications Picture-in-Picture (PiP), Moving Pictures Expert Group (MPEG) - 4, MP3Encoder, and Video Object Plane Decoder (VOPD). Experiments are run on software and hardware platforms. The static performance metric communication cost and the dynamic performance metrics network latency, throughput, and router power consumption are examined using software platform. In the hardware platform, the Field Programmable Gate Array (FPGA) is used to validate proposed fault-tolerant topologies and analyze performance metrics such as application runtime, resource utilization, and power consumption. The results are compared with the existing approaches, specifically Ring topology and its modified versions on both software and hardware platforms. The experimental results obtained from software and hardware platforms for a specific link and any link fault show significant improvements in performance metrics using our approach when compared with the related works in the literature.publishedVersio

NORA - Norwegian Open Research Archives

Agder University Research Archive

Scalable interconnect strategies for Neuro-glia networks using Networks-on-Chip

Author: Martin George
Publication venue
Publication date: 01/06/2019
Field of study

Ulster University's Research Portal