5 research outputs found

    Reliability-aware and energy-efficient system level design for networks-on-chip

    Get PDF
    2015 Spring.Includes bibliographical references.With CMOS technology aggressively scaling into the ultra-deep sub-micron (UDSM) regime and application complexity growing rapidly in recent years, processors today are being driven to integrate multiple cores on a chip. Such chip multiprocessor (CMP) architectures offer unprecedented levels of computing performance for highly parallel emerging applications in the era of digital convergence. However, a major challenge facing the designers of these emerging multicore architectures is the increased likelihood of failure due to the rise in transient, permanent, and intermittent faults caused by a variety of factors that are becoming more and more prevalent with technology scaling. On-chip interconnect architectures are particularly susceptible to faults that can corrupt transmitted data or prevent it from reaching its destination. Reliability concerns in UDSM nodes have in part contributed to the shift from traditional bus-based communication fabrics to network-on-chip (NoC) architectures that provide better scalability, performance, and utilization than buses. In this thesis, to overcome potential faults in NoCs, my research began by exploring fault-tolerant routing algorithms. Under the constraint of deadlock freedom, we make use of the inherent redundancy in NoCs due to multiple paths between packet sources and sinks and propose different fault-tolerant routing schemes to achieve much better fault tolerance capabilities than possible with traditional routing schemes. The proposed schemes also use replication opportunistically to optimize the balance between energy overhead and arrival rate. As 3D integrated circuit (3D-IC) technology with wafer-to-wafer bonding has been recently proposed as a promising candidate for future CMPs, we also propose a fault-tolerant routing scheme for 3D NoCs which outperforms the existing popular routing schemes in terms of energy consumption, performance and reliability. To quantify reliability and provide different levels of intelligent protection, for the first time, we propose the network vulnerability factor (NVF) metric to characterize the vulnerability of NoC components to faults. NVF determines the probabilities that faults in NoC components manifest as errors in the final program output of the CMP system. With NVF aware partial protection for NoC components, almost 50% energy cost can be saved compared to the traditional approach of comprehensively protecting all NoC components. Lastly, we focus on the problem of fault-tolerant NoC design, that involves many NP-hard sub-problems such as core mapping, fault-tolerant routing, and fault-tolerant router configuration. We propose a novel design-time (RESYN) and a hybrid design and runtime (HEFT) synthesis framework to trade-off energy consumption and reliability in the NoC fabric at the system level for CMPs. Together, our research in fault-tolerant NoC routing, reliability modeling, and reliability aware NoC synthesis substantially enhances NoC reliability and energy-efficiency beyond what is possible with traditional approaches and state-of-the-art strategies from prior work

    Network-on-Chip

    Get PDF
    Addresses the Challenges Associated with System-on-Chip Integration Network-on-Chip: The Next Generation of System-on-Chip Integration examines the current issues restricting chip-on-chip communication efficiency, and explores Network-on-chip (NoC), a promising alternative that equips designers with the capability to produce a scalable, reusable, and high-performance communication backbone by allowing for the integration of a large number of cores on a single system-on-chip (SoC). This book provides a basic overview of topics associated with NoC-based design: communication infrastructure design, communication methodology, evaluation framework, and mapping of applications onto NoC. It details the design and evaluation of different proposed NoC structures, low-power techniques, signal integrity and reliability issues, application mapping, testing, and future trends. Utilizing examples of chips that have been implemented in industry and academia, this text presents the full architectural design of components verified through implementation in industrial CAD tools. It describes NoC research and developments, incorporates theoretical proofs strengthening the analysis procedures, and includes algorithms used in NoC design and synthesis. In addition, it considers other upcoming NoC issues, such as low-power NoC design, signal integrity issues, NoC testing, reconfiguration, synthesis, and 3-D NoC design. This text comprises 12 chapters and covers: The evolution of NoC from SoC—its research and developmental challenges NoC protocols, elaborating flow control, available network topologies, routing mechanisms, fault tolerance, quality-of-service support, and the design of network interfaces The router design strategies followed in NoCs The evaluation mechanism of NoC architectures The application mapping strategies followed in NoCs Low-power design techniques specifically followed in NoCs The signal integrity and reliability issues of NoC The details of NoC testing strategies reported so far The problem of synthesizing application-specific NoCs Reconfigurable NoC design issues Direction of future research and development in the field of NoC Network-on-Chip: The Next Generation of System-on-Chip Integration covers the basic topics, technology, and future trends relevant to NoC-based design, and can be used by engineers, students, and researchers and other industry professionals interested in computer architecture, embedded systems, and parallel/distributed systems

    Design and Implementation of Software Defined Radios on a Homogeneous Multi-Processor Architecture

    Get PDF
    In the wireless communications domain, multi-mode and multi-standard platforms are becoming increasingly the central focus of system architects. In fact, mobile terminal users require more and more mobility and throughput, pushing towards a fully integrated radio system able to support different communication protocols running concurrently on the platform. A new concept of radio system was introduced to meet the users' expectations. Flexible radio platforms have became an indispensable requirement to meet the expectations of the users today and in the future. This thesis deals with issues related to the design of flexible radio platforms. In particular, the flexibility of the radio system is achieved through the concept of software defined radios (SDRs). The research work focuses on the utilization of homogeneous multi-processor (MP) architectures as a feasible way to efficiently implement SDR platforms. In fact, platforms based on MP architectures are able to deliver high performance together with a high degree of flexibility. Moreover, homogeneous MP platforms are able to reduce design and verification costs as well as provide a high scalability in terms of software and hardware. However, homogeneous MP architectures provide less computational efficiency when compared to heterogeneous solutions. This thesis can be divided into two parts: the first part is related to the implementation of a reference platform while the second part of the thesis introduces the design and implementation of flexible, high performance, power and energy efficient algorithms for wireless communications. The proposed reference platform, Ninesilica, is a homogeneous MP architecture composed of a 3x3 mesh of processing nodes (PNs), interconnected by a hierarchical Network-on-Chip (NoC). Each PN hosts as Processing Element (PE) a processor core. To improve the computational efficiency of the platform, different power and energy saving techniques have been investigated. In the design, implementation and mapping of the algorithms, the following constraints were considered: energy and power efficiency, high scalability of the platform, portability of the solutions across similar platforms, and parallelization efficiency. Ninesilica architecture together with the proposed algorithm implementations showed that homogeneous MP architectures are highly scalable platforms, both in terms of hardware and software. Furthermore, Ninesilica architecture demonstrated that homogeneous MPs are able to achieve high parallelization efficiency as well as high energy and power savings, meeting the requirements of SDRs as well as enabling cognitive radios. Ninesilica can be utilized as a stand-alone block or as an elementary building block to realize clustered many-core architectures. Moreover, the obtained results, in terms of parallelization efficiency as well as power and energy efficiency are independent of the type of PE utilized, ensuring the portability of the results to similar architectures based on a different type of processing element

    Design and Implementation of Software Defined Radios on a Homogeneous Multi-Processor Architecture

    Get PDF
    In the wireless communications domain, multi-mode and multi-standard platforms are becoming increasingly the central focus of system architects. In fact, mobile terminal users require more and more mobility and throughput, pushing towards a fully integrated radio system able to support different communication protocols running concurrently on the platform. A new concept of radio system was introduced to meet the users' expectations. Flexible radio platforms have became an indispensable requirement to meet the expectations of the users today and in the future. This thesis deals with issues related to the design of flexible radio platforms. In particular, the flexibility of the radio system is achieved through the concept of software defined radios (SDRs). The research work focuses on the utilization of homogeneous multi-processor (MP) architectures as a feasible way to efficiently implement SDR platforms. In fact, platforms based on MP architectures are able to deliver high performance together with a high degree of flexibility. Moreover, homogeneous MP platforms are able to reduce design and verification costs as well as provide a high scalability in terms of software and hardware. However, homogeneous MP architectures provide less computational efficiency when compared to heterogeneous solutions. This thesis can be divided into two parts: the first part is related to the implementation of a reference platform while the second part of the thesis introduces the design and implementation of flexible, high performance, power and energy efficient algorithms for wireless communications. The proposed reference platform, Ninesilica, is a homogeneous MP architecture composed of a 3x3 mesh of processing nodes (PNs), interconnected by a hierarchical Network-on-Chip (NoC). Each PN hosts as Processing Element (PE) a processor core. To improve the computational efficiency of the platform, different power and energy saving techniques have been investigated. In the design, implementation and mapping of the algorithms, the following constraints were considered: energy and power efficiency, high scalability of the platform, portability of the solutions across similar platforms, and parallelization efficiency. Ninesilica architecture together with the proposed algorithm implementations showed that homogeneous MP architectures are highly scalable platforms, both in terms of hardware and software. Furthermore, Ninesilica architecture demonstrated that homogeneous MPs are able to achieve high parallelization efficiency as well as high energy and power savings, meeting the requirements of SDRs as well as enabling cognitive radios. Ninesilica can be utilized as a stand-alone block or as an elementary building block to realize clustered many-core architectures. Moreover, the obtained results, in terms of parallelization efficiency as well as power and energy efficiency are independent of the type of PE utilized, ensuring the portability of the results to similar architectures based on a different type of processing element

    Interleaving in Systolic-Arrays: a Throughput Breakthrough

    Get PDF
    In past years the most common way to improve computers performance was to increase the clock frequency. In recent years this approach suffered the limits of technology scaling, therefore computers architectures are shifting toward the direction of parallel computing to further improve circuits performance. Not only GPU based architectures are spreading in consideration, but also Systolic Arrays are particularly suited for certain classes of algorithms. An important point in favor of Systolic Arrays is that, due to the regularity of their circuit layout, they are appealing when applied to many emerging and very promising technologies, like Quantum-dot Cellular Automata and nanoarrays based on Silicon NanoWire or on Carbon nanotube Field Effect Transistors. In this work we present a systematic method to improve Systolic Arrays performance exploiting Pipelining and Input Data Interleaving. We tackle the problem from a theoretical point of view first, and then we apply it to both CMOS technology and emerging technologies. On CMOS we demonstrate that it is possible to vastly improve the overall throughput of the circuit. By applying this technique to emerging technologies we show that it is possible to overcome some of their limitations greatly improving the throughput, making a considerable step forward toward the post-CMOS era
    corecore