213 research outputs found

    Cost Effective Routing Implementations for On-chip Networks

    Full text link
    Arquitecturas de múltiples núcleos como multiprocesadores (CMP) y soluciones multiprocesador para sistemas dentro del chip (MPSoCs) actuales se basan en la eficacia de las redes dentro del chip (NoC) para la comunicación entre los diversos núcleos. Un diseño eficiente de red dentro del chip debe ser escalable y al mismo tiempo obtener valores ajustados de área, latencia y consumo de energía. Para diseños de red dentro del chip de propósito general se suele usar topologías de malla 2D ya que se ajustan a la distribución del chip. Sin embargo, la aparición de nuevos retos debe ser abordada por los diseñadores. Una mayor probabilidad de defectos de fabricación, la necesidad de un uso optimizado de los recursos para aumentar el paralelismo a nivel de aplicación o la necesidad de técnicas eficaces de ahorro de energía, puede ocasionar patrones de irregularidad en las topologías. Además, el soporte para comunicación colectiva es una característica buscada para abordar con eficacia las necesidades de comunicación de los protocolos de coherencia de caché. En estas condiciones, un encaminamiento eficiente de los mensajes se convierte en un reto a superar. El objetivo de esta tesis es establecer las bases de una nueva arquitectura para encaminamiento distribuido basado en lógica que es capaz de adaptarse a cualquier topología irregular derivada de una estructura de malla 2D, proporcionando así una cobertura total para cualquier caso resultado de soportar los retos mencionados anteriormente. Para conseguirlo, en primer lugar, se parte desde una base, para luego analizar una evolución de varios mecanismos, y finalmente llegar a una implementación, que abarca varios módulos para alcanzar el objetivo mencionado anteriormente. De hecho, esta última implementación tiene por nombre eLBDR (effective Logic-Based Distributed Routing). Este trabajo cubre desde el primer mecanismo, LBDR, hasta el resto de mecanismos que han surgido progresivamente.Rodrigo Mocholí, S. (2010). Cost Effective Routing Implementations for On-chip Networks [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8962Palanci

    Performance Evaluation of XY and XTRANC Routing Algorithm for Network on Chip and Implementation using DART Simulator

    Get PDF
    In today’s world Network on Chip(NoC) is one of the most efficient on chip communication platform for System on Chip where a large amount of computational and storage blocks are integrated on a single chip. NoCs are scalable and have tackled the short commings of SoCs . In the first part of this project the basics of NoCs is explained which includes why we should use NoC , how to implement NoC ,various blocks of NoCs .The next part of the project deals with the implementation of XY routing algorithm in mesh (3*3) and mesh (4*4) network topologies. The throughput and latency curves for both the topologies were found and a through comparison was done by varying the no of virtual cannels. In the next part an improvised routing algorithm known as the extended torus(XTRANC) routing algorithm for NoCs implementation is explained. This algorithm is designed for inner torus mesh networks and provides better performance than usual routing algorithms. It has been implemented using the CONNECT simulator. Then the DART simulator was explored and two important components namely the flitqueue and the traffic generator was designed using this simulator

    Design and Implementation of High QoS 3D-NoC using Modified Double Particle Swarm Optimization on FPGA

    Get PDF
    One technique to overcome the exponential growth bottleneck is to increase the number of cores on a processor, although having too many cores might cause issues including chip overheating and communication blockage. The problem of the communication bottleneck on the chip is presently effectively resolved by networks-on-chip (NoC). A 3D stack of chips is now possible, thanks to recent developments in IC manufacturing techniques, enabling to reduce of chip area while increasing chip throughput and reducing power consumption. The automated process associated with mapping applications to form three-dimensional NoC architectures is a significant new path in 3D NoC research. This work proposes a 3D NoC partitioning approach that can identify the 3D NoC region that has to be mapped. A double particle swarm optimization (DPSO) inspired algorithmic technique, which may combine the characteristics having neighbourhood search and genetic architectures, also addresses the challenge of a particle swarm algorithm descending into local optimal solutions. Experimental evidence supports the claim that this hybrid optimization algorithm based on Double Particle Swarm Optimisation outperforms the conventional heuristic technique in terms of output rate and loss in energy. The findings demonstrate that in a network of the same size, the newly introduced router delivers the lowest loss on the longest path.  Three factors, namely energy, latency or delay, and throughput, are compared between the suggested 3D mesh ONoC and its 2D version. When comparing power consumption between 3D ONoC and its electronic and 2D equivalents, which both have 512 IP cores, it may save roughly 79.9% of the energy used by the electronic counterpart and 24.3% of the energy used by the latter. The network efficiency of the 3D mesh ONoC is simulated by DPSO in a variety of configurations. The outcomes also demonstrate an increase in performance over the 2D ONoC. As a flexible communication solution, Network-On-Chips (NoCs) have been frequently employed in the development of multiprocessor system-on-chips (MPSoCs). By outsourcing their communication activities, NoCs permit on-chip Intellectual Property (IP) cores to communicate with one another and function at a better level. The important components in assigning application duties, distributing the work to the IPs, and coordinating communication among them are mapping and scheduling methods. This study aims to present an entirely advanced form of research in the area of 3D NoC mapping and scheduling applications, grouping the results according to various parameters and offering several suggestions for further research

    Transient and Permanent Error Control for High-End Multiprocessor Systems-on-Chip

    Get PDF
    High-end MPSoC systems with built-in high-radix topologies achieve good performance because of the improved connectivity and the reduced network diameter. In high-end MPSoC systems, fault tolerance support is becoming a compulsory feature. In this work, we propose a combined method to address permanent and transient link and router failures in those systems. The LBDRhr mechanism is proposed to tolerate permanent link failures in some popular high-radix topologies. The increased router complexity may lead to more transient router errors than routers using simple XY routing algorithm. We exploit the inherent information redundancy (IIR) in LBDRhr logic to manage transient errors in the network routers. Thorough analyses are provided to discover the appropriate internal nodes and the forbidden signal patterns for transient error detection. Simulation results show that LBDRhr logic can tolerate all of the permanent failure combinations of long-range links and 80% of links failures at short-range links. Case studies show that the error detection method based on the new IIR extraction method reduces the power consumption and the residual error rate by 33% and up to two orders of magnitude, respectively, compared to triple modular redundancy. The impact of network topologies on the efficiency of the detection mechanism has been examined in this work, as well

    Modeling Networks-on-Chip at System Level with the MARTE UML profile

    Get PDF
    International audienceThe study of Networks on Chips (NoCs) is a research field that primarily addresses the global communication in Systems-on-Chip (SoCs). The selected topology and the routing algorithm play a prime role in the performance of NoC architectures. In order to handle the design complexity and meet the tight time-to-market constraints, it is important to automate most of these NoC design phases. The extension of the UML language called UML profile for MARTE (Modeling and Analysis of Real-Time and Embedded systems) specifies some concepts for model-based design and analysis of real time and embedded systems. This paper presents a MARTE based methodology for modeling concepts of NoC based architectures. It aims at improving the effectiveness of the MARTE standard by clarifying some notations and extending some definitions in the standard, in order to be able to model complex architectures like NoCs

    Energy Efficient Network Generation for Application Specific NoC

    Get PDF
    Networks-on-Chip is emerging as a communication platform for future complex SoC designs, composed of a large number of homogenous or heterogeneous processing resources. Most SoC platforms are customized to the domainspecific requirements of their applications, which communicate in a specific, mostly irregular way. The specific but often diverse communication requirements among cores of the SoC call for the design of application-specific network of SoC for improved performance in terms of communication energy, latency, and throughput. In this work, we propose a methodology for the design of customized irregular network architecture of SoC. The proposed method exploits priori knowledge of the application2019;s communication characteristic to generate an energy optimized network and corresponding routing tables

    Quarc: an architecture for efficient on-chip communication

    Get PDF
    The exponential downscaling of the feature size has enforced a paradigm shift from computation-based design to communication-based design in system on chip development. Buses, the traditional communication architecture in systems on chip, are incapable of addressing the increasing bandwidth requirements of future large systems. Networks on chip have emerged as an interconnection architecture offering unique solutions to the technological and design issues related to communication in future systems on chip. The transition from buses as a shared medium to networks on chip as a segmented medium has given rise to new challenges in system on chip realm. By leveraging the shared nature of the communication medium, buses have been highly efficient in delivering multicast communication. The segmented nature of networks, however, inhibits the multicast messages to be delivered as efficiently by networks on chip. Relying on extensive research on multicast communication in parallel computers, several network on chip architectures have offered mechanisms to perform the operation, while conforming to resource constraints of the network on chip paradigm. Multicast communication in majority of these networks on chip is implemented by establishing a connection between source and all multicast destinations before the message transmission commences. Establishing the connections incurs an overhead and, therefore, is not desirable; in particular in latency sensitive services such as cache coherence. To address high performance multicast communication, this research presents Quarc, a novel network on chip architecture. The Quarc architecture targets an area-efficient, low power, high performance implementation. The thesis covers a detailed representation of the building blocks of the architecture, including topology, router and network interface. The cost and performance comparison of the Quarc architecture against other network on chip architectures reveals that the Quarc architecture is a highly efficient architecture. Moreover, the thesis introduces novel performance models of complex traffic patterns, including multicast and quality of service-aware communication

    Network-on-Chip

    Get PDF
    Limitations of bus-based interconnections related to scalability, latency, bandwidth, and power consumption for supporting the related huge number of on-chip resources result in a communication bottleneck. These challenges can be efficiently addressed with the implementation of a network-on-chip (NoC) system. This book gives a detailed analysis of various on-chip communication architectures and covers different areas of NoCs such as potentials, architecture, technical challenges, optimization, design explorations, and research directions. In addition, it discusses current and future trends that could make an impactful and meaningful contribution to the research and design of on-chip communications and NoC systems

    Exploiting generalized de-Bruijn/Kautz topologies for flexible iterative channel code decoder architectures

    Get PDF
    Modern iterative channel code decoder architectures have tight constrains on the throughput but require flexibility to support different modes and standards. Unfortunately, flexibility often comes at the expense of increasing the number of clock cycles required to complete the decoding of a data-frame, thus reducing the sustained throughput. The Network- on-Chip (NoC) paradigm is an interesting option to achieve flexibility, but several design choices, including the topology and the routing algorithm, can affect the decoder throughput. In this work logarithmic diameter topologies, in particular generalized de-Bruijn and Kautz topologies, are addressed as possible solutions to achieve both flexible and high throughput architectures for iterative channel code decoding. In particular, this work shows that the optimal shortest-path routing algorithm for these topologies, that is still available in the open literature, can be efficiently implemented resorting to a very simple circuit. Experimental results show that the proposed architecture features a reduction of about 14% and 10% for area and power consumption respectively, with respect to a previous shortest-path routing-table-based desig

    Exploiting generalized de-Bruijn/Kautz topologies for flexible iterative channel code decoder architectures

    Get PDF
    Modern iterative channel code decoder architectures have tight constrains on the throughput but require flexibility to support different modes and standards. Unfortunately, flexibility often comes at the expense of increasing the number of clock cycles required to complete the decoding of a data-frame, thus reducing the sustained throughput. The Network- on-Chip (NoC) paradigm is an interesting option to achieve flexibility, but several design choices, including the topology and the routing algorithm, can affect the decoder throughput. In this work logarithmic diameter topologies, in particular generalized de-Bruijn and Kautz topologies, are addressed as possible solutions to achieve both flexible and high throughput architectures for iterative channel code decoding. In particular, this work shows that the optimal shortest-path routing algorithm for these topologies, that is still available in the open literature, can be efficiently implemented resorting to a very simple circuit. Experimental results show that the proposed architecture features a reduction of about 14% and 10% for area and power consumption respectively, with respect to a previous shortest-path routing-table-based design
    corecore