23 research outputs found

    Bandwidth optimization in asynchronous NoCs by customizing link wire length

    Get PDF
    Journal ArticleThe bandwidth requirement for each link on a network-on-chip (NoC) may differ based on topology and traffic properties of the IP cores. Available bandwidth on an asynchronous NoC link will also vary depending on the wire length between sender and receiver. We explore the benefit to NoC performance when this property is used to increase bandwidth on specific links that carry the most traffic of an SoC design. Two methods are used to accomplish this: specifying router locations on the floorplan, and adding pipeline latches on long links. Energy and latency characteristics of an asynchronous NoC are compared to a similarly-designed synchronous NoC. The results indicate that the asynchronous network has lower energy, and link-specific bandwidth optimization has improved the average packet latency. Adding pipeline latches to congested links yields the most improvement. This link-specific optimization is applicable not only to the router and network we present here, but any asynchronous NoC used in a eterogeneous SoC

    Comparing energy and latency of asynchronous and synchronous NoCs for embedded SoCs

    Get PDF
    Journal ArticlePower consumption of on-chip interconnects is a primary concern for many embedded system-on-chip (SoC) applications. In this paper, we compare energy and performance characteristics of asynchronous (clockless) and synchronous network on-chip implementations, optimized for a number of SoC designs. We adapted the COSI-2.0 framework with ORION 2.0 router and wire models for synchronous network generation. Our own tool, ANetGen, specifies the asynchronous network by determining the topology with simulated-annealing and router locations with force-directed placement. It uses energy and delay models from our 65 nm bundled-data router design. SystemC simulations varied traffic burstiness using the self-similar b-model. Results show that the asynchronous network provided lower median and maximum message latency, especially under bursty traffic, and used far less router energy with a slight overhead for the interrouter wires

    Physical-aware link allocation and route assignment for chip multiprocessing

    Get PDF
    The architecture definition, design, and validation of the interconnect networks is a key step in the design of modern on-chip systems. This paper proposes a mathematical formulation of the problem of simultaneously defining the topology of the network and the message routes for the traffic among the processing elements of the system. The solution of the problem meets the physical and performance constraints defined by the designer. The method guarantees that the generated solution is deadlock free. It is also capable of automatically discovering topologies that have been previously used in industrial systems. The applicability of the method has been validated by solving realistic size interconnect networks modeling the typical multiprocessor systems.Peer ReviewedPostprint (published version

    Area efficient asynchronous SDM routers using 2-stage Clos switches

    Full text link

    Analysis of asynchronous routers for network-on-chip applications

    Get PDF
    Asynchronous circuit design has been conventionally regarded as a valid alternative to synchronous logic due to its potential for low consumption of resources, power and delay. This includes areas such as the communication infrastructure of modern multi core processors, the so-called Network-on-Chip (NoC) paradigm on which this thesis focus on. In recent times, the transistor downscaling and the increasing clock frequencies have pushed synchronous design to high static power and delay. As a result, the interest for asynchronous integrated routers and links has re-emerged, especially in fields with ultra-low power requirements such as embedded systems. In this thesis, we construct an asynchronous router using Verilog code based on architectures found in the literature. We analyze the functionality of each of the building blocks and verify the operation of the implemented routing algorithm and arbitration mechanism. In the future, the results obtained here are expected to enable a complete implementation of the router in Verilog and its posterior analysis of its scalability

    Deadlock avoidance with virtual channels

    Get PDF
    High Performance Computing is a rapidly evolving area of computer science which attends to solve complicated computational problems with the combination of computational nodes connected through high speed networks. This work concentrates on the networks problems that appear in such networks and specially focuses on the Deadlock problem that can decrease the efficiency of the communication or even destroy the balance and paralyze the network. Goal of this work is the Deadlock avoidance with the use of virtual channels, in the switches of the network where the problem appears. The deadlock avoidance assures that will not be loss of data inside network, having as result the increased latency of the served packets, due to the extra calculation that the switches have to make to apply the policy.La computación de alto rendimiento es una zona de rápida evolución de la informática que busca resolver complicados problemas de cálculo con la combinación de los nodos de cómputo conectados a través de redes de alta velocidad. Este trabajo se centra en los problemas de las redes que aparecen en este tipo de sistemas y especialmente se centra en el problema del "deadlock" que puede disminuir la eficacia de la comunicación con la paralización de la red. El objetivo de este trabajo es la evitación de deadlock con el uso de canales virtuales, en los conmutadores de la red donde aparece el problema. Evitar el deadlock asegura que no se producirá la pérdida de datos en red, teniendo como resultado el aumento de la latencia de los paquetes, debido al overhead extra de cálculo que los conmutadores tienen que hacer para aplicar la política.La computació d'alt rendiment és una àrea de ràpida evolució de la informàtica que pretén resoldre complicats problemes de càlcul amb la combinació de nodes de còmput connectats a través de xarxes d'alta velocitat. Aquest treball se centra en els problemes de les xarxes que apareixen en aquest tipus de sistemes i especialment se centra en el problema del "deadlock" que pot disminuir l'eficàcia de la comunicació amb la paralització de la xarxa. L'objectiu d'aquest treball és l'evitació de deadlock amb l'ús de canals virtuals, en els commutadors de la xarxa on apareix el problema. Evitar deadlock assegura que no es produirà la pèrdua de dades en xarxa, tenint com a resultat l'augment de la latència dels paquets, degut al overhead extra de càlcul que els commutadors han de fer per aplicar la política

    Doctor of Philosophy

    Get PDF
    dissertationPortable electronic devices will be limited to available energy of existing battery chemistries for the foreseeable future. However, system-on-chips (SoCs) used in these devices are under a demand to offer more functionality and increased battery life. A difficult problem in SoC design is providing energy-efficient communication between its components while maintaining the required performance. This dissertation introduces a novel energy-efficient network-on-chip (NoC) communication architecture. A NoC is used within complex SoCs due it its superior performance, energy usage, modularity, and scalability over traditional bus and point-to-point methods of connecting SoC components. This is the first academic research that combines asynchronous NoC circuits, a focus on energy-efficient design, and a software framework to customize a NoC for a particular SoC. Its key contribution is demonstrating that a simple, asynchronous NoC concept is a good match for low-power devices, and is a fruitful area for additional investigation. The proposed NoC is energy-efficient in several ways: simple switch and arbitration logic, low port radix, latch-based router buffering, a topology with the minimum number of 3-port routers, and the asynchronous advantages of zero dynamic power consumption while idle and the lack of a clock tree. The tool framework developed for this work uses novel methods to optimize the topology and router oorplan based on simulated annealing and force-directed movement. It studies link pipelining techniques that yield improved throughput in an energy-efficient manner. A simulator is automatically generated for each customized NoC, and its traffic generators use a self-similar message distribution, as opposed to Poisson, to better match application behavior. Compared to a conventional synchronous NoC, this design is superior by achieving comparable message latency with half the energy

    Doctor of Philosophy

    Get PDF
    dissertationThe bandwidth requirement for each link on a network-on-chip (NoC) may differ based on topology and traffic properties of the IP cores. Available bandwidth on an asynchronous NoC link will also vary depending on the wire length between sender and receiver. This work explores the benefit to NoC performance, area, and energy when this property is used to optimize bandwidth on specific links based on its bandwidth required by a target SoC design. Three asynchronous routers were designed for implementing of asynchronous NoCs. Simple routing scheme and single-flit packet format lead to performance- and area-efficient router designs. Their performance was evaluated in consideration of link wire delay. Comprehensive analysis of pipeline latch insertion in asynchronous communication links is performed in regard to link bandwidth. Optimal placement of pipeline latch for maximizing benefit to increase of bandwidth is described. Specific methods are proposed for performance, area and energy optimization, respectively. Performance optimization is achieved by increasing bandwidth of high trafficked and high utilized links in an NoC, as inserting pipeline latches in those links. Through decrease of bandwidth of links with low traffic and low utilization by halving data-path width, reduction of wire area of an NoC is accomplished. Energy optimization is performed using wide spacing between wires in links with high energy consumption. An analytical model for asynchronous link bandwidth estimation is presented. It is utilized to deploy NoC optimization methods as identifying adequate links for each optimization method. Energy and latency characteristics of an asynchronous NoC are compared to a similarly-designed synchronous NoC. The results indicate that the asynchronous network has lower energy, and link-specific bandwidth optimization has improved NoC performance. Evaluation of proposed optimization methods by employing to an asynchronous NoC shows achievements of performance enhancement, wire area reduction and wire energy saving
    corecore