30 research outputs found

    Throttling Control for Bufferless Routing in On-Chip Networks

    Get PDF
    As the number of core integration on a single die grows, buffers consume significant energy, and occupy chip area. A bufferless deflection outing that eliminates router’s input port buffers can considerably help saving energy and chip area while providing similar performance of xisting buffered routing, especially for low-to-medium network loads. However when congestion increases, the bufferless frequently causes flits deflections, and misrouting leading to a degradation of network performance. In this paper, we propose IRT(Injection Rate Throttling), a ocal throttling mechanism that reduces deflection and misrouting for high-load bufferless networks. IRT provides injection rate control independently for each network node, allowing to reduce network congestion. Our simulation results based on a cycle-accurate simulator show that using IRT, IRT reduces average transmission latency by 8.65% compared to traditional bufferless routing

    On the Effectiveness of Source Throttling for Networks-on-Chip in Chip Multiprocessor Designs

    Get PDF
    In modern chip-multiprocessor (CMP) designs, with the increasing number of cores, traffic between different cores keeps increasing. Consequently, on-chip interconnection networks experience increasingly large communication bandwidth demand. This thesis focuses on Quality-of-Service (QoS) of Networks-on-Chip (NoC). NoC is considered as a scalable approach of interconnection network compared to conventional bus-based architecture. Like Ethernet, NoC faces common QoS issues such as bandwidth utilization and fairness. This thesis is a study on the effectiveness of source throttling for NoC, including fairness and overall performance such as program run time and packet latency. Source throttling is a well-known technique for traffic regulation. It is shown to be effective for bufferless NoC in previous studies. Due to different traffic behaviors and characteristics, however, it is not obvious if source throttling is effective for general buffered NoC. The first part of this research is a set of network simulations on various synthetic traffic cases. The results indicate that source throttling can reduce application runtime when (1) the network is congested, (2) there are dependencies among communication requests, and (3) the width of the dependence graph must be sufficiently large. The second part is full system simulations on public benchmark suites. Source throttling does not bring benefit for these relative realistic cases. Further experiment reveals that the aforementioned conditions are not satisfied. This explains why source throttling is of little use for general buffered NoC in CMP designs

    On the Effectiveness of Source Throttling for Networks-on-Chip in Chip Multiprocessor Designs

    Get PDF
    In modern chip-multiprocessor (CMP) designs, with the increasing number of cores, traffic between different cores keeps increasing. Consequently, on-chip interconnection networks experience increasingly large communication bandwidth demand. This thesis focuses on Quality-of-Service (QoS) of Networks-on-Chip (NoC). NoC is considered as a scalable approach of interconnection network compared to conventional bus-based architecture. Like Ethernet, NoC faces common QoS issues such as bandwidth utilization and fairness. This thesis is a study on the effectiveness of source throttling for NoC, including fairness and overall performance such as program run time and packet latency. Source throttling is a well-known technique for traffic regulation. It is shown to be effective for bufferless NoC in previous studies. Due to different traffic behaviors and characteristics, however, it is not obvious if source throttling is effective for general buffered NoC. The first part of this research is a set of network simulations on various synthetic traffic cases. The results indicate that source throttling can reduce application runtime when (1) the network is congested, (2) there are dependencies among communication requests, and (3) the width of the dependence graph must be sufficiently large. The second part is full system simulations on public benchmark suites. Source throttling does not bring benefit for these relative realistic cases. Further experiment reveals that the aforementioned conditions are not satisfied. This explains why source throttling is of little use for general buffered NoC in CMP designs

    Round Robin based Arbitration Mechanism for Signaling Approach based Router Architecture

    Get PDF
    In Network-on-Chip the effectiveness of the network resource allocation is demonstrated by the flow control mechanism. There are two types of flow control mechanisms: buffered and bufferless. Compared to buffered flow control methods, buffer less flow control mechanisms are easier to use, need less power, and take up less space. When there are congestion and resource conflicts, it experiences higher packet loss and packet misrouting inside the network. A good buffered control mechanism useful as it overcomes the limitations of buffer less mechanism. There are numerous buffered and bufferless flow control methods available. In this paper, signaling-based Virtual Output Queue Router Arbiter Mechanism is used to explore credit-based flow control. This mechanism worked on new concept that is “stress value”. This information is generated in the form of credit whenever any input buffer has free space. Then, using this credit data, the node's stress value is determined. Free buffer space takes precedence over stress value if it is bigger. The stress value will increase if there is less available buffer space. To handle the congestion problem, the signaling block then sends this stress value to a neighboring router. To help the arbitrator make a more accurate decision, the crediting system constantly operates in tandem with arbitration

    A Multifunctional Integrated Circuit Router for Body Area Network Wearable Systems

    Get PDF
    A multifunctional router IC to be included in the nodes of a wearable body sensor network is described and evaluated. The router targets different application scenarios, especially those including tens of sensors, embedded into textile materials and with high data-rate communication demands. The router IC supports two different functionality sets, one for sensor nodes and another for the base node, both based on the same circuit module. The nodes are connected to each other by means of woven thick conductive yarns forming a mesh topology with the base node at the center. From the standpoint of the network, each sensor node is a four port router capable of handling packets from destination nodes to the base node, with sufficient redundant paths. The adopted hybrid circuit and packet switching scheme significantly improve network performance in terms of end-to-end delay, throughput and power consumption. The IC also implements a highly precise, sub-microsecond one-way time synchronization protocol which is used for time stamping the acquired data. The communication module was implemented in a 4-metal, 0.35 μm CMOS technology. The maximum data rate of the system is 35 Mbps while supporting up to 250 sensors, which exceeds current BAN applications scenarios.This work was supported in part by the Fundação para a Ciéncia e a Tecnologia (FCT) (Portuguese Foundation for Science and Technology) under Project PROLIMB PTDC/EEAELC/103683/2008 and through the Ph.D. Grant SFRH/BD/75324/2010, and in part by the CREaTION, FCT/MEC through national funds and co-funded by the FEDER-PT2020 partnership agreement under Project UIDB/EEA/50008/2020, Project CONQUEST (CMU/ECE/030/2017), Project COST CA15104, and ORCIP. (Corresponding author: Fardin Derogarian Miyandoab.)info:eu-repo/semantics/publishedVersio

    Theoretical Analysis and Evaluation of NoCs with Weighted Round-Robin Arbitration

    Full text link
    Fast and accurate performance analysis techniques are essential in early design space exploration and pre-silicon evaluations, including software eco-system development. In particular, on-chip communication continues to play an increasingly important role as the many-core processors scale up. This paper presents the first performance analysis technique that targets networks-on-chip (NoCs) that employ weighted round-robin (WRR) arbitration. Besides fairness, WRR arbitration provides flexibility in allocating bandwidth proportionally to the importance of the traffic classes, unlike basic round-robin and priority-based arbitration. The proposed approach first estimates the effective service time of the packets in the queue due to WRR arbitration. Then, it uses the effective service time to compute the average waiting time of the packets. Next, we incorporate a decomposition technique to extend the analytical model to handle NoC of any size. The proposed approach achieves less than 5% error while executing real applications and 10% error under challenging synthetic traffic with different burstiness levels.Comment: This paper is accepted in International Conference on Computer Aided Design (ICCAD), 202
    corecore