85 research outputs found

    Performance Evaluation of XY and XTRANC Routing Algorithm for Network on Chip and Implementation using DART Simulator

    Get PDF
    In today’s world Network on Chip(NoC) is one of the most efficient on chip communication platform for System on Chip where a large amount of computational and storage blocks are integrated on a single chip. NoCs are scalable and have tackled the short commings of SoCs . In the first part of this project the basics of NoCs is explained which includes why we should use NoC , how to implement NoC ,various blocks of NoCs .The next part of the project deals with the implementation of XY routing algorithm in mesh (3*3) and mesh (4*4) network topologies. The throughput and latency curves for both the topologies were found and a through comparison was done by varying the no of virtual cannels. In the next part an improvised routing algorithm known as the extended torus(XTRANC) routing algorithm for NoCs implementation is explained. This algorithm is designed for inner torus mesh networks and provides better performance than usual routing algorithms. It has been implemented using the CONNECT simulator. Then the DART simulator was explored and two important components namely the flitqueue and the traffic generator was designed using this simulator

    Classification of networks-on-chip in the context of analysis of promising self-organizing routing algorithms

    Full text link
    This paper contains a detailed analysis of the current state of the network-on-chip (NoC) research field, based on which the authors propose the new NoC classification that is more complete in comparison with previous ones. The state of the domain associated with wireless NoC is investigated, as the transition to these NoCs reduces latency. There is an assumption that routing algorithms from classical network theory may demonstrate high performance. So, in this article, the possibility of the usage of self-organizing algorithms in a wireless NoC is also provided. This approach has a lot of advantages described in the paper. The results of the research can be useful for developers and NoC manufacturers as specific recommendations, algorithms, programs, and models for the organization of the production and technological process.Comment: 10 p., 5 fig. Oral presentation on APSSE 2021 conferenc

    Global Congestion and Fault Aware Wireless Interconnection Framework for Multicore Systems

    Get PDF
    Multicore processors are getting more common in the implementation of all type of computing demands, starting from personal computers to the large server farms for high computational demanding applications. The network-on-chip provides a better alternative to the traditional bus based communication infrastructure for this multicore system. Conventional wire-based NoC interconnect faces constraints due to their long multi-hop latency and high power consumption. Furthermore high traffic generating applications sometimes creates congestion in such system further degrading the systems performance. In this thesis work, a novel two-state congestion aware wireless interconnection framework for network chip is presented. This WiNoC system was designed to able to dynamically redirect traffic to avoid congestion based on network condition information shared among all the core tiles in the system. Hence a novel routing scheme and a two-state MAC protocol is proposed based on a proposed two layer hybrid mesh-based NoC architecture. The underlying mesh network is connected via wired-based interconnect and on top of that a shared wireless interconnect framework is added for single-hop communication. The routing scheme is non-deterministic in nature and utilizes the principles from existing dynamic routing algorithms. The MAC protocol for the wireless interface works in two modes. The first is data mode where a token-based protocol is utilized to transfer core data. And the second mode is the control mode where a broadcast-based communication protocol is used to share the network congestion information. The work details the switching methodology between these two modes and also explain, how the routing scheme utilizes the congestion information (gathered during the control mode) to route data packets during normal operation mode. The proposed work was modeled in a cycle accurate network simulator and its performance were evaluated against traditional NoC and WiNoC designs

    Performance evaluation of different routing algorithms in network on chip

    Get PDF
    Network on Chip (NoC) is one of the efficient on-chip communication architecture for System on Chip (SoC) where a large number of computational and storage blocks are integrated on a single chip. NoCs have tackled the disadvantages of SoCs as well as they are scalable. But an efficient routing algorithm can enhance the performance of NoC. In one chapter of the thesis three different types of routing algorithms are compared i.e. XY, OE, and DyAD. XY routing algorithm is a distributed deterministic algorithm. Odd-Even (OE) routing algorithm is distributed adaptive routing algorithm with deadlock-free ability. DyAD is a smart routing algorithm which combines the features of both deterministic and adaptive routing. In another chapter of thesis three different types of deadlock free routing algorithms are compared i.e. one deterministic routing (XY routing algorithm), three partially adaptive routing (West first, North last and Negative first) and two adaptive routing (DyXY, OE) are being compared with % of load for various traffic patterns. In another chapter of thesis, a fault tolerant algorithm is described and its performance is compared with all the deadlock free routing algorithms in a NoC having link faults and node faults. All these simulation is done in NIRGAM 2.1 simulator which is a cycle accurate systemC based simulator

    Physical parameter-aware Networks-on-Chip design

    Get PDF
    PhD ThesisNetworks-on-Chip (NoCs) have been proposed as a scalable, reliable and power-efficient communication fabric for chip multiprocessors (CMPs) and multiprocessor systems-on-chip (MPSoCs). NoCs determine both the performance and the reliability of such systems, with a significant power demand that is expected to increase due to developments in both technology and architecture. In terms of architecture, an important trend in many-core systems architecture is to increase the number of cores on a chip while reducing their individual complexity. This trend increases communication power relative to computation power. Moreover, technology-wise, power-hungry wires are dominating logic as power consumers as technology scales down. For these reasons, the design of future very large scale integration (VLSI) systems is moving from being computation-centric to communication-centric. On the other hand, chip’s physical parameters integrity, especially power and thermal integrity, is crucial for reliable VLSI systems. However, guaranteeing this integrity is becoming increasingly difficult with the higher scale of integration due to increased power density and operating frequencies that result in continuously increasing temperature and voltage drops in the chip. This is a challenge that may prevent further shrinking of devices. Thus, tackling the challenge of power and thermal integrity of future many-core systems at only one level of abstraction, the chip and package design for example, is no longer sufficient to ensure the integrity of physical parameters. New designtime and run-time strategies may need to work together at different levels of abstraction, such as package, application, network, to provide the required physical parameter integrity for these large systems. This necessitates strategies that work at the level of the on-chip network with its rising power budget. This thesis proposes models, techniques and architectures to improve power and thermal integrity of Network-on-Chip (NoC)-based many-core systems. The thesis is composed of two major parts: i) minimization and modelling of power supply variations to improve power integrity; and ii) dynamic thermal adaptation to improve thermal integrity. This thesis makes four major contributions. The first is a computational model of on-chip power supply variations in NoCs. The proposed model embeds a power delivery model, an NoC activity simulator and a power model. The model is verified with SPICE simulation and employed to analyse power supply variations in synthetic and real NoC workloads. Novel observations regarding power supply noise correlation with different traffic patterns and routing algorithms are found. The second is a new application mapping strategy aiming vii to minimize power supply noise in NoCs. This is achieved by defining a new metric, switching activity density, and employing a force-based objective function that results in minimizing switching density. Significant reductions in power supply noise (PSN) are achieved with a low energy penalty. This reduction in PSN also results in a better link timing accuracy. The third contribution is a new dynamic thermal-adaptive routing strategy to effectively diffuse heat from the NoC-based threedimensional (3D) CMPs, using a dynamic programming (DP)-based distributed control architecture. Moreover, a new approach for efficient extension of two-dimensional (2D) partially-adaptive routing algorithms to 3D is presented. This approach improves three-dimensional networkon- chip (3D NoC) routing adaptivity while ensuring deadlock-freeness. Finally, the proposed thermal-adaptive routing is implemented in field-programmable gate array (FPGA), and implementation challenges, for both thermal sensing and the dynamic control architecture are addressed. The proposed routing implementation is evaluated in terms of both functionality and performance. The methodologies and architectures proposed in this thesis open a new direction for improving the power and thermal integrity of future NoC-based 2D and 3D many-core architectures

    Interconnects architectures for many-core era using surface-wave communication

    Get PDF
    PhD ThesisNetworks-on-chip (NoCs) is a communication paradigm that has emerged aiming to address on-chip communication challenges and to satisfy interconnection demands for chip-multiprocessors (CMPs). Nonetheless, there is continuous demand for even higher computational power, which is leading to a relentless downscaling of CMOS technology to enable the integration of many-cores. However, technology downscaling is in favour of the gate nodes over wires in terms of latency and power consumption. Consequently, this has led to the era of many-core processors where power consumption and performance are governed by inter-core communications rather than core computation. Therefore, NoCs need to evolve from being merely metalbased implementations which threaten to be a performance and power bottleneck for many-core efficiency and scalability. To overcome such intensified inter-core communication challenges, this thesis proposes a novel interconnect technology: the surface-wave interconnect (SWI). This new RF-based on-chip interconnect has notable characteristics compared to cutting-edge on-chip interconnects in terms of CMOS compatibility, high speed signal propagation, low power dissipation, and massive signal fan-out. Nonetheless, the realization of the SWI requires investigations at different levels of abstraction, such as the device integration and RF engineering levels. The aim of this thesis is to address the networking and system level challenges and highlight the potential of this interconnect. This should encourage further research at other levels of abstraction. Two specific system-level challenges crucial in future many-core systems are tackled in this study, which are cross-the-chip global communication and one-to-many communication. This thesis makes four major contributions towards this aim. The first is reducing the NoC average-hop count, which would otherwise increase packet-latency exponentially, by proposing a novel hybrid interconnect architecture. This hybrid architecture can not only utilize both regular metal-wire and SWI, but also exploits merits of both bus and NoC architectures in terms of connectivity compared to other general-purpose on-chip interconnect architectures. The second contribution addresses global communication issues by developing a distance-based weighted-round-robin arbitration (DWA) algorithm. This technique prioritizes global communication to be send via SWI short-cuts, which offer more efficient power dissipation and faster across-the-chip signal propagation. Results obtained using a cycleaccurate simulator demonstrate the effectiveness of the proposed system architecture in terms of significant power reduction, considervii able average delay reduction and higher throughput compared to a regular NoC. The third contribution is in handling multicast communications, which are normally associated with traffic overload, hotspots and deadlocks and therefore increase, by an order of magnitude the power consumption and latency. This has been achieved by proposing a novel routing and centralized arbitration schemes that exploits the SWI0s remarkable fan-out features. The evaluation demonstrates drastic improvements in the effectiveness of the proposed architecture in terms of power consumption ( 2-10x) and performance ( 22x) but with negligible hardware overheads ( 2%). The fourth contribution is to further explore multicast contention handling in a flexible decentralized manner, where original techniques such as stretch-multicast and ID-tagging flow control have been developed. A comparison of these techniques shows that the decentralized approach is superior to the centralized approach with low traffic loads, while the latter outperforms the former near and after NoC saturation

    Embedded dynamic programming networks for networks-on-chip

    Get PDF
    PhD ThesisRelentless technology downscaling and recent technological advancements in three dimensional integrated circuit (3D-IC) provide a promising prospect to realize heterogeneous system-on-chip (SoC) and homogeneous chip multiprocessor (CMP) based on the networks-onchip (NoCs) paradigm with augmented scalability, modularity and performance. In many cases in such systems, scheduling and managing communication resources are the major design and implementation challenges instead of the computing resources. Past research efforts were mainly focused on complex design-time or simple heuristic run-time approaches to deal with the on-chip network resource management with only local or partial information about the network. This could yield poor communication resource utilizations and amortize the benefits of the emerging technologies and design methods. Thus, the provision for efficient run-time resource management in large-scale on-chip systems becomes critical. This thesis proposes a design methodology for a novel run-time resource management infrastructure that can be realized efficiently using a distributed architecture, which closely couples with the distributed NoC infrastructure. The proposed infrastructure exploits the global information and status of the network to optimize and manage the on-chip communication resources at run-time. There are four major contributions in this thesis. First, it presents a novel deadlock detection method that utilizes run-time transitive closure (TC) computation to discover the existence of deadlock-equivalence sets, which imply loops of requests in NoCs. This detection scheme, TC-network, guarantees the discovery of all true-deadlocks without false alarms in contrast to state-of-the-art approximation and heuristic approaches. Second, it investigates the advantages of implementing future on-chip systems using three dimensional (3D) integration and presents the design, fabrication and testing results of a TC-network implemented in a fully stacked three-layer 3D architecture using a through-silicon via (TSV) complementary metal-oxide semiconductor (CMOS) technology. Testing results demonstrate the effectiveness of such a TC-network for deadlock detection with minimal computational delay in a large-scale network. Third, it introduces an adaptive strategy to effectively diffuse heat throughout the three dimensional network-on-chip (3D-NoC) geometry. This strategy employs a dynamic programming technique to select and optimize the direction of data manoeuvre in NoC. It leads to a tool, which is based on the accurate HotSpot thermal model and SystemC cycle accurate model, to simulate the thermal system and evaluate the proposed approach. Fourth, it presents a new dynamic programming-based run-time thermal management (DPRTM) system, including reactive and proactive schemes, to effectively diffuse heat throughout NoC-based CMPs by routing packets through the coolest paths, when the temperature does not exceed chip’s thermal limit. When the thermal limit is exceeded, throttling is employed to mitigate heat in the chip and DPRTM changes its course to avoid throttled paths and to minimize the impact of throttling on chip performance. This thesis enables a new avenue to explore a novel run-time resource management infrastructure for NoCs, in which new methodologies and concepts are proposed to enhance the on-chip networks for future large-scale 3D integration.Iraqi Ministry of Higher Education and Scientific Research (MOHESR)

    Adaptive Routing Approaches for Networked Many-Core Systems

    Get PDF
    Through advances in technology, System-on-Chip design is moving towards integrating tens to hundreds of intellectual property blocks into a single chip. In such a many-core system, on-chip communication becomes a performance bottleneck for high performance designs. Network-on-Chip (NoC) has emerged as a viable solution for the communication challenges in highly complex chips. The NoC architecture paradigm, based on a modular packet-switched mechanism, can address many of the on-chip communication challenges such as wiring complexity, communication latency, and bandwidth. Furthermore, the combined benefits of 3D IC and NoC schemes provide the possibility of designing a high performance system in a limited chip area. The major advantages of 3D NoCs are the considerable reductions in average latency and power consumption. There are several factors degrading the performance of NoCs. In this thesis, we investigate three main performance-limiting factors: network congestion, faults, and the lack of efficient multicast support. We address these issues by the means of routing algorithms. Congestion of data packets may lead to increased network latency and power consumption. Thus, we propose three different approaches for alleviating such congestion in the network. The first approach is based on measuring the congestion information in different regions of the network, distributing the information over the network, and utilizing this information when making a routing decision. The second approach employs a learning method to dynamically find the less congested routes according to the underlying traffic. The third approach is based on a fuzzy-logic technique to perform better routing decisions when traffic information of different routes is available. Faults affect performance significantly, as then packets should take longer paths in order to be routed around the faults, which in turn increases congestion around the faulty regions. We propose four methods to tolerate faults at the link and switch level by using only the shortest paths as long as such path exists. The unique characteristic among these methods is the toleration of faults while also maintaining the performance of NoCs. To the best of our knowledge, these algorithms are the first approaches to bypassing faults prior to reaching them while avoiding unnecessary misrouting of packets. Current implementations of multicast communication result in a significant performance loss for unicast traffic. This is due to the fact that the routing rules of multicast packets limit the adaptivity of unicast packets. We present an approach in which both unicast and multicast packets can be efficiently routed within the network. While suggesting a more efficient multicast support, the proposed approach does not affect the performance of unicast routing at all. In addition, in order to reduce the overall path length of multicast packets, we present several partitioning methods along with their analytical models for latency measurement. This approach is discussed in the context of 3D mesh networks.Siirretty Doriast

    Spatial parallelism in the routers of asynchronous on-chip networks

    Get PDF
    State-of-the-art multi-processor systems-on-chip use on-chip networks as their communication fabric. Although most on-chip networks are implemented synchronously, asynchronous on-chip networks have several advantages over their synchronous counterparts. Timing division multiplexing (TDM) flow control methods have been utilized in asynchronous on-chip networks extensively. The synchronization required by TDM leads to significant speed penalties. Compared with using TDM methods, spatial parallelism methods, such as the spatial division multiplexing (SDM) flow control method, achieve better network throughput with less area overhead.This thesis proposes several techniques to increase spatial parallelism in the routers of asynchronous on-chip networks.Channel slicing is a new pipeline structure that alleviates the speed penalty by removing the synchronization among bit-level data pipelines. It is also found out that the lookahead pipeline using early evaluated acknowledgement can be used in routers to further improve speed.SDM is a new flow control method proposed for asynchronous on-chip networks. It improves network throughput without introducing synchronization among buffers of different frames, which is required by TDM methods. It is also found that the area overhead of SDM is smaller than the virtual channel (VC) flow control method -- the most used TDM method. The major design problem of SDM is the area consuming crossbars. A novel 2-stage Clos switch structure is proposed to replace the crossbar in SDM routers, which significantly reduces the area overhead. This Clos switch is dynamically reconfigured by a new asynchronous Clos scheduler.Several asynchronous SDM routers are implemented using these new techniques. An asynchronous VC router is also reproduced for comparison. Performance analyses show that the SDM routers outperform the VC router in throughput, area overhead and energy efficiency.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore