Search CORE

5 research outputs found

Recursive partitioning multicast: a bandwidth-efficient routing for networks-on-chip

Author: Eun Jung Kim
Hyungjun Kim
Lei Wang
Yuho Jin
Publication venue
Publication date: 01/01/2009
Field of study

Chip Multi-processor (CMP) architectures have become mainstream for designing processors. With a large number of cores, Networks-on-Chip (NOCs) provide a scalable communication method for CMP architectures. NOCs must be carefully designed to meet constraints of power consumption and area, and provide ultra low latencies. Existing NOCs mostly use Dimension Order Routing (DOR) to determine the route taken by a packet in unicast traffic. However, with the development of diverse applications in CMPs, one-to-many (multicast) and one-to-all (broadcast) traffic are becoming more common. Current unicast routing cannot support multicast and broadcast traffic efficiently. In this paper, we propose Recursive Partitioning Multicast (RPM) routing and a detailed multicast wormhole router design for NOCs. RPM allows routers to select intermediate replication nodes based on the global distribution of destination nodes. This provides more path diversities, thus achieves more bandwidth-efficiency and finally improves the performance of the whole network. Our simulation results using a detailed cycle-accurate simulator show that compared with the most recent multicast scheme, RPM saves 25 % of crossbar and link power, and 33 % of link utilization with 50 % network performance improvement. Also RPM is more scalable to large networks than the recently proposed VCTM. 1

CiteSeerX

Crossref

A study of on-chip FPGA system with 2D mesh network

Author: Keung Ka-ming
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2010
Field of study

The advance in fabrication technology hugely increases the number of available transistors on a single chip. It allows the industry to build the entire system on a single chip which was only realizable on a board in the past. On-chip System not only reduces the computer physical size, but also increases the computation performance because modules/cores/intellectual properties (IPs) are packed closely together. When simply increasing the clock frequency to increase the computer performance becomes harder because of the wire delay, putting more computation units on a single chip becomes a good alternative for improving computer performance. Building more cores on a chip in the future is expected. With many IPs on a chip, traditional bus is no longer able to provide enough bandwidth to support the communication between IPs. Providing a high performance on-chip network infrastructure for the IP communication becomes a key to high performance on-chip computation. This thesis focuses on an on-chip network supporting on-chip system. This thesis is composed of two main parts. In the first part, a high performance deadlock free dual-coded on-chip router using adaptive multicast routing is built. Compared with the traditional deterministic XY unicast router, this router can reduce both packet latency and energy consumption. In the second part, a co-processor placement algorithm for an on-chip system built from FPGAs with an on-chip network is proposed. The algorithm aims to place the communicating modules as close as possible. In addition, an algorithm for sharing a FPGA by multiple co-processors and an algorithm for supporting polymorphic co-processor are proposed to increase on-chip FPGA system throughput

Digital Repository @ Iowa State University (ISU)

B-RPM: An Efficient One-to-Many Communication Framework for On-Chip Networks

Author: Shaukat Noman
Publication venue
Publication date
Field of study

The prevalence of multicore architectures has accentuated the need for scalable on-chip communication media. Various parallel applications and programming paradigms use a mix of unicast (one-to-one) and multicast (one-to-many) to maintain data coherence and consistency. Providing efficient support for these communication patterns becomes a critical design point for on-chip networks (OCN). High performance on-chip networks design advocates balanced traffic across the whole network, which makes adaptive routing appealing. Adaptive routing explores the path diversity of the network, increases throughput, and reduces network latency compared with oblivious routing. In this work, we propose an adaptive multicast routing, Balanced Recursive Partitioning Multicast (B-RPM), to achieve balanced one-to-many on-chip communication. The algorithm derives its functionality from previously proposed algorithm Recursive Partitioning Multicast (RPM). Unlike RPM which uses fixed set of directional priorities and position of destination nodes, B-RPM replicates packet based on the local congestion information and position of destination nodes with respect to current node. B-RPM employs a new deadlock avoidance technique Dynamically Sized Virtual Networks (DSVN). Built upon the traditional virtual networks, DSVN dynamically allocates the network resources to different VNs according to the run-time traffic status, which delivers better resources utilization. We also propose a new scheme for representing multiple destinations in packet head. The scheme works simply by differentiating multicast and unicast packets. The algorithm combined with dynamically sized virtual networks enables us to improve network performance at high load on average by 20% (up to 50%) and saturation throughput of network on average by 10% (up to 18%) over the most recent multicast algorithm. Also the new header representation scheme enables us to save 24% of dynamic link power

Texas A&M Repository

A Scalable and Adaptive Network on Chip for Many-Core Architectures

Author: Heißwolf Jan
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2014
Field of study

In this work, a scalable network on chip (NoC) for future many-core architectures is proposed and investigated. It supports different QoS mechanisms to ensure predictable communication. Self-optimization is introduced to adapt the energy footprint and the performance of the network to the communication requirements. A fault tolerance concept allows to deal with permanent errors. Moreover, a template-based automated evaluation and design methodology and a synthesis flow for NoCs is introduced

KITopen

Multi-Address Encoding for Multicast

Author: Chi-Ming Chiang
Lionel M. Ni
Publication venue
Publication date: 01/01/1994
Field of study

. Efficient implementation of multicast communication is critical to the performance of message-based scalable parallel computers and switch-based high speed networks. This paper deals with address issues occurring in the message header for the transmission of multicast messages. Multi-address encoding is becoming critical to system performance as the scale of networks is getting larger and the demand of multicast communication is getting higher. Several multi-address encoding schemes are investigated and explored. Although the proposed multi-address encoding schemes can be applied to networks with different switching techniques, the emphasis of this paper is on the emerging wormhole routing technique. 1 Introduction Multicast communication, which refers to the delivery of a message from a single source node to a number of destination nodes, is a frequently used communication pattern in distributed-memory parallel computers and computer networks. Efficient implementation of multicast ..

CiteSeerX