Search CORE

4,181 research outputs found

An Energy and Performance Exploration of Network-on-Chip Architectures

Author: Arnab Banerjee
Gerard J. M. Smit
Pascal T. Wolkotte
Robert D. Mullins
Senior Member
Simon W. Moore
Student Member
Publication venue: IEEE Circuits and Systems Society
Publication date: 01/01/2009
Field of study

In this paper, we explore the designs of a circuit-switched router, a wormhole router, a quality-of-service (QoS) supporting virtual channel router and a speculative virtual channel router and accurately evaluate the energy-performance tradeoffs they offer. Power results from the designs placed and routed in a 90-nm CMOS process show that all the architectures dissipate significant idle state power. The additional energy required to route a packet through the router is then shown to be dominated by the data path. This leads to the key result that, if this trend continues, the use of more elaborate control can be justified and will not be immediately limited by the energy budget. A performance analysis also shows that dynamic resource allocation leads to the lowest network latencies, while static allocation may be used to meet QoS goals. Combining the power and performance figures then allows an energy-latency product to be calculated to judge the efficiency of each of the networks. The speculative virtual channel router was shown to have a very similar efficiency to the wormhole router, while providing a better performance, supporting its use for general purpose designs. Finally, area metrics are also presented to allow a comparison of implementation costs

CiteSeerX

University of Twente Research Information

Cycle-accurate evaluation of reconfigurable photonic networks-on-chip

Author: Artundo Iñigo
Debaes Christof
Heirman Wim
Thienpont Hugo
Van Campenhout Jan
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2010
Field of study

There is little doubt that the most important limiting factors of the performance of next-generation Chip Multiprocessors (CMPs) will be the power efficiency and the available communication speed between cores. Photonic Networks-on-Chip (NoCs) have been suggested as a viable route to relieve the off- and on-chip interconnection bottleneck. Low-loss integrated optical waveguides can transport very high-speed data signals over longer distances as compared to on-chip electrical signaling. In addition, with the development of silicon microrings, photonic switches can be integrated to route signals in a data-transparent way. Although several photonic NoC proposals exist, their use is often limited to the communication of large data messages due to a relatively long set-up time of the photonic channels. In this work, we evaluate a reconfigurable photonic NoC in which the topology is adapted automatically (on a microsecond scale) to the evolving traffic situation by use of silicon microrings. To evaluate this system's performance, the proposed architecture has been implemented in a detailed full-system cycle-accurate simulator which is capable of generating realistic workloads and traffic patterns. In addition, a model was developed to estimate the power consumption of the full interconnection network which was compared with other photonic and electrical NoC solutions. We find that our proposed network architecture significantly lowers the average memory access latency (35% reduction) while only generating a modest increase in power consumption (20%), compared to a conventional concentrated mesh electrical signaling approach. When comparing our solution to high-speed circuit-switched photonic NoCs, long photonic channel set-up times can be tolerated which makes our approach directly applicable to current shared-memory CMPs

Crossref

Ghent University Academic Bibliography

Information Switching Processor (ISP) contention analysis and control

Author: Inukai T.
Shyy D.
Publication venue
Publication date
Field of study

Future satellite communications, as a viable means of communications and an alternative to terrestrial networks, demand flexibility and low end-user cost. On-board switching/processing satellites potentially provide these features, allowing flexible interconnection among multiple spot beams, direct to the user communications services using very small aperture terminals (VSAT's), independent uplink and downlink access/transmission system designs optimized to user's traffic requirements, efficient TDM downlink transmission, and better link performance. A flexible switching system on the satellite in conjunction with low-cost user terminals will likely benefit future satellite network users

NASA Technical Reports Server

Recommended from our members

Isochronets: a High-Speed Network Switching Architecture

Author: Florissi Danilo
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1993
Field of study

Traditional switching techniques need hundred- or thousand-MIPS processing power within switches to support Gbit/s transmission rates available today. These techniques anchor their decision-making on control information within transmitted frames and thus must resolve routes at the speed in which frames are being pumped into switches. Isochronets can potentially switch at any transmission rate by making switching decisions independent of frame contents. Isochronets divide network bandwidth among routing trees, a technique called Route Division Multiple Access (RDMA). Frames access network resources through the appropriate routing tree to the destination. Frame structures are irrelevant for switching decisions. Consequently, Isochronets can support multiple framing protocols without adaptation layers and are strong candidates for all-optical implementations. All network-layer functions are reduced to an admission control mechanism designed to provide quality of service (QOS) guarantees for multiple classes of traffic. The main results of this work are: (1) A new network architecture suitable for high-speed transmissions; (2) An implementation of Isochronets using cheap off-theshelf components; (3) A comparison of RDMA with more traditional switching techniques, such as Packet Switching and Circuit Switching; (4) New protocols necessary for Isochronet operations; and (5) Use of Isochronet techniques at higher layers of the protocol stack (in particular, we show how Isochronet techniques may solve routing problems in ATM networks)

Columbia University Academic Commons

Simulation Of Multi-core Systems And Interconnections And Evaluation Of Fat-Mesh Networks

Author: Zhang Yu
Publication venue
Publication date: 28/01/2009
Field of study

Simulators are very important in computer architecture research as they enable the exploration of new architectures to obtain detailed performance evaluation without building costly physical hardware. Simulation is even more critical to study future many-core architectures as it provides the opportunity to assess currently non-existing computer systems. In this thesis, a multiprocessor simulator is presented based on a cycle accurate architecture simulator called SESC. The shared L2 cache system is extended into a distributed shared cache (DSC) with a directory-based cache coherency protocol. A mesh network module is extended and integrated into SESC to replace the bus for scalable inter-processor communication. While these efforts complete an extended multiprocessor simulation infrastructure, two interconnection enhancements are proposed and evaluated. A novel non-uniform fat-mesh network structure similar to the idea of fat-tree is proposed. This non-uniform mesh network takes advantage of the average traffic pattern, typically all-to-all in DSC, to dedicate additional links for connections with heavy traffic (e.g., near the center) and fewer links for lighter traffic (e.g., near the periphery). Two fat-mesh schemes are implemented based on different routing algorithms. Analytical fat-mesh models are constructed by presenting the expressions for the traffic requirements of personalized all-to-all traffic. Performance improvements over the uniform mesh are demonstrated in the results from the simulator. A hybrid network consisting of one packet switching plane and multiple circuit switching planes is constructed as the second enhancement. The circuit switching planes provide fast paths between neighbors with heavy communication traffic. A compiler technique that abstracts the symbolic expressions of benchmarks' communication patterns can be used to help facilitate the circuit establishment

D-Scholarship@Pitt

The Effect Of Hot Spots On The Performance Of Mesh--Based Networks

Author: Al-Issa Yazan M
Publication venue: LSU Digital Commons
Publication date: 01/08/1999
Field of study

Direct network performance is affected by different design parameters which include number of virtual channels, number of ports, routing algorithm, switching technique, deadlock handling technique, packet size, and buffer size. Another factor that affects network performance is the traffic pattern. In this thesis, we study the effect of hotspot traffic on system performance. Specifically, we study the effect of hotspot factor, hotspot number, and hot spot location on the performance of mesh-based networks. Simulations are run on two network topologies, both the mesh and torus. We pay more attention to meshes because they are widely used in commercial machines. Comparisons between oblivious wormhole switching and chaotic packet switching are reported. Overall packet switching proved to be more efficient in terms of throughput when compared to wormhole switching. In the case of uniform random traffic, it is shown that the differences between chaotic and oblivious routing are indistinguishable. Networks with low number of hotspots show better performance. As the number of hotspots increases network latency tends to increase. It is shown that when the hotspot factor increases, performance of packet switching is better than that of wormhole switching. It is also shown that the location of hotspots affects network performance particularly with the oblivious routers since their achieved latencies proved to be more vulnerable to changes in the hotspot location. It is also shown that the smaller the size of the network the earlier network saturation occurs. Further, it is shown that the chaos router’s adaptivity is useful in this case. Finally, for tori, performance is not greatly affected by hotspot presence. This is mostly due to the symmetric nature of tori

Louisiana State University

Scheduling and reconfiguration of interconnection network switches

Author: Roy Krishnendu
Publication venue: LSU Digital Commons
Publication date: 01/01/2009
Field of study

Interconnection networks are important parts of modern computing systems, facilitating communication between a system\u27s components. Switches connecting various nodes of an interconnection network serve to move data in the network. The switch\u27s delay and throughput impact the overall performance of the network and thus the system. Scheduling efficient movement of data through a switch and configuring the switch to realize a schedule are the main themes of this research. We consider various interconnection network switches including (i) crossbar-based switches, (ii) circuit-switched tree switches, and (iii) fat-tree switches. For crossbar-based input-queued switches, a recent result established that logarithmic packet delay is possible. However, this result assumes that packet transmission time through the switch is no less than schedule-generation time. We prove that without this assumption (as is the case in practice) packet delay becomes linear. We also report results of simulations that bear out our result for practical switch sizes and indicate that a fast scheduling algorithm reduces not only packet delay but also buffer size. We also propose a fast mesh-of-trees based distributed switch scheduling (maximal-matching based) algorithm that has polylog complexity. A circuit-switched tree (CST) can serve as an interconnect structure for various computing architectures and models such as the self-reconfigurable gate array and the reconfigurable mesh. A CST is a tree structure with source and destination processing elements as leaves and switches as internal nodes. We design several scheduling and configuration algorithms that distributedly partition a given set of communications into non-conflicting subsets and then establish switch settings and paths on the CST corresponding to the communications. A fat-tree is another widely used interconnection structure in many of today\u27s high-performance clusters. We embed a reconfigurable mesh inside a fat-tree switch to generate efficient connections. We present an R-Mesh-based algorithm for a fat-tree switch that creates buses connecting input and output ports corresponding to various communications using that switch

Louisiana State University