

GLOBAL JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY: C SOFTWARE & DATA ENGINEERING Volume 14 Issue 4 Version 1.0 Year 2014 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc. (USA) Online ISSN: 0975-4172 & Print ISSN: 0975-4350

# Energy Efficient Branch and Bound based On-Chip Irregular Network Design

By Kalpana Jain, Naveen Choudhary & Dharm Singh

University of Agriculture and Technology, India

*Abstract-* Here we present a technique which construct the topology for heterogeneous SoC, (Application Specific NoC) such that total Dynamic communication energy is optimized. The topology is certain to satisfy the constraints of node degree as well the link length. We first layout the topology by finding the shortest path between traffic characteristics with the branch and bound optimization technique. Deadlock is dealt with escape routing using Spanning tree. Investigation outcome show that the proposed design methodology is fast and achieves significant dynamic energy gain.

*Keywords:* network on chip, shortest path, branch and bound, routing. GJCST-C Classification: C.2.1



Strictly as per the compliance and regulations of:



© 2014. Kalpana Jain, Naveen Choudhary & Dharm Singh. This is a research/review paper, distributed under the terms of the Creative Commons Attribution-Noncommercial 3.0 Unported License http://creativecommons.org/licenses/by-nc/3.0/), permitting all non-commercial use, distribution, and reproduction inany medium, provided the original work is properly cited.

## Energy Efficient Branch and Bound based On-Chip Irregular Network Design

Kalpana Jain <sup>a</sup>, Naveen Choudhary <sup>a</sup> & Dharm Singh <sup>p</sup>

Abstract- Here we present a technique which construct the topology for heterogeneous SoC, (Application Specific NoC) such that total Dynamic communication energy is optimized. The topology is certain to satisfy the constraints of node degree as well the link length. We first layout the topology by finding the shortest path between traffic characteristics with the branch and bound optimization technique. Deadlock is dealt with escape routing using Spanning tree. Investigation outcome show that the proposed design methodology is fast and achieves significant dynamic energy gain.

*Keywords:* network on chip, shortest path, branch and bound, routing.

### I. INTRODUCTION

nterconnection networks are used to meet the of numerous communication demands the processing elements in high end parallel supercomputers, telecom switches and more recently their wide spread use is also seen for the communication requirement in complex SoC[1] having numerous processing elements. With the development of integration technology, System on Chip composed of numerous cores on a single chip has entered the billion transistor era. As the microprocessor industry moves from single core to multi core architectures, requiring efficient communication among processor. A high performance, flexible, scalable and design friendly interconnection network design is highly preferred for new SoC and chip designs. These interconnection networks for complex SoC also referred as on chip Networks. Network on chip have emerged as a viable option for scheming scalable messaging architecture for MPSOCs .In Noc, on chip micro networks are used to intersect the various cores, which are better than bus based systems, so used for dealing communication issues. Early works are done for standard topologies like mesh, torus etc where traffic cannot be statically predicted however challenges are different for diverse with different core size, operation and SoC communication requirement. In Irregular NoC each node can be connected to one or more core, as per the constraint and design requirement and therefore are best suited for application specific custom NoC design [2, 3]. Here we propose a branch and bound [B&B]

*Author* α σ ρ: Deptt. Of CSE, CTAE, MPUAT, Udaipur.

based heuristics for the design of customized energy efficient irregular NoC assuming an area optimized floor plan as a prerequisite.

# II. Communication Energy and Irregular Noc

Design of Irregular Network on Chip has two main issues to be dealt with respect to the proposed work that is calculating energy dissipated through the network for data transfer and finding the routes between the cores of network. Thus here we discuss the Energy Model and Routing methods used.

### a) Energy Model

Ye et al. [4] proposed a model for communication for on chip networks. For regular networks the channels length between the cores is of uniform length. Thus the energy dissipated in transferring 1 bit of data from soured core to destination core comprising of both router energy and channel energy is as follows:

Router Energy: (E<sub>Rbit</sub>)

$$E_{Rbit} = E_{Sbit} + E_{Bbit} + E_{Wbit}$$
(1)

Where  $E_{Sbit} + E_{Bbit} + E_{Wbit}$  correspond to the dynamic energy elapsed by switch( $E_{Bbit}$ ), buffering( $E_{Bbit}$ ) and interconnection links ( $E_{Wbit}$ ) within the switching framework. The dynamic energy dissipated on the channels between cores ( $E_{Lbit}$ ) should also be considered, thus the dynamic energy dissipated in transferring one bit of information from a tile to its adjacent tile can be given as

$$\mathsf{E}_{\mathrm{bit}} = \mathsf{E}_{\mathrm{Rbit}} + \mathsf{E}_{\mathrm{Lbit}} \tag{2}$$

Thus the communication\_energy required in sending 1 bit of information from source tile  $t_j$  to destination tile tile  $t_k$  is

 $E^{jk}$  bit = nhops \*  $E_{Rbit}$  + (nhops -1) \*  $E_{Lbit}$  (3)

Where nhops is sum of tiles from source tile tj to destination  $t_k$  and  $E_{Lbit}$  is channel length between adjacent tiles (channel length is uniform for all adjacent tiles of regular networks).

For irregular networks the channel length is not of uniform (equal), as channels are laid by maximum length constraint of link length. Thus the second operand of eq-3 is replaced as the summation of energy dissipated by each channel in the route of source tile  $t_j$ to destination tile  $t_k$ 

e-mails: kalpana\_jain2@rediffmail.com, naveenc121@yahoo.com, dharm94@gmail.com

#### b) Routing in Irregular NoC

The popular routing algorithms with irregular topologies such as Left-Right routing [8], up -down routing[6] etc, uses the turn based model [7] to overcome deadlock state. In the proposed work minimum shortest paths are laid as preferred routing paths from the source tile to destination tile with a view to optimize communication energy requirements, however a methodology based on escape route[5] is used to achieve deadlock freedom in communication.

### III. Proposed Methodology for Energy Efficient Branch and Bound based on Chip Irregular Network Design

In the presented work, an energy efficient topology design is proposed. To design the energy efficient topology the, the channels laid should be such that they lead to shortest path for communication, this is achieved by finding the shortest path application communication characteristics, considering the constraints of node degree and length of channel as maximum length should not be exceeded due to physical signaling delay. Moreover the connectivity of network is assured by creating spanning tree for the network and a constraint according to up/down routing, is used to achieve the deadlock freedom in communication.



### *Figure1:* Design flow of Energy Efficient Branch and Bound based On-Chip Irregular Network Design

The application characteristics are clustered according to the source traffic, and then using the shortest energy path as the optimization criteria and a branch and bound method is developed to get an customized energy efficient irregular on-chip network then the source with the maximum data rate are routed using shortest path and branch and bound method used to get the optimized solution.

The design flow is given in figure 1 shows the input taken for topology synthesizer such as traffic characteristics, constraints and tile coordinates for Manhattan distance to lay the channel.

### a) Branch and Bound (B&B) Topology Generation

A branch and bound [19] based optimization technique is developed to design a dynamic

communication energy efficient methodology, which is custom tailored according to the traffic requirement (predefined) with the necessary constraints of node degree, channel length and routing. Figure 2 shows the partial representation of nodes generation of tree, traffic requirement is routed at each stage to form the efficient communication energy topology.

The nodes of the tree can be one of the following types:

*Root:* traffic characteristics are not routed and represent the problem (energy efficient topology) to be optimized. *Internal node:* Each number in the label represents the Priority of traffic characteristics which are routed. For example node 201XX represents the partial routing of traffic characteristics with priority number 2, 0, 1. While traffic characteristics with priority 3 and 4 are still unrouted.



Figure 2 : partial nodes representation of tree.

*Leaf:* All the traffic characteristics are routed and topology is created, select leaf node one with minimum cost.

Every node is associated with cost, UBC:Upper bound cost and LBC:lower bound cost

*Cost:* The cost of node is energy consumed in communication for routed traffic characteristics.

UBC and LBC are cost of the nodes which helps us to determine the whether the nodes lead to optimal solution and helps in not making the search exhaustive.

Finding optimal solution for the problem of efficient communication energy topology is searching leaf node with minimum cost. Branch is expansion(create child node) at each node by routing next application characteristics to be routed, and bound is check on child nodes whether they lead to the better solution. This checking is achieved by comparing their UBC and LBC with the global UBC and parent node. If cost or LBC is greater than global min UBC child nodes are discarded.

### b) The calculation of UBC and LBC

UBC calculation: UBC of node is calculated by finding path of all remaining unrouted traffic characteristics using a greedy method for remaining unrouted traffic characteristics.For each step in the greedy method, the next unmapped application characteristics with highest communication demand is selected and its path is laid by shortest path method. This step is repeated until all application are routed. This leads to a complete routing and identifies a leaf node. If this node is illegal then it is discarded otherwise saved for the future expansion. LBC calculation: LBC of node is calculated by finding path of all remaining unrouted application characteristics, here constraints of topology are not considered in path setup. This step is repeated for all remaining unrouted traffic characteristics.

Priority Queue is used to speed up the search for optimal leaf node; the nodes are inserted in sorted order of their cost, once the Queue is full, nodes are inserted only if they are leading to better solution.

c) The Proposed methodology: Algorithm



### IV. EXPERIMENTAL RESULTS

The random data sets required to evaluate the proposed methodology was generated using TGFF [18] with diverse communication data rate of the cores. An On Chip simulator is used for evaluation. The router energy dissipated is evaluated using the power simulator Orion[15,20] for  $0.18\mu$ m technology. The

dynamic bit energy dessipated for inter-node link (ELbit) can be computed using the below equation.

 $E_{\text{Lbit}} = (1/2) \times \alpha \times C_{\text{phy}} \times V_{\text{DD}}^2$  Where

- 1.  $\alpha$  = average probability of a one to zero or zero to one transition between two successive samples in the stream for a specific bit, assured average value of  $\alpha$  = 0.5.
- 2.  $C_{phy} = physical capacitance of inter-node wire.$
- 3.  $V_{DD}$  is the supply voltage.
- a) B&B comparison with Genetic on Random and Realistic Benchmarks

Below graphs shows the evaluation comparison of proposed methodology with Genetic algorithms based methodology proposed [14] by Naveen Choudhary et al. for the similar data sets (tile coordinates and traffic characteristics) with Node\_ degree =4 and link length as twice the length of the Maximum core size.

Below Graphs shows the performance comparison of Branch and Bound and [14] over 100 sets of diverse application data. Average flit latency gain in the range of 5% to 20% and average communication energy gain in the range of 2% to 10% in comparison to [14] has achieved by the topologies generated by the proposed B&B method.



a. Average Communication Energy per flit



b. Average flit latency

*Figure 3 :* Performance comparison of proposed methodology with genetic approach on Random Benchmark. (a)Average Communication Energy per flit and (b) Average flit latency





a. Average Communication Energy per flit

b. Average Flit Latency

*Figure 4*: Performance comparison of proposed methodology with genetic approach on realistic benchmark. (A).Average Communication Energy per flit and (b) Average flit latency

### V. Conclusion

The paper present a B&B based technique for designing an energy efficient custom tailored topology for Irregular on-chip networks. The customized topology is design as per the predefined traffic requirements. The necessary constraint of max node degree, max channel length, deadlock free communication and area optimized floor plan are incorporated in the proposed methodology provides a realistic solution .The results clearly elaborates that the proposed method is able to generate better energy efficient networks in comparison to the popular evolutionary based approach[14].

The proposed work can be further extended in a quite few ways such as incorporating the floor plan also in the proposed methodology which can be expected to provide improved energy efficient networks may be at the cost of increased area overhead. Another extension of the proposed work can be in the area of designing irregular 3D on-chip energy efficient networks.

### **References** Références Referencias

- 1. Dally, W. J., and Towles, B. 2001. Route packets, Not wires: On-chip interconnection networks. In Design Automation Conference. Proceedings .pp. 684-689. IEEE.
- S. Murali, G. De Micheli, 2004.SUNMAP: A Tool for Automatic Topology Selection and Generation for NoCs. In Proceeding of DAC.

- 3. K. Srinivasan et al.2005.An Automated Technique for Topology and Route Generation of Application Specific On-Chip Interconnection Networks, In Proc. ICCAD.
- 4. Ye, T. T., Benini, L., & De Micheli, G. 2002. Analysis of power consumption on switch fabrics in network routers. In Design Automation Conference, Proceedings. 39th.pp. 524-529. IEEE.
- Silla, F., & Duato, J. 2000. High-performance routing in networks of workstations with irregular topology. IEEE Transactions Parallel and Distributed Systems., 11(7), 699-719.
- Silla, F., Malumbres, M. P., Robles, A., López, P., & Duato, J.1997. Efficient adaptive routing in networks of workstations with irregular topology. In Springer Berlin Heidelberg Communication and Architectural Support for Network–Based Parallel Computing. pp. 46-60.
- Glass, C. J., & Ni, L. M. 1992. The turn model for adaptive routing. In ACM SIGARCH Computer Architecture News. 20(2). pp. 278-287.
- Koibuchi, M., Funahashi, A., Jouraku, A., & Amano, H. 2001. L-turn routing: An adaptive routing in irregular networks. In IEEE Parallel Processing InternationalConference. .pp. 383-392.
- 9. Benini, L., & De Micheli, G.2002. Networks on chips: a new SoC paradigm.Computer, 35(1).pp. 70\_78.
- Ogras, U. Y., Hu, J., & Marculescu, R. 2005. Key research problems in NoC design: a holistic perspective. In Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis .pp. 69-74.
- Hu, J., & Marculescu, R.2003. Energy-aware mapping for tile-based NoC architectures under performance constraints. In Proceedings of the Asia and South Pacific Design Automation Conference .pp. 233-239.
- Schroeder, M. D., Birrell, A. D., Burrows, M., Murray, H., Needham, R. M., Rodeheffer, T. L., & Thacker, C. P. 1991. Autonet: A high-speed, self-configuring local area network using point-to-point links. In IEEE Journal Selected Areas in Communications, 9(8), pp. 1318-1335.
- Choudhary, N., Gaur, M. S., Laxmi, V., & Singh, V. 2011.GA based congestion aware topology generation for application specific NoC. In Sixth IEEE International Symposium Electronic Design, Test and Application (DELTA). pp. 93-98.
- Choudhary, N., Gaur, M. S., Laxmi, V., & Singh, V. 2010. Energy aware design methodologies for application specific NoC. In IEEE 28th Norchip Conference Finland, ISBN 978-1-4244-8971-8.. pp.1-4.
- 15. Wang, H. S., Zhu, X., Peh, L. S., & Malik, S. 2002. Orion: a power-performance simulator for

interconnection. In proc. International Symposium on Microarchitecture.

- Jain, L., Al-Hashimi, B. M., Gaur, M. S., Laxmi, V., & Narayanan, A.2007. NIRGAM: a simulator for NoC interconnects routing and application modeling. In Design, Automation and Test in Europe Conference.
- Hu, J., & Marculescu, R.2005. Energy-and performance-aware mapping for regular NoC architectures. In IEEE TransactionComputer-Aided Design of Integrated Circuits and Systems.24(4). pp.551-562.
- Dick, R. P., Rhodes, D. L., & Wolf, W. 1998. TGFF: task graphs for free. In IEEE Computer Society. Proceedings of the 6th international workshop on Hardware/software codesign.pp. 97- 101.
- 19. Cormen Thomas H, Leiserson Charles E, Rivest Ronald L and Stein Clifford. 2009. Introduction to Algorithms third edition. The MIT Press.
- 20. A. B. Kahng, B. Li, L. S. Peh, K. Samadi, \_Orion 2.0: a fast and accurate NoCpower and area model for early-stage design space exploration,\_ in DATE'09,pp. 423\_428, 2009.

# This page is intentionally left blank