Network on Chips (NoCs) replace traditional busses in highly integrated Multiprocessor System on Chips (MPSoCs). As SoCs, communication issues take much important in NoCs but they need to give contention free architecture with low latency. To meet the above need several methods like handshaking mechanism and arbiter designs developed and implemented. This paper presents various scheduler designs using iSLIP scheduling algorithms and its comparative analysis with various arbiters. All the arbiters described using Verilog HDL and synthesized using Xilinx. For performance analysis, Cadence RTL compiler with UMC 0.13µm technology used to compute power and area of all the algorithms for arbiter.
The basic elements of NoC are core processors, core interface, network interface, router and physical links. Topologies for NoCs are mesh, ring, tree, mesh of tree and butterfly, and most preferable topology is mesh. Heart of the NoCs is router and it consists of buffer, crossbar, arbiter and allocators, which coördinate data transfer from input to output based on information received from allocators [4] . Mostly five input and output routers are used for NoC, among five, four I/O ports for adjacent routers and one I/O port for core processor. Buffers are temporarily held the data and release to crossbar which works same as switch. Among all these basic modules, data flow controls of channels whether it may be virtual channels or wormhole play a vital role to give high throughput. NoC routers should provide high speed data transmission when multiple data packets from different input ports to same output. A fast arbiter is most dominant factor in high throughput NoCs. So for the above reason, this study gives analysis of different arbiters on scheduler. Figure 2 describes the general architecture of NoC router. This paper includes details of RRA and HRRA in chapter 2, scheduler deals in chapter 3, chapter 4 gives analysis of different arbiter performance on scheduler and chapter 5 provides conclusion of our work [5] .
ARBITER
The arbiters play important role in designing of schedulers. To design a scheduler, we use two arbiters for grant and accept process. But both the arbiters are identical except mechanism of priority state determination. In this chapter we will see two types of arbiter logic called Round Robin and Hierarchical Round Robin.
Round Robin Arbiter (RRA)
The main goal of RRA is to produce grant signal for one request among multiple requests from input ports at a time in such a way. A general RRA consist of two blocks, one is input selector and other one is pointer updater. Consider a N input request signal from source port, and possibility to generate grant signal (E) is log 2 (N). Pointer updater has a pointer (RPT) that would be generated a grant signal for next possible request from input port in next grant generation signal cycle. Input selector generates a grant signal based on priority, i.e., it gives highest priority to pointer output. An optional signal can be included to indicate no request. Figure 3 shows the block diagram of RRA [6 -8] . In order to achieve flexibility of priority changing and circular priority order, hierarchical approach is applied in RRA with increased number of request input. In HRRA, first process is to divide number of inputs in to k subsets and which are called as Sub RRA, and they have local request. Local requests and grants are done in multiple stages of sub RRA based on priority settings. A simple pass signal is used for smooth transitions between sub RRAs. 
SCHEDULER
Scheduler acts as a central switch arbiter and it analyzes the requests in queue. It configures input ports and interconnects to connect input and output ports, and allows serial data transfer. The scheduler performs large number connections simultaneously based on such algorithms, also avoids conflicts of multiple inputs to single input. Figure 5 shows scheduler implementation chosen for this design. Input to the scheduler is the occupancy vectors from each of the input_blocks with packets waiting to be scheduled. Main part of the scheduler is arbiter. The scheduler consists of two set of arbiters, one to perform grant function, and another to perform accept. The number of grant and accept arbiters depend on number of modules/processors to be fabricated in NoC. Here we designed scheduler using RRA and HRRA for grant and accept process. Scheduler also has an 8-bit busy input from each of the switch outputs. Output port is disabled from the Grant Arbitration if the busy signal is asserted [10].
Fig 5: Architecture of Scheduler
In this paper we used iSLIP algorithms to function scheduling process, which is developed from SLIP to provide multiple iterations. The SLIP algorithm is a single iteration process, after the iteration is done, possible inputs and output remains unused. Steps of iSLIP algorithm follows as:
International Journal of Computer Applications (0975 -8887)
Volume 62-No.14, January 2013
Step 1: Request. Each unmatched input sends a request to every output for which it has a queued cell.
Step 2: Grant. In an unmatched output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output notifies each input whether or not its request was granted. The pointer to the highest priority element of the round-robin schedule is incremented to one location beyond the granted input if the grant is accepted in Step 3 of the first iteration.
Step 3: Accept. If an unmatched input receives a grant, it accepts the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The pointer to the highest priority element of the round-robin schedule is incremented to one location beyond the accepted output only if this input was matched in the first iteration.
This scheduler can extend to any type of topology with N number of modules or processors in NoC [11].
RESULTS AND DISCUSSION
As discussed above, to provide congestion free and efficient data transfer from multiple inputs to single output, such process plays vital role called scheduling. The scheduler has a main part called arbiter which is used to predict possible request based on priority and gives grant signal. In this paper we designed scheduler with RRA and HRRA. The performance metrics of scheduler are analyzed with RRA and HRRA and they are listed in Table 1 , 2 and 3. First we designed arbiter with round robin fashion and hierarchical round robin fashion, and then scheduler with RRA and HRRA. Scheduler and all the sub modules described using Verilog HDL and functionally verified using Xilinx Isim. Figure 5 shows the simulation output of iSLIP scheduler with HRRA. Other performance parameters were calculated using Cadence RTL compiler with 130nm technology. Technology library file used for calculation is fsc0h_d_sc_bc from Europractice. Major part of the scheduler is arbiter and it consumes more power than other modules. So it insisted us to find and analyze arbiter performance in terms of area, power and delay. Generally scheduler designed with arbiter based on round robin fashion. So for, we also designed RRA and HRRA individually, and measured their parameters like total power of RRA is 157759.59 nW and 60184.9 nW, total cell area is 1756.32µm 2 and 841.61 µm 2 respectively. Therefore by applying hierarchical scheme in arbiter design, 38.15% of power was reduced and 47.92% of area was reduced. Detailed power parameters are listed in Table 1 .
In order to get low area, low power and high speed scheduler, we designed iSLIP scheduler with HRRA. To understand the efficiency of proposed design, it is compared with iSLIP scheduler with RRA. From synthesis report observed total power values are 1682114.88nW for iSLIP with RRA and 1443539.97nW for iSLIP with HRRA. Area and other timing parameters are listed in Table 2 and 3.
CONCLUSIONS
This paper proposed a scheduler with hierarchical round robin scheme to provide low area, low power and high speed. To achieve the above, hierarchical approach was applied to arbiter design because arbiter is a heart part of the scheduler. Scheduler provides contention free data transfer from input port to output port when multiple requests from input ports to single output port at a same time. 
