Developments of modern technologies in electronics, such as communication, Internet, pervasive and ubiquitous computing and ambient intelligence have figured largely our life. In our day micro-electronic products inspire the ways of learning, communication and entertainment. These products such as laptop computer, mobile phones, and personal handheld sets are becoming faster, lighter in weight, smaller in size, larger in capacity, lower in power consumptions, cheaper and functionally enhanced. This trend will persistently continue. Following this trend, we could integrate more and more complex applications and even systems onto a single chip. The System-on-Chip (SoC) technologies, where complex applications are integrated onto single ULSI chips became key driving force for the developments.
Another challenge is communication architecture [2, 3] . Most of current SoCs have a bus-based architecture, such as simple, crossbar-type or hierarchical buses. Unlikeness the scaling of chip capacity, buses do not scale well with the system size in terms of bandwidth, clocking frequency and power due to the following reasons. As the number of clients grows, the intrinsic resistance and capacitance of the bus also increase. Thus, the bus speed is inherently difficult to scale up. A bus system has very limited concurrent communication capability since only one device can use a bus segment at a time. Since every data transfer is broadcast, the bus is inefficient in energy. The entire bus wire has to be switched on and off. Thus, the data must reach each receiver at great energy cost. Despite the fact that improvements such as split-transaction protocols and advanced arbitration schemes for buses have been proposed, these methods cannot address the ultimate problems. To search the future chip capacity, for highthroughput and low-power applications, hundreds of processor-sized resources must be integrated. A bus-based architecture would become a critical performance and power bottleneck due to the scalability problem.
Network-on-Chip (NoC) was proposed in face of those challenges in around year 2001 in the SoC community [4, 5] . On-chip networks are developed on a single chip and designed for closed systems targeting homogeneous or heterogeneous applications.
With the steady advance in electronic downsizing we have reached a point in which we are incapable of increasing processor performance in traditional ways: in one hand, all significant microarchitectural improvements are already in place and, in the other hand, clock frequency has reached the limits imposed by the power and dissipation capabilities of current technology. For this reason, the microprocessor community has been forced to move towards multicore architectures with increasingly high number of processing cores. As the number of cores increases, established on-chip communication infrastructures (buses) start to become a performance bottleneck because a centralised, shared medium results on increases of power consumption and transmission latency as well as in decreases of per-core available bandwidth. For this reason, decentralised infrastructures (networks on chip) are gaining importance and will be essential once a critical number of cores is reached. Many projects attempted to investigate such infrastructures in a holistic manner: covering several areas of opportunity (network topology, router microarchitecture, interfaces) and considering different numbers of merit (power, performance, area, fault tolerance)
As Networks-on-chip (NOCs) are becoming the de facto communication fabric to connect cores and cache banks in chip multiprocessors (CMPs), routing algorithms, as one of the key components that influence NOC latency, are the subject of extensive research. Static routing algorithms consume low cost but unlike adaptive routing algorithms, do not perform well under non-uniform or bursty traffic. Adaptive routing algorithms estimate congestion levels of output ports to avoid routing traffic over congested ports. As global adaptive routing algorithms are not restricted to local information for congestion estimation, they are the prime candidates for balancing traffic in NOCs.
Trough computing beyond a million processors, bio-inspired massively parallel architectures have been considered to provide more adaptability to support heterogeneous applications [6] . There are challenges in this research area, from understanding brain function at the highest level, through automatic mapping of neural models onto a highly distributed computation platform down to architecture optimization and run-time fault and energy management. To eliminate the delay of routing from the critical path of a router, researchers proposed look-ahead routing, where routing decisions are performed one hop in advance. The delay of the routing being off the critical path, so researchers proposed dynamic routing methods, which are more complex, for better distribution of traffic in the network.
The current and almost certainly the next generation of multiprocessor hardware relies on wired connectivity between many core processors in a hierarchy of cores, chips, boards and racks. At the present time, interconnect between all the components is achieved using high speed serial or bus connections. Performance requirements of NoC infrastructures in future technology nodes cannot be met by relying only on material innovation with traditional scaling. The continuing demand for low-power and high-speed interconnects with technology scaling obliges looking beyond the conventional planar metal/dielectric-based interconnect infrastructures.
