On the area and energy scalability of wireless network-on-chip: a model-based benchmarked design space exploration by Abadal Cavallé, Sergi et al.
1On the Area and Energy Scalability of Wireless
Network-on-Chip: A Model-based Benchmarked
Design Space Exploration
Sergi Abadal∗, Mario Iannazzo∗, Mario Nemirovsky†, Albert Cabellos-Aparicio∗,
Heekwan Lee‡ and Eduard Alarco´n∗
∗NaNoNetworking Center in Catalonia (N3Cat)
Universitat Polite`cnica de Catalunya, Barcelona, Spain
Email: {abadal,acabello}@ac.upc.edu, mario.enrique.iannazzo@estudiant.upc.edu, eduard.alarcon@upc.edu
†ICREA Senior Research Professor at
Barcelona Supercomputing Center (BSC), Barcelona, Spain
Email: mario.nemirovsky@bsc.es
‡Samsung Advanced Institute of Technology (SAIT)
South Korea
Email: heekwan.lee@samsung.com
Abstract—Networks-on-Chip (NoCs) are emerging as the way
to interconnect the processing cores and the memory within
a chip multiprocessor. As recent years have seen a significant
increase in the number of cores per chip, it is crucial to guarantee
the scalability of NoCs in order to avoid communication to
become the next performance bottleneck in multicore processors.
Among other alternatives, the concept of Wireless Network-on-
Chip (WNoC) has been proposed, wherein on-chip antennas
would provide native broadcast capabilities leading to enhanced
network performance. Since energy consumption and chip area
are the two primary constraints, this work is aimed to explore
the area and energy implications of scaling a WNoC in terms of
(a) the number of cores within the chip, and (b) the capacity of
each link in the network. To this end, an integral design space
exploration is performed, covering implementation aspects (area
and energy), communication aspects (link capacity) and network-
level considerations (number of cores and network architecture).
The study is entirely based upon analytical models, which will
allow to benchmark the WNoC scalability against a baseline
NoC. Eventually, this investigation will provide qualitative and
quantitative guidelines for the design of future transceivers for
wireless on-chip communication.
Index Terms—Network-on-Chip, Wireless Network-on-Chip,
Multicore Processors, Design Space Exploration, Emerging Inter-
connect Technologies, On-chip Antennas, Wireless Transceivers,
Area, Power
I. INTRODUCTION
In the ever-changing world of microprocessor design, mul-
ticore architectures are currently the dominant trend for both
conventional and high-performance computing. These archi-
tectures consist of the interconnection of several independent
processors or cores, as well as of a multilevel cache to improve
the memory throughput. Communication among these ele-
ments is required for the implementation of diverse signaling
schemes essential for the correct operation of a multiprocessor
and largely impacts upon the computation performance. As the
number of cores within these processing systems increases,
their communication needs rise dramatically, to the point of
turning communication into the major performance bottleneck
of current multicore architectures.
With the aim of coping with the increasing on-chip com-
munication requirements, a common practice has been to
replace traditional bus architectures with networks of on-
chip wires and routers [1]. This approach, also referred to
as Network-on-Chip (NoC), can be understood as the appli-
cation of networking principles and methods upon a set of
electrical interconnects [2], [3]. However, NoCs enabled by
these interconnects present fundamental limitations that point
towards a reduced scalability beyond several tens of cores.
As thoroughly discussed in [4] and references therein, the
available energy for interconnects will soon be under the 100
fJ/bit barrier and will not be enough to cover the requirements
of electrical wires (Table I). Also, their decreasing multicast
performance is foreseen to be a significant issue in future
many-core architectures (more details in Section II).
As a consequence of such limited scalability, considerable
research efforts have been directed towards extending the
original concept of NoC to other interconnect technologies.
Diverse examples can be found in the literature, including
the employment of vertical vias within stacked architectures
[5], [6], of on-chip transmission lines for the transmission of
modulated RF signals [7] or of nanophotonic interconnects
enabling optical on-chip communication [4], [8]. Such emerg-
ing technologies may be used either to completely replace
traditional NoCs [9], [10] or to follow a hybrid approach which
leverages the capabilities of different types of interconnects
[11], [12], targeting to ensure the scalability of on-chip net-
works beyond thousands of cores.
In line with the recent research trends, the possibility of
implementing on-chip wireless communication by means of
integrated antennas has been proposed [13]. The resulting
wireless NoCs have garnered considerable interest from the
2  

 

  

	




	






	






	

	


 	

 
	

 



	  	




	
	




Fig. 1. Model-based approach employed in this work.
community by virtue of, among others, their native broadcast
and multicast capabilities [14]. Since the medium is shared
among the cores, either multiplexing techniques or medium
access control (MAC) protocols are required to achieve multi-
user communication [15]. As a result, the concept of wireless
NoC has been thus far analyzed in the form of specific net-
work architectures and benchmarked employing traffic patterns
from a set of standard applications. Alternatively, we aim to
provide an interconnect-driven view of this research area by
performing, as the main contribution, a circuit-oriented design
space exploration of wireless NoC.
The employed methodology is summarized in Figure 1. The
investigation is entirely based on analytical models and com-
pares how the area and energy consumption of wireless NoC
scale as a function of the size and bandwidth requirements
of the network for a given architecture. The results are then
compared with that of a baseline electrical NoC (the interested
reader will find data for a 64-core 48-bit instance in Table I)
and of a selection of emerging alternatives. We expect that
this design space exploration will allow for the identification
of the scenarios wherein wireless NoC will potentially outper-
form other interconnect technologies. Further, it will provide
guidelines for the design not only of future transceivers
and protocols for wireless on-chip communication, but also
of network architectures that leverage different interconnect
technologies.
The remainder of this paper is as follows. In Section II,
we present a case study that will try to motivate the aim
of this paper. In Section III, we review the state of the art
of the wireless on-chip networking field. After introducing
the analytical framework and general assumptions in Section
IV, the area and energy models for the different interconnect
technologies are depicted in Sections V and VI, respectively.
The results of the design space exploration are discussed in
VII. Section VIII concludes the paper.
II. MOTIVATION
As the integration of a higher number of cores in the same
chip is enabled, the general trend is to scale current multicore
architectures and then to address the resulting increase in
communication demands by means of enhanced on-chip net-
works. Provided that the architecture defines the characteristics
TABLE I
BASELINE NOC PARAMETERS
Parameter Value Unit
System
Chip Area 400 mm2
CMOS Technology Node 32 nm
Operation Frequency 5 GHz
Supply Voltage 1 V
Topology Mesh -
Number of Links 224 -
Link (per hop)
Capacity 240 Gbps
Energy 540 fJ/bit
Area 0.009 mm2
Static Power 3.8 mW
Router (per hop)
Energy 220 fJ/bit
Area 0.11 mm2
Static Power 64 mW
Link and router figures were obtained with ORION [16] assuming a 64-core
system with a datapath width of 48 bits, as well as four virtual channels per
port and a buffer size of four flits per virtual channel.
of these communication demands, multicore processors have
been designed taking into consideration the NoC capabilities.
For instance, multicast has traditionally been a costly commu-
nication in chip environments and has been widely avoided.
This tendency continues as in conventional NoCs, multicast
messages are broken down into multiple unicast packets and
generate large levels of contention. The work in [17] shows
that conventional NoC latency and throughput suffer a degra-
dation proportional to the multicast traffic intensity and reports
significant reductions even for 1% of multicast traffic in a 4x4
mesh. It is expected that such impact will further increase in
larger networks, as the number of destinations per message
may potentially grow with the number of cores.
Even though on-chip multicast communications have been
traditionally avoided, some architectural methods will need
multicast in order to scale. For instance, cache coherency
protocols normally avoid multicast by storing the state of each
shared variable in a directory. This produces area and energy
overheads proportional to the number of cores and may not
be affordable in many-core systems. Instead, broadcast-based
implementations do not store the state of each variable but
need to issue a broadcast for each coherence operation [18].
In this case, it is shown that improving the NoC multicast
performance results in a significant reduction of both the
interconnect power and execution time for a set of benchmark
applications [19]. The introduction of an effective platform for
the service of multicast messages would be highly beneficial
in this context, but, more importantly, could open the door for
new many-core architectures.
Aware of the importance of such issue, explicit support
for multicast communications within conventional electronic
NoCs has been widely proposed for moderately sized mul-
tiprocessors [17], [19]–[22]. Still, the scalability of these
solutions in terms of performance and cost has not been
discussed in the literature. Figure 2 plots the delay-throughput
characteristic of a two-dimensional electrical mesh in the
presence of broadcast traffic (as a particular case of multi-
3200 300 400 500 600
5
10
15
20
Latency [cycles]
Th
ro
ug
hp
ut
 [%
]
Electrical Broadcast Test
N = {16, 36, 64, 144, 256, 576}
Fig. 2. Simulated delay-throughput characteristic of electrical meshed NoCs
as a function of the number of nodes, considering pure broadcast traffic.
Links are optimally repeated, with a link width of 64 at a clock frequency
of 5 GHz; whereas routers implement unbalanced tree multicasting with a
minimum routing latency per flit of 4 clock cycles. The throughput is in
transmission and it is expressed as a percentage of a link capacity.
cast), showing a considerable performance deterioration as the
number of cores is scaled. The results were obtained using
the PhoenixSim framework [23], a cycle-accurate simulator
that includes a wide variety of tools and methods for the
evaluation of NoCs. In light of this, it remains unclear whether
the aforementioned improvements will suffice to enable the use
of traditional architectures in many-core processors.
Alternatively, the introduction of emergent interconnect
technologies has opened a wide range of possibilities for cost-
effective multicast on-chip communications. In 3D NoCs, the
reduced distance among cores both physically and in terms of
number of hops inherently allows for an improved multicast
performance. Also, the employment of one-to-all or all-to-
all channels by means of global RF transmission lines and
nanophotonic waveguides has been inspected [9], [11], [24].
In the case of wireless NoC, the native broadcast capabilities
of such technique show great promise towards implementing
efficient architectural methods for many-core processors, as
detailed and quantified in the following sections. It is important
to note, though, that each of the aforementioned options
presents its particular trade-offs in terms of area, energy and
communication performance.
III. WIRELESS NETWORK-ON-CHIP
The constant improvement in the operating speeds of tran-
sistors has enabled the implementation of multi-GHz digital
and RF circuits. In this context, the concept of on-chip
antenna becomes a possibility since an antenna of a few
millimeters in size is able to radiate at these frequencies
[13]. Also, transceivers suited to the needs of the wireless
chip communications have been developed: a wide variety of
millimeter-wave implementations can be found in the literature
covering many alternatives in terms of technology generation,
modulation or transceiver architecture [25]–[30]. For transmis-
sion ranges of up to a few centimeters, these provide high
multigigabit data rates and it is expected that these figures
will keep increasing as technology evolves. A factor that aims
to quantify the maturity of technology within this context is
proposed in Section IV.
In light of the availability of both on-chip antennas and of
appropriate transceivers, their employment to build Wireless
Networks-on-Chip (WNoC) has been proposed. In this ap-
proach, information is radiated and propagates within the chip
package following different propagation mechanisms [31].
Planar antennas can be used in spite of their typically low
gain in the co-planar direction, in which case communication
takes place by means of space waves that are reflected upon the
chip package. Alternatively, thanks to their potentially larger
radiation efficiency in the chip plane, three-dimensional anten-
nas could lead to achieving wireless communication through
surface waves [32]. However, such antennas require complex
MEMS (micro electro mechanical systems) technologies for
its fabrication.
As information may potentially reach any core regardless
of its location, WNoC offers native broadcast capabilities,
as well as the possibility of implementing flexible and one-
hop communications. Multicast messages may actually be
conveyed to the receivers in a few clock cycles, as opposed to
in conventional NoCs. However, as the core density increases,
the size of the millimeter-wave antennas may restrict the scope
of WNoC to hybrid architectures wherein the wireless plane
is employed to communicate clusters of cores. Although such
wireless backbone approach allows a reduction of the network
diameter and has been shown to outperform conventional
NoCs [33]–[36], its potential for broadcast-based communi-
cations is limited by the performance of the electrical edges
of the network.
As further CMOS advancements push the operating frequen-
cies towards the terahertz band [37], [38], the implementation
of micrometer antennas becomes feasible. Moreover, novel
planar antennas based on graphene promise to be able to
radiate within this frequency band while being two orders
of magnitude below, in size, of their metallic counterparts
[39], [40]. In order to drive the antennas, transmitters and
receivers for multigigabit communication at frequencies rang-
ing from 0.1 to 0.4 THz have been already proposed [41]–
[45]. Additionally, components reaching frequencies of 0.8
THz are under intense research [46]–[49], thus far leading
to the apparition of transmitters and detectors for terahertz
imaging and sensing [50]–[52].
Assuming a similar evolution than that of millimeter-wave
transceivers, terahertz implementations could provide data
rates of hundreds of gigabits per second at the chip scale. By
virtue of this and the potentially reduced size of these terahertz
systems, architectures implementing wireless communication
at the core level can be envisaged [14] and will be considered
throughout this work. In many-core processors, this approach
will likely generate extremely high levels of contention when
accessing the shared medium. Multiplexing techniques may
not be suitable in this scenario due to the large number of
channels required and the implications of this fact upon the
complexity of the transceiver. Instead, a MAC protocol could
arbitrate access to a single broadband channel and enable
4the development of broadcast-based WNoC architectures. In
transmission, packets are serialized into bits and broadcast
regardless of the number of intended destinations; whereas
the receiver deserializes the incoming bits and then accepts
or discards the packet after decoding its address. Buffer
requirements for this process will be affordable as long as
the packet rate after deserialization (C/L, where C is the link
capacity in bits per second and L is the packet length in bits)
is below the system clock frequency.
Since the bandwidth is shared among the nodes, the ex-
pected aggregated throughput of WNoC will be extremely low
when compared to a wired NoC. In light of this, first uses of
this broadcast-based platform may be restricted to serving a
selection of control and signaling messages. These are latency-
critical, often dense multicasts, and require lower bandwidths
as they generally represent a small fraction of all the on-
chip traffic. The approach is only feasible provided that this
wireless control plane will complement a throughput-oriented
wired NoC that will compensate for the low WNoC bandwidth
by transporting the rest of the communication flows. Such
hybrid NoC could potentially reduce the latency of time-
critical control messages while avoiding a deterioration of the
wired NoC performance, potentially opening the door for new
multiprocessor architectures.
IV. FRAMEWORK AND GENERAL CONSIDERATIONS
Given the stringent requirements of the on-chip communi-
cation scenario, in this work we explore the area and energy
implications of scaling a WNoC system in terms of (a) the
number of effective receivers or network size, and (b) the
capacity of a wireless link. The results of this implementation
study are compared to that of representative examples of
conventional and photonic NoC configurations, being aware
of the main differences among them. For instance, since
we assume that all nodes share a single broadband channel,
the network capacity in WNoC is equal to a link capacity.
Further, the need of a MAC protocol implies that the effective
network throughput in this case will be significantly lower
than the network capacity. In contrast, the network capacity
in wireline NoCs is the sum of the capacities of all the
dedicated links that can simultaneously transmit data. Even
though wireline NoCs will therefore yield much larger network
throughput figures than WNoC for similar link capacities,
the comparison will be performed at the link level. From a
network throughput perspective, it remains unclear whether
this large gap in nominal capacity will be compensated by
the inherent difference in communication typology (i.e. local
against global, unicast against broadcast). In future work, we
will address this issue by investigating both the minimum
wireless capacity requirements of different multiprocessor ar-
chitectures, as well as the potential performance improvements
of adding a wireless control plane.
The Maturity Factor
On the one hand, the relation between the area/energy of a
WNoC and its size in number of nodes can be easily described
by means of simple models, as shown in Sections V-A and
TABLE II
SUMMARY OF TRANSCEIVER SPECIFICATIONS
Technology 40 - 130 nm CMOS
130 - 250 nm SiGe BiCMOS
Transceiver Architecture Impulse Radio (IR),
Continuous Wave (CW)
Modulation On-Off Keying (OOK),
Amplitude Shift Keying (ASK),
Phase Shift Keying (BPSK, QPSK),
Frequency Shift Keying (FSK),
Quadrature Amplitude Modulation (QAM)
Operation Frequency (fc) 8 - 820 GHz
Transmission Range (dmax) 1.4 - 210 cm
Data Rate (R) 2 - 18 Gbps
2
4
6
Data Rate (Gbps)
A
re
a 
(m
m2
)
 
 
2 4 6 8 10 12 14 16
0
20
40
60
En
er
gy
 (p
J/b
it/c
m1
/2
)
 
 
Fig. 3. Area and energy of state-of-the-art wireless transceivers [25]–[29],
[41]–[44], [53] as a function of their data rate.
VI-A. On the other hand, it is not straightforward to assess
how the area and energy of a wireless transceiver scale with
its maximum achievable data rate due to the number of factors
involved. Such data rate, which is referred to as link capacity
throughout the paper, depends on the transceiver bandwidth
and the spectral efficiency of the selected modulation: in the
transceiver design process, a given architecture is chosen in
order to achieve the target bandwidth while implementing
the selected modulation. On top of that, the frequency band
wherein the communication will take place imposes additional
requirements on some components of the transceiver, again
depending on the architecture. Finally, the maturity of the
employed technology should be taken into consideration, spe-
cially when reaching extremely high frequency bands.
In order to extract a trend from the state of the art, a
generally accepted approach is to represent the area or energy
efficiency as a function of the data rate of different transceiver
implementations, as done in Figure 3. However, the tendency
shown by such scatter plots is unclear and only covers a range
between 2 and 18 Gbps, rendering its extrapolation inadequate
for the purpose of this work.
In light of the complexity of the analysis and of the
heterogeneity of the state of the art in the field (see Table
II), we will consider the following. Let us define the Maturity
Factor as:
M = SE ·Q [bps/Hz] (1)
5where SE = RB is the spectral efficiency of the employed
modulation or data rate over the operation bandwidth, and
Q = B
fc
is the transceiver quality factor or its bandwidth over
the operation frequency. Therefore:
M =
R
fc
(2)
In summary, the maturity factor tries to evaluate the effi-
ciency of implementing a given modulation and bandwidth in
order to yield a target data rate operating at a target frequency
band. As technology matures, we expect highly optimized
transceivers leading to increasing maturity factors, this is,
higher data rates for similar area and energy values. For a
transceiver at a given operation frequency and with certain
area and energy efficiency figures, we will a priori assume a
maturity value in order to extract a projected data rate. This
way, a rough estimate of the area and energy efficiency of
future wireless transceivers can be obtained.
Figure 4 shows the maturity factor of several state-of-the-
art transceivers [25]–[30], [41]–[44], [53] as a function of
their frequency. We observe factors of up to 35% at the 60
GHz band followed by a decrease below 5% when reaching
sub-THz frequencies. These values will be used throughout
this work as reference guidelines indicating the maturity of
wireless transceivers at a given frequency. We will consider
that initial designs could achieve a maturity factor of up to
10%, while refined implementations may reach a 20% and
well-established transceivers could provide a 30%. However,
this rule of thumb may find exceptions as novel technologies
are introduced. For instance, an impressive 100-Gbps wireless
transmission was recently accomplished using a photonic
transmitter at a carrier frequency of 237.5 GHz, resulting in a
maturity factor above 42% [54].
Eventually, the feasibility of the WNoC approach will be
determined by the data rate requirements of the system. These
could be met with current designs as transceivers with such
performance have been already proposed [30]. Data rates up
until 60 Gbps may be achievable in the near future provided
that either technologies at 100-300 GHz mature and reach
a reasonable factor of 20%, or initial designs appear in the
terahertz band. In order to reach speeds above 60 Gbps, mid
term efforts are required in order to raise the maturity of
transceivers at in the terahertz band close to well-established
levels.
V. AREA MODELS
While integration levels have been constantly increasing
over the years, die sizes have practically stayed constant.
Recently, 3D stacking techniques have emerged allowing the
integration of devices in various vertically stacked layers. Still,
the chip area is a finite resource that needs to be carefully
managed: the area devoted to a given NoC will not be available
for the core implementation and viceversa.
In order to calculate the area overhead of an on-chip
interconnection network, we will use the following general
expression:
0 50 100 150 200 250 300 350 400
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
M
at
ur
ity
 F
ac
to
r
Operation Frequency (GHz)
 
 
Fig. 4. Maturity factor as a function of the operation frequency of the
transceiver for proposals [25]–[29], [41]–[44], [53].
A = NTXATX +NRXARX +NLAL +NRAR (3)
where Ni and Ai indicate the number of components of type
i and its mean area occupancy, being the types divided in
transmitters (TX), receivers (RX), links (L) and routers,
switches or other arbitration mechanisms (R). In the following,
we will detail the analytical models that relate the number of
components and their area to the number of nodes of the net-
work and the targeted link capacity in wireless, electrical and
photonic NoCs. Note that the area figures will be independent
of the traffic typology, as the considered NoCs are designed
to support both unicast and multicast.
A. Wireless NoC Area Models
In the case of wireless on-chip communication, physical
links are not needed in order to convey the information
from the transmitter to the receiver. Moreover, switches or
routers are not required if we assume one-hop communication.
Therefore, the only components that occupy chip area are the
antennas and the transceivers needed to modulate the data and
to drive the signals to the antenna. We will assume one antenna
and one transceiver per node, even though configurations with
multiple antennas could be devised. Also, the analysis does
not consider the area occupied by the logic required for the
MAC protocol. For all this, Equation (3) can be reduced to:
A = NTXATX +NRXARX = N(Aant +Atxrx) (4)
where N is the number of nodes in the network, Aant
is the antenna area and Atxrx is the transceiver area. The
antenna and transceiver area will be mainly determined by the
on-chip communication requirements. In order to achieve a
given goal, the wireless plane must provide a certain effective
network throughput which depends on the MAC protocol that
arbitrates medium access and, more importantly, the data rate
of each transceiver. Generally, higher data rates require higher
6bandwidths which, in turn, require communication in higher
frequency bands.
Such tendency fortunately imposes a downscale on the
antenna size. Due to the planar nature of a chip, we will
consider the employment of patch antennas. The dimensions
of such antennas are as follows: the width (W ) is comparable
to a wavelength λ, while the length L must be approximately
λ/2. Therefore, for a given operation frequency f :
Aant ≈ c
2
0
2ǫefff2
(5)
where c0 is the speed of light and ǫeff is the effective
permittivity of the antenna. In order to fulfill the bandwidth
requirements B at such resonance frequency fc, the antenna
must yield a quality factor of Q ≈ B
fc
. In order to simplify
the analysis, we will consider that this quality factor will be
achieved by means of techniques that do not largely affect the
area occupied by the antenna, e.g., the quality factor in patch
antennas is mainly determined by the distance between the
patch and the ground plane.
In the case of the transceiver, the relation between the area
and peak data rate is calculated as discussed in Section IV.
A given maturity factor M is assumed so that the data rate
requirement R can be achieved by operating at least at a
frequency fc = RM . The area for such transceiver can be
extrapolated with data from the state of the art, which points
towards a decrease in area when the frequency is upscaled
(see Figure 5). The reasons for the observed tendency may
stem from the strong downsizing that is applied to the passive
RF components of a transceiver when the operation frequency
is increased. On the other hand, the scaling of active RF
components remains unclear and should be inspected in future
work with the aim of obtaining an accurate area scaling model
for wireless on-chip transceivers. In this work, we will use
a model obtained by applying fitting methods to the data
represented in Figure 5, which yielded the following equation:
Atxrx =
206.1
fc + 27.22
[mm2] (6)
wherein fc is expressed in GHz. Rational fitting was chosen
on the grounds that it delivers the most accurate result among
the possible fittings and that it does not yield negative values
for high frequencies. The weight of each data point is assigned
in inverse proportion to the operation frequency, implying
that implementations for well-established technologies at low
frequencies are more representative than initial designs at
the terahertz band. The resulting coefficient of determination,
which evaluates the goodness of fit, is 0.68 (with 1 being an
exact fit).
B. Electronic NoC Area Models
Two steps have been performed in order to calculate the
area of an electronic NoC. First, the number of elements
that constitute a given architecture can be easily derived by
observing how its topology scales with the number of nodes.
Once the topology is fixed, the area of each element can be
calculated by means of simulation taking into consideration
0 500 1000 1500
0
1
2
3
4
5
6
7
A
re
a 
(m
m2
)
Operation Frequency (GHz)
Extrapolation of
the state of the art
Fig. 5. Area of state-of-the-art wireless transceivers [25]–[29], [41]–[44],
[50], [53], [55]–[59] as a function of their central frequency.
the topology and the target capacity. Our analysis has been
performed by means of ORION, a widely recognized power-
area simulator for on-chip interconnection networks [16].
Let us assume that each node has two line drivers, one for
transmission (TX) and one for reception (RX). A typical line
driver accounts for an inverter and a D flip-flop, and ORION
allows the user to calculate their area occupancy for a given
technology node. In the case of the on-chip wires, ORION
evaluates the number of repeaters needed for each link (L)
based on its length (which is determined by the topology) and
technology node. The area of each repeater is then calculated,
added to the physical area of the wire and multiplied by the
number of parallel wires in a link, i.e. datapath width. Finally,
the chip area of each router (R) is assessed by breaking the
router down to the transistor level, calculating the number of
transistors needed and multiplying it by the size of a transistor
for a given technology node. The final result will depend on
parameters such as the number of ports, the size of the buffers
or the datapath width.
C. Photonic NoC Area Models
A photonic on-chip network essentially includes modu-
lators, waveguides, switches, filters and photodetectors. In
the transmitting side (TX), we will assume that modulators
are made of one active ring resonator, whereas receivers
(RX) consist of a passive ring resonator-based filter and a
photodetector. Switches can also be devised by employing
ring resonators as building blocks [12]. Finally, we also
consider that all ring resonators are of the same size. Given
these assumptions, the area of a given architecture can be
approximated as:
A ≈ NringAring +NdetAdet +
∑
i
Awg,i (7)
where Nring and Ndet are the number of ring resonators and
photodetectors, respectively. Aring = W 2ring is the area of
each ring, or the square of its pitch, Adet is the photodetector
area, and Awg,i is the area of waveguide i. As in conventional
7electronic NoCs, the specific network architecture will deter-
mine the exact number of components as a function of the
number of nodes and the target link capacity. The interested
reader can find more details in [60], including a more detailed
description of the architectures as well as the area and insertion
loss values used in the analysis.
VI. ENERGY MODELS
The power consumed by any communication network can
be classified in two main groups: static and dynamic. The static
or zero-load power is the energy consumed independently of
the traffic being served, whereas the dynamic power is a load-
dependent component. Due to their distinct nature, static and
dynamic powers are usually expressed in different units. Static
power Pstatic is expressed in Watts and gives insight about
the energy that is consumed invariably through time to, for
instance, maintain the circuitry active; whereas dynamic power
Ebit is expressed in Joules per bit and gives insight about the
energy required to physically transmit one bit of data without
errors from the transmitter to the intended receivers for a given
interconnect technology.
As a rule of thumb, we will calculate the power consumed
by a given on-chip network by using the following formula:
P = Pstatic + Ebit · T (8)
where T is the network throughput in bits per second. In a
reverse process, we can also calculate the energy required
to convey one bit of information from the transmitter to the
intended receivers, operating at a given throughput:
ETbit =
Pstatic
T
+ Ebit (9)
where, the throughput T is ideally equivalent to the link
capacity considering one transmission flow and no packet loss.
A. Wireless NoC Energy Models
Unlike in traditional wireless networks, the network nodes
in a WNoC are integrated within the same platform and share
the same power supply. Moreover, we will assume one shared
channel and enough transmission power so that each wireless
message is received by all the processing cores. In this context,
the energy consumed in the transmission and reception of
one bit is independent of whether the message is unicast or
multicast and can be expressed as:
ETbit,W = E
tx
bit +N ·Erxbit (10)
where Etxbit and Erxbit are the mean energy consumption in
transmission and reception, respectively. Leakage currents of
the N−1 inactive transmitters, as well as the power consumed
by the logic required to implement the MAC protocol are
neglected. For a transceiver implementation with measured
power in transmission Ptx and measured power in reception
Prx, both for a data rate R and a given transmission range,
the equation above can be also expressed as:
ETbit,W =
Ptx +N · Prx
R
(11)
0 100 200 300 400 500 600 700 800
0
10
20
30
40
50
60
En
er
gy
 (p
J/b
it/c
m1
/2
)
Operation Frequency (GHz)
Extrapolation of
the state of the art
Fig. 6. Energy efficiency figure of merit of state-of-the-art wireless
transceivers [25]–[29], [41]–[44], [53] as a function of their central frequency.
It is important to remark that Equation (11) expresses the
energy per bit of a specific wireless transceiver yielding a data
rate R. Since both metrics depend on several factors such
as the selected modulation, the transceiver architecture, the
transmission range or the maturity of the employed technology,
analytically obtaining a model that relates both the energy
efficiency of wireless communication and its data rate is
deemed highly challenging. Instead, as discussed in Section
IV, we will assume a maturity factor M so that a target
peak data rate R can be achieved by operating at least at
a frequency fc = RM . We further consider that applying the
MAC protocol, such data rate will yield an effective network
throughput that meets the communication requirements set by
the multiprocessor.
This way, a generic trend can be extracted from the state
of the art in wireless transceivers. Authors in [61] propose
and discuss a figure of merit for wireless transceivers which
encompasses both their energy efficiency Ebit and trans-
mission range dmax by means of the following expression:
Φ = Ebit√
dmax
. Figure 6 shows how this figure of merit scales
as a function of the frequency for implementations [25]–[29],
[41]–[44], [53]. A similar fitting approach than the used in
Section V-A provided the following relation:
Ebit√
dmax
=
1.41 · 103
fc + 28.81
[pJ/bit/cm1/2] (12)
with a coefficient of determination of 0.65. In this case, Ebit =
Etxbit + E
rx
bit and fc is expressed in GHz. Energy values can
be extrapolated for frequencies beyond 400 GHz using the
equation above.
The dependence on the transmission range is an impor-
tant aspect to consider since, under the assumption that any
transmitter should be able to reach any receiver, the nodes
located at the chip edges will need a higher range that of
more centric nodes. This has two main implications: on the
one hand, centric nodes need less transmission power to fulfill
the sensitivity requirements at the chip edges. Therefore, the
power amplifier can be tuned to consume less power. On
8the other hand, centric nodes receive transmissions with high
power since the link budget is performed considering the
worst case, this is, to reach the chip edges. In this case,
the requirements for the low noise amplifiers are significantly
relaxed. In our analysis, we will calculate which is the average
energy per bit over all the on-chip transmitters following the
aforementioned considerations with static power allocation.
Finally and unless noted, we will assume Etxbit = Erxbit =
Ebit/2.
B. Electronic NoC Energy Models
Again, ORION is employed to determine both the static
and dynamic power of an electronic NoC. In the former case,
we will consider the power due to leakage currents in wires
and routers. ORION breaks down these digital circuits to the
transistor level and uses experimentally-validated values for
quiescent currents. In the latter case, ORION provides means
to calculate the energy required to perform one hop within the
network, which includes the energy required to (1) transmit
one bit of data through an on-chip wire of fixed length and
(2) read one bit of data from a router buffer, route it and write
it into the next router buffer.
Assuming a throughput T equal to the link capacity C, the
energy per bit in an electronic NoC is:
ETbit,E =
Pleakage
C
+H · Eb,hop (13)
where Pleakage is the power due to leakage currents and Eb,hop
is the average energy required for one bit to perform one hop.
The H is the average distance between transmitter and receiver
in terms of number of hops and solely depends on the network
topology. For a 2D Mesh of N cores, Hucast = 2
√
N
3
, whereas
Hbcast = N−1 considering a routing algorithm that minimizes
the number of hops needed to deliver the message once to all
the destinations.
C. Photonic NoC Energy Models
The power consumption in a photonic NoC is mainly driven
by three components, namely, the laser power, the ring heating
and the energy required to perform the electrooptic (E/O)
and optoelectric (O/E) conversions at the modulators and
photodetectors, respectively.
Laser Power: Since integrating individual laser sources on
a chip is currently unfeasible, it is generally accepted that
light in a photonic NoC is supplied by an external multi-
wavelength source. This light is coupled, modulated and then
guided within the chip towards the intended receiver. In order
to fulfill the sensitivity requirements at the receiver, the laser
must transmit enough power to compensate for the losses
incurred by the components found in the light path. Moreover
and unless practical real-time laser management systems are
made available [62], the laser power needs to be statically
allocated to the worst case scenario. In this context, a power
budget analysis is performed following the expression:
Plaser(dBW ) = SRX(dBW ) +
∑
i
Li(dB) (14)
where Plaser is the electrical power consumed by the laser,
SRX is the receiver sensitivity, and Li is the loss of component
i, which includes both the laser and coupling efficiencies. The
size and architecture of the network, as well as the target
link capacity, will determine the number of components in the
critical light path. The interested reader will find more details
and a comparison of the laser power for different architectures
in [60].
Ring Heating: Another source of static energy in photonic
NoCs is the power needed to maintain ring resonators tuned
to the desired frequency. Such components are extremely
temperature-sensitive as small variations produce a shift in
their resonant frequency. The power needed to keep ring
resonators thermally tuned is:
Pheat = Nring · Pring (15)
where Nring is the number of ring modulators in the
architecture, and Pring is the power needed to maintain one
ring finely tuned (see Table III). As commented in Section
V-C, we will assume one ring per modulator and filter in all
cases.
E/O and O/E Conversions: The dynamic power consump-
tion in a photonic NoC is mainly due to the energy required to
convert one electronic bit to light and viceversa. In this case,
we will consider fixed values demonstrated in the literature,
which are shown in Table III. Similarly to wireless NoC, the
energy required for the transmission and reception of one bit
will depend on the number of k simultaneous receivers:
Ebit = E
tx
bit + k ·Erxbit (16)
The parameter k is generally dependent on the photonic
NoC architecture. Generally, point-to-point (k = 1) optical
communication is implemented and a separated broadcast
channel (k = N ) is employed for multi-receiver transmissions
[9]. Alternatively, a broadcast-based architecture would deliver
any message to all the receivers, which would check the
destination address and discard the message if necessary [24].
Assuming a throughput T equal to the link capacity C and
using Equations (14)-(16), the energy per bit in a photonic
NoC is:
ETbit,P =
Plaser + Pheat
C
+ Ebit (17)
VII. BENCHMARKED DESIGN SPACE EXPLORATION
In this section, the results of the design space exploration
are presented. We compare a small selection of architectures,
namely:
• EMesh: which implements a conventional electrical mesh.
We consider one 5-port router per core and bidirectional
links connecting neighboring routers.
• WMesh: a WNoC-based architecture accounting for one
communication unit (antenna and transceiver) per core.
We assume that all cores share the same broadband
channel and that a tailor-made MAC protocol arbitrates
medium access.
9TABLE III
PHOTONIC NOC PARAMETERS
Parameter Value Units Ref.
Ring Losses 0.01-1 dB [60]
Ring Area 64 µm2 [60]
Ring Heating Power 26 µW/ring [63]
Propagation Loss 0.5 dB/cm [64]
Bending Loss 0.15 dB [64]
Waveguide Pitch 2 µm [64]
E/O Conversion 82 fJ/bit [10]
O/E Conversion 50 fJ/bit [10]
Photodetector Area 20 µm2 [24]
Photodetector Sensitivity −30 dBm [60]
• OBus: a photonic bus arbitrated by means of an all-optical
token-based scheme.
• OXBar1: an optical crossbar, wherein each core is tuned
to a unique wavelength in transmission and broadcasts
its messages to the rest of cores. For more details on this
architecture, see [60].
• OXBar2: another optical crossbar, wherein each core is
associated to a unique data waveguide. Through this
dedicated channel, a given core is able to receive data
modulated by any of the other cores. For more details on
this architecture, see [60].
Tables I and III show a summary of the technological
parameters used in the study. The variable number of cores
is swept between 4 and 1024, whereas the link capacity
is scaled up to 250 Gbps. Note that, when the number of
cores is increased, the network capacity remains constant in
WMesh and grows proportionally to that increase in the rest
of alternatives.
A. Area
Figure 7 shows the area-network size plane of the design
space, corresponding to fixing the link capacity to a value of
80 Gbps. The electrical and wireless options show a linear
behavior, while photonic NoCs grow with the square of the
number of cores due to the quadratic scaling in number of
components [60].
In the WNoC case, three different operation frequencies
have been chosen, namely 260, 400 and 800 GHz. Taking
into account the targeted link capacity, such frequencies lead
to maturity factors not exceeding 30%, in consonance with the
values shown in the state of the art (see Fig. 4). From an area
overhead perspective, high frequencies are beneficial since
they entail lower area both for the antenna and the transceiver,
according to the tendency pointed out in Section V-A. Nev-
ertheless, the area occupation in most cases is higher than
that of the electrical and photonic alternatives. Considerable
transceiver area optimization is needed in order to enable size
compatibility with massive multicore architectures: reducing
the area of a 800-GHz transceiver to 0.1 mm2 would yield
an overhead of 27% in a 1000-core processor. By employing
graphene-based nano-antennas [39], [40], such area overhead
would be further reduced to a 25%.
Figure 8 shows the area-capacity plane of the design space,
corresponding to fixing the network size to a value of 256
200 400 600 800 1000
0
10
20
30
40
50
60
70
80
90
100
A
re
a 
(%
 of
 40
0m
m2
)
Number of cores
 
 
EMesh
WMesh [260GHz]
WMesh [400GHz]
WMesh [800GHz]
OBus
OXBar1
OXBar2
Fig. 7. Area scaling as a function of the number of cores for different
interconnect technologies and architectures. The link capacity is set to 80
Gbps.
nodes. It can be observed that both electronic and photonic
NoCs show a linear growth of area with respect to the link
capacity, since higher bandwidth requirements are generally
fulfilled by means of additional wires and circuitry.
In the wireless case, we consider different preset maturity
factors and then scale the operation frequency in accordance
with the link capacity objectives. Once the operation frequency
is chosen, the area is calculated using the model presented in
Section V-A. Such approach explains the negative slope of
the WNoC area plots: higher bandwidth requirements imply
an increase in the operation frequency, which in turn entails a
reduction in the size of both the antenna and the transceiver.
Due to this, it is expected that WNoC will be able to compete
with the electrical and photonic alternatives at high link
capacities due to the extremely high operation frequencies
required for transmission. It is important to note, though,
that such possibility is limited by the state of technology as
it determines the maximum frequency at which circuits can
operate. This may also imply that higher maturity factors may
need to be sought in order to increase the link capacity of a
WNoC over a given value.
B. Energy
Figure 9 shows the energy-network size plane of the design
space for a fixed link capacity of 80 Gbps. There are several
aspects to be noted:
• In a conventional NoC, there is a considerable gap
between the energy per bit in a unicast transmission and
in a broadcast transmission. In both cases, conventional
designs outperform wireless and photonic NoCs.
• WNoCs follow a similar trend than conventional NoCs,
being the options working at higher frequencies closer
to achieve an energy efficiency comparable to that of
conventional NoCs, in accordance to the extrapolation
proposed in Figure 6.
10
50 100 150 200 250
0
10
20
30
40
50
60
70
80
90
100
A
re
a 
(%
 of
 40
0m
m2
)
Link Capacity (Gbps)
 
 
EMesh
WMesh [10%]
WMesh [20%]
WMesh [30%]
OBus
OXBar1
OXBar2
Fig. 8. Area scaling as a function of the link capacity for different
interconnect technologies and architectures. The number of nodes is set to
256.
200 400 600 800 1000
10−12
10−11
10−10
10−9
10−8
10−7
En
er
gy
 p
er
 B
it 
(J)
Number of cores
 
 
EMesh
EMesh (Bcast)
WMesh [260GHz]
WMesh [400GHz]
WMesh [800GHz]
OBus
OXBar1
OXBar2
Fig. 9. Energy per bit scaling as a function of the network size for different
interconnect technologies and architectures. The link capacity is set to 80
Gbps.
• In a photonic NoC, the energy figures can be considered
independent on whether the transmission is unicast or
multicast by virtue of the extremely low energy needed
for the O/E conversions. However and despite such po-
tential for low energy transmissions, the photonic NoC
configurations scale poorly due to their high laser power
requirements, specially at high core counts [60].
Figure 10 shows the energy-capacity plane of the design
space. On the one hand, it is observed that conventional
NoCs yield an energy efficiency which is almost invariant
with respect to the link capacity. On the other hand, the
energy efficiency of WNoCs not only improves with the link
capacity, but also outperforms conventional NoCs at some
point, provided that the trend observed in the state of the
art continues in future transceivers (see Figure 6). Finally,
our results confirm that the different photonic NoC options
50 100 150 200 250
10−11
10−10
10−9
10−8
10−7
10−6
En
er
gy
 p
er
 B
it 
(J)
Link Capacity (Gbps)
 
 
EMesh
EMesh (Bcast)
WMesh [10%]
WMesh [20%]
WMesh [30%]
OBus
OXBar1
OXBar2
Fig. 10. Energy per bit scaling as a function of the link capacity for different
interconnect technologies and architectures. The network size is fixed to 256
nodes.
do not scale well, as their efficiency substantially deteriorates
for high link capacities. This is mainly due to the steep
increase in number of components leading to an extremely
high accumulated loss and, eventually, to unaffordable laser
power requirements.
C. Area-Energy Figure of Merit
As seen in the previous sections, a given on-chip network
may scale remarkably well in terms of area and perform poorly
in terms of energy, or viceversa. In order to evaluate both the
area and energy scalability of each solution, we propose the
following figure of merit:
FoM =
1
A ·ETbit
[bits/J/mm2] (18)
Such performance metric can be understood as the average
number of bits that can be effectively transmitted for (a) each
consumed joule of energy and (b) square millimeter of chip
real estate. It is therefore an indicator of the joint energy and
area efficiency of a given on-chip network. A large value of
this figure of merit is desired.
On the one hand, Figure 11 shows how the figure of merit
scales as a function of the network size in number of cores.
Again, electrical and wireless NoCs show a similar trend,
while a rapid decrease of the figure of merit is observed in
photonic NoCs. Overall, conventional NoC yields the best
performance. On the other hand, Figure 12 shows how the
figure of merit scales as a function of the link capacity, in
a network consisting of 256 cores. In this case, the analysis
is slightly more complex. While it is clear that the optical
crossbars scale poorly with the link capacity, the rest of options
yield similar performance. According to our analysis, the
optical bus shows the best performance for low link capacities,
whereas wireless NoCs could yield an improved efficiency for
high link capacities if the scaling trends observed in the state
of the art continue.
11
200 400 600 800 1000
102
104
106
108
1010
1012
1014
1016
Pr
op
os
ed
 F
ig
ur
e 
of
 M
er
it 
(b
its
/J/
mm
2 )
Number of cores
 
 
EMesh
EMesh (Bcast)
WMesh [260GHz]
WMesh [400GHz]
WMesh [800GHz]
OBus
OXBar1
OXBar2
Fig. 11. Scaling of the proposed figure of merit (higher is better) as a
function of the number of cores for different interconnect technologies and
architectures. The link capacity is set to 80 Gbps.
50 100 150 200 250
102
104
106
108
1010
Pr
op
os
ed
 F
ig
ur
e 
of
 M
er
it 
(b
its
/J/
mm
2 )
Link Capacity (Gbps)
 
 
EMesh
EMesh (Bcast)
WMesh [10%]
WMesh [20%]
WMesh [30%]
OBus
OXBar1
OXBar2
Fig. 12. Scaling of the proposed figure of merit (higher is better) as a function
of the link capacity for different interconnect technologies and architectures.
The network size is fixed to 256 nodes.
D. Discussion and Open Challenges
Results revealed in previous sections indicate that, in abso-
lute terms, the baseline NoC performs remarkably better than
its potential alternatives. However, it is important to note that
the technologies employed for electrical on-chip wires and
routers is thus far much more optimized than nanophotonic or
wireless chip-area technologies, which are still in their infancy
and may substantially improve in the following years. In the
specific case of WNoC, the efficiency of the communication
could be improved at different levels of design:
• At the transceiver level: Unlike in traditional wireless
systems, all the on-chip wireless transceivers share the
same power supply and, therefore, the energy per bit
metric encompasses the energy consumed by transmitter
and all the receivers within the transmission range -see
Equation (11)-. Thus far, we assumed Etxbit = Erxbit in
order to simplify the analysis. However, the ratio between
TABLE IV
DOMINANT AREA AND ENERGY SCALABILITY TRENDS
Architecture Area Energy
EMesh O
(
NC
)
O
(
N
)
WMesh O
(
N/C
)
O
(
N/C
)
OBus O
(
N2
)
O
(
αNβC
)
OXBar1 O
(
N2C
)
O
(
γN
)
OXBar2 O
(
N2C
)
O
(
δN ǫC
)
(α, β, γ, δ, ǫ are constants)
such figures could be chosen in the transceiver design
process. To this end, a model accounting for the trade-offs
between transceiver energy consumption, radiated power
and received power, would enable the optimization of the
energy efficiency.
• At the circuit level: In this work, we considered a
heterogeneous set of transceivers implementing different
modulations and aiming at different communication sce-
narios, which are not necessarily oriented to low area
and low power. Novel and optimized circuit topologies
could allow for a substantial improvement of the area
and energy efficiencies in wireless chip communication.
• At the technology level: The performance of a given wire-
less transceiver is undeniably limited by the underlying
technology. Generally, technological advancements lead
to higher operation frequency, lower area and potential
for lower energy consumption. The trend set by current
state-of-the-art transceivers will continue provided that
the employed technologies evolve accordingly. However,
the advent of a new technology bringing disruptive im-
provements, such as the graphene technology [65]–[68],
may allow to go beyond the predicted performance.
For the sake of fairness, the comparison must account for
the structural tendencies rather than for the absolute area
and energy values. Table IV summarizes the trends obtained
through the application of fitting methods to the area and
energy plots. We can observe that WMesh offers a good
area and energy scalability with respect the number of nodes
and an excellent scalability with respect to the link capacity.
From this, we can infer that the concept of WNoC is better
suited to the case of high data rate requirements leading to a
very high radiation frequency. Conversely, in small networks
working at lower speeds, electrical and photonic interconnects
are expected to offer improved area and energy efficiencies. It
is important to remark that these results do not include the area
and energy required by the circuits required to implement the
MAC protocol. However, SD-MAC [69] represents the only
MAC protocol for WNoC implemented to date and consumes
very low area and bit energy (∼0.01 mm2 and ∼70 pJ/packet
in 0.18 µm CMOS), suggesting that the impact of including
the MAC protocol within the analysis is negligible in light
of the results shown in this paper. This aspect will be further
addressed in future work.
VIII. CONCLUSIONS
The area and energy scalability of WNoC in terms of (a) the
number of cores within a multiprocessor and (b) the capacity
12
of each link in the network has been analyzed and compared
to those of conventional and optical NoCs. In support of this
study, we modeled the area and energy efficiencies of high-
speed transceivers by means of extrapolation with respect to
the state of the art and proposed a figure of merit encompassing
both metrics. Although it is shown that the baseline NoC
outperforms the wireless and optical alternatives in absolute
terms, such comparison is implementation-dependent and does
not reveal the fundamental scalability trends. A further analy-
sis of the results shows that WNoC offers good scalability both
in area and energy, especially with respect to the link capacity.
This outcome confirms the feasibility of WNoC, which may
take a central role in future multiprocessors given the rising
importance of multicast communication in such scenario.
ACKNOWLEDGMENT
The authors gratefully acknowledge support from Samsung
through the Global Research Outreach (GRO) program and
from INTEL through the Doctoral Student Honor Program.
This work has been also partially supported by the FI-AGAUR
grant of the Catalan Government and the FPU grant of the
Spanish Ministry of Education.
REFERENCES
[1] W. Dally and B. Towles, “Route packets, not wires: on-chip intercon-
nection networks,” in Proceedings of the 38th IEEE Design Automation
Conference. Acm, 2001, pp. 684–689.
[2] L. Benini and G. De Micheli, “Networks on chips: a new SoC paradigm,”
Computer, vol. 35, no. 1, pp. 70–78, 2002.
[3] T. Bjerregaard and S. Mahadevan, “A survey of research and practices
of Network-on-chip,” ACM Computing Surveys, vol. 38, no. 1, pp. 1–51,
Jun. 2006.
[4] D. A. B. Miller, “Device Requirements for Optical Interconnects to
Silicon Chips,” Proceedings of the IEEE, vol. 97, no. 7, pp. 1166–1185,
Jul. 2009.
[5] A. W. Topol, D. C. La Tulipe, L. Shi, D. J. Frank, K. Bernstein,
S. E. Steen, A. Kumar, G. U. Singco, A. M. Young, K. W. Guarini,
and M. Ieong, “Three-dimensional integrated circuits,” IBM Journal of
Research and Development, vol. 50, no. 4, pp. 491–506, Jul. 2006.
[6] B. S. Feero and P. P. Pande, “Networks-on-Chip in a Three-Dimensional
Environment: A Performance Evaluation,” IEEE Transactions on Com-
puters, vol. 58, no. 1, pp. 32–45, Jan. 2009.
[7] E. Socher and M.-C. F. Chang, “Can RF Help CMOS Processors?” IEEE
Communications Magazine, vol. 45, no. 8, pp. 104–111, Aug. 2007.
[8] R. G. Beausoleil, P. J. Kuekes, G. S. Snider, S.-y. Wang, and R. S.
Williams, “Nanoelectronic and Nanophotonic Interconnect,” Proceed-
ings of the IEEE, vol. 96, no. 2, pp. 230–247, Feb. 2008.
[9] D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N. Jouppi,
M. Fiorentino, A. Davis, N. Binkert, R. Beausoleil, and J. Ahn,
“Corona: System implications of emerging nanophotonic technology,”
ACM SIGARCH Computer Architecture News, vol. 36, no. 3, pp. 153–
164, 2008.
[10] N. Kirman and J. F. Martı´nez, “A Power-efficient All-optical On-chip
Interconnect Using Wavelength-based Oblivious Routing,” ACM Sigplan
Notices, vol. 45, no. 3, pp. 15–28, 2010.
[11] M.-C. F. Chang, E. Socher, S.-W. Tam, J. Cong, and G. Reinman, “RF
interconnects for communications on-chip,” Proceedings of the 2008
international symposium on Physical design - ISPD ’08, p. 78, 2008.
[12] A. Shacham, K. Bergman, and L. P. Carloni, “Photonic networks-on-
chip for future generations of chip multiprocessors,” IEEE Transactions
on Computers, vol. 57, no. 9, pp. 1246–1260, Sep. 2008.
[13] K. K. O, K. Kim, B. Floyd, J. Mehta, H. Yoon, C.-M. Hung, D. Bravo,
T. Dickson, X. Guo, R. Li, N. Trichy, J. Caserta, W. Bomstad, J. Branch,
D.-J. Yang, J. Bohorquez, E. Seok, L. Gao, A. Sugavanam, J.-J. Lin,
J. Chen, and J. Brewer, “On-chip antennas in silicon ICs and their
application,” IEEE Transactions on Electron Devices, vol. 52, no. 7,
pp. 1312–1323, 2005.
[14] S. Abadal, E. Alarco´n, M. C. Lemme, M. Nemirovsky, and A. Cabellos-
Aparicio, “Graphene-enabled Wireless Communication for Massive
Multicore Architectures,” IEEE Communications Magazine, vol. 51,
no. 11, pp. 137–143, 2013.
[15] S. Deb, A. Ganguly, P. P. Pande, B. Belzer, and D. Heo, “Wireless
NoC as Interconnection Backbone for Multicore Chips : Promises and
Challenges,” IEEE Journal on Emerging and Selected Topics in Circuits
and Systems (JETCAS), vol. 2, no. 2, pp. 228–239, 2012.
[16] A. Kahng, B. Li, L. Peh, and K. Samadi, “Orion 2.0: A fast and accurate
noc power and area model for early-stage design space exploration,” in
Proceedings of Design, Automation & Test in Europe, 2009, pp. 423–8.
[17] N. E. Jerger, L.-S. Peh, and M. Lipasti, “Virtual Circuit Tree Multi-
casting: A Case for On-Chip Hardware Multicast Support,” in 2008
International Symposium on Computer Architecture. Ieee, Jun. 2008,
pp. 229–240.
[18] M. Lodde, J. Flich, and M. E. Acacio, “Heterogeneous NoC Design
for Efficient Broadcast-based Coherence Protocol Support,” in 2012
IEEE/ACM Sixth International Symposium on Networks-on-Chip. Ieee,
May 2012, pp. 59–66.
[19] T. Krishna, L. Peh, B. Beckmann, and S. K. Reinhardt, “Towards the
ideal on-chip fabric for 1-to-many and many-to-1 communication,” in
44th Annual IEEE/ACM International Symposium on Microarchitecture
(MICRO-44), vol. 2, 2011, pp. 71–82.
[20] S. Rodrigo, J. Flich, J. Duato, and M. Hummel, “Efficient unicast
and multicast support for CMPs,” 2008 41st IEEE/ACM International
Symposium on Microarchitecture, pp. 364–375, Nov. 2008.
[21] F. A. Samman, T. Hollstein, and M. Glesner, “Multicast parallel pipeline
router architecture for network-on-chip,” in Proceedings of the Confer-
ence on Design, Automation and Test in Europe (DATE). ACM Press,
2008, pp. 1396–1401.
[22] R. Manevich, I. Walter, I. Cidon, and A. Kolodny, “Best of both worlds:
A bus enhanced NoC (BENoC),” 2009 3rd ACM/IEEE International
Symposium on Networks-on-Chip, pp. 173–182, 2009.
[23] J. Chan, G. Hendry, A. Biberman, K. Bergman, and L. P. Carloni,
“PhoenixSim: A Simulator for Physical-Layer Analysis of Chip-Scale
Photonic Interconnection Networks,” in Proceedings of the Conference
on Design, Automation and Test in Europe (DATE), 2010, pp. 691–696.
[24] G. Kurian, J. Miller, J. Psota, J. Eastep, J. Liu, J. Michel, L. Kimerling,
and A. Agarwal, “ATAC: A 1000-Core Cache-Coherent Processor with
On-Chip Optical Network,” in Proceedings of the PACT. ACM, 2010,
pp. 477–488.
[25] W.-H. Chen, S. Joo, S. Sayilir, R. Willmot, T.-Y. Choi, D. Kim, J. Lu,
D. Peroulis, and B. Jung, “A 6-Gb/s Wireless Inter-Chip Data Link
Using 43-GHz Transceivers and Bond-Wire Antennas,” IEEE Journal
of Solid-State Circuits, vol. 44, no. 10, pp. 2711–2721, Oct. 2009.
[26] H. Wang, M.-H. Hung, Y.-C. Yeh, and J. Lee, “A 60-GHz FSK
Transceiver with Automatically-Calibrated Demodulator in 90-nm
CMOS,” in IEEE Symposium on VLSI Circuits (VLSIC), 2010, pp. 95–
96.
[27] K. Kawasaki, Y. Akiyama, K. Komori, M. Uno, H. Takeuchi, T. Itagaki,
Y. Hino, Y. Kawasaki, K. Ito, and A. Hajimiri, “A Millimeter-Wave
Intra-Connect Solution,” IEEE Journal of Solid-State Circuits, vol. 45,
no. 12, pp. 2655–2666, 2010.
[28] X. Yu, S. P. Sah, S. Deb, P. P. Pande, B. Belzer, and D. Heo,
“A wideband body-enabled millimeter-wave transceiver for wireless
Network-on-Chip,” 2011 IEEE 54th International Midwest Symposium
on Circuits and Systems (MWSCAS), pp. 1–4, Aug. 2011.
[29] K. Okada, K. Kondou, M. Miyahara, M. Shinagawa, and H. Asada, “Full
Four-Channel 6.3-Gb/s 60-GHz CMOS TRX With Low-Power Analog
and Digital Baseband Circuitry,” IEEE Journal of Solid-State Circuits,
vol. 48, no. 1, pp. 46–65, 2013.
[30] S. Kawai, R. Minami, Y. Tsukui, Y. Takeuchi, H. Asada, and A. Musa,
“Direct-Conversion Transceiver in 65-nm CMOS,” in Radio Frequency
Integrated Circuits Symposium (RFIC), 2013 IEEE, no. 1, 2013, pp.
137–140.
[31] Y. P. Zhang, Z. M. Chen, and M. Sun, “Propagation Mechanisms
of Radio Waves Over Intra-Chip Channels With Integrated Antennas:
Frequency-Domain Measurements and Time-Domain Analysis,” IEEE
Transactions on Antennas and Propagation, vol. 55, no. 10, pp. 2900–
2906, Oct. 2007.
[32] P. Nenzi, F. Tripaldi, V. Varlamava, F. Palma, and M. Balucani, “On-
Chip THz 3D Antennas,” in Electronic Components and Technology
Conference (ECTC), 2012 IEEE 62nd, 2012, pp. 102–108.
[33] A. Ganguly, K. Chang, S. Deb, P. P. Pande, B. Belzer, and C. Teuscher,
“Scalable Hybrid Wireless Network-on-Chip Architectures for Multi-
Core Systems,” IEEE Transactions on Computers, vol. 60, no. 10, pp.
1485–1502, 2010.
13
[34] C. Wang, W.-H. Hu, and N. Bagherzadeh, “A Wireless Network-on-
Chip Design for Multicore Platforms,” in 19th International Euromicro
Conference on Parallel, Distributed and Network-Based Processing.
Ieee, Feb. 2011, pp. 409–416.
[35] S.-B. Lee, S.-W. Tam, I. Pefkianakis, S. Lu, M.-C. F. Chang, C. Guo,
G. Reinman, C. Peng, M. Naik, L. Zhang, and J. Cong, “A scalable
micro wireless interconnect structure for CMPs,” in Proceedings of the
Mobicom’09. New York, New York, USA: ACM Press, 2009, p. 217.
[36] D. Matolak, A. Kodi, S. Kaya, D. DiTomaso, S. Laha, and W. Rayess,
“Wireless networks-on-chips: architecture, wireless channel, and de-
vices,” IEEE Wireless Communications, vol. 19, no. 5, pp. 58–65, 2012.
[37] S. Sankaran, C. Mao, E. Seok, D. Shim, C. Cao, R. Han, D. J. Arenas,
D. B. Tanner, S. Hill, C.-M. Hung, and K. K. O, “Towards terahertz
operation of CMOS,” in International Solid-State Circuits Conference,
2009, pp. 202–204.
[38] E. Seok, D. Shim, and C. Mao, “Progress and challenges towards tera-
hertz CMOS integrated circuits,” IEEE Journal of Solid-State Circuits,
vol. 45, no. 8, pp. 1554–1564, 2010.
[39] I. Llatser, C. Kremers, A. Cabellos-Aparicio, J. M. Jornet, E. Alarco´n,
and D. N. Chigrin, “Graphene-based nano-patch antenna for terahertz
radiation,” Photonics and Nanostructures - Fundamentals and Applica-
tions, vol. 10, no. 4, pp. 353–358, 2012.
[40] J. M. Jornet and I. F. Akyildiz, “Graphene-based Plasmonic Nano-
Antenna for Terahertz Band Communication in Nanonetworks,” IEEE
Journal on Selected Areas in Communications, vol. 31, no. 12, pp. 685–
694, Dec. 2013.
[41] N. Ono, M. Motoyoshi, K. Takano, K. Katayama, R. Fujimoto, and
M. Fujishima, “135 GHz 98 mW 10 Gbps ASK transmitter and receiver
chipset in 40 nm CMOS,” in IEEE Symposium on VLSI Circuits (VLSIC),
2012, pp. 50–51.
[42] E. Laskin, P. Chevalier, B. Sautreuil, and S. Voinigescu, “A 140-GHz
double-sideband transceiver with amplitude and frequency modulation
operating over a few meters,” in IEEE BCTM, 2009, pp. 178–181.
[43] S. Hu, Y.-Z. Xiong, B. Zhang, L. Wang, T.-G. Lim, M. Je, and
M. Madihian, “A SiGe BiCMOS TX/RX Chipset With On-Chip SIW
Antennas for Terahertz Applications,” IEEE Journal of Solid-State
Circuits, vol. 47, no. 11, pp. 2654–2664, 2012.
[44] J.-D. Park, S. Kang, S. Thyagarajan, E. Alon, and A. Niknejad, “A
260 GHz fully integrated CMOS transceiver for wireless chip-to-chip
communication,” in IEEE Symposium on VLSI Circuits (VLSIC), 2012,
pp. 48–49.
[45] B. Khamaisi, S. Jameson, E. Socher, and S. Member, “A 210 227
GHz Transmitter With Integrated On-Chip Antenna in 90 nm CMOS
Technology,” IEEE Transactions on Terahertz Science and Technology,
vol. 3, no. 2, pp. 141–150, 2013.
[46] A. Lisauskas, S. Boppel, M. Mundt, V. Krozer, and H. G. Roskos, “Sub-
harmonic Mixing With Field-Effect Transistors: Theory and Experiment
at 639 GHz High Above fT,” IEEE Sensors Journal, vol. 13, no. 1, pp.
124–132, 2013.
[47] R. Han and E. Afshari, “A High-Power Broadband Passive Terahertz
Frequency Doubler in CMOS,” IEEE Transactions on Microwave Theory
and Techniques, vol. 61, no. 3, pp. 1150–1160, 2013.
[48] F. Golcuk, O. D. Gurbuz, and G. M. Rebeiz, “A 0.390.44 THz 2x4
Amplifier-Quadrupler Array With Peak EIRP of 34 dBm,” IEEE Trans-
actions on Microwave Theory and Techniques, vol. 61, no. 12, pp. 4483–
4491, 2013.
[49] H. Rucker, B. Heinemann, and A. Fox, “Half-Terahertz SiGe BiCMOS
Technology,” in Silicon Monolithic Integrated Circuits in RF Systems
(SiRF), 2012 IEEE 12th Topical Meeting on, 2012, pp. 133–136.
[50] E. ¨Ojefors, J. Grzyb, B. Heinemann, B. Tillack, and U. R. Pfeiffer,
“A 820 GHz SiGe chipset for terahertz active imaging applications,” in
IEEE International Solid-State Circuits Conference (ISSCC), 2011, pp.
224–225.
[51] R. A. Hadi, J. Grzyb, B. Heinemann, and U. R. Pfeiffer, “A Terahertz
Detector Array in a SiGe HBT Technology,” IEEE Journal of Solid-State
Circuits, vol. 48, no. 9, pp. 2002–2010, 2013.
[52] U. R. Pfeiffer, J. Grzyb, H. Sherry, A. Cathelin, and A. Kaiser,
“Toward low-NEP room-temperature THz MOSFET direct detectors in
CMOS technology,” in 2013 38th International Conference on Infrared,
Millimeter, and Terahertz Waves (IRMMW-THz). Ieee, Sep. 2013, pp.
1–2.
[53] T. Abe, Y. Yuan, H. Ishikuro, and T. Kuroda, “A 2Gb/s 150mW
UWB direct-conversion coherent transceiver with IQ-switching carrier
recovery scheme,” in IEEE International Solid-State Circuits Conference
(ISSCC), 2012, pp. 442–444.
[54] S. Koenig, D. Lopez-Diaz, J. Antes, F. Boes, R. Henneberger, A. Leuther,
A. Tessmann, R. Schmogrow, D. Hillerkuss, R. Palmer, T. Zwick,
C. Koos, W. Freude, O. Ambacher, J. Leuthold, and I. Kallfass, “Wireless
sub-THz communication system with high data rate,” Nature Photonics,
vol. 7, no. 12, pp. 977–981, Oct. 2013.
[55] L. Zhou, Z. Chen, C.-C. Wang, F. Tzeng, V. Jain, and P. Heydari, “A
2Gbps RF-correlation-based impulseradio UWB transceiver front-end in
130nm CMOS,” in IEEE RFIC, 2009, pp. 65–68.
[56] I. Sarkas, S. Nicolson, A. Tomkins, E. Laskin, P. Chevalier, and
B. Sautreuil, “An 18-Gb/s, Direct QPSK Modulation SiGe BiCMOS
TRX for Last Mile Links in the 7080 GHz Band,” IEEE Journal of
Solid-State Circuits, vol. 10, no. 1968-1980, p. 45, 2010.
[57] C. Wagner, H.-P. Forstner, G. Haider, A. Stelzer, and H. Jager, “A 79-
GHz radar transceiver with switchable TX and LO feedthrough in SiGe,”
in Bipolar/BiCMOS Circuits and Technology Meeting, 2008, pp. 105–
108.
[58] I. Sarkas, J. Hasch, A. Balteanu, and S. Voinigescu, “A Fundamental
Frequency 120-GHz SiGe BiCMOS Distance Sensor With Integrated
Antenna,” IEEE Transactions on Microwave Theory and Techniques,
vol. 60, no. 3, pp. 795–812, 2012.
[59] Yan Zhao, E. Ojefors, K. Aufinger, T. Meister, and U. Pfeiffer, “A 160-
GHz Subharmonic Transmitter and Receiver Chipset in an SiGe HBT
Technology,” IEEE Transactions on Microwave Theory and Techniques,
vol. 60, no. 10, pp. 3286–3299, 2012.
[60] S. Abadal, A. Cabellos-aparicio, J. A. La´zaro, M. Nemirovsky,
E. Alarco´n, and J. Sole´-Pareta, “Area and Laser Power Scalability Anal-
ysis in Photonic Networks-on-Chip,” in 17th International Conference
in Optical Network Design and Modeling (ONDM), 2013.
[61] J. Gorisse, D. Morche, and J. Jantunen, “Wireless transceivers for
gigabit-per-second communications,” 10th IEEE International NEWCAS
Conference, pp. 545–548, Jun. 2012.
[62] C. Chen and A. Joshi, “Runtime Management of Laser Power in Silicon-
Photonic Multibus NoC Architecture,” IEEE Journal of Selected Topics
in Quantum Electronics, vol. 19, no. 2, 2013.
[63] J. Ahn, M. Fiorentino, R. G. Beausoleil, N. Binkert, A. Davis, D. Fattal,
N. P. Jouppi, M. McLaren, C. M. Santori, R. S. Schreiber, S. M. Spillane,
D. Vantrease, and Q. Xu, “Devices and architectures for photonic chip-
scale integration,” Applied Physics A, vol. 95, no. 4, pp. 989–997, Feb.
2009.
[64] J. Cardenas, C. Poitras, and J. Robinson, “Low loss etchless silicon
photonic waveguides,” Optics Express, vol. 17, no. 6, pp. 4752–7, 2009.
[65] A. K. Geim and K. S. Novoselov, “The rise of graphene,” Nature
materials, vol. 6, no. 3, pp. 183–191, Mar. 2007.
[66] Y. Wu, K. a. Jenkins, A. Valdes-Garcia, D. B. Farmer, Y. Zhu, A. a. Bol,
C. Dimitrakopoulos, W. Zhu, F. Xia, P. Avouris, and Y.-M. Lin, “State-
of-the-art graphene high-frequency electronics.” Nano letters, vol. 12,
no. 6, pp. 3062–7, Jun. 2012.
[67] Y. Wu, D. B. Farmer, F. Xia, and P. Avouris, “Graphene Electronics:
Materials, Devices, and Circuits,” Proceedings of the IEEE, vol. 101,
no. 7, pp. 1620–1637, Jul. 2013.
[68] S.-J. Han, A. V. Garcia, S. Oida, K. A. Jenkins, and W. Haensch,
“Graphene radio frequency receiver integrated circuit,” Nature commu-
nications, vol. 5, 2014.
[69] D. Zhao and Y. Wang, “SD-MAC: Design and Synthesis of a
Hardware-Efficient Collision-Free QoS-Aware MAC Protocol for Wire-
less Network-on-Chip,” IEEE Transactions on Computers, vol. 57, no. 9,
pp. 1230–1245, 2008.
Sergi Abadal received the BSc and MSc degree
in Telecommunication Engineering from Technical
University of Catalunya, Barcelona, Spain, in 2010
and 2011, respectively. From September 2009 to
May 2010, he was a visiting researcher at the Broad-
band Wireless Networking Lab, Georgia Institute of
Technology, Atlanta. Since 2011, he is pursuing his
PhD at the NaNoNetworking Center in Catalunya
(N3Cat, http://www.n3cat.upc.edu) at UPC. In 2013,
he was awarded by INTEL within his Doctoral Stu-
dent Honor Program. His current research interests
are graphene-based wireless and nanophotonic communications for on-chip
networks.
14
Mario Iannazzo received the MSc degree in elec-
trical engineering from the Technical University of
Catalunya and the MA in digital arts from the
Pompeu Fabra University, Barcelona, Spain, in 1998
and 2006, respectively. From 1998 to 2004, he was
an AMS IC design engineer at Nokia Mobile Phones
Oy, Oulu, Finland. From 2005 to 2006, he was an IC
patent engineer at Oficina Ponti, Barcelona, Spain.
From 2007 to 2009, he was an IC consultant engi-
neer at Alten Gmbh, Mu¨nchen, Germany, working at
Infineon Technologies AG as a RF IC test engineer.
From 2010 to 2011, he was an AMS IC design engineer at Decawave Ltd,
Dublin, Ireland. In 2011, he joined the Department of Electronic Engineering,
Technical University of Catalunya, as a PhD candidate. His current research
interests include the areas of graphene transistor modelling, circuit and
transceiver design.
Mario Nemirovsky received the Telecommunica-
tions Engineering degree from the National Univer-
sity of La Plata, Argentina, in 1980, and his PhD
in Electrical and Computer Engineering from the
University of California, Santa Barbara, in 1990,
where he was an adjunct professor from 1991 to
1998. After being chief architect in companies such
as Apple Inc., National Semiconductors or General
Motors (GM), he founded several renowned start-
ups including FlowStorm Networks, Xstream Logic,
ConSentry Networks or Miraveo. In 2007, he be-
came an ICREA Senior Research Professor at the Barcelona Supercomputing
Center (BSC). Mario holds more than 60 issued patents: he pioneered
the concepts of Massively Multithreading (MMT) processing for the high
performance processor and the by now well-established Simultaneous Mul-
tithreding architecture (SMT). He also architected the GM engine control
being used in all GM cars for over 20 years. His current research interests
include multithreaded multicore systems, high performance systems, network
processors and Big Data.
Albert Cabellos-Aparicio received a BSc (2001),
MSc (2005) and PhD (2008) degree in Computer
Science Engineering from the Technical University
of Catalunya. He is also assistant professor of the
Computer Architecture Department and researcher
of the Broadband Communications Group since
2005. In 2010 he joined the NaNoNetworking Center
in Catalunya (http://www.n3cat.upc.edu) where he is
the Scientific Director. He is an editor of the Elsevier
Journal on Nano Computer Network and founder
of the ACM NANOCOM conference, the IEEE
MONACOM workshop and the N3Summit. He has also founded the LISPmob
open-source initiative (http://lispmob.org) along with Cisco. He has been a
visiting researcher at Cisco Systems and Agilent Technologies and a visiting
professor at the Royal Institute of Technology (KTH) and the Massachusetts
Institute of Technology (MIT). He has given more than 10 invited talks (MIT,
Cisco, INTEL, MIET, Northeastern Univ. etc.) and co-authored more than
15 journal and 40 conference papers. His main research interests are future
architectures for the Internet and nano-scale communications.
Heekwan Lee received the BSc degree in Electrical
Engineering from Yonsei University, Seoul, Korea,
in 1996 and the MA degree in mathematics from
the University of Southern California (USC), Los
Angeles, in 1999. He received the MSc and PhD
degrees in the Department of Electrical Engineering
from USC in 2001 and 2005, respectively. After his
graduation, he joined Samsung Advanced Institute of
Technology. Now he is working in DMC in Samsung
Electronics. His current research interests include
coding theory, cryptography, and Information theory
and Security.
Eduard Alarco´n (S’96, M’01), received MSc (na-
tional award) and PhD degrees in EE from UPC
Barcelona, Spain, in 1995 and 2000, respectively,
where he became Associate Professor in 2001, and
has been visiting Professor at University of Col-
orado at Boulder, USA (2003) and KTH Stockholm
(2011). He has coauthored more than 250 scientific
publications, 4 book chapters and 4 patents, and has
been involved in different national, EU and US R&D
projects. Research interests include the areas of on-
chip energy management circuits, energy harvesting
and wireless energy transfer, and nanocommunications. He was elected IEEE
CAS society distinguished lecturer, elected member of the IEEE CAS Board of
Governors (2010-2013), recipient of Best paper award at IEEEMWSCAS98,
co-editor of 4 journals special issues, 5 conference special sessions, TPC
co-chair and TPC member of 15 IEEE conferences, and Associate Editor
for IEEE TCAS-I, TCAS-II, JETCAS, JOLPE and Nano Communication
Networks.
