NoCs:a Short History of Success and a Long Future by De Micheli, Giovanni
NoCs:
a Short History of Success
and a Long Future
Giovanni De Micheli
Federico Angiolini
With credits to Charles Janac
2A LOOK BACK
(c) Giovanni De Micheli
320th Century: The (Mini)bus
(c) Giovanni De Micheli
4But buses run out of gas….
(c) Giovanni De Micheli
§ Not enough parallelism for increasing core counts
§ Power: all transactions essentially broadcast
§ Zero composability
§ Physical issues: timing, routing
Bus
Slave 1
Master 1
Slave 3
Master 2
Slave 2
Master 3
5Bus Evolution
(c) Giovanni De Micheli
Bus
Bus
Bus
Bus
Bus
Bus
Bus
Cr
os
sb
ar
Bus++	v2.0 Protocolevolution
Topology
evolution
6The birth of the NoC
(c) Giovanni De Micheli
Using a network to replace global wiring has 
advantages of structure, performance, and modularity.
Dally & Towles, 2001
We propose borrowing models, techniques, and 
tools from the network design field and applying 
them to SoC design.
Benini & De Micheli, 2002
We explain why the shared bus, which is today's dominant 
template, will not meet the performance requirements of 
tomorrow’s systems. We present an alternative 
interconnection in the form of switching networks.
Guerrier & Greiner, 2000
7NoCs: A broad literature
 
SPIN: a Scalable, Packet Switched, On-chip Micro-network
 
Adrijean Adriahantenaina (UPMC/LIP6) 
Hervé Charlery (UPMC/LIP6) 
Alain Greiner (UPMC/LIP6) 
Laurent Mortiez (UPMC/LIP6) 
Cesar Albenes Zeferino (UFRGS) 
 
Abstract 
This paper presents the SPIN micro-network that is a 
generic, scalable interconnect architecture for system on 
chip. The SPIN architecture relies on packet switching 
and point-to-point bi-directional links between the routers 
implementing the micro-network. SPIN gives the system 
designer the simple view of a single shared address space 
and provides a variable number of VCI compliant 
communication interfaces for both initiators (masters) and 
targets (slaves).  Performance comparisons between a 
classical PI-bus based interconnect and the SPIN micro-
network are analyzed. 
Keywords 
Systems -on-Chip. Networks-on-Chip. Embedded Systems. 
1. Introduction 
The technology scaling improvements will allow the 
building of Systems -on-Chip (SoCs) with from several 
dozens to hundreds of components within a four-billion-
transistor chip until the end of this decade [1]. This will also 
allow the development of new applications in the fields of 
telecommunication, entertainment and consumer 
electronics. Such systems will require communication 
templates providing several dozens of Gbit/s [2], which still 
must be reusable to meet time-to-market requirements. 
The reusable communication templates typically used in 
current SoCs are based on the bus approach, using either a 
single shared bus [3] or a hierarchy of buses [4]. However, 
such approach has strong drawbacks: A bus does not scale 
with the system size as the bandwidth is shared by all the 
components attached to it. Furthermore, as the number of 
cores increases, the capacitance load grows, degrading the 
bus operating frequency.  
Some recent works [1][2][5] have proposed the use of 
integrated switching networks as an alternative approach to 
interconnect cores in SoCs. The overall idea is that such 
networks, also called  Networks-on-Chip (NoCs), meet three 
of the major key communication requirements for future 
SoCs: reusability, scalability, and parallelism. 
This work presents the evaluation and comparison of two 
on-chip communication templates based on bus and NoC 
approaches. The communication architectures are compared 
by simulating cycle-true, RT-level models running synthetic 
workloads.  
This paper is structured as follows. In section 2, we present 
the generic architecture used to compare the two 
communication templates, which are described in section 3. 
The bit-true, cycle-true simulation environment is presented 
in section 4. A first experiment analyzing the global latency 
as a function of the number of terminals is presented in 
section 5. A second experiment describing the latency as a 
function of the offered load is described in section 6. 
Finally, section 7 presents some concluding remarks. 
2. The Generic Architecture  
The architecture used in this work to evaluate and compare 
the performance of the two communication architectures is 
shown in Figure 1. It is based on two kinds of components: 
initiators and targets. The system can have different number 
of cores for each type. The "initiator" components are 
traffic generators, which send requests to the "target" 
components. The "target" component sends a response as 
soon as it  receives a request.  
               
target
vci
target1
native
vci
target
vci
target0
native
vci
init
vci
init1
native
vci
init
vci
init0
native
vci
Communication Architecture
wrappers
cores
terminals
 
Figure 1. The reference system. 
All the components in the system are VCI-compliant [6] and, 
since the communication architectures are typically based 
Proceedings of the Design,Automation and Test in Europe Conference and Exhibition (DATE’03)
1530-1591/03 $17.00 © 2003 IEEE
0 0 1 8 -9 1 6 2 /0 2 /$ 1 7 .0 0  © 2 0 0 2  IE E E7 0 Compu ter
Network s  Chips:
A N w S C 
Pa adig
S
ystem-on-chip (SoC) designs provide inte-
grated  solu tions to challenging design
problems in the telecommu nications, mu l-
timed ia, and  consu mer electronics do-
mains. Mu ch of the progress in these fi elds
hinges on the desig ers’ ability to conceive complex
electronic engines u nder strong time-to-mark et
pressu re. Su ccess will rely on u sing appropriate
design and process techn logi s, as well as on th
ability to interconnect existing components—
inclu d ing processors, controllers, and  memory
arrays—reliably, in a plu g-and-play fashion.
By the end of the decade, SoCs, u sing 5 0 -nm tran-
sistors operating below one volt, will grow to 4  bil-
lion transistors ru nning at 1 0  GHz, according to the
International Technology Roadmap for Semicon-
du ctors. The major challenge designers of these sys-
tems mu st overcome will be to provide for fu nction-
ally correct, r liabl  operationof the interacting com-
ponents. On-chip physical i terconnections will pre-
sent a limiting factor for performance and, possibly,
energy consu mption. 
Silicon technologies face other challenges.
Synchronization of fu tu re chips with a single clock
sou rce and negligible sk ew will be extremely d iffi -
cu lt, if not impossible. The most lik ely synchro-
nization parad igm for fu tu re chips—globally
asynchronou s and locally synchronou s—involves
u sing many different clock s. In the absence of a sin-
gle timing reference, SoC chips become distribu ted
systems on a single silicon su bstrate. Global con-
trol of the information traffi c is u nlik ely to su cceed
becau se the system needs to k eep track  of each com-
ponent’s states. Thu s, components will initiate data
transfers au tonomou sly, according to their needs.
The global commu nication pattern will be fu lly dis-
tribu ted , with little or no global coordination.
As SoC complexity scales, captu ring the system’s
fu nctionality with fu lly deterministic operation
models will become increasingly diffi cu lt. As global
wires span mu ltiple clock  domains, synchroniza-
tion failu res in commu nicating between d ifferent
domains will be rare bu t u navoidable events.1
Moreover, energy and device reliability concerns
will impose small logic swings and power su pplies,
most lik ely less than one volt. Electrical noise du e
to crosstalk , electromagnetic interference, and radi-
ation-indu ced charge injection will lik ely produ ce
data errors, also called  u psets. Thu s, transmitting
digital valu es on wires will be inherently u nreliable
and nondeterministic. Other cau ses of nondeter-
minism inclu de design components with a high l vel
of abstraction and coarse granu larity and d istrib-
u ted  commu nication control.
Focu sing on u sing probabilistic metrics su ch as
average valu es or variance to qu antify design objec-
tives su ch as performance and power will lead to a
major change in design methodologies. Overall,
SoC design will be based on both deterministic and
stochastic models. Creating complex SoCs requ ires
a modu lar, component-based  approach to both
hardware and software design. 
Based on the premise that interconnect technology
will be the limiting factor for achieving SoCs’ opera-
tional goals, we postu late that the layered design of
reconfi gu rable micronetwork s, which exploits the
methods and tools u sed for general network s, can
best achieve effi cient commu nication on SoCs.
On-ch ip micronetwork s, desig ned with  a layered meth odolog y, will 
meet th e distinctive ch alleng es of providing  fu nctionally c rrect, 
reliable operation of interacting  syst m-on-ch ip components.
Luca Benini
University of
Bologna
Giovanni 
De Micheli
Stanford University
S O C  D E S I G N S
Networks o  Chips
4 1 4 0 7 4 0 -7 4 7 5 /0 5 /$ 2 0 .0 0  © 2 0 0 5  IEEE Co p u blished by the IEEE CS an d the IEEE CASS IEEE Desig n & Test of Compu ters
CONTINUING ADVANCES in semiconductor tech-
nology enable the integration of increasing numbers of
IP blocks in a single SoC. Interconnect infrastructures,
such as buses, switches, and networks on chips (NoCs),
combine the IPs into a working SoC. Moreover, the indus-
try expect  platfo m-based SoC design to evolve to com-
munication-centric design, with NoCs as a central
enabling technology.1
In this articl , we introduce the Æther al NoC.2-4 The
tenet of the Æthereal NoC is that guaranteed services
(GSs)—such as uncorrupted, lossless, ordered data
delivery; guaranteed throughput; and bounded laten-
cy—are essential for the efficient construction of robust
S Cs. One reason is that many IPs have in erent perfor-
mance requirements, such as a minimum throughput
(for real-time streaming data) or bounded latency (for
interrupts). Furthermore, because the traffic of different
IPs d es not interfere with eac  other, the IPs’ behaviors
are decoupled; thus, the IPs can be designed and tested
independently of each other and the NoC. This aids in
the compositional design and programming of SoCs.
GSs require resource reservations for the worst case.
To exploit the NoC capacity unused by GS traffic, we
also provide best-effort services (BESs). GSs serve criti-
cal (for example, real-time) traffic, and BESs serve non-
critical communication.
Many architectures that implement
BESs already exist, but our concept of
contention-free ro ting is one of the first
to offer guaranteed services—throughput
and latency, in particular—in addition to
BESs. GSs require resource reservation;
the Æthereal NoC thus requires configu-
ration and programming. We offer alter-
native programming models and router
architectures to facilitate design space exploration: A
system architect can optimize a NoC with either a dis-
tributed programming model (for scalability) or a cen-
tralized programming m del (for low cost). In the latt r
case, he can choose between a NoC without BESs or one
with normal or improved BES p rformance. Of course,
better services cost more. All alternative NoCs are based
on contention-free routing, and, as a result, the Æthereal
design fl w can gen rate, program, a  simulate them.2
Performance gu arantees in networks
Researchers have paid much attention to the prob-
lem of building networks (both on- and off-chip) with
predictable performance.5-8 Fundamentally, there are
two reasons for unpredictable network behavior: First,
the network can drop packets as a result of buffer over-
flows, misrouting, router failure, and so forth. A given
drop rate provides only a statistical reasoning about
packet arrival, not a hard, 100% guarantee. Second,
even if the network does not drop any packets, packets
share resources (such as wires and buffers) with other
packets. When two packets attempt to use the same
resource at the sam  tim ,contention oc urs and the
network must either delay or drop one of the packets.
Delayed packets often delay the packets following
them, causing network congestion.
Æthereal Network on Chip:
Conc pts, Architectu r ,
and Implementations
Editor’s note:
Many SoC applications requ ire gu aranteed levels of service and
performance. Can networks on chips (NoCs) enable su ch gu arantees? Here,
the au thors demonstrate that the Æthereal network can. This particu lar NoC,
developed at Philips R s arch L aborat ries, encompasses hardware, a
programming model, and a design flow. Read on to find ou t abou t the details.
—André Ivanov, University of British Colu mbia
Kees Goossens, John Dielissen, and 
Andrei Ra˘dulescu
Philips Research L aboratories
8The concept in a nutshell
(c) Giovanni De Micheli
Image credit: iNoCs
9The Main Promises
§ Scalability
§ Hundreds/thousands of connected cores
§ Tunable Power/Frequency/Area
§ With packetization has happened, links can be tuned locally:
- wide or narrow, fast or slow, ..
§ Easier design closure 
§ Fewer, shorter, point-to-point wires
§ Decentralized nature suited to multiple clock/power domains
(c) Giovanni De Micheli
10
NoC
NoC synthesis: xPIPES
ni_request
ni_response
ni_receive
ni_resendM
as
te
r
Sl
av
e
Initiator NI Target NI
H
ea
de
r R
eg
Pa
yl
oa
d 
R
eg
H
ea
de
r R
eg
Pa
yl
oa
d 
R
eg
H
ea
de
r R
eg
Pa
yl
oa
d 
R
eg
H
ea
de
r R
eg
Pa
yl
oa
d 
R
eg
Pending 
Trans Reg
FSM
Routing 
LUT
Pending 
Trans Reg
FSM
Routing 
LUT
FSM
FSM
Buffer Buffer
BufferBuffer
OCP
AHB
AXI
OCP
AHB
AXI
OCP
AHB
AXI
OCP
AHB
AXI
Source 
routing
Parametric 
link width
Protocol 
interoperab.
[Bertozzi et al. 2005]
11
Layout-aware NoC Synthesis
(c) Giovanni De Micheli
Credit: iNoCs, 2009
[Angiolini, Murali et al.]
12
Quick, Broad Academic Adoption
§ Within few years, hundreds of papers every year
§ Favourite subjects:
§ Topology
§ Architecture, esp. switches, buffering, …
§ Routing algorithms and implementations
§ Simulation
§ Physical implementation (signaling, asynchronous, …)
§ Fault tolerance
§ QoS, mapping
§ Design tools (EDA)
(c) Giovanni De Micheli
13
Interesting Research Trends
§ Some research were strongly inspired by WANs and 
supercomputers, e.g.:
§ Virtual channels
§ Deeply pipelined switches
§ Hypercube topologies
§ Store-and-forward switching
§ Virtual output queuing
§ Dynamic routing
§ How did this go?
(c) Giovanni De Micheli
14
Learnings Set in Quickly
§ A NoC is not the same as a wide-area network
§ In most cases, opposite tradeoffs:
§ WANs need to minimize cable count, but cable length is unlimited. 
Router area and power is secondary. Software on top implements 
much of the stack. Accepted latency: milliseconds
§ NoC wires are comparatively inexpensive but they must be short. 
Area and power are severely limited. Must work also without any 
software. Accepted latency: sometimes <1 ns!
§ Led to quick adjustments and an opposite current innovation 
(for some types of designs)
§ Low power NoCs, bufferless routing, combinational NoCs
(c) Giovanni De Micheli
15
So What Happened in the Real World?
“Academia invents complex solutions to 
problems, and evaluates them in a 
simplified context.
“Industry tries to find the simplest 
solutions because the context is already 
complex.”
José Duato
16
Initial Industrial Adoption
§ A few designs based on the 
“mesh” approach for CMP and 
high-end computing
§ Notably, Tilera TILE64 (~2007) 
(64 cores, 11 W), Intel SCC (~2009) 
(48 x86 cores, 125 W)
§ A huge number of designs based 
on heterogeneous MPSoCs, often 
low-power
§ E.g. ST (2006), TI (2008), NEC, 
Samsung, LG, MobilEye,
Toshiba, Qualcomm…
(c) Giovanni De Micheli
17
Who Designs These NoCs? 
§ Specialized vendors: Arteris, Sonics, NetSpeed (acquired by Intel)
§ ARM
§ Academic spinoffs: e.g. Silistix, iNoCs (IP acquired by Arteris)
§ In-house teams are still the lion’s share of the market
§ Interconnect and related services are seen as key differentiator
§ Crucial for functionality, design time, performance, power
§ Specialized designs have sometimes fundamentally different and unique 
traffic patterns (e.g. GPUs, network processors, FPGAs…)
§ With ever-increasing complexity, this may shift
(c) Giovanni De Micheli
18
NOCS TODAY
(c) Giovanni De Micheli
19
NoC is Data Highway of the SoC
§ Only IP that traverses the chip
§ Changes between projects
§ If it does not function properly,
the SoC does not work
§ Contains the longest SoC wires
§ Carries most of the interesting data
§ Helps define the SoC architecture
§ Must support SoC performance requirements
§ Often the last IP to be frozen, has to fit into available channel space
§ When timing closure becomes schedule-critical!
§ Changes multiple time in response to Architecture, Marketing ECOs
Slide courtesy 
20
A NoC is Many NoCs
§ At the very minimum, most implementations separate 
request and response networks (avoid deadlocks)
§ NoCs may further separate message types
§ E.g. for cache coherence (see later); TILE64 has 5 networks…
§ NoCs are likely to be partitioned into “subsystem NoCs”
§ either co-designed or reused
§ Plus additional “interwoven” NoCs for:
§ Configuration of main NoCs
§ Performance monitoring/statistics
§ Debug
§ …
(c) Giovanni De Micheli
21
FlexNoC Main NoC
Ncore Cache Coherent NoC
CSR (Service) NoC
Memory NoC
Video NoC
NoC Interconnect Technology Enables Better SoCs - Faster
§ NoC technology allows isolation of individual fabrics so they can be 
managed quickly and easily
§ Capturing both logical interconnect topologies and physical constraints
§ Enabling rapid delivery of SoC interconnect for architectural, logical and 
layout success
Plan Optimize Implement
Service NoC
Memory NoC
Video NoC
Main NoC
Coherent NoC
Slide courtesy 
22
Quo vadis: two main drivers in SoC design
§ Mobile communication
§ Autonomous vehicles
(c) Giovanni De Micheli
23
NoCs Cover Design Space of SoC Requirements
§ Architecture Flexibility:  Unlimited 
topologies, Support for standard protocols & 
heterogeneous coherency, multiple caching 
levels to reduce off-chip accesses
§ Performance:  166Mhz-2Ghz frequency 
@16nm, >1TBit/sec bandwidth w/512 bit links
§ Power:  <0.5mW idle power/1M gates@16nm, 
one cycle power domain wake up, 3-level 
clock gating, etc.
§ Area:  Endpoint NoC = Lower 
area/Interconnect function (vs hybrid buses or 
corner router NoCs)
§ Productivity:  Design exploration, multi-level 
modelling, auto test bench generation, 
physical awareness, design flexibility, 
derivative SoC NoCs can be built in 3 days
§ Safety:  Resilience – ISO26262 ASIL B-D 
capable, Functionally safe domains
§ Security:  Customer extensible firewalls and 
access controls
Slide courtesy 
Safety Security
Area
Power Productivity
Quality Assumed and Vitally Important!!
Car ADAS SoC
Performance
Mobility SoC
24
Mobility – Original Killer App for NoC Technology
§ Application Processors, Modems
§ Many initiators to many targets
§ Required NoC due to needs for:
§ Low power for battery life
§ Efficient area for cost
§ Performance for response time
§ Productivity due to short SoC cycles
§ Multiple power domain flexibility
§ All at the same time!
Slide courtesy 
25
NoCs Replaced Cascaded Crossbars
AXI, AHB, APB, OCP, …
AXI, AHB, APB, OCP, …
Cascaded  crossbar architecture
+ bridges
Efficient transport based Architecture
Flexible & Scalable
NoC
NIUNIUNIU
NIUNIUNIU
NIU
NIU
Configurable
topology
Fewer 
wires
Addr decode & 
context tracking 
once
Data Header
Byte
.
.
Byte
Burst
OPC
Addr
User
Id
Transaction Packet
AXI Xbar
AXI Xbar
AXI
AXI
AXI2AHB
AHB2AXI
Just one level of 
arbitration
per XB
Address decode, 
context tracking 
duplication
Lots of Wires
Congested
Area
Pipe Insertion?
Clock crossing?
Power crossing? Protocol 
Restriction
No 
Congestion
Protocol 
Decoupling
Flexible Pipe Insertion
Clock/Power crossing 
anywhere
Slide courtesy 
26
Automotive: the new driver for NoCs
Slide courtesy 
Chassis Control (4)
Vision Camera (4-16) (6)
ADAS / Machine Learning (1-4) (2)
Dashboard / HUD (2)
Radar and/or LIDAR (4-6) (6)
V2X / V2I / WAN Modem (2-3) (3)
Engine Control (2)
Infotainment (1)
§ Electrification + ECU Consolidation + Automated Driving = 
Disruption
§ Total SoCs/car = 24 (avg.)
§ 60M electronically enabled cars by 2025
§ → ~1.4B SoCs per year (plus electronic infrastructure SoCs)
27
End to End Resilience for ISO26262 Compliance
§ Unit duplication - fault detection
§ ECC at interface & in-transport
§ Packet Consistency checkers
DMA Camera
ASIL D NoC
UARTMem Ctrl Peripheral
DRAM
ROM/Flash
Automotive
CPU Display
UART
ECC - Core
=
?
=
?
=
?
=
?
Fault
Packet Consistency 
checker
NoC
Without
Safety 
Goals
Firewall (SW 
programmable)
=
?
cc
=
?
Duplicated unit in 
lock-step with 
checker
Sa
fe
ty
 C
on
tro
lle
r
cccccc
cc cc
Non-duplicated Transport network
Slide courtesy 
§ Safety Controller 
§ Fault reporting logic BIST
§ Multi ASIL Level Support
§ ARM Cortex® R5/R7 support
28
Cache Coherence
§ Many SoC designs using
some form of cache coherence
§ ARM ACE, CHI sockets
§ Synch. among processors; 
§ also for CPU/GPU 
heterogeneous computing
§ Often coherent & non-coherent islands
§ Adds significant requirements to the NoC
§ In-NoC directory services and related management
§ Multiple networks support the cache coherence messages 
with sufficient performance and without deadlocks
(c) Giovanni De Micheli
Credit: ARM
29
Physical Design
§ Increasingly critical in recent nodes
§ Well-known issue: global wire delay is worsening
§ NoC has to feature configurable pipelining
§ Design QoR/PPA depends significantly on input floorplan
§ Wire routing also problematic
§ Wire-intensive, but must fit in remaining channel space
§ Useful to narrow links as much as possible, 
but only where latency is expendable
§ Interplay with domain partitioning and power management
§ NoCs must be partitioned in domains, individually power-managed 
(status and control signals to flush the packets in flight)
(c) Giovanni De Micheli
30
Quality of Service
§ Expectations of fine-grained control
§ Especially CPU ↔ memory traffic should have “zero latency”
§ Extremely complex combinations of high-bandwidth, low-latency 
requirements
§ Combined with complex memory mappings 
- interleaving, address spaces, …
§ Requirement for multiple HW tuning knobs 
§ Buffers, arbitration, priorities, …
§ QoS encompasses  notions like “security” (from attackers) 
and “safety” (from faults)
(c) Giovanni De Micheli
31
UPCOMING CHALLENGES
(c) Giovanni De Micheli
32
Automotive: Super Computer on Wheels
§ Level 4 ADAS will require 30-40 Tops of processing
§ Functional safety versus performance
§ Power consumption versus performance
§ Security versus performance
§ Sensor fusion (Cameras, Lidars, Radars, Ultrasonics) need 
1-2 Terabits/second bandwidth
§ Need highly sophisticated yet functionally safe NoC
§ Near real time latency
§ On-chip cache hierarchy to minimize off-chip DRAM accesses
§ Architectural flexibility
§ CPU subsystem, vision subsystem, deep learning, power management, 
cache coherency islands and high bandwidth memory on a single SoC
§ Interconnect needs to support both 2D implementations and 3D approaches 
using chiplets
§ Power Management – car have only  300W budget for ~80 chips, 
less than 12 watts for air cooling
Slide courtesy 
33
IoT: Working in Milliwatts
§ The IoT market is still evolving 
and killer applications are 
still emerging
§ We already know that IoT applications 
will require fast computing, 
but in a challenging power and area budget
§ Sensor networks, wearables, embedded
§ IoT applications will soon need NoCs on a massive scale
§ Power management?
§ Extreme resource conservation?
(c) Giovanni De Micheli
34
Emerging Artificial Intelligence Applications
§ New algorithms & large data sets 
à hardware architecture evolution
§ How you move the data determines
§ Performance
§ Power
§ Scalability
§ Regular arrays, suited for e.g. meshes
§ Switches + buffers à “corner routers”
§ Assisted mesh generation, editing
§ Dedicated optimizations (routing, buffering)?
§ Broadcast for updating of CNN weights
Slide courtesy 
35
AI/Mesh Background: Array of Processors
§ Each node has a router and controller
§ Controller provides access to one or more processing units
§ Controller provide interfaces (sockets) to the corner router
§ Processing units give access to Controller/Router
§ Note: Router may include integrated controller
Controller
Router
Slide courtesy 
36
AI Mesh Generation
Slide courtesy 
37
AI Challenges (Hard Stuff) – Predictability, Safety & Reliability
§ How do you verify a deep learning system?
§ How do you debug the Neural Network black box?
§ What are the ethics and biases of these systems?
§ What does it mean to make a Neural Network “safe”?
§ No one wants a stale AI algorithm à hardware evolution
Slide courtesy 
38
ONE LAST THING…
…WE STILL HAVE TO GET RIGHT
(c) Giovanni De Micheli
39
What About NoC Design Software
§ NoC design flows are extremely complex
§ Rich specifications (sockets, connectivity, address maps, …)
§ Design entry
§ Schematic editing
§ Floorplan viewing and tuning (placement, pipelining)
§ Reporting
§ Iterations and ECOs
§ Interaction with simulators, back-annotation from placement/routing, etc.
§ Generation of RTL and many collaterals
- Scripts, IP-XACT, documentation, verification…
§ …how much is design automation supporting NoC design?
(c) Giovanni De Micheli
40
Distilling Lessons to Do Better
1. The market was not ready ten years ago (but may be now)
§ Automation is distrusted until necessary, i.e. on really large designs
2. The problem is very, very complex
§ Large set of requirements that are often conflicting: 
latency, bandwidth, power, area, wirelength, frequency, traffic 
priorities, resilience, cache coherence, deadlock freedom…
3. Solution domains are extremely different
§ NetSpeed offers a heavily assisted flow to generate mesh-like NoCs, 
but few types of chips look like meshes. 
Domain-specific tools and algorithms needed
4. Push-button automation is the wrong goal
§ Real-world flows are iterative and often demand last-minute ECOs. 
Automation must be piecemeal and users must be able to override it 
at any stage 
(c) Giovanni De Micheli
41
CONCLUSIONS
(c) Giovanni De Micheli
42
Outlook
§ NoCs have been with us for almost 20 years
§ An unmitigated success in the academia and, 
with few years of delay, also in the industry
§ Different research avenues all turned out relevant
§ New promising avenues are optical and wireless NoCs
§ We are not done yet:
§ Chip complexity keeps scaling up and new challenges 
(e.g. resilience, coherence) become prominent
§ NoCs must cover extremely-high-performance as well as 
extremely-low-power designs, seamlessly
§ We must remove EDA bottlenecks to enable the next 
generations of designs
(c) Giovanni De Micheli
43
Thank you
(c) Giovanni De Micheli
