A Complete Network-On-Chip Emulation Framework by Genko, Nicolas et al.
A Complete Network-On-Chip
Emulation Framework
Nicolas Genko, LSI/EPFL Switzerland
David Atienza, DACYA/UCM Spain
G. De Micheli, LSI/EPFL Switzerland
J.M. Mendias, DACYA/UCM Spain
R. Hermida, DACYA/UCM Spain
F. Catthoor, IMEC Belgium
2Outline
• Introduction
• NoC Emulation architecture
• NoC Emulation Flow
• Experimental results
• Conclusion
3Motivations
• Growing interest in NoCs.
• Buses run out of requirements.
• Need for structured and reliable interconnect.
• NoC Design tools
• Synthesis tools like Xpipes allow us to quickly
generate NoC models.
• NoC models need to be accurately tested to
see how well they fit each concrete application.
–Simulation (cycle accurate).
–Emulation on FPGA.
4Previous Work
• NoC software simulation:
• High level models in C/C++.
• To evaluate the latency of NoCs.
• To evaluate the throughput of NoCs.
• NoC implementation on FPGAs:
• For functional validation.
• To show the effectiveness of NoCs.
• To validate NoCs features.
5NoC Emulation on FPGA
• The emulation can achieve important speed-
ups compared to cycle accurate
simulations:
• Up to four orders of magnitude faster.
• Real inputs with millions of packets can be
emulated.
• Emulation on FPGA enables functional
validation of NoC systems.
• FPGA emulation is a way to run NoCs
accurately.
• Extensive profiling of statistics and NoC
emulation features are provided.
6NoC Architecture to be emulated
• Our focus:
• Switch topology
• Switch parameters
–Number of inputs
–Number of outputs
–Size of buffers
7Outline
• Introduction
• NoC Emulation architecture
• NoC Emulation Flow
• Experimental results
• Conclusion
8NoC Emulation architecture
• A Processor (i.e. PowerPC):
Orchestrates the whole process.
• A monitor: Display on the
screen of a PC the information
extracted from NoC emulation
components.
• The emulation platform.
• The processor can access
each component by accessing
their specific addresses.
• In our design, we allow up to 4
internal busses and 1024
devices in each internal bus.
9Stochastic Traffic
• Uniform Model; Parameters:
• Length of packets.
• Interval between packets.
• Burst Model; Parameters:
• Transition probabilities in a 2-state Markov chain.
• Other models possible (i.e. Poisson…).
• Trace driven traffic generators:
• Generates traffic from a trace recorded on a real life
application.
10
Example of TG structure
• A bench of registers:
• For traffic parameterization.
• For random initialization.
• A packet generator which generates various traffic
patterns.
• A Network interface:
• Converts a traffic pattern in flits for NoC.
• Can be adapted for any type of NoC.
11
Statistics reports and analysis
• Stochastic receptors:
• Histograms, which show an image of the
received traffic.
• Total running time.
• Trace driven receptors:
• Latency analyzer.
• Congestion counter.
12
Outline
• Introduction
• NoC Emulation architecture
• NoC Emulation Flow
• Experimental results
• Conclusion
13
NoC Emulation flow: Approach
• Objective:  A HW/SW emulation environment.
• Provide a versatile emulation platform.
• Avoids often hardware re-synthesis.
• HW part: network of switches to emulate
any NoC packet-switching
intercommunication scheme.
• It can emulate different types of NoC and
compare their features.
• SW part: A processor configures and rules
the NoC emulation platform features to
emulate and statistics acquisition.
14
NoC Emulation flow: Overview
1) Platform compilation: Setup of
NoC parameters, type of TG/TR.
2) Physical synthesis.
3) Platform initialization: Setup the
software with emulation
parameters.
4) Software compilation.
5) Emulation on FPGA: The
emulation runs according to the
user-specific setup.
6) Final report: The user visualizes
the results of the emulation on
the screen of his/her PC.
15
Emulation setting
• Platform settings:
• Topology, type of generators…
• Software settings:
• Traffic definition, orchestration of the emulation…
• Ease of use:
• Simplicity of the flow.
• Speed of the emulation.
16
Outline
• Introduction
• NoC Emulation architecture
• NoC Emulation Flow
• Experimental results
• Conclusion
17
FPGA reports
• Platform with:
• 4 TG, 4 TR
• 6 switches
7387 Xilinx slices (80%)
0.218Control module
7.4690TR trace driven
4.0371TR stochastic
7.0652TG trace driven
7.8719TG stochastic
FPGA percentage (%)Number of slicesDevice
18
FPGA reports
• Platform speed:
50 MHz
• The speed has been chosen regarding the
possibilities of our Virtex 2 Pro FPGA.
3’20’’3.2 sec50MOur Emulation
5 days 19h2h13’20KSystemC
(MPARM)
36 days 4h13h53’3.2KVerilog
(ModelSim)
Simulation time
For 1000 Mpackets
Simulation time
For 16
Mpackets
Speed(cycles/sec)Simulation
mode
19
Experimental setup
• Each TG generates
some traffic at 45%
of the maximum
bandwidth with two
routing possibilities
in two cases.
• Two inter-switch
links are loaded
with 90% of traffic.
20
Experimental results
• With stochastic
traffic devices.
• Run-time vs.
Number of sent
packets.
• Burst traffic creates
more congestion on
the NoC than
uniform traffic.
21
Experimental results
• With trace driven
traffic devices.
• Congestion rate vs.
Number of
packet/Burst with a
variation of the
Number of
Flit/Packet.
Measure of congestion according
to burst’s length in flits.
22
Experimental result
• With trace-driven
traffic devices.
• Average latency
vs. N of
packets/Burst.
• The latency
reaches a
maximum.
• The maximum is a
function of the
congestion rate
(90%).
23
Outline
• Introduction
• NoC Emulation architecture
• NoC Emulation Flow
• Experimental results
• Conclusion
24
Conclusion
• Important speed-ups in NoC studies are
possible with our NoC emulation platform.
• Our HW/SW emulation solves efficiently the
time-consuming problem of HW re-
synthesis in case of many NoC parameters
changes.
• With larger FPGAs, it will be possible to
emulate very large NoCs (tens of switches in
the next generation of FPGAs).
