3 research outputs found
Simulation Environment for Link Energy Estimation in Networks-on-Chip with Virtual Channels
Network-on-chip (NoC) is the most promising design paradigm for the
interconnect architecture of a multiprocessor system-on-chip (MPSoC). On the
downside, a NoC has a significant impact on the overall energy consumption of
the system. NoC simulators are highly relevant for design space exploration
even at an early stage. Since links in NoC consume up to 50% of the energy, a
realistic energy consumption of links in NoC simulators is important. This work
presents a simulation environment which implements a technique to precisely
estimate the data dependent link energy consumption in NoCs with virtual
channels for the first time. Our model works at a high level of abstraction,
making it feasible to estimate the energy requirements at an early design
stage. Additionally, it enables the fast evaluation and early exploration of
low-power coding techniques. The presented model is applicable for 2D and 3D
NoCs. A case study for an image processing application shows that the current
link model leads to an underestimate of the link energy consumption by up to a
factor of four. In contrast, the technique presented in this paper estimates
the energy quantities precisely with an error below 1% compared to results
obtained by precise, but computational extensive, bit-level simulation
Ratatoskr: An open-source framework for in-depth power, performance and area analysis in 3D NoCs
We introduce ratatoskr, an open-source framework for in-depth power,
performance and area (PPA) analysis in NoCs for 3D-integrated and heterogeneous
System-on-Chips (SoCs). It covers all layers of abstraction by providing a NoC
hardware implementation on RT level, a NoC simulator on cycle-accurate level
and an application model on transaction level. By this comprehensive approach,
ratatoskr can provide the following specific PPA analyses: Dynamic power of
links can be measured within 2.4% accuracy of bit-level simulations while
maintaining cycle-accurate simulation speed. Router power is determined from RT
level synthesis combined with cycle-accurate simulations. The performance of
the whole NoC can be measured both via cycle-accurate and RT level simulations.
The performance of individual routers is obtained from RT level including
gate-level verification. The NoC area is calculated from RT level. Despite
these manifold features, ratatoskr offers easy two-step user interaction:
First, a single point-of-entry that allows to set design parameters and second,
PPA reports are generated automatically. For both the input and the output,
different levels of abstraction can be chosen for high-level rapid network
analysis or low-level improvement of architectural details. The synthesize NoC
model reduces up to 32% total router power and 3% router area in comparison to
a conventional standard router. As a forward-thinking and unique feature not
found in other NoC PPA-measurement tools, ratatoskr supports heterogeneous 3D
integration that is one of the most promising integration paradigms for
upcoming SoCs. Thereby, ratatoskr lies the groundwork to design their
communication architectures
NoCs in Heterogeneous 3D SoCs: Co-Design of Routing Strategies and Microarchitectures
Heterogeneous 3D System-on-Chips (3D SoCs) are the most promising design
paradigm to combine sensing and computing within a single chip. A special
characteristic of communication networks in heterogeneous 3D SoCs is the
varying latency and throughput in each layer. As shown in this work, this
variance drastically degrades the network performance. We contribute a
co-design of routing algorithms and router microarchitecture that allows to
overcome these performance limitations. We analyze the challenges of
heterogeneity: Technology-aware models are proposed for communication and
thereby identify layers in which packets are transmitted slower. The
communication models are precise for latency and throughput under zero load.
The technology model has an area error and a timing error of less than 7.4% for
various commercial technologies from 90 to 28nm. Second, we demonstrate how to
overcome limitations of heterogeneity by proposing two novel routing algorithms
called Z+(XY)Z- and ZXYZ that enhance latency by up to 6.5x compared to
conventional dimension order routing. Furthermore, we propose a high
vertical-throughput router microarchitecture that is adjusted to the routing
algorithms and that fully overcomes the limitations of slower layers. We
achieve an increased throughput of 2 to 4x compared to a conventional router.
Thereby, the dynamic power of routers is reduced by up to 41.1% and we achieve
improved flit latency of up to 2.26x at small total router area costs between
2.1% and 10.4% for realistic technologies and application scenarios