1,573 research outputs found
Fast, Accurate and Detailed NoC Simulations
Network-on-Chip (NoC) architectures have a wide variety of parameters that can be adapted to the designer's requirements. Fast exploration of this parameter space is only possible at a high-level and several methods have been proposed. Cycle and bit accurate simulation is necessary when the actual router's RTL description needs to be evaluated and verified. However, extensive simulation of the NoC architecture with cycle and bit accuracy is prohibitively time consuming. In this paper we describe a simulation method to simulate large parallel homogeneous and heterogeneous network-on-chips on a single FPGA. The method is especially suitable for parallel systems where lengthy cycle and bit accurate simulations are required. As a case study, we use a NoC that was modelled and simulated in SystemC. We simulate the same NoC on the described FPGA simulator. This enables us to observe the NoC behavior under a large variety of traffic patterns. Compared with the SystemC simulation we achieved a speed-up of 80-300, without compromising the cycle and bit level accuracy
A case study for NoC based homogeneous MPSoC architectures
The many-core design paradigm requires flexible and modular hardware and software components to provide the required scalability to next-generation on-chip multiprocessor architectures. A multidisciplinary approach is necessary to consider all the interactions between the different components of the design. In this paper, a complete design methodology that tackles at once the aspects of system level modeling, hardware architecture, and programming model has been successfully used for the implementation of a multiprocessor network-on-chip (NoC)-based system, the NoCRay graphic accelerator. The design, based on 16 processors, after prototyping with field-programmable gate array (FPGA), has been laid out in 90-nm technology. Post-layout results show very low power, area, as well as 500 MHz of clock frequency. Results show that an array of small and simple processors outperform a single high-end general purpose processo
Heracles: Fully Synthesizable Parameterized MIPS-Based Multicore System
Heracles is an open-source complete multicore system written in Verilog. It is fully parameterized and can be reconfigured and synthesized into different topologies and sizes. Each processing node has a 7-stage pipeline, fully bypassed, microprocessor running the MIPS-III ISA, a 4-stage input-buffer, virtual-channel router, and a local variable-size shared memory. Our design is highly modular with clear interfaces between the core, the memory hierarchy, and the on-chip network. In the baseline design, the microprocessor is attached to two caches, one instruction cache and one data cache, which are oblivious to the global memory organization. The memory system in Heracles can be configured as one single global shared memory (SM), or distributed shared memory (DSM), or any combination thereof. Each core is connected to the rest of the network of processors by a parameterized, realistic, wormhole router. We show different topology configurations of the system, and their synthesis results on the Xilinx Virtex-5 LX330T FPGA board. We also provide a small MIPS cross-compiler toolchain to assist in developing software for Heracles
Design and implementation of the Quarc network on-chip
Networks-on-Chip (NoC) have emerged as alternative to buses to provide a packet-switched communication medium for modular development of large Systems-on-Chip. However, to successfully replace its predecessor, the NoC has to be able to efficiently exchange all types of traffic including collective communications. The latter is especially important for e.g. cache updates in multicore systems. The Quarc NoC architecture has been introduced as a Networks-on-Chip which is highly efficient in exchanging all types of traffic including broadcast and multicast. In this paper we present the hardware implementation of the switch architecture and the network adapter (transceiver) of the Quarc NoC. Moreover, the paper presents an analysis and comparison of the cost and performance between the Quarc and the Spidergon NoCs implemented in Verilog targeting the Xilinx Virtex FPGA family. We demonstrate a dramatic improvement in performance over the Spidergon especially for broadcast traffic, at no additional hardware cost
A Benes Based NoC Switching Architecture for Mixed Criticality Embedded Systems
Multi-core, Mixed Criticality Embedded (MCE) real-time systems require high
timing precision and predictability to guarantee there will be no interference
between tasks. These guarantees are necessary in application areas such as
avionics and automotive, where task interference or missed deadlines could be
catastrophic, and safety requirements are strict. In modern multi-core systems,
the interconnect becomes a potential point of uncertainty, introducing major
challenges in proving behaviour is always within specified constraints,
limiting the means of growing system performance to add more tasks, or provide
more computational resources to existing tasks.
We present MCENoC, a Network-on-Chip (NoC) switching architecture that
provides innovations to overcome this with predictable, formally verifiable
timing behaviour that is consistent across the whole NoC. We show how the
fundamental properties of Benes networks benefit MCE applications and meet our
architecture requirements. Using SystemVerilog Assertions (SVA), formal
properties are defined that aid the refinement of the specification of the
design as well as enabling the implementation to be exhaustively formally
verified. We demonstrate the performance of the design in terms of size,
throughput and predictability, and discuss the application level considerations
needed to exploit this architecture
Automated Hardware Prototyping for 3D Network on Chips
Vor mehr als 50 Jahren stellte IntelÂź MitbegrĂŒnder Gordon Moore eine Prognose zum Entwicklungsprozess der Transistortechnologie auf. Er prognostizierte, dass sich die Zahl der Transistoren in integrierten Schaltungen alle zwei Jahre verdoppeln wird. Seine Aussage ist immer noch gĂŒltig, aber ein Ende von Moores Gesetz ist in Sicht. Mit dem Ende von Mooreâs Gesetz mĂŒssen neue Aspekte untersucht werden, um weiterhin die Leistung von integrierten Schaltungen zu steigern. Zwei mögliche AnsĂ€tze fĂŒr "More than Mooreâ sind 3D-Integrationsverfahren und heterogene Systeme. Gleichzeitig entwickelt sich ein Trend hin zu Multi-Core Prozessoren, basierend auf Networks on chips (NoCs).
Neben dem Ende des Mooreschen Gesetzes ergeben sich bei immer kleiner werdenden TechnologiegröĂen, vor allem jenseits der 60 nm, neue Herausforderungen. Eine Schwierigkeit ist die WĂ€rmeableitung in groĂskalierten integrierten Schaltkreisen und die daraus resultierende Ăberhitzung des Chips. Um diesem Problem in modernen Multi-Core Architekturen zu begegnen, muss auch die Verlustleistung der Netzwerkressourcen stark reduziert werden. Diese Arbeit umfasst eine durch Hardware gesteuerte Kombination aus Frequenzskalierung und Power Gating fĂŒr 3D On-Chip Netzwerke, einschlieĂlich eines FPGA Prototypen. DafĂŒr wurde ein Takt-synchrones 2D Netzwerk auf ein dreidimensionales asynchrones Netzwerk mit mehreren Frequenzbereichen erweitert. ZusĂ€tzlich wurde ein skalierbares Online-Power-Management System mit geringem Ressourcenaufwand entwickelt.
Die Verifikation neuer Hardwarekomponenten ist einer der zeitaufwendigsten Schritte im Entwicklungsprozess hochintegrierter digitaler Schaltkreise. Um diese Aufgabe zu beschleunigen und um eine parallele Softwareentwicklung zu ermöglichen, wurde im Rahmen dieser Arbeit ein automatisiertes und benutzerfreundliches Tool fĂŒr den Entwurf neuer Hardware Projekte entwickelt. Eine grafische BenutzeroberflĂ€che zum Erstellen des gesamten Designablaufs, vom Erstellen der Architektur, Parameter Deklaration, Simulation, Synthese und Test ist Teil dieses Werkzeugs. Zudem stellt die GröĂe der Architektur fĂŒr die Erstellung eines Prototypen eine besondere Herausforderung dar. FrĂŒhere Arbeiten haben es versĂ€umt, eine schnelles und unkompliziertes Prototyping, insbesondere von Architekturen mit mehr als 50 Prozessorkernen, zu realisieren. Diese Arbeit umfasst eine Design Space Exploration und FPGA-basierte Prototypen von verschiedenen 3D-NoC Implementierungen mit mehr als 80 Prozessoren
- âŠ