358 research outputs found

    High-level services for networks-on-chip

    Get PDF
    Future technology trends envision that next-generation Multiprocessors Systems-on- Chip (MPSoCs) will be composed of a combination of a large number of processing and storage elements interconnected by complex communication architectures. Communication and interconnection between these basic blocks play a role of crucial importance when the number of these elements increases. Enabling reliable communication channels between cores becomes therefore a challenge for system designers. Networks-on-Chip (NoCs) appeared as a strategy for connecting and managing the communication between several design elements and IP blocks, as required in complex Systems-on-Chip (SoCs). The topic can be considered as a multidisciplinary synthesis of multiprocessing, parallel computing, networking, and on- chip communication domains. Networks-on-Chip, in addition to standard communication services, can be employed for providing support for the implementation of system-level services. This dissertation will demonstrate how high-level services can be added to an MPSoC platform by embedding appropriate hardware/software support in the network interfaces (NIs) of the NoC. In this dissertation, the implementation of innovative modules acting in parallel with protocol translation and data transmission in NIs is proposed and evaluated. The modules can support the execution of the high-level services in the NoC at a relatively low cost in terms of area and energy consumption. Three types of services will be addressed and discussed: security, monitoring, and fault tolerance. With respect to the security aspect, this dissertation will discuss the implementation of an innovative data protection mechanism for detecting and preventing illegal accesses to protected memory blocks and/or memory mapped peripherals. The second aspect will be addressed by proposing the implementation of a monitoring system based on programmable multipurpose monitoring probes aimed at detecting NoC internal events and run-time characteristics. As last topic, new architectural solutions for the design of fault tolerant network interfaces will be presented and discussed

    A Scalable and Adaptive Network on Chip for Many-Core Architectures

    Get PDF
    In this work, a scalable network on chip (NoC) for future many-core architectures is proposed and investigated. It supports different QoS mechanisms to ensure predictable communication. Self-optimization is introduced to adapt the energy footprint and the performance of the network to the communication requirements. A fault tolerance concept allows to deal with permanent errors. Moreover, a template-based automated evaluation and design methodology and a synthesis flow for NoCs is introduced

    A case study for NoC based homogeneous MPSoC architectures

    Get PDF
    The many-core design paradigm requires flexible and modular hardware and software components to provide the required scalability to next-generation on-chip multiprocessor architectures. A multidisciplinary approach is necessary to consider all the interactions between the different components of the design. In this paper, a complete design methodology that tackles at once the aspects of system level modeling, hardware architecture, and programming model has been successfully used for the implementation of a multiprocessor network-on-chip (NoC)-based system, the NoCRay graphic accelerator. The design, based on 16 processors, after prototyping with field-programmable gate array (FPGA), has been laid out in 90-nm technology. Post-layout results show very low power, area, as well as 500 MHz of clock frequency. Results show that an array of small and simple processors outperform a single high-end general purpose processo

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    Quarc: a novel network-on-chip architecture

    Get PDF
    This paper introduces the Quarc NoC, a novel NoC architecture inspired by the Spidergon NoC. The Quarc scheme significantly outperforms the Spidergon NoC through balancing the traffic which is the result of the modifications applied to the topology and the routing elements.The proposed architecture is highly efficient in performing collective communication operations including broadcast and multicast. We present the topology, routing discipline and switch architecture for the Quarc NoC and demonstrate the performance with the results obtained from discrete event simulations

    Introduction de mécanismes de tolérance aux pannes franches dans les architectures de processeur « many-core » à mémoire partagée cohérente

    Get PDF
    The always increasing performance demands of applications such as cryptography, scientific simulation, network packets dispatching, signal processing or even general-purpose computing has made of many-core architectures a necessary trend in the processor design. These architectures can have hundreds or thousands of processor cores, so as to provide important computational throughputs with a reasonable power consumption. However, their important transistor density makes many-core architectures more prone to hardware failures. There is an augmentation in the fabrication process variability, and in the stress factors of transistors, which impacts both the manufacturing yield and lifetime. A potential solution to this problem is the introduction of fault-tolerance mechanisms allowing the processor to function in a degraded mode despite the presence of defective internal components. We propose a complete in-the-field reconfiguration-based permanent failure recovery mechanism for shared-memory many-core processors. This mechanism is based on a firmware (stored in distributed on-chip read-only memories) executed at each hardware reset by the internal processor cores without any external intervention. It consists in distributed software procedures, which locate the faulty components (cores, memory banks, and network-on-chip routers), reconfigure the hardware architecture, and provide a description of the functional hardware infrastructure to the operating system. Our proposal is evaluated using a cycle-accurate SystemC virtual prototype of an existing many-core architecture. We evaluate both its latency, and its silicon cost.L'augmentation continue de la puissance de calcul requise par les applications telles que la cryptographie, la simulation, ou le traitement du signal a fait évoluer la structure interne des processeurs vers des architectures massivement parallèles (dites « many-core »). Ces architectures peuvent contenir des centaines, voire des milliers de cœurs afin de fournir une puissance de calcul importante avec une consommation énergétique raisonnable. Néanmoins, l'importante densité de transistors fait que ces architectures sont très susceptibles aux pannes matérielles. L'augmentation dans la variabilité du processus de fabrication, et dans les facteurs de stress des transistors, dégrade à la fois le rendement de fabrication, et leur durée de vie. Nous proposons donc un mécanisme complet de tolérance aux pannes franches, permettant les architectures « many-core » à mémoire partagée cohérente de fonctionner dans un mode dégradé. Ce mécanisme s'appuie sur un logiciel embarqué et distribué dans des mémoires sur puce (« firmware »), qui est exécuté par les cœurs à chaque démarrage du processeur. Ce logiciel implémente plusieurs algorithmes distribués permettant de localiser les composants défaillants (cœurs, bancs mémoires, et routeurs des réseaux sur puce), de reconfigurer l'architecture matérielle, et de fournir une cartographie de l'infrastructure matérielle fonctionnelle au système d'exploitation. Le mécanisme supporte aussi bien des défauts de fabrication, que des pannes de vieillissement après que la puce est en service dans l'équipement. Notre proposition est évaluée en utilisant un prototype virtuel précis au cycle d'une architecture « many-core » existante

    Interconnect architectures for dynamically partially reconfigurable systems

    Get PDF
    Dynamically partially reconfigurable FPGAs (Field-Programmable Gate Arrays) allow hardware modules to be placed and removed at runtime while other parts of the system keep working. With their potential benefits, they have been the topic of a great deal of research over the last decade. To exploit the partial reconfiguration capability of FPGAs, there is a need for efficient, dynamically adaptive communication infrastructure that automatically adapts as modules are added to and removed from the system. Many bus and network-on-chip (NoC) architectures have been proposed to exploit this capability on FPGA technology. However, few realizations have been reported in the public literature to demonstrate or compare their performance in real world applications. While partial reconfiguration can offer many benefits, it is still rarely exploited in practical applications. Few full realizations of partially reconfigurable systems in current FPGA technologies have been published. More application experiments are required to understand the benefits and limitations of implementing partially reconfigurable systems and to guide their further development. The motivation of this thesis is to fill this research gap by providing empirical evidence of the cost and benefits of different interconnect architectures. The results will provide a baseline for future research and will be directly useful for circuit designers who must make a well-reasoned choice between the alternatives. This thesis contains the results of experiments to compare different NoC and bus interconnect architectures for FPGA-based designs in general and dynamically partially reconfigurable systems. These two interconnect schemes are implemented and evaluated in terms of performance, area and power consumption using FFT (Fast Fourier Transform) andANN(Artificial Neural Network) systems as benchmarks. Conclusions drawn from these results include recommendations concerning the interconnect approach for different kinds of applications. It is found that a NoC provides much better performance than a single channel bus and similar performance to a multi-channel bus in both parallel and parallel-pipelined FFT systems. This suggests that a NoC is a better choice for systems with multiple simultaneous communications like the FFT. Bus-based interconnect achieves better performance and consume less area and power than NoCbased scheme for the fully-connected feed-forward NN system. This suggests buses are a better choice for systems that do not require many simultaneous communications or systems with broadcast communications like a fully-connected feed-forward NN. Results from the experiments with dynamic partial reconfiguration demonstrate that buses have the advantages of better resource utilization and smaller reconfiguration time and memory than NoCs. However, NoCs are more flexible and expansible. They have the advantage of placing almost all of the communication infrastructure in the dynamic reconfiguration region. This means that different applications running on the FPGA can use different interconnection strategies without the overhead of fixed bus resources in the static region. Another objective of the research is to examine the partial reconfiguration process and reconfiguration overhead with current FPGA technologies. Partial reconfiguration allows users to efficiently change the number of running PEs to choose an optimal powerperformance operating point at the minimum cost of reconfiguration. However, this brings drawbacks including resource utilization inefficiency, power consumption overhead and decrease in system operating frequency. The experimental results report a 50% of resource utilization inefficiency with a power consumption overhead of less than 5% and a decrease in frequency of up to 32% compared to a static implementation. The results also show that most of the drawbacks of partial reconfiguration implementation come from the restrictions and limitations of partial reconfiguration design flow. If these limitations can be addressed, partial reconfiguration should still be considered with its potential benefits.Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 201

    Software-based and regionally-oriented traffic management in Networks-on-Chip

    Get PDF
    Since the introduction of chip-multiprocessor systems, the number of integrated cores has been steady growing and workload applications have been adapted to exploit the increasing parallelism. This changed the importance of efficient on-chip communication significantly and the infrastructure has to keep step with these new requirements. The work at hand makes significant contributions to the state-of-the-art of the latest generation of such solutions, called Networks-on-Chip, to improve the performance, reliability, and flexible management of these on-chip infrastructures
    corecore