378 research outputs found

    Scalability of broadcast performance in wireless network-on-chip

    Get PDF
    Networks-on-Chip (NoCs) are currently the paradigm of choice to interconnect the cores of a chip multiprocessor. However, conventional NoCs may not suffice to fulfill the on-chip communication requirements of processors with hundreds or thousands of cores. The main reason is that the performance of such networks drops as the number of cores grows, especially in the presence of multicast and broadcast traffic. This not only limits the scalability of current multiprocessor architectures, but also sets a performance wall that prevents the development of architectures that generate moderate-to-high levels of multicast. In this paper, a Wireless Network-on-Chip (WNoC) where all cores share a single broadband channel is presented. Such design is conceived to provide low latency and ordered delivery for multicast/broadcast traffic, in an attempt to complement a wireline NoC that will transport the rest of communication flows. To assess the feasibility of this approach, the network performance of WNoC is analyzed as a function of the system size and the channel capacity, and then compared to that of wireline NoCs with embedded multicast support. Based on this evaluation, preliminary results on the potential performance of the proposed hybrid scheme are provided, together with guidelines for the design of MAC protocols for WNoC.Peer ReviewedPostprint (published version

    A Scalable & Energy Efficient Graphene-Based Interconnection Framework for Intra and Inter-Chip Wireless Communication in Terahertz Band

    Get PDF
    Network-on-Chips (NoCs) have emerged as a communication infrastructure for the multi-core System-on-Chips (SoCs). Despite its advantages, due to the multi-hop communication over the metal interconnects, traditional Mesh based NoC architectures are not scalable in terms of performance and energy consumption. Folded architectures such as Torus and Folded Torus were proposed to improve the performance of NoCs while retaining the regular tile-based structure for ease of manufacturing. Ultra-low-latency and low-power express channels between communicating cores have also been proposed to improve the performance of conventional NoCs. However, the performance gain of these approaches is limited due to metal/dielectric based interconnection. Many emerging interconnect technologies such as 3D integration, photonic, Radio Frequency (RF), and wireless interconnects have been envisioned to alleviate the issues of a metal/dielectric interconnect system. However, photonic and RF interconnects need the additional physically overlaid optical waveguides or micro-strip transmission lines to enable data transmission across the NoC. Several on-chip antennas have shown to improve energy efficiency and bandwidth of on-chip data communications. However, the date rates of the mm-wave wireless channels are limited by the state-of-the-art power-efficient transceiver design. Recent research has brought to light novel graphene based antennas operating at THz frequencies. Due to the higher operating frequencies compared to mm-wave transceivers, the data rate that can be supported by these antennas are significantly higher. Higher operating frequencies imply that graphene based antennas are just hundred micrometers in size compared to dimensions in the range of a millimeter of mm-wave antennas. Such reduced dimensions are suitable for integration of several such transceivers in a single NoC for relatively low overheads. In this work, to exploit the benefits of a regular NoC structure in conjunction with emerging Graphene-based wireless interconnect. We propose a toroidal folding based NoC architecture. The novelty of this folding based approach is that we are using low power, high bandwidth, single hop direct point to point wireless links instead of multihop communication that happens through metallic wires. We also propose a novel phased based communication protocol through which multiple wireless links can be made active at a time without having any interference among the transceiver. This offers huge gain in terms of performance as compared to token based mechanism where only a single wireless link can be made active at a time. We also propose to extend Graphene-based wireless links to enable energy-efficient, phase-based chip-to-chip communication to create a seamless, wireless interconnection fabric for multichip systems as well. Through cycle-accurate system-level simulations, we demonstrate that such designs with torus like folding based on THz links instead of global wires along with the proposed phase based multichip systems. We provide estimates that they are able to provide significant gains (about 3 to 4 times better in terms of achievable bandwidth, packet latency and average packet energy when compared to wired system) in performance and energy efficiency in data transfer in a NoC as well as multichip system. Thus, realization of these kind of interconnection framework that could support high data rate links in Tera-bits-per-second that will alleviate the capacity limitations of current interconnection framework

    Recent Trends and Considerations for High Speed Data in Chips and System Interconnects

    Get PDF
    This paper discusses key issues related to the design of large processing volume chip architectures and high speed system interconnects. Design methodologies and techniques are discussed, where recent trends and considerations are highlighted

    An Energy and Performance Exploration of Network-on-Chip Architectures

    Get PDF
    In this paper, we explore the designs of a circuit-switched router, a wormhole router, a quality-of-service (QoS) supporting virtual channel router and a speculative virtual channel router and accurately evaluate the energy-performance tradeoffs they offer. Power results from the designs placed and routed in a 90-nm CMOS process show that all the architectures dissipate significant idle state power. The additional energy required to route a packet through the router is then shown to be dominated by the data path. This leads to the key result that, if this trend continues, the use of more elaborate control can be justified and will not be immediately limited by the energy budget. A performance analysis also shows that dynamic resource allocation leads to the lowest network latencies, while static allocation may be used to meet QoS goals. Combining the power and performance figures then allows an energy-latency product to be calculated to judge the efficiency of each of the networks. The speculative virtual channel router was shown to have a very similar efficiency to the wormhole router, while providing a better performance, supporting its use for general purpose designs. Finally, area metrics are also presented to allow a comparison of implementation costs

    Smart Chips for Smart Surroundings -- 4S

    Get PDF
    The overall mission of the 4S project (Smart Chips for Smart Surroundings) was to define and develop efficient flexible, reconfigurable core building blocks, including the supporting tools, for future Ambient System Devices. Reconfigurability offers the needed flexibility and adaptability, it provides the efficiency needed for these systems, it enables systems that can adapt to rapidly changing environmental conditions, it enables communication over heterogeneous wireless networks, and it reduces risks: reconfigurable systems can adapt to standards that may vary from place to place or standards that have changed during and after product development. In 4S we focused on heterogeneous building blocks such as analogue, hardwired functions, fine and coarse grain reconfigurable tiles and microprocessors. Such a platform can adapt to a wide application space without the need for specialized ASICs. A novel power aware design flow and runtime system was developed. The runtime system decides dynamically about the near-optimal application mapping to the given hardware platform. The overall concept was verified on hardware platforms based on an existing SoC and in a second step with novel silicon. DRM (Digital Radio Mondiale) and MPEG4 Video applications have been implemented on the platforms demonstrating the adaptability of the 4S concept

    Understanding the Impact of On-chip Communication on DNN Accelerator Performance

    Full text link
    Deep Neural Networks have flourished at an unprecedented pace in recent years. They have achieved outstanding accuracy in fields such as computer vision, natural language processing, medicine or economics. Specifically, Convolutional Neural Networks (CNN) are particularly suited to object recognition or identification tasks. This, however, comes at a high computational cost, prompting the use of specialized GPU architectures or even ASICs to achieve high speeds and energy efficiency. ASIC accelerators streamline the execution of certain dataflows amenable to CNN computation that imply the constant movement of large amounts of data, thereby turning on-chip communication into a critical function within the accelerator. This paper studies the communication flows within CNN inference accelerators of edge devices, with the aim to justify current and future decisions in the design of the on-chip networks that interconnect their processing elements. Leveraging this analysis, we then qualitatively discuss the potential impact of introducing the novel paradigm of wireless on-chip network in this context.Comment: ICECS201

    Computing and communications for the software-defined metamaterial paradigm: a context analysis

    Get PDF
    Metamaterials are artificial structures that have recently enabled the realization of novel electromagnetic components with engineered and even unnatural functionalities. Existing metamaterials are specifically designed for a single application working under preset conditions (e.g., electromagnetic cloaking for a fixed angle of incidence) and cannot be reused. Software-defined metamaterials (SDMs) are a much sought-after paradigm shift, exhibiting electromagnetic properties that can be reconfigured at runtime using a set of software primitives. To enable this new technology, SDMs require the integration of a network of controllers within the structure of the metamaterial, where each controller interacts locally and communicates globally to obtain the programmed behavior. The design approach for such controllers and the interconnection network, however, remains unclear due to the unique combination of constraints and requirements of the scenario. To bridge this gap, this paper aims to provide a context analysis from the computation and communication perspectives. Then, analogies are drawn between the SDM scenario and other applications both at the micro and nano scales, identifying possible candidates for the implementation of the controllers and the intra-SDM network. Finally, the main challenges of SDMs related to computing and communications are outlined.Peer ReviewedPostprint (published version

    Evolution of Publications, Subjects, and Co-authorships in Network-On-Chip Research From a Complex Network Perspective

    Get PDF
    The academia and industry have been pursuing network-on-chip (NoC) related research since two decades ago when there was an urgency to respond to the scaling and technological challenges imposed on intra-chip communication in SoC designs. Like any other research topic, NoC inevitably goes through its life cycle: A. it started up (2000-2007) and quickly gained traction in its own right; B. it then entered the phase of growth and shakeout (2008-2013) with the research outcomes peaked in 2010 and remained high for another four/five years; C. NoC research was considered mature and stable (2014-2020), with signs showing a steady slowdown. Although from time to time, excellent survey articles on different subjects/aspects of NoC appeared in the open literature, yet there is no general consensus on where we are in this NoC roadmap and where we are heading, largely due to lack of an overarching methodology and tool to assess and quantify the research outcomes and evolution. In this paper, we address this issue from the perspective of three specific complex networks, namely the citation network, the subject citation network, and the co-authorship network. The network structure parameters (e.g., modularity, diameter, etc.) and graph dynamics of the three networks are extracted and analyzed, which helps reveal and explain the reasons and the driving forces behind all the changes observed in NoC research over 20 years. Additional analyses are performed in this study to link interesting phenomena surrounding the NoC area. They include: (1) relationships between communities in citation networks and NoC subjects, (2) measure and visualization of a subject\u27s influence score and its evolution, (3) knowledge flow among the six most popular NoC subjects and their relationships, (4) evolution of various subjects in terms of number of publications, (5) collaboration patterns and cross-community collaboration among the authors in NoC research, (6) interesting observation of career lifetime and productivity among NoC researchers, and finally (7) investigation of whether or not new authors are chasing hot subjects in NoC. All these analyses have led to a prediction of publications, subjects, and co-authorship in NoC research in the near future, which is also presented in the paper

    Architecting a One-to-many Traffic-Aware and Secure Millimeter-Wave Wireless Network-in-Package Interconnect for Multichip Systems

    Get PDF
    With the aggressive scaling of device geometries, the yield of complex Multi Core Single Chip(MCSC) systems with many cores will decrease due to the higher probability of manufacturing defects especially, in dies with a large area. Disintegration of large System-on-Chips(SoCs) into smaller chips called chiplets has shown to improve the yield and cost of complex systems. Therefore, platform-based computing modules such as embedded systems and micro-servers have already adopted Multi Core Multi Chip (MCMC) architectures overMCSC architectures. Due to the scaling of memory intensive parallel applications in such systems, data is more likely to be shared among various cores residing in different chips resulting in a significant increase in chip-to-chip traffic, especially one-to-many traffic. This one-to-many traffic is originated mainly to maintain cache-coherence between many cores residing in multiple chips. Besides, one-to-many traffics are also exploited by many parallel programming models, system-level synchronization mechanisms, and control signals. How-ever, state-of-the-art Network-on-Chip (NoC)-based wired interconnection architectures do not provide enough support as they handle such one-to-many traffic as multiple unicast trafficusing a multi-hop MCMC communication fabric. As a result, even a small portion of such one-to-many traffic can significantly reduce system performance as traditional NoC-basedinterconnect cannot mask the high latency and energy consumption caused by chip-to-chipwired I/Os. Moreover, with the increase in memory intensive applications and scaling of MCMC systems, traditional NoC-based wired interconnects fail to provide a scalable inter-connection solution required to support the increased cache-coherence and synchronization generated one-to-many traffic in future MCMC-based High-Performance Computing (HPC) nodes. Therefore, these computation and memory intensive MCMC systems need an energy-efficient, low latency, and scalable one-to-many (broadcast/multicast) traffic-aware interconnection infrastructure to ensure high-performance. Research in recent years has shown that Wireless Network-in-Package (WiNiP) architectures with CMOS compatible Millimeter-Wave (mm-wave) transceivers can provide a scalable, low latency, and energy-efficient interconnect solution for on and off-chip communication. In this dissertation, a one-to-many traffic-aware WiNiP interconnection architecture with a starvation-free hybrid Medium Access Control (MAC), an asymmetric topology, and a novel flow control has been proposed. The different components of the proposed architecture are individually one-to-many traffic-aware and as a system, they collaborate with each other to provide required support for one-to-many traffic communication in a MCMC environment. It has been shown that such interconnection architecture can reduce energy consumption and average packet latency by 46.96% and 47.08% respectively for MCMC systems. Despite providing performance enhancements, wireless channel, being an unguided medium, is vulnerable to various security attacks such as jamming induced Denial-of-Service (DoS), eavesdropping, and spoofing. Further, to minimize the time-to-market and design costs, modern SoCs often use Third Party IPs (3PIPs) from untrusted organizations. An adversary either at the foundry or at the 3PIP design house can introduce a malicious circuitry, to jeopardize an SoC. Such malicious circuitry is known as a Hardware Trojan (HT). An HTplanted in the WiNiP from a vulnerable design or manufacturing process can compromise a Wireless Interface (WI) to enable illegitimate transmission through the infected WI resulting in a potential DoS attack for other WIs in the MCMC system. Moreover, HTs can be used for various other malicious purposes, including battery exhaustion, functionality subversion, and information leakage. This information when leaked to a malicious external attackercan reveals important information regarding the application suites running on the system, thereby compromising the user profile. To address persistent jamming-based DoS attack in WiNiP, in this dissertation, a secure WiNiP interconnection architecture for MCMC systems has been proposed that re-uses the one-to-many traffic-aware MAC and existing Design for Testability (DFT) hardware along with Machine Learning (ML) approach. Furthermore, a novel Simulated Annealing (SA)-based routing obfuscation mechanism was also proposed toprotect against an HT-assisted novel traffic analysis attack. Simulation results show that,the ML classifiers can achieve an accuracy of 99.87% for DoS attack detection while SA-basedrouting obfuscation could reduce application detection accuracy to only 15% for HT-assistedtraffic analysis attack and hence, secure the WiNiP fabric from age-old and emerging attacks
    corecore