664 research outputs found

    An efficient asynchronous spatial division multiplexing router for network-on-chip on the hardware platform

    Get PDF
    The quasi-delay-insensitive (QDI) based asynchronous network-on-chip (ANoC) has several advantages over clock-based synchronous network-on-chips (NoCs). The asynchronous router uses a virtual channel (VC) as a primary flow-control mechanism however, the spatial division multiplexing (SDM) based mechanism performs better over input traffics over VC. This manuscript uses an asynchronous spatial division multiplexing (ASDM) based router for NoC architecture on a field-programmable gate array (FPGA) platform. The ASDM router is configurable to different bandwidths and VCs. The ASDM router mainly contains input-output (I/O) buffers, a switching allocator, and a crossbar unit. The 4-phase 1-of-4 dual-rail protocol is used to construct the I/O buffers. The performance of the ASDM router is analyzed in terms of lower urinary tract symptoms (LUTs) (chip area), delay, latency, and throughput parameters. The work is implemented using Verilog-HDL with Xilinx ISE 14.7 on artix-7 FPGA. The ASDM router achieves % chip area and obtains 0.8 ns of latency with a throughput of 800 Mfps. The proposed router is compared with existing asynchronous approaches with improved latency and throughput metrics

    Doctor of Philosophy

    Get PDF
    dissertationCommunication surpasses computation as the power and performance bottleneck in forthcoming exascale processors. Scaling has made transistors cheap, but on-chip wires have grown more expensive, both in terms of latency as well as energy. Therefore, the need for low energy, high performance interconnects is highly pronounced, especially for long distance communication. In this work, we examine two aspects of the global signaling problem. The first part of the thesis focuses on a high bandwidth asynchronous signaling protocol for long distance communication. Asynchrony among intellectual property (IP) cores on a chip has become necessary in a System on Chip (SoC) environment. Traditional asynchronous handshaking protocol suffers from loss of throughput due to the added latency of sending the acknowledge signal back to the sender. We demonstrate a method that supports end-to-end communication across links with arbitrarily large latency, without limiting the bandwidth, so long as line variation can be reliably controlled. We also evaluate the energy and latency improvements as a result of the design choices made available by this protocol. The use of transmission lines as a physical interconnect medium shows promise for deep submicron technologies. In our evaluations, we notice a lower energy footprint, as well as vastly reduced wire latency for transmission line interconnects. We approach this problem from two sides. Using field solvers, we investigate the physical design choices to determine the optimal way to implement these lines for a given back-end-of-line (BEOL) stack. We also approach the problem from a system designer's viewpoint, looking at ways to optimize the lines for different performance targets. This work analyzes the advantages and pitfalls of implementing asynchronous channel protocols for communication over long distances. Finally, the innovations resulting from this work are applied to a network-on-chip design example and the resulting power-performance benefits are reported

    Doctor of Philosophy

    Get PDF
    dissertationPortable electronic devices will be limited to available energy of existing battery chemistries for the foreseeable future. However, system-on-chips (SoCs) used in these devices are under a demand to offer more functionality and increased battery life. A difficult problem in SoC design is providing energy-efficient communication between its components while maintaining the required performance. This dissertation introduces a novel energy-efficient network-on-chip (NoC) communication architecture. A NoC is used within complex SoCs due it its superior performance, energy usage, modularity, and scalability over traditional bus and point-to-point methods of connecting SoC components. This is the first academic research that combines asynchronous NoC circuits, a focus on energy-efficient design, and a software framework to customize a NoC for a particular SoC. Its key contribution is demonstrating that a simple, asynchronous NoC concept is a good match for low-power devices, and is a fruitful area for additional investigation. The proposed NoC is energy-efficient in several ways: simple switch and arbitration logic, low port radix, latch-based router buffering, a topology with the minimum number of 3-port routers, and the asynchronous advantages of zero dynamic power consumption while idle and the lack of a clock tree. The tool framework developed for this work uses novel methods to optimize the topology and router oorplan based on simulated annealing and force-directed movement. It studies link pipelining techniques that yield improved throughput in an energy-efficient manner. A simulator is automatically generated for each customized NoC, and its traffic generators use a self-similar message distribution, as opposed to Poisson, to better match application behavior. Compared to a conventional synchronous NoC, this design is superior by achieving comparable message latency with half the energy

    Comparing energy and latency of asynchronous and synchronous NoCs for embedded SoCs

    Get PDF
    Journal ArticlePower consumption of on-chip interconnects is a primary concern for many embedded system-on-chip (SoC) applications. In this paper, we compare energy and performance characteristics of asynchronous (clockless) and synchronous network on-chip implementations, optimized for a number of SoC designs. We adapted the COSI-2.0 framework with ORION 2.0 router and wire models for synchronous network generation. Our own tool, ANetGen, specifies the asynchronous network by determining the topology with simulated-annealing and router locations with force-directed placement. It uses energy and delay models from our 65 nm bundled-data router design. SystemC simulations varied traffic burstiness using the self-similar b-model. Results show that the asynchronous network provided lower median and maximum message latency, especially under bursty traffic, and used far less router energy with a slight overhead for the interrouter wires

    Circuit design and analysis for on-FPGA communication systems

    No full text
    On-chip communication system has emerged as a prominently important subject in Very-Large- Scale-Integration (VLSI) design, as the trend of technology scaling favours logics more than interconnects. Interconnects often dictates the system performance, and, therefore, research for new methodologies and system architectures that deliver high-performance communication services across the chip is mandatory. The interconnect challenge is exacerbated in Field-Programmable Gate Array (FPGA), as a type of ASIC where the hardware can be programmed post-fabrication. Communication across an FPGA will be deteriorating as a result of interconnect scaling. The programmable fabrics, switches and the specific routing architecture also introduce additional latency and bandwidth degradation further hindering intra-chip communication performance. Past research efforts mainly focused on optimizing logic elements and functional units in FPGAs. Communication with programmable interconnect received little attention and is inadequately understood. This thesis is among the first to research on-chip communication systems that are built on top of programmable fabrics and proposes methodologies to maximize the interconnect throughput performance. There are three major contributions in this thesis: (i) an analysis of on-chip interconnect fringing, which degrades the bandwidth of communication channels due to routing congestions in reconfigurable architectures; (ii) a new analogue wave signalling scheme that significantly improves the interconnect throughput by exploiting the fundamental electrical characteristics of the reconfigurable interconnect structures. This new scheme can potentially mitigate the interconnect scaling challenges. (iii) a novel Dynamic Programming (DP)-network to provide adaptive routing in network-on-chip (NoC) systems. The DP-network architecture performs runtime optimization for route planning and dynamic routing which, effectively utilizes the in-silicon bandwidth. This thesis explores a new horizon in reconfigurable system design, in which new methodologies and concepts are proposed to enhance the on-FPGA communication throughput performance that is of vital importance in new technology processes

    Comparison of multi-layer bus interconnection and a network on chip solution

    Get PDF
    Abstract. This thesis explains the basic subjects that are required to take in consideration when designing a network on chip solutions in the semiconductor world. For example, general topologies such as mesh, torus, octagon and fat tree are explained. In addition, discussion related to network interfaces, switches, arbitration, flow control, routing, error avoidance and error handling are provided. Furthermore, there is discussion related to design flow, a computer aided designing tools and a few comprehensive researches. However, several networks are designed for the minimum latency, although there are also versions which trade performance for decreased bus widths. These designed networks are compared with a corresponding multi-layer bus interconnection and both synthesis and register transfer level simulations are run. For example, results from throughput, latency, logic area and power consumptions are gathered and compared. It was discovered that overall throughput was well balanced with the network on chip solutions, although its maximum throughput was limited by protocol conversions. For example, the multi-layer bus interconnection was capable of providing a few times smaller latencies and higher throughputs when only a single interface was injected at the time. However, with parallel traffic and high-performance requirements a network on chip solution provided better results, even though the difference decreased when performance requirements were lower. Furthermore, it was discovered that the network on chip solutions required approximately 3–4 times higher total cell area than the multi-layer bus interconnection and that resources were mainly located at network interfaces and switches. In addition, power consumption was approximately 2–3 times higher and was mostly caused by dynamic consumption.Monitasoisen vĂ€ylĂ€arkkitehtuurin ja tietokoneverkkomaisen ratkaisun vertailua. TiivistelmĂ€. Tutkielmassa kĂ€sitellÀÀn tĂ€rkeimpiĂ€ aihealueita, jotka tulee huomioida suunniteltaessa tietokoneverkkomaisia vĂ€ylĂ€ratkaisuja puolijohdemaailmassa. Esimerkiksi yleiset rakenteet, kuten verkko-, torus-, kahdeksankulmio- ja puutopologiat kĂ€sitellÀÀn lyhyesti. LisĂ€ksi alustetaan verkon liitĂ€ntĂ€kohdat, kytkimet, vuorottelu, vuon hallinta, reititys, virheiden vĂ€lttely ja -kĂ€sittely. Lopuksi kerrotaan suunnitteluvuon oleellisimmat vĂ€livaiheet ja niihin soveltuvia kaupallisia työkaluja, sekĂ€ kĂ€sitellÀÀn lyhyesti muutaman aiemman julkaisun tuloksia. Tutkielmassa kĂ€ytetÀÀn suunnittelutyökalua muutaman tietokoneverkkomaisen ratkaisun toteutukseen ja tavoitteena on saavuttaa pienin mahdollinen latenssi. Toisaalta myös hieman suuremman latenssin versioita suunnitellaan, mutta pienemmillĂ€ vĂ€ylĂ€nleveyksillĂ€. LisĂ€ksi suunniteltuja tietokoneverkkomaisia ratkaisuja vertaillaan perinteisempÀÀn monitasoiseen vĂ€ylĂ€arkkitehtuuriin. Esimerkiksi synteesi- ja simulaatiotuloksia, kuten logiikan vaatimaa pinta-alaa, tehonkulutusta, latenssia ja suorituskykyĂ€, vertaillaan keskenÀÀn. Tutkielmassa selvisi, ettĂ€ suunnittelutyökalulla toteutetut tietokoneverkkomaiset ratkaisut mahdollistivat tasaisemman suorituskyvyn, joskin niiden suurin saavutettu suorituskyky ja pienin latenssi mÀÀrĂ€ytyivĂ€t protokollan kÀÀnnöksen aiheuttamasta viiveestĂ€. Tutkielmassa havaittiin, ettĂ€ perinteisemmillĂ€ menetelmillĂ€ saavutettiin noin kaksi kertaa suurempi suorituskyky ja pienempi latenssi, kun verkossa ei ollut muuta liikennettĂ€. Rinnakkaisen liikenteen lisÀÀntyessĂ€ tietokoneverkkomainen ratkaisu tarjosi keskimÀÀrin paremman suorituskyvyn, kun sille asetetut tehokkuusvaateet olivat suuret, mutta suorituskykyvaatimuksien laskiessa erot kapenivat. LisĂ€ksi huomattiin, ettĂ€ tietokoneverkkomaisten ratkaisujen kĂ€yttĂ€mĂ€ pinta-ala oli noin 3–4 kertaa suurempi kuin monitasoisella vĂ€ylĂ€arkkitehtuurilla ja ettĂ€ resurssit sijaitsivat enimmĂ€kseen verkon liittymĂ€kohdissa ja kytkimissĂ€. LisĂ€ksi tehonkulutuksen huomattiin olevan noin 2–3 kertaa suurempi, joskin sen havaittiin koostuvan pÀÀosin dynaamisesta kulutuksesta

    Exploration and Design of High Performance Variation Tolerant On-Chip Interconnects

    Get PDF
    Siirretty Doriast

    Planning assistance for the NASA 30/20 GHz program. Network control architecture study.

    Get PDF
    Network Control Architecture for a 30/20 GHz flight experiment system operating in the Time Division Multiple Access (TDMA) was studied. Architecture development, identification of processing functions, and performance requirements for the Master Control Station (MCS), diversity trunking stations, and Customer Premises Service (CPS) stations are covered. Preliminary hardware and software processing requirements as well as budgetary cost estimates for the network control system are given. For the trunking system control, areas covered include on board SS-TDMA switch organization, frame structure, acquisition and synchronization, channel assignment, fade detection and adaptive power control, on board oscillator control, and terrestrial network timing. For the CPS control, they include on board processing and adaptive forward error correction control
    • 

    corecore