39 research outputs found

    An Application-Specific Design Methodology for On-chip Crossbar Generation

    Get PDF
    Designing a power-efficient interconnection architec- ture for MultiProcessor Systems-on-Chips (MPSoCs) satisfying the application performance constraints is a nontrivial task. In order to meet the tight time-to-market constraints and to effec- tively handle the design complexity, it is essential to provide a computer-aided design tool support for automating this task. In this paper, we address the issue of “application-specific design of optimal crossbar architecture” satisfying the performance re- quirements of the application and optimal binding of the cores onto the crossbar resources. We present a simulation-based design approach that is based on the analysis of the actual traffic trace of the application, considering local variations in traffic rates, temporal overlap among traffic streams, and criticality of traffic streams. Our approach is physical design aware, where the wiring complexity of the crossbar architecture is also considered during the design process. This leads to detecting timing violations on the wires early in the design cycle and to having accurate estimates of the power consumption on the wires. We apply our methodology onto several MPSoC designs, and the synthesized crossbar plat- forms are validated for performance by cycle-accurate SystemC simulation of the designs. The crossbar matrix power consumption values are based on the synthesis of the register transfer level models of the designs, obtained using industry standard tools. The experimental case studies show large reduction in communication architecture power consumption (45.3% on average) and total wirelength (38% on average) for the MPSoC designs when com- pared with traditional design approaches. The synthesized cross- bar designs also lead to large reduction in transaction latencies (up to 7×) when compared with the existing design approaches

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    A Framework for Cosynthesis of Memory and Communication Architectures for MPSoC

    Full text link

    A Topology Design Customization Approach for STNoC

    Full text link
    To support high bandwidth SoCs, a communication design flow is necessary for the design space exploration respecting tight design requirements. In order to exploit the benefits introduced by the NoC approach for the on-chip communication, the paper presents a design flow for the core mapping and customization of the network topology applied to STNoC, the Network on-Chip developed by STMicroelectronics. Starting from ring topology, the proposed application-specific flow tries to find a set of customized topologies, optimized in terms of performance and area/energy overhead, by adding links. The generated STNoC custom topologies provide a reduced cost with respect to the spidergon topology

    Comparison of multi-layer bus interconnection and a network on chip solution

    Get PDF
    Abstract. This thesis explains the basic subjects that are required to take in consideration when designing a network on chip solutions in the semiconductor world. For example, general topologies such as mesh, torus, octagon and fat tree are explained. In addition, discussion related to network interfaces, switches, arbitration, flow control, routing, error avoidance and error handling are provided. Furthermore, there is discussion related to design flow, a computer aided designing tools and a few comprehensive researches. However, several networks are designed for the minimum latency, although there are also versions which trade performance for decreased bus widths. These designed networks are compared with a corresponding multi-layer bus interconnection and both synthesis and register transfer level simulations are run. For example, results from throughput, latency, logic area and power consumptions are gathered and compared. It was discovered that overall throughput was well balanced with the network on chip solutions, although its maximum throughput was limited by protocol conversions. For example, the multi-layer bus interconnection was capable of providing a few times smaller latencies and higher throughputs when only a single interface was injected at the time. However, with parallel traffic and high-performance requirements a network on chip solution provided better results, even though the difference decreased when performance requirements were lower. Furthermore, it was discovered that the network on chip solutions required approximately 3–4 times higher total cell area than the multi-layer bus interconnection and that resources were mainly located at network interfaces and switches. In addition, power consumption was approximately 2–3 times higher and was mostly caused by dynamic consumption.Monitasoisen väyläarkkitehtuurin ja tietokoneverkkomaisen ratkaisun vertailua. Tiivistelmä. Tutkielmassa käsitellään tärkeimpiä aihealueita, jotka tulee huomioida suunniteltaessa tietokoneverkkomaisia väyläratkaisuja puolijohdemaailmassa. Esimerkiksi yleiset rakenteet, kuten verkko-, torus-, kahdeksankulmio- ja puutopologiat käsitellään lyhyesti. Lisäksi alustetaan verkon liitäntäkohdat, kytkimet, vuorottelu, vuon hallinta, reititys, virheiden välttely ja -käsittely. Lopuksi kerrotaan suunnitteluvuon oleellisimmat välivaiheet ja niihin soveltuvia kaupallisia työkaluja, sekä käsitellään lyhyesti muutaman aiemman julkaisun tuloksia. Tutkielmassa käytetään suunnittelutyökalua muutaman tietokoneverkkomaisen ratkaisun toteutukseen ja tavoitteena on saavuttaa pienin mahdollinen latenssi. Toisaalta myös hieman suuremman latenssin versioita suunnitellaan, mutta pienemmillä väylänleveyksillä. Lisäksi suunniteltuja tietokoneverkkomaisia ratkaisuja vertaillaan perinteisempään monitasoiseen väyläarkkitehtuuriin. Esimerkiksi synteesi- ja simulaatiotuloksia, kuten logiikan vaatimaa pinta-alaa, tehonkulutusta, latenssia ja suorituskykyä, vertaillaan keskenään. Tutkielmassa selvisi, että suunnittelutyökalulla toteutetut tietokoneverkkomaiset ratkaisut mahdollistivat tasaisemman suorituskyvyn, joskin niiden suurin saavutettu suorituskyky ja pienin latenssi määräytyivät protokollan käännöksen aiheuttamasta viiveestä. Tutkielmassa havaittiin, että perinteisemmillä menetelmillä saavutettiin noin kaksi kertaa suurempi suorituskyky ja pienempi latenssi, kun verkossa ei ollut muuta liikennettä. Rinnakkaisen liikenteen lisääntyessä tietokoneverkkomainen ratkaisu tarjosi keskimäärin paremman suorituskyvyn, kun sille asetetut tehokkuusvaateet olivat suuret, mutta suorituskykyvaatimuksien laskiessa erot kapenivat. Lisäksi huomattiin, että tietokoneverkkomaisten ratkaisujen käyttämä pinta-ala oli noin 3–4 kertaa suurempi kuin monitasoisella väyläarkkitehtuurilla ja että resurssit sijaitsivat enimmäkseen verkon liittymäkohdissa ja kytkimissä. Lisäksi tehonkulutuksen huomattiin olevan noin 2–3 kertaa suurempi, joskin sen havaittiin koostuvan pääosin dynaamisesta kulutuksesta

    Energy Efficient Network Generation for Application Specific NoC

    Get PDF
    Networks-on-Chip is emerging as a communication platform for future complex SoC designs, composed of a large number of homogenous or heterogeneous processing resources. Most SoC platforms are customized to the domainspecific requirements of their applications, which communicate in a specific, mostly irregular way. The specific but often diverse communication requirements among cores of the SoC call for the design of application-specific network of SoC for improved performance in terms of communication energy, latency, and throughput. In this work, we propose a methodology for the design of customized irregular network architecture of SoC. The proposed method exploits priori knowledge of the application2019;s communication characteristic to generate an energy optimized network and corresponding routing tables

    Application-Specific Topology Design Customization for STNoC

    Full text link

    Interconnect design for the edge computing system-on-chip

    Get PDF
    Nowadays the majority of system-on-chips are designed by placing various IP blocks such as CPUs, memories and accelerators on the same chip. With the advantage of silicon manufacturing technologies, it has become possible to place hundreds of CPU cores and other design blocks on the same chip. A communication system that transfers data between chip components largely affects overall chip performance, computational speed and response time for external events. Firstly, this thesis studies the main on-chip interconnect design paradigms. According to the presented research, various architectures may be chosen for an interconnect design depending on the required complexity and number of subsystems. The shared and hybrid bus interconnects are one of the oldest means of on-chip communication. They are efficient for small systems with no more than ten IP blocks. The crossbars or bus matrix interconnects can help to build on-chip communication systems which can efficiently interconnect dozens of system-on-chip modules. The networks-on-chip can provide a communication solution for large scale chip designs with hundreds of IP blocks. The second part of this thesis focuses on the novel Ballast chip implementation and its interconnect design. The Ballast is a heterogeneous multiprocessor chip designed for edge computing and general-purpose computing applications. In this thesis Ballast interconnect was designed from scratch by using a cascaded crossbar approach by connecting three open-sourced AXI protocol bus matrices. The designed interconnect allows to efficiently connect 6 bus masters with 9 slaves and provides up to 9,6 GB/s bandwidth for the most productive CPU subsystem
    corecore