459 research outputs found

    A global wire planning scheme for Network-on-Chip.

    Get PDF
    As technology scales down, the interconnect for on-chip global communication becomes the delay bottleneck. In order to provide well-controlled global wire delay and efficient global communication, a packet switched Network-on-Chip (NoC) architecture was proposed by different authors. In this paper, the NoC system parameters constrained by the interconnections are studied. Predictions on scaled system parameters such as clock frequency, resource size, global communication bandwidth and inter-resource delay are made for future technologies. Based on these parameters, a global wire planning scheme is proposed

    Energy and performance models for clocked and asynchronous communication

    Get PDF
    Journal ArticleParameterized first-order models for throughput, energy, and bandwidth are presented in this paper. Models are developed for many common pipeline methodologies, including clocked flopped, clocked time-borrowing latch protocols, asynchronous two-cycle, four-cycle, delay-insensitive, and source synchronous. The paper focuses on communication costs which have the potential to throttle design performance as scaling continues. The models can also be applied to logic. The equations share common parameters to allow apples-to-apples comparisons against different design targets and pipeline methodologies. By applying the parameters to various design targets, one can determine when unclocked communication is superior at the physical level to clocked communication in terms of energy for a given bandwidth. Comparisons between protocols at fixed targets also allow designers to understand tradeoffs between implementations that have a varying degree of timing assumptions and design requirements

    Limits on Fundamental Limits to Computation

    Full text link
    An indispensable part of our lives, computing has also become essential to industries and governments. Steady improvements in computer hardware have been supported by periodic doubling of transistor densities in integrated circuits over the last fifty years. Such Moore scaling now requires increasingly heroic efforts, stimulating research in alternative hardware and stirring controversy. To help evaluate emerging technologies and enrich our understanding of integrated-circuit scaling, we review fundamental limits to computation: in manufacturing, energy, physical space, design and verification effort, and algorithms. To outline what is achievable in principle and in practice, we recall how some limits were circumvented, compare loose and tight limits. We also point out that engineering difficulties encountered by emerging technologies may indicate yet-unknown limits.Comment: 15 pages, 4 figures, 1 tabl

    A Survey Addressing on High Performance On-Chip VLSI Interconnect

    Get PDF
    With the rapid increase in transmission speeds of communication systems, the demand for very high-speed lowpower VLSI circuits is on the rise. Although the performance of CMOS technologies improves notably with scaling, conventional CMOS circuits cannot simultaneously satisfy the speed and power requirements of these applications. In this paper we survey the state of the art of on-chip interconnect techniques for improving performance, power and delay optimization and also comparative analysis of various techniques for high speed design have been discussed

    Skybridge: 3-D Integrated Circuit Technology Alternative to CMOS

    Full text link
    Continuous scaling of CMOS has been the major catalyst in miniaturization of integrated circuits (ICs) and crucial for global socio-economic progress. However, scaling to sub-20nm technologies is proving to be challenging as MOSFETs are reaching their fundamental limits and interconnection bottleneck is dominating IC operational power and performance. Migrating to 3-D, as a way to advance scaling, has eluded us due to inherent customization and manufacturing requirements in CMOS that are incompatible with 3-D organization. Partial attempts with die-die and layer-layer stacking have their own limitations. We propose a 3-D IC fabric technology, Skybridge[TM], which offers paradigm shift in technology scaling as well as design. We co-architect Skybridge's core aspects, from device to circuit style, connectivity, thermal management, and manufacturing pathway in a 3-D fabric-centric manner, building on a uniform 3-D template. Our extensive bottom-up simulations, accounting for detailed material system structures, manufacturing process, device, and circuit parasitics, carried through for several designs including a designed microprocessor, reveal a 30-60x density, 3.5x performance per watt benefits, and 10X reduction in interconnect lengths vs. scaled 16-nm CMOS. Fabric-level heat extraction features are shown to successfully manage IC thermal profiles in 3-D. Skybridge can provide continuous scaling of integrated circuits beyond CMOS in the 21st century.Comment: 53 Page

    Constant-degree graph expansions that preserve the treewidth

    Full text link
    Many hard algorithmic problems dealing with graphs, circuits, formulas and constraints admit polynomial-time upper bounds if the underlying graph has small treewidth. The same problems often encourage reducing the maximal degree of vertices to simplify theoretical arguments or address practical concerns. Such degree reduction can be performed through a sequence of splittings of vertices, resulting in an _expansion_ of the original graph. We observe that the treewidth of a graph may increase dramatically if the splittings are not performed carefully. In this context we address the following natural question: is it possible to reduce the maximum degree to a constant without substantially increasing the treewidth? Our work answers the above question affirmatively. We prove that any simple undirected graph G=(V, E) admits an expansion G'=(V', E') with the maximum degree <= 3 and treewidth(G') <= treewidth(G)+1. Furthermore, such an expansion will have no more than 2|E|+|V| vertices and 3|E| edges; it can be computed efficiently from a tree-decomposition of G. We also construct a family of examples for which the increase by 1 in treewidth cannot be avoided.Comment: 12 pages, 6 figures, the main result used by quant-ph/051107

    Low-swing signaling for energy efficient on-chip networks

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 65-69).On-chip networks have emerged as a scalable and high-bandwidth communication fabric in many-core processor chips. However, the energy consumption of these networks is becoming comparable to that of computation cores, making further scaling of core counts difficult. This thesis makes several contributions to low-swing signaling circuit design for the energy efficient on-chip networks in two separate projects: on-chip networks optimized for one-to-many multicasts and broadcasts, and link designs that allow on-chip networks to approach an ideal interconnection fabric. A low-swing crossbar switch, which is based on tri-state Reduced-Swing Drivers (RSDs), is presented for the first project. Measurement results of its test chip fabricated in 45nm SOI CMOS show that the tri-state RSD-based crossbar enables 55% power savings as compared to an equivalent full-swing crossbar and link. Also, the measurement results show that the proposed crossbar allows the broadcast-optimized on-chip networks using a single pipeline stage for physical data transmission to operate at 21% higher data rate, when compared with the full-swing networks. For the second project, two clockless low-swing repeaters, a Self-Resetting Logic Repeater (SRLR) and a Voltage-Locked Repeater (VLR), have been proposed and analyzed in simulation only. They both require no reference clock, differential signaling, and bias current. Such digital-intensive properties enable them to approach energy and delay performance of a point-to-point interconnect of variable lengths. Simulated in 45nm SOI CMOS, the 10mm SRLR featured with high energy efficiency consumes 338fJ/b at 5.4Gb/s/ch while the 10mm VLR raises its data rate up to 16.OGb/s/ch with 427fJ/b.by Sunghyun Park.S.M

    On-Chip Transparent Wire Pipelining (invited paper)

    Get PDF
    Wire pipelining has been proposed as a viable mean to break the discrepancy between decreasing gate delays and increasing wire delays in deep-submicron technologies. Far from being a straightforwardly applicable technique, this methodology requires a number of design modifications in order to insert it seamlessly in the current design flow. In this paper we briefly survey the methods presented by other researchers in the field and then we thoroughly analyze the solutions we recently proposed, ranging from system-level wire pipelining to physical design aspects

    Throughput-driven floorplanning with wire pipelining

    Get PDF
    The size of future high-performance SoC is such that the time-of-flight of wires connecting distant pins in the layout can be much higher than the clock period. In order to keep the frequency as high as possible, the wires may be pipelined. However, the insertion of flip-flops may alter the throughput of the system due to the presence of loops in the logic netlist. In this paper, we address the problem of floorplanning a large design where long interconnects are pipelined by inserting the throughput in the cost function of a tool based on simulated annealing. The results obtained on a series of benchmarks are then validated using a simple router that breaks long interconnects by suitably placing flip-flops along the wires
    • 

    corecore