173 research outputs found

    Throughput-driven floorplanning with wire pipelining

    Get PDF
    The size of future high-performance SoC is such that the time-of-flight of wires connecting distant pins in the layout can be much higher than the clock period. In order to keep the frequency as high as possible, the wires may be pipelined. However, the insertion of flip-flops may alter the throughput of the system due to the presence of loops in the logic netlist. In this paper, we address the problem of floorplanning a large design where long interconnects are pipelined by inserting the throughput in the cost function of a tool based on simulated annealing. The results obtained on a series of benchmarks are then validated using a simple router that breaks long interconnects by suitably placing flip-flops along the wires

    On-Chip Transparent Wire Pipelining (invited paper)

    Get PDF
    Wire pipelining has been proposed as a viable mean to break the discrepancy between decreasing gate delays and increasing wire delays in deep-submicron technologies. Far from being a straightforwardly applicable technique, this methodology requires a number of design modifications in order to insert it seamlessly in the current design flow. In this paper we briefly survey the methods presented by other researchers in the field and then we thoroughly analyze the solutions we recently proposed, ranging from system-level wire pipelining to physical design aspects

    Design and modelling of variability tolerant on-chip communication structures for future high performance system on chip designs

    Get PDF
    The incessant technology scaling has enabled the integration of functionally complex System-on-Chip (SoC) designs with a large number of heterogeneous systems on a single chip. The processing elements on these chips are integrated through on-chip communication structures which provide the infrastructure necessary for the exchange of data and control signals, while meeting the strenuous physical and design constraints. The use of vast amounts of on chip communications will be central to future designs where variability is an inherent characteristic. For this reason, in this thesis we investigate the performance and variability tolerance of typical on-chip communication structures. Understanding of the relationship between variability and communication is paramount for the designers; i.e. to devise new methods and techniques for designing performance and power efficient communication circuits in the forefront of challenges presented by deep sub-micron (DSM) technologies. The initial part of this work investigates the impact of device variability due to Random Dopant Fluctuations (RDF) on the timing characteristics of basic communication elements. The characterization data so obtained can be used to estimate the performance and failure probability of simple links through the methodology proposed in this work. For the Statistical Static Timing Analysis (SSTA) of larger circuits, a method for accurate estimation of the probability density functions of different circuit parameters is proposed. Moreover, its significance on pipelined circuits is highlighted. Power and area are one of the most important design metrics for any integrated circuit (IC) design. This thesis emphasises the consideration of communication reliability while optimizing for power and area. A methodology has been proposed for the simultaneous optimization of performance, area, power and delay variability for a repeater inserted interconnect. Similarly for multi-bit parallel links, bandwidth driven optimizations have also been performed. Power and area efficient semi-serial links, less vulnerable to delay variations than the corresponding fully parallel links are introduced. Furthermore, due to technology scaling, the coupling noise between the link lines has become an important issue. With ever decreasing supply voltages, and the corresponding reduction in noise margins, severe challenges are introduced for performing timing verification in the presence of variability. For this reason an accurate model for crosstalk noise in an interconnection as a function of time and skew is introduced in this work. This model can be used for the identification of skew condition that gives maximum delay noise, and also for efficient design verification

    Interconnect Challenges and Carbon Nanotube as Interconnect in Nano VLSI Circuits

    Get PDF
    This chapter discusses about the behavior of Carbon Nanotube (CNT) different structures which can be used as interconnect in Very Large Scale (VLSI) circuits in nanoscale regime. Also interconnect challenges in VLSI circuits which lead to use CNT as interconnect instead of Cu, is reviewed. CNTs are classified into three main types including Single-walled Carbon Nanotube (SWCNT), CNT Bundle, and Multi-walled Carbon Nanotube (MWCNT). Because of extremely high quantum resistance of a SWCNT which is about 6.45 k℩, rope or bundle of CNTs are used which consist of parallel CNTs in order to overcome the high delay time due to the high intrinsic (quantum) resistance. Also MWCNTs which consist of parallel shells, present much less delay time with respect to SWCNTs, for the application as interconnects. In this chapter, first a short discussion about interconnect challenges in VLSI circuits is presented. Then the repeater insertion technique for the delay reduction in the global interconnects will be studied. After that, the parameters and circuit model of a CNT will be discussed. Then a brief review about the different structures of CNT interconnects including CNT bundle and MWCNT will be presented. At the continuation, the time domain behavior of a CNT bundle interconnect in a driver-CNT bundle-load configuration will be discussed and analyzed. In this analysis, CNT bundle is modeled as a transmission line circuit model. At the end, a brief study of stability analysis in CNT interconnects will be presented

    Throughput-Centric Wave-Pipelined Interconnect Circuits for Gigascale Integration

    Get PDF
    The central thesis of this research is that VLSI interconnect design strategies should shift from using global wires that can support only a single binary transition during the latency of the line to global wires that can sustain multiple bits traveling simultaneously along the length of the line. It is shown in this thesis that such throughput-centric multibit transmission can be achieved by wave-pipelining the interconnects using repeaters. A holistic analysis of wave-pipelined interconnect circuits, along with the full-custom optimization of these circuits, is performed in this research. With the help of models and methodologies developed in this thesis, the design rules for repeater insertion are crafted to simultaneously optimize performance, power, and area of VLSI global interconnect networks through a simultaneous application of voltage scaling and wire sizing. A qualitative analysis of latency, throughput, signal integrity, power dissipation, and area is performed that compares the results of design optimizations in this work to those of conventional global interconnect circuits. The objective of this thesis is to study the circuit- and system-level opportunities of voltage scaling, wire sizing, and repeater insertion in wave-pipelined global interconnect networks that are implemented in deep submicron technologies.Ph.D.Committee Chair: Davis, Jeffrey; Committee Member: Kohl, Paul; Committee Member: Meindl, James; Committee Member: Swaminathan, Madhavan; Committee Member: Wills, D. Scot

    A novel low-swing voltage driver design and the analysis of its robustness to the effects of process variation and external disturbances

    Get PDF
    arket forces are continually demanding devices with increased functionality/unit area; these demands have been satisfied through aggressive technology scaling which, unfortunately, has impacted adversely on the global interconnect delay subsequently reducing system performance. Line drivers have been used to mitigate the problems with delay; however, these have a large power consumption. A solution to reducing the power dissipation of the drivers is to use lower supply voltages. However, by adopting a lower power supply voltage, the performance of the line drivers for global interconnects is impaired unless low-swing signalling techniques are implemented. Low-swing signalling techniques can provide high speed signalling with low power consumption and hence can be used to drive global on-chip interconnect. Most of the proposed low-swing signalling schemes are immune to noise as they have a good SNR. However, they tend to have a large penalty in area and complexity as they require additional circuitry such as voltage generators and low-Vth devices. Most of the schemes also incorporate multiple Vdd and reference voltages which increase the overall circuit complexity. A diode-connected driver circuit has the best attributes over other low-swing signalling techniques in terms of low power, low delay, good SNR and low area overhead. By incorporating a diode-connected configuration at the output, it can provide high speed signalling due to its high driving capability. However, this configuration also has its limitations as it has issues with its adaptability to process variations, as well as an issue with leakage currents. To address these limitations, two novel driver schemes have been designed, namely, nLVSD and mLVSD, which, additionally, have improvements in performance and power consumption. Comparisons between the proposed schemes with the existing diode-connected driver circuits (MJ and DDC) showed that the nLVSD and mLVSD drivers have approximately 46% and 50% less delay. The name MJ originates from the driver’s designer called Juan A. Montiel-Nelson, while DDC stands for dynamic diode-connected. In terms of power consumption, the nLVSD and mLVSD drivers also produce 43% and 7% improvement. Additionally, the mLVSD driver scheme is the most robust as its SNR is 14 to 44% higher compared to other diode-connected driver circuits. On the other hand, the nLVSD driver has 6% lower SNR compared to the MJ driver, even though it is 19% more robust than the DDC driver. However, since its SNR is still above 1, its improved performance and reduced power consumption, as well other advantages it has over other diode-connected driver circuits can compensate for this limitation. Regarding the robustness to external disturbances, the proposedmdriver circuits are more robust to crosstalk effects as the nLVSD and mLVSD drivers are approximately 35% and 7% more robust than other diode-connected drivers. Furthermore, the mLVSD driver is 5%, 33% and 47% more tolerant to SEUs compared to the nLVSD, MJ and DDC driver circuits respectively, whilst the MJ and DDC drivers are 26% and 40% less tolerant to SEUs iii compared to the nLVSD circuit. A comparison between the four schemes was also undertaken in the presence of ±3σ process and voltage (PV) variations. The analysis indicated that both proposed driver schemes are more robust than other diode-connected driver schemes, namely, the MJ and DDC driver circuits. The MJ driver scheme deviates approximately 18% and 35% more in delay and power consumption compared to the proposed schemes. The DDC driver has approximately 20% and 57% more variations in delay and power consumption in comparison to the proposed schemes. In order to further improve the robustness of the proposed driver circuits against process variation and environmental disturbances, they were further analysed to identify which process variables had the most impact on circuit delay and power consumption, as well as identifying several design techniques to mitigate problems with environmental disturbances. The most significant process parameters to have impact on circuit delay and power consumption were identified to be Vdd, tox, Vth, s, w and t. The impact of SEUs on the circuit can be reduced by increasing the bias currents whilst design methods such as increasing the interconnect spacing can help improve the circuit robustness against crosstalk. Overall it is considered that the proposed nLVSD and mLVSD circuits advance the state of the art in driver design for on-chip signalling applications.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Dynamically tunable memory hierarchy

    Get PDF
    Journal ArticleThe widespread use of repeaters in long wires creates the possibility of dynamically sizing regular on-chip structures. We present a tunable cache and translation lookaside buffer (TLB) hierarchy that leverages repeater insertion to dynamically trade off size for speed and power consumption on a per-application phase basis using a novel configuration management algorithm. In comparison to a conventional design that is fixed at a single design point targeted to the average application, the dynamically tunable cache and TLB hierarchy can be tailored to the needs of each application phase. The configuration algorithm dynamically detects phase changes and selects a configuration based on the application's ability to tolerate different hit and miss latencies in order to improve the memory energy-delay product. We evaluate the performance and energy consumption of our approach and project the effects of technology scaling trends on our design

    High-Speed and Low-Energy On-Chip Communication Circuits.

    Full text link
    Continuous technology scaling sharply reduces transistor delays, while fixed-length global wire delays have increased due to less wiring pitch with higher resistance and coupling capacitance. Due to this ever growing gap, long on-chip interconnects pose well-known latency, bandwidth, and energy challenges to high-performance VLSI systems. Repeaters effectively mitigate wire RC effects but do little to improve their energy costs. Moreover, the increased complexity and high level of integration requires higher wire densities, worsening crosstalk noise and power consumption of conventionally repeated interconnects. Such increasing concerns in global on-chip wires motivate circuits to improve wire performance and energy while reducing the number of repeaters. This work presents circuit techniques and investigation for high-performance and energy-efficient on-chip communication in the aspects of encoding, data compression, self-timed current injection, signal pre-emphasis, low-swing signaling, and technology mapping. The improved bus designs also consider the constraints of robust operation and performance/energy gains across process corners and design space. Measurement results from 5mm links on 65nm and 90nm prototype chips validate 2.5-3X improvement in energy-delay product.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/75800/1/jseo_1.pd
    • 

    corecore