145 research outputs found

    Repeater insertion to minimise delay in coupled interconnects.

    Get PDF
    Signalling over long interconnect is a dominant issue in electronic chip design in current technologies, with the device sizes getting smaller and smaller and the circuits becoming ever larger. Repeater insertion is a well established technique to minimise the propagation delay over long resistive interconnect. In deep sub-micron technologies, as the wires are spaced closer and closer together and signal rise and fall times go into the sub-nano second region, the coupling between interconnects assumes great significance. The resulting crosstalk has implications on the data throughput and on signal integrity. Depending on the data correlation on the coupled lines, the delay can either decrease or increase. In this paper we attempt to quantify the effect of worst-case capacitive crosstalk in parallel buses and look at how it affects repeater insertion in particular. We develop analytic expressions for the delay, buffer size and number that are suitable in a-priori timing analyses and signal integrity estimations. All equations are checked against a dynamic circuit simulator (SPECTRE

    Interconnect Challenges and Carbon Nanotube as Interconnect in Nano VLSI Circuits

    Get PDF
    This chapter discusses about the behavior of Carbon Nanotube (CNT) different structures which can be used as interconnect in Very Large Scale (VLSI) circuits in nanoscale regime. Also interconnect challenges in VLSI circuits which lead to use CNT as interconnect instead of Cu, is reviewed. CNTs are classified into three main types including Single-walled Carbon Nanotube (SWCNT), CNT Bundle, and Multi-walled Carbon Nanotube (MWCNT). Because of extremely high quantum resistance of a SWCNT which is about 6.45 k℩, rope or bundle of CNTs are used which consist of parallel CNTs in order to overcome the high delay time due to the high intrinsic (quantum) resistance. Also MWCNTs which consist of parallel shells, present much less delay time with respect to SWCNTs, for the application as interconnects. In this chapter, first a short discussion about interconnect challenges in VLSI circuits is presented. Then the repeater insertion technique for the delay reduction in the global interconnects will be studied. After that, the parameters and circuit model of a CNT will be discussed. Then a brief review about the different structures of CNT interconnects including CNT bundle and MWCNT will be presented. At the continuation, the time domain behavior of a CNT bundle interconnect in a driver-CNT bundle-load configuration will be discussed and analyzed. In this analysis, CNT bundle is modeled as a transmission line circuit model. At the end, a brief study of stability analysis in CNT interconnects will be presented

    A Survey Addressing on High Performance On-Chip VLSI Interconnect

    Get PDF
    With the rapid increase in transmission speeds of communication systems, the demand for very high-speed lowpower VLSI circuits is on the rise. Although the performance of CMOS technologies improves notably with scaling, conventional CMOS circuits cannot simultaneously satisfy the speed and power requirements of these applications. In this paper we survey the state of the art of on-chip interconnect techniques for improving performance, power and delay optimization and also comparative analysis of various techniques for high speed design have been discussed

    Optimising bandwidth over deep sub-micron interconnect.

    Get PDF
    In deep sub-micron (DSM) circuits proper analysis of interconnect delay is very important. When relatively long wires are placed in parallel, it is essential to include the effects of cross-talk on delay. In a parallel wire structure, the exact spacing and size of the wires determine both the resistance and the distribution of the capacitance between the ground plane and the adjacent signal carrying conductors, and have a direct effect on the delay. Repeater insertion depending on whether it is optimal or constrained, affects the delay in different ways. Considering all these effects we show that there is a clear optimum configuration for the wires which maximises the total bandwidth. Our analysis is valid for lossy interconnects as are typical of wires in DSM technologies

    On dynamic delay and repeater insertion.

    Get PDF
    In deep sub-micron technologies, as the wires are placed ever closer and signal rise and fall times go into the sub-nano second region, increased crosstalk has implications on the data throughput and on signal integrity. Depending on the data correlation on the coupled lines, the delay can either decrease or increase. Here we show that in uniform coupled lines, the response for several important switching configurations has a dominant pole characteristic. This allows easy prediction for the average, worst-case and best-case delay of buffered lines. We show that the repeater numbering and sizing can be optimised to deal with crosstalk under different constraints to best match the application. Area and power issues are considered and all equations are checked against a dynamic circuit simulator (SPECTRE)

    High-performance long NoC link using delay-insensitive current-mode signaling

    Get PDF
    High-performance long-range NoC link enables efficient implementation of network-on-chip topologies which inherently require high-performance long-distance point-to-point communication such as torus and fat-tree structures. In addition, the performance of other topologies, such as mesh, can be improved by using high-performance link between few selected remote nodes.We presented novel implementation of high-performance long-range NoC link based onmultilevel current-mode signaling and delayinsensitive two-phase 1-of-4 encoding. Current-mode signaling reduces the communication latency of long wires significantlycompared to voltage-mode signaling, making it possible to achieve high throughput without pipelining and/or using repeaters. The performance of the proposed multilevel current-mode interconnect is analyzed and compared with two reference voltage mode interconnects. These two reference interconnects are designed using two-phase 1-of-4 encoded voltage-mode signaling, one with pipeline stages and the other using optimal repeater insertion. The proposed multilevel current-mode interconnect achieves higher throughput and lower latency than the two reference interconnects. Its throughput at 8mm wire length is 1.222GWord/swhich is 1.58 and 1.89 times higher than the pipelined and optimal repeater insertion interconnects, respectively. Furthermore, its power consumption is less than the optimal repeater insertion voltage-mode interconnect, at 10mm wire length its power consumption is 0.75mW while the reference repeater insertion interconnect is 1.066 mW. The effect of crosstalk is analyzed using four-bit parallel data transfer with the best-case and worst-case switching patterns and a transmission line model which has both capacitive coupling and inductive coupling.</p

    Design and modelling of variability tolerant on-chip communication structures for future high performance system on chip designs

    Get PDF
    The incessant technology scaling has enabled the integration of functionally complex System-on-Chip (SoC) designs with a large number of heterogeneous systems on a single chip. The processing elements on these chips are integrated through on-chip communication structures which provide the infrastructure necessary for the exchange of data and control signals, while meeting the strenuous physical and design constraints. The use of vast amounts of on chip communications will be central to future designs where variability is an inherent characteristic. For this reason, in this thesis we investigate the performance and variability tolerance of typical on-chip communication structures. Understanding of the relationship between variability and communication is paramount for the designers; i.e. to devise new methods and techniques for designing performance and power efficient communication circuits in the forefront of challenges presented by deep sub-micron (DSM) technologies. The initial part of this work investigates the impact of device variability due to Random Dopant Fluctuations (RDF) on the timing characteristics of basic communication elements. The characterization data so obtained can be used to estimate the performance and failure probability of simple links through the methodology proposed in this work. For the Statistical Static Timing Analysis (SSTA) of larger circuits, a method for accurate estimation of the probability density functions of different circuit parameters is proposed. Moreover, its significance on pipelined circuits is highlighted. Power and area are one of the most important design metrics for any integrated circuit (IC) design. This thesis emphasises the consideration of communication reliability while optimizing for power and area. A methodology has been proposed for the simultaneous optimization of performance, area, power and delay variability for a repeater inserted interconnect. Similarly for multi-bit parallel links, bandwidth driven optimizations have also been performed. Power and area efficient semi-serial links, less vulnerable to delay variations than the corresponding fully parallel links are introduced. Furthermore, due to technology scaling, the coupling noise between the link lines has become an important issue. With ever decreasing supply voltages, and the corresponding reduction in noise margins, severe challenges are introduced for performing timing verification in the presence of variability. For this reason an accurate model for crosstalk noise in an interconnection as a function of time and skew is introduced in this work. This model can be used for the identification of skew condition that gives maximum delay noise, and also for efficient design verification

    Performance and power optimization in VLSI physical design

    Get PDF
    As VLSI technology enters the nanoscale regime, a great amount of efforts have been made to reduce interconnect delay. Among them, buffer insertion stands out as an effective technique for timing optimization. A dramatic rise in on-chip buffer density has been witnessed. For example, in two recent IBM ASIC designs, 25% gates are buffers. In this thesis, three buffer insertion algorithms are presented for the procedure of performance and power optimization. The second chapter focuses on improving circuit performance under inductance effect. The new algorithm works under the dynamic programming framework and runs in provably linear time for multiple buffer types due to two novel techniques: restrictive cost bucketing and efficient delay update. The experimental results demonstrate that our linear time algorithm consistently outperforms all known RLC buffering algorithms in terms of both solution quality and runtime. That is, the new algorithm uses fewer buffers, runs in shorter time and the buffered tree has better timing. The third chapter presents a method to guarantee a high fidelity signal transmission in global bus. It proposes a new redundant via insertion technique to reduce via variation and signal distortion in twisted differential line. In addition, a new buffer insertion technique is proposed to synchronize the transmitted signals, thus further improving the effectiveness of the twisted differential line. Experimental results demonstrate a 6GHz signal can be transmitted with high fidelity using the new approaches. In contrast, only a 100MHz signal can be reliably transmitted using a single-end bus with power/ground shielding. Compared to conventional twisted differential line structure, our new techniques can reduce the magnitude of noise by 45% as witnessed in our simulation. The fourth chapter proposes a buffer insertion and gate sizing algorithm for million plus gates. The algorithm takes a combinational circuit as input instead of individual nets and greatly reduces the buffer and gate cost of the entire circuit. The algorithm has two main features: 1) A circuit partition technique based on the criticality of the primary inputs, which provides the scalability for the algorithm, and 2) A linear programming formulation of non-linear delay versus cost tradeoff, which formulates the simultaneous buffer insertion and gate sizing into linear programming problem. Experimental results on ISCAS85 circuits show that even without the circuit partition technique, the new algorithm achieves 17X speedup compared with path based algorithm. In the meantime, the new algorithm saves 16.0% buffer cost, 4.9% gate cost, 5.8% total cost and results in less circuit delay
    • 

    corecore