881 research outputs found

    High-Speed and Low-Energy On-Chip Communication Circuits.

    Full text link
    Continuous technology scaling sharply reduces transistor delays, while fixed-length global wire delays have increased due to less wiring pitch with higher resistance and coupling capacitance. Due to this ever growing gap, long on-chip interconnects pose well-known latency, bandwidth, and energy challenges to high-performance VLSI systems. Repeaters effectively mitigate wire RC effects but do little to improve their energy costs. Moreover, the increased complexity and high level of integration requires higher wire densities, worsening crosstalk noise and power consumption of conventionally repeated interconnects. Such increasing concerns in global on-chip wires motivate circuits to improve wire performance and energy while reducing the number of repeaters. This work presents circuit techniques and investigation for high-performance and energy-efficient on-chip communication in the aspects of encoding, data compression, self-timed current injection, signal pre-emphasis, low-swing signaling, and technology mapping. The improved bus designs also consider the constraints of robust operation and performance/energy gains across process corners and design space. Measurement results from 5mm links on 65nm and 90nm prototype chips validate 2.5-3X improvement in energy-delay product.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/75800/1/jseo_1.pd

    Design of variation-tolerant synchronizers for multiple clock and voltage domains

    Get PDF
    PhD ThesisParametric variability increasingly affects the performance of electronic circuits as the fabrication technology has reached the level of 32nm and beyond. These parameters may include transistor Process parameters (such as threshold voltage), supply Voltage and Temperature (PVT), all of which could have a significant impact on the speed and power consumption of the circuit, particularly if the variations exceed the design margins. As systems are designed with more asynchronous protocols, there is a need for highly robust synchronizers and arbiters. These components are often used as interfaces between communication links of different timing domains as well as sampling devices for asynchronous inputs coming from external components. These applications have created a need for new robust designs of synchronizers and arbiters that can tolerate process, voltage and temperature variations. The aim of this study was to investigate how synchronizers and arbiters should be designed to tolerate parametric variations. All investigations focused mainly on circuit-level and transistor level designs and were modeled and simulated in the UMC90nm CMOS technology process. Analog simulations were used to measure timing parameters and power consumption along with a “Monte Carlo” statistical analysis to account for process variations. Two main components of synchronizers and arbiters were primarily investigated: flip-flop and mutual-exclusion element (MUTEX). Both components can violate the input timing conditions, setup and hold window times, which could cause metastability inside their bistable elements and possibly end in failures. The mean-time between failures is an important reliability feature of any synchronizer delay through the synchronizer. The MUTEX study focused on the classical circuit, in addition to a number of tolerance, based on increasing internal gain by adding current sources, reducing the capacitive loading, boosting the transconductance of the latch, compensating the existing Miller capacitance, and adding asymmetry to maneuver the metastable point. The results showed that some circuits had little or almost no improvements, while five techniques showed significant improvements by reducing τ and maintaining high tolerance. Three design approaches are proposed to provide variation-tolerant synchronizers. wagging synchronizer proposed to First, the is significantly increase reliability over that of the conventional two flip-flop synchronizer. The robustness of the wagging technique can be enhanced by using robust τ latches or adding one more cycle of synchronization. The second approach is the Metastability Auto-Detection and Correction (MADAC) latch which relies on swiftly detecting a metastable event and correcting it by enforcing the previously stored logic value. This technique significantly reduces the resolution time down from uncertain synchronization technique is proposed to transfer signals between Multiple- Voltage Multiple-Clock Domains (MVD/MCD) that do not require conventional level-shifters between the domains or multiple power supplies within each domain. This interface circuit uses a synchronous set and feedback reset protocol which provides level-shifting and synchronization of all signals between the domains, from a wide range of voltage-supplies and clock frequencies. Overall, synchronizer circuits can tolerate variations to a greater extent by employing the wagging technique or using a MADAC latch, while MUTEX tolerance can suffice with small circuit modifications. Communication between MVD/MCD can be achieved by an asynchronous handshake without a need for adding level-shifters.The Saudi Arabian Embassy in London, Umm Al-Qura University, Saudi Arabi

    Low-Power and Low-Noise Clock Generator for High-Speed ADCs

    Get PDF
    The rapid development of high-performance communication technologies reflects a clear trend in demanding requirements imposed on analog-to-digital converters (ADCs). Thus, it appears that these requirements imply higher frequencies not only for the input signal but also higher sampling frequencies, which translates into a higher sensitivity of the circuit to thermal noise and consequent increase in phase-noise. This arises as to the main purpose of this document, which will seek, as its main objective, the development of an architecture that allows the generation of multiple clock signals at high input frequencies with low jitter and low power dissipation to make ADCs more efficient and faster. This dissertation proposes an architecture implemented by a Clock Buffer that converts a differential input signal into a single-ended output signal, a Digital Buffer that transforms a sine wave into a square wave, and finally a Multi Clock Phase Generator (MPCG), consisting of Shift Registers. Both architectures are implemented in 130 nm CMOS technology. The architecture is powered by a LVDS signal with an amplitude of 200 mV and a frequency of 1 GHz, in order to output 8 square wave clock signals with an amplitude of 1.2 V and with a frequency of 125 MHz. The signals obtained at the output later will feed an architecture of 8 Time-Interleaved ADCs. The total area of the implemented circuit is about 8054.3 μm2, for a dissipated power of 5.3 mW and a jitter value of 1.13 ps. This new architecture will be aimed at all types of entities that work with devices that are made up of high-speed performance ADCs, to improve the operation of these same devices, making the processing from a continuous signal to a discrete signal as efficiently as possible.O rápido desenvolvimento das tecnologias de comunicação de alto desempenho, reflete uma tendência clara na exigência dos requisitos impostos aos conversores analógico-digital (ADCs). Deste modo, verifica-se que estes requisitos implicam elevadas frequências não só sinal de entrada, como também frequências elevadas de amostragem o que se traduz numa maior sensibilidade do circuito ao ruído térmico e consequente aumento ruído de fase. Esta problemática, surge como propósito principal deste documento, no qual se procurará, como objetivo principal, o desenvolvimento de uma arquitetura que permita gerar múltiplos sinais de relógio a altas frequências de entrada e períodos de amostragem, com um baixo jitter e baixa energia consumida de forma a tornar mais eficiente e rápido o funcionamento de ADCs. Ruido térmico. Esta dissertação propõe uma arquitetura composta por um amplificador de sinal de relógio que converte o duplo sinal de entrada num único sinal de saída, um amplificador digital que transforma uma onda sinusoidal numa onda quadrada e por fim um gerador de fase múltipla de sinais de relógio (MPCG), constituído por registos de deslocamento. Ambas as arquiteturas são implementadas em tecnologia CMOS de 130 nm. A arquitetura é alimentada com um sinal LVDS de 200 mV de amplitude e com uma frequência de 1 GHz, de forma a obter à saída 8 sinais de relógio de onda quadrada com uma amplitude de 1,2 V e com 125 MHz de frequência. Os sinais obtidos à saída posteriormente alimentarão uma arquitetura de 8 canais com multiplexagem temporal. A área total do circuito implementado é cerca de 8054,3 μm2, para uma potência dissipada de 5,3 mW e para um valor de jitter de 1,13 ps. Esta nova arquitetura será direcionada para todo o tipo de entidades que trabalham com dispositivos que são constituídos por ADCs de alta velocidade de desempenho, de forma a poder melhorar o funcionamento desses mesmos dispositivos, tornando o processamento de sinal continuo para sinal discreto o mais eficiente possível

    Information Leakage of Flip-Flops in DPA-Resistant Logic Styles

    Get PDF
    This contribution discusses the information leakage of flip-flops for different DPA-resistant logic styles. We show that many of the proposed side-channel resistant logic styles still employ flip-flops that leak data-dependent information. Furthermore, we apply simple models for the leakage of masked flip-flops to design a new attack on circuits implemented using masked logic styles. Contrary to previous attacks on masked logic styles, our attack does not predict the mask bit and does not need detailed knowledge about the attacked device, e.g., the circuit layout. Moreover, our attack works even if all the load capacitances of the complementary logic signals are perfectly balanced and even if the PRNG is ideally unbiased. Finally, after performing the attack on DRSL, MDPL, and iMDPL circuits we show that single-bit masks do not influence the exploitability of the revealed leakage of the masked flip-flops

    Power Reduction Techniques in Clock Distribution Networks with Emphasis on LC Resonant Clocking

    Get PDF
    In this thesis we propose a set of independent techniques in the overall concept of LC resonant clocking where each technique reduces power consumption and improve system performance. Low-power design is becoming a crucial design objective due to the growing demand on portable applications and the increasing difficulties in cooling and heat removal. The clock distribution network delivers the clock signal which acts as a reference to all sequential elements in the synchronous system. The clock distribution network consumes a considerable amount of power in synchronous digital systems. Resonant clocking is an emerging promising technique to reduce the power of the clock network. The inductor used in resonant clocking enables the conversion of the electric energy stored on the clock capacitance to magnetic energy in the inductor and vice versa. In this thesis, the concept of the slack in the clock skew has been extended for an LC fully-resonant clock distribution network. This extra slack in comparison to standard clock distribution networks can be used to reduce routing complexity, achieve reduction in wire elongation, total wire length, and power consumption. Simulation results illustrate that by utilizing the proposed approach, an average reduction of 53% in the number of wire elongations and 11% reduction in total wire length can be achieved. A dual-edge clocking scheme introduced in the literature to enable the operation of the flip-flop at the rising- and falling edges of the clock has been modified. The interval by which the charging elements in the flip-flop are being switched-on was reduced causing a reduction in power consumption. Simulating the flip-flop in STMicroelectronics 90-nm technology shows correct functionality of the Sense Amplifier flip-flop with a resonant clock signal of 500 MHz and a throughput of 1 GHz under process, voltage, and temperature (PVT) variations. Modeling the resonant system with the proposed flip-flop illustrates that dual-edge compared to single-edge triggering can achieve up to 58% reduction in power consumption when the clock capacitance is the dominating factor. The application of low-swing clocking to LC resonant clock distribution network has been investigated on-chip. The proposed low-swing resonant clocking scheme operates with one voltage supply and does not require an additional supply voltage. The Differential Conditional Capturing flip-flop introduced in the literature was modified to operate with a low-swing sinusoidal clock. Low-swing resonant clocking achieved around 5.8% reduction in total power with 5.7% area overhead. Modeling the clock network with the proposed flip-flop illustrates that low-swing clocking can achieve up to 58% reduction in the power consumption of the resonant clock. An analytical approach was introduced to estimate the required driver strength in the clock generator. Using the proposed approach early in the design stage reduces area and power overhead by eliminating the need for programmable switches in the driving circuit

    Low power VLSI design of a fir filter using dual edge triggered clocking strategy

    Get PDF
    Digital signal processing is an area of science and engineering that has developed rapidly over the past 30 years. This rapid development is a result of the significant advances in digital computer technology and integrated–circuit fabrication. DSP processors are a diverse group, most share some common features designed to support fast execution of the repetitive, numerically intensive computations characteristic of digital signal processing algorithms. The most often cited of these features is the ability to perform a multiply-accumulate operation (often called a "MAC") in a single instruction cycle. Hence in this project a DSP Processor is designed which can perform the basic DSP Operations like convolution, fourier transform and filtering. The processor designed is a simple 4-bit processor which has single data line of 8-bits and a single address bus of 16-bits. With a set of branch instructions the project DSP will operate as a CISC processor with strong math capabilities and can perform the above mentioned DSP operations. The application I have taken is the low power FIR filter using dual edge clocking strategy. It combines two novel techniques for the power reduction which is : multi stage clock gating and a symmetric two-phase level-sensitive clocking with glitch aware re-distribution of data-path registers. Simulation results confirm a 42% reduction in power over single edge triggered clocking with clock gating.Also to further reduce the power consumption the a low power latch circuit is used. Thanks to a partial pass-transistor logic, it trades time for energy, being particularly suitable for low power low-frequency applications. Simulation results confirm the power reduction. This technique discussed can be implemented to portable devices which needs longer battery life and to ASIC’

    A Comparative Analysis for Low-voltage, Low-power, and Low-energy Flip-flops

    Get PDF
    Recently, several flip-flops have been proposed to increase their speed while reducing their power and energy consumption. Flip-flop power is dependent on data activity and in many applications data activity is between 5-15%. In such cases, significantly large clock power and energy is wasted. This thesis explores performance of seven advanced D flip-flops in terms of power, delay, and energy using 65nm CMOS process technology. The main objective of this research is to compare and contrast recent flip-flops under different voltage and data activity conditions, and draw conclusions. Transmission gate flip-flop (TGFF) is used as a reference flip-flop, and based on comparison result TGFF has shown power-performance trade-offs. 18-T single-phase clocked static flip-flops (18TSPC, TSPC18), and Low-power at low data activity flip-flop (LLFF) are the fastest alternatives suitable for higher performance. However, LLFF consumes more power on higher data activities, where as TSPC18 and 18TSPC consume more power on lower data activities. For lower data activities Topologically compressed flip-flop (TCFF) is power efficient amongst all, but poor in performance. Furthermore, post-layout simulation result illustrates that LLFF is the most energy efficient amongst all up to 20% of data activities, and 18TSPC is energy efficient for higher data activities

    Design methodologies for variation-aware integrated circuits

    Get PDF
    The scaling of VLSI technology has spurred a rapid growth in the semiconductor industry. With the CMOS device dimension scaling to and beyond 90nm technology, it is possible to achieve higher performance and to pack more complex functionalities on a single chip. However, the scaling trend has introduced drastic variation of process and design parameters, leading to severe variability of chip performance in nanometer regime. Also, the manufacturing community projects CMOS will scale for three to four more generations. Since the uncertainties due to variations are expected to increase in each generation, it will significantly impact the performance of design and consequently the yield. Another challenging issue in the nanometer IC design is the high power consumption due to the greater packing density, higher frequency of operation and excessive leakage power. Moreover, the circuits are usually over-designed to compensate for uncertainties due to variations. The over-designed circuits not only make timing closure difficult but also cause excessive power consumption. For portable electronics, excessive power consumption may reduce battery life; for non-portable systems it may impose great difficulties in cooling and packaging. The objective of my research has been to develop design methodologies to address variations and power dissipation for reliable circuit operation. The proposed work has been divided into three parts: the first part addresses the issues related with power/ground noise induced by clock distribution network and proposes techniques to reduce power/ground noise considering the effects of process variations. The second part proposes an elastic pipeline scheme for random circuits with feedback loops. The proposed scheme provides a low-power solution that has the same variation tolerance as the conventional approaches. The third section deals with discrete buffer and wire sizing for link-based non-tree clock network, which is an energy efficient structure for skew tolerance to variations. For the power/ground noise problem, our approach could reduce the peak current and the delay variations by 50% and 51% respectively. Compared to conventional approach, the elastic timing scheme reduces power dissipation by 20% − 27%. The sizing method achieves clock skew reduction of 45% with a small increase in power dissipation
    corecore