8 research outputs found

    Sincronização em sistemas integrados a alta velocidade

    Get PDF
    Doutoramento em Engenharia ElectrotécnicaA distribui ção de um sinal relógio, com elevada precisão espacial (baixo skew) e temporal (baixo jitter ), em sistemas sí ncronos de alta velocidade tem-se revelado uma tarefa cada vez mais demorada e complexa devido ao escalonamento da tecnologia. Com a diminuição das dimensões dos dispositivos e a integração crescente de mais funcionalidades nos Circuitos Integrados (CIs), a precisão associada as transições do sinal de relógio tem sido cada vez mais afectada por varia ções de processo, tensão e temperatura. Esta tese aborda o problema da incerteza de rel ogio em CIs de alta velocidade, com o objetivo de determinar os limites do paradigma de desenho sí ncrono. Na prossecu ção deste objectivo principal, esta tese propõe quatro novos modelos de incerteza com âmbitos de aplicação diferentes. O primeiro modelo permite estimar a incerteza introduzida por um inversor est atico CMOS, com base em parâmetros simples e su cientemente gen éricos para que possa ser usado na previsão das limitações temporais de circuitos mais complexos, mesmo na fase inicial do projeto. O segundo modelo, permite estimar a incerteza em repetidores com liga ções RC e assim otimizar o dimensionamento da rede de distribui ção de relógio, com baixo esfor ço computacional. O terceiro modelo permite estimar a acumula ção de incerteza em cascatas de repetidores. Uma vez que este modelo tem em considera ção a correla ção entre fontes de ruí do, e especialmente util para promover t ecnicas de distribui ção de rel ogio e de alimentação que possam minimizar a acumulação de incerteza. O quarto modelo permite estimar a incerteza temporal em sistemas com m ultiplos dom ínios de sincronismo. Este modelo pode ser facilmente incorporado numa ferramenta autom atica para determinar a melhor topologia para uma determinada aplicação ou para avaliar a tolerância do sistema ao ru ído de alimentação. Finalmente, usando os modelos propostos, são discutidas as tendências da precisão de rel ogio. Conclui-se que os limites da precisão do rel ogio são, em ultima an alise, impostos por fontes de varia ção dinâmica que se preveem crescentes na actual l ogica de escalonamento dos dispositivos. Assim sendo, esta tese defende a procura de solu ções em outros ní veis de abstração, que não apenas o ní vel f sico, que possam contribuir para o aumento de desempenho dos CIs e que tenham um menor impacto nos pressupostos do paradigma de desenho sí ncrono.Distributing a the clock simultaneously everywhere (low skew) and periodically everywhere (low jitter) in high-performance Integrated Circuits (ICs) has become an increasingly di cult and time-consuming task, due to technology scaling. As transistor dimensions shrink and more functionality is packed into an IC, clock precision becomes increasingly a ected by Process, Voltage and Temperature (PVT) variations. This thesis addresses the problem of clock uncertainty in high-performance ICs, in order to determine the limits of the synchronous design paradigm. In pursuit of this main goal, this thesis proposes four new uncertainty models, with di erent underlying principles and scopes. The rst model targets uncertainty in static CMOS inverters. The main advantage of this model is that it depends only on parameters that can easily be obtained. Thus, it can provide information on upcoming constraints very early in the design stage. The second model addresses uncertainty in repeaters with RC interconnects, allowing the designer to optimise the repeater's size and spacing, for a given uncertainty budget, with low computational e ort. The third model, can be used to predict jitter accumulation in cascaded repeaters, like clock trees or delay lines. Because it takes into consideration correlations among variability sources, it can also be useful to promote oorplan-based power and clock distribution design in order to minimise jitter accumulation. A fourth model is proposed to analyse uncertainty in systems with multiple synchronous domains. It can be easily incorporated in an automatic tool to determine the best topology for a given application or to evaluate the system's tolerance to power-supply noise. Finally, using the proposed models, this thesis discusses clock precision trends. Results show that limits in clock precision are ultimately imposed by dynamic uncertainty, which is expected to continue increasing with technology scaling. Therefore, it advocates the search for solutions at other abstraction levels, and not only at the physical level, that may increase system performance with a smaller impact on the assumptions behind the synchronous design paradigm

    Design Techniques for Energy Efficient Multi-GB/S Serial I/O Transceivers

    Get PDF
    Total I/O bandwidth demand is growing in high-performance systems due to the emergence of many-core microprocessors and in mobile devices to support the next generation of multi-media features. High-speed serial I/O energy efficiency must improve in order to enable continued scaling of these parallel computing platforms in applications ranging from data centers to smart mobile devices. The first work, a low-power forwarded-clock I/O transceiver architecture is presented that employs a high degree of output/input multiplexing, supply-voltage scaling with data rate, and low-voltage circuit techniques to enable low-power operation. The transmitter utilizes a 4:1 output multiplexing voltage-mode driver along with 4-phase clocking that is efficiently generated from a passive poly-phase filter. The output driver voltage swing is accurately controlled from 100-200 mV_(ppd) using a low-voltage pseudo-differential regulator that employs a partial negative-resistance load for improved low frequency gain. 1:8 input de-multiplexing is performed at the receiver equalizer output with 8 parallel input samplers clocked from an 8-phase injection-locked oscillator that provides more than 1UI de-skew range. Low-power high-speed serial I/O transmitters which include equalization to compensate for channel frequency dependent loss are required to meet the aggressive link energy efficiency targets of future systems. The second work presents a low power serial link transmitter design that utilizes an output stage which combines a voltage-mode driver, which offers low static-power dissipation, and current-mode equalization, which offers low complexity and dynamic-power dissipation. The utilization of current-mode equalization decouples the equalization settings and termination impedance, allowing for a significant reduction in pre-driver complexity relative to segmented voltage-mode drivers. Proper transmitter series termination is set with an impedance control loop which adjusts the on-resistance of the output transistors in the driver voltage-mode portion. Further reductions in dynamic power dissipation are achieved through scaling the serializer and local clock distribution supply with data rate. Finally, it presents that a scalable quarter-rate transmitter employs an analog-controlled impedance-modulated 2-tap voltage-mode equalizer and achieves fast power-state transitioning with a replica-biased regulator and ILO clock generation. Capacitively-driven 2 mm global clock distribution and automatic phase calibration allows for aggressive supply scaling

    Precise Timing of Digital Signals: Circuits and Applications

    Get PDF
    With the rapid advances in process technologies, the performance of state-of-the-art integrated circuits is improving steadily. The drive for higher performance is accompanied with increased emphasis on meeting timing constraints not only at the design phase but during device operation as well. Fortunately, technology advancements allow for even more precise control of the timing of digital signals, an advantage which can be used to provide solutions that can address some of the emerging timing issues. In this thesis, circuit and architectural techniques for the precise timing of digital signals are explored. These techniques are demonstrated in applications addressing timing issues in modern digital systems. A methodology for slow-speed timing characterization of high-speed pipelined datapaths is proposed. The technique uses a clock-timing circuit to create shifted versions of a slow-speed clock. These clocks control the data flow in the pipeline in the test mode. Test results show that the design provides an average timing resolution of 52.9ps in 0.18μm CMOS technology. Results also demonstrate the ability of the technique to track the performance of high-speed pipelines at a reduced clock frequency and to test the clock-timing circuit itself. In order to achieve higher resolutions than that of an inverter/buffer stage, a differential (vernier) delay line is commonly used. To allow for the design of differential delay lines with programmable delays, a digitally-controlled delay-element is proposed. The delay element is monotonic and achieves a high degree of transfer characteristics' (digital code vs. delay) linearity. Using the proposed delay element, a sub-1ps resolution is demonstrated experimentally in 0.18μm CMOS. The proposed delay element with a fixed delay step of 2ps is used to design a high-precision all-digital phase aligner. High-precision phase alignment has many applications in modern digital systems such as high-speed memory controllers, clock-deskew buffers, and delay and phase-locked loops. The design is based on a differential delay line and a variation tolerant phase detector using redundancy. Experimental results show that the phase aligner's range is from -264ps to +247ps which corresponds to an average delay step of approximately 2.43ps. For various input phase difference values, test results show that the difference is reduced to less than 2ps at the output of the phase aligner. On-chip time measurement is another application that requires precise timing. It has applications in modern automatic test equipment and on-chip characterization of jitter and skew. In order to achieve small conversion time, a flash time-to-digital converter is proposed. Mismatch between the various delay comparators limits the time measurement precision. This is demonstrated through an experiment in which a 6-bit, 2.5ps resolution flash time-to-digital converter provides an effective resolution of only 4-bits. The converter achieves a maximum conversion rate of 1.25GSa/s

    Characterization and mitigation of process variation in digital circuits and systems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 155-166).Process variation threatens to negate a whole generation of scaling in advanced process technologies due to performance and power spreads of greater than 30-50%. Mitigating this impact requires a thorough understanding of the variation sources, magnitudes and spatial components at the device, circuit and architectural levels. This thesis explores the impacts of variation at each of these levels and evaluates techniques to alleviate them in the context of digital circuits and systems. At the device level, we propose isolation and measurement of variation in the intrinsic threshold voltage of a MOSFET using sub-threshold leakage currents. Analysis of the measured data, from a test-chip implemented on a 0. 18[mu]m CMOS process, indicates that variation in MOSFET threshold voltage is a truly random process dependent only on device dimensions. Further decomposition of the observed variation reveals no systematic within-die variation components nor any spatial correlation. A second test-chip capable of characterizing spatial variation in digital circuits is developed and implemented in a 90nm triple-well CMOS process. Measured variation results show that the within-die component of variation is small at high voltages but is an increasing fraction of the total variation as power-supply voltage decreases. Once again, the data shows no evidence of within-die spatial correlation and only weak systematic components. Evaluation of adaptive body-biasing and voltage scaling as variation mitigation techniques proves voltage scaling is more effective in performance modification with reduced impact to idle power compared to body-biasing.(cont.) Finally, the addition of power-supply voltages in a massively parallel multicore processor is explored to reduce the energy required to cope with process variation. An analytic optimization framework is developed and analyzed; using a custom simulation methodology, total energy of a hypothetical 1K-core processor based on the RAW core is reduced by 6-16% with the addition of only a single voltage. Analysis of yield versus required energy demonstrates that a combination of disabling poor-performing cores and additional power-supply voltages results in an optimal trade-off between performance and energy.by Nigel Anthony Drego.Ph.D

    Circuit design for logic automata

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (p. 143-148).The Logic Automata model is a universal distributed computing structure which pushes parallelism to the bit-level extreme. This new model drastically differs from conventional computer architectures in that it exposes, rather than hides, the physics underlying the computation by accommodating data processing and storage in a local and distributed manner. Based on Logic Automata, highly scalable computing structures for digital and analog processing have been developed; and they are verified at the transistor level in this thesis. The Asynchronous Logic Automata (ALA) model is derived by adding the temporal locality, i.e., the asynchrony in data exchanges, in addition to the spacial locality of the Logic Automata model. As a demonstration of this incrementally extensible, clockless structure, we designed an ALA cell library in 90 nm CMOS technology and established a "pick-and-place" design flow for fast ALA circuit layout. The work flow gracefully aligns the description of computer programs and circuit realizations, providing a simpler and more scalable solution for Application Specific Integrated Circuit (ASIC) designs, which are currently limited by global constraints such as the clock and long interconnects. The potential of the ALA circuit design flow is tested with example applications for mathematical operations. The same Logic Automata model can also be augmented by relaxing the digital states into analog ones for interesting analog computations. The Analog Logic Automata (AnLA) model is a merge of the Analog Logic principle and the Logic Automata architecture, in which efficient processing is embedded onto a scalable construction.(cont.) In order to study the unique property of this mixed-signal computing structure, we designed and fabricated an AnLA test chip in AMI 0.5[mu]m CMOS technology. Chip tests of an AnLA Noise-Locked Loop (NLL) circuit as well as application tests of AnLA image processing and Error-Correcting Code (ECC) decoding, show large potential of the AnLA structure.by Kailiang Chen.S.M

    메모리 인터페이스를 위한 20Gbps급 직렬화 송수신기 설계

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2013. 8. 정덕균.Various types of serial link for current and future memory interface are presented in this thesis. At first, PHY design for commercial GDDR3 memory is proposed. GDDR3 PHY is consists of read path, write path, command path. Write path and command path calibrate skew by using VDL (Variable delay line), while read path calibrates skew by using DLL (Delay locked loop) and VDL. There are four data channels and one command/address channel. Each data channel consists of one clock signal (DQS) and eight data signals (DQ). Data channel operates in 1.2Gbps (1.08Gbps~1.2Gbps), and command/address channel operates 600Mbps (540Mbps~600Mbps). In particular, DLL design for high speed and for SSN (simultaneous switching noise) is concentrated in this thesis. Secondly, serial link design for silicon photonics is proposed. Silicon photonics is the strongest candidate for next generation memory interface. Modulator driver for modulator, TIA (trans-impedance amplifier) and LA (limiting amplifier) for photo diode design are discussed. It operates above 12.5Gbps but it consumes much power 7.2mW/Gbps (transmitter core), 2mW/Gbps (receiver core) because it is connected with optical device which has large parasitic capacitance. Overall receiver which includes CDR (clock and data recovery) is also implemented. Many chips are fabricated in 65nm, 0.13um CMOS process. Finally, electrical serial link for 20Gbps memory link is proposed. Overall architecture is forwarded clocking architecture, and is very simple and intuitive. It does not need additional synchronizer. This open loop delay matched stream line receiver finds optimum sampling point with DCDL (Digitally controlled delay line) controller and expects to consume low power structurally. Only two phase half rate clock is transmitted through clock channel, but half rate time interleaved way sampling is performed by aid of initial value settable PRBS chaser. A CMOS Chip is fabricated by 65nm process and it occupies 2500um x 2500um (transceiver). It is expected that about 2.6mW(2.4mW)/Gbps (transmitter), 4.1mW(2.7mW)/Gbps (receiver). Power consumption improvement is expected in advanced process.ABSTRACT I CONTENTS V LIST OF FIGURES VII LIST OF TABLES XII CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 10 CHAPTER 2 A SERIAL LINK PHY DESIGN FOR GDDR3 MEMORY INTERFACE 11 2.1 INTRODUCTION 11 2.2 GDDR3 MEMORY INTERFACE ARCHITECTURE 12 2.2.1 READ PATH ARCHITECTURE 15 2.2.2 WRITE PATH ARCHITECTURE 17 2.2.3 COMMAND PATH ARCHITECTURE 19 2.3 DLL DESIGN FOR MEMORY INTERFACE 20 2.3.1 SSN(SIMULTANEOUS SWITCHING NOISE) 20 2.3.2 DLL ARCHITECTURE 21 2.3.3 VOLTAGE CONTROLLED DELAY LINE (VCDL) 22 2.3.4 HYSTERESIS COARSE LOCK DETECTOR (HCLD) 23 2.3.5 DYNAMIC PHASE DETECTOR AND CHARGE PUMP 26 2.4 SIMULATION RESULT 29 2.5 CONCLUSION 32 CHAPTER 3 OPTICAL FRONT-END SERIAL LINK DESIGN FOR 20 GBPS MEMORY INTERFACE 35 3.1 SILICON PHOTONICS INTRODUCTION 35 3.2 OPTICAL FRONT-END TRANSMITTER DESIGN 45 3.2.1 MODULATOR DRIVER REQUIREMENTS 46 3.2.2 MODULATOR DRIVER DESIGN - CURRENT MODE DRIVER 47 3.2.3 MODULATOR DRIVER DESIGN - CURRENT MODE DRIVER 50 3.3 OPTICAL FRONT-END RECEIVER DESIGN 55 3.3.1 OPTICAL RECEIVER BACK END REQUIREMENTS 56 3.3.2 OPTICAL RECEIVER BACK END DESIGN – TIA 57 3.3.3 OPTICAL RECEIVER BACK END DESIGN – LA, DRIVER 63 3.3.4 OPTICAL RECEIVER BACK END DESIGN – CDR 66 3.4 MEASUREMENT AND SIMULATION RESULTS 70 3.4.1 MEASUREMENT AND SIMULATION ENVIRONMENTS 70 3.4.2 OPTICAL TX FRONT END MEASUREMENT AND SIMULATION 74 3.4.3 OPTICAL RX FRONT END MEASUREMENT AND SIMULATION 77 3.4.4 OPTICAL RX BACK END SIMULATION 79 3.4.5 OPTICAL-ELECTRICAL OVERALL MEASUREMENTS 80 3.4.6 DIE PHOTO AND LAYOUT 82 3.5 CONCLUSION 86 CHAPTER 4 ELECTRICAL FRONT-END SERIAL LINK DESIGN FOR 20GBPS MEMORY INTERFACE 87 4.1 INTRODUCTION 87 4.2 CONVENTIONAL ELECTRICAL FRONT-END HIGH SPEED SERIAL LINK ARCHITECTURES 90 4.3 DESIGN CONCEPT AND PROPOSED SERIAL LINK ARCHITECTURE – OPEN LOOP DELAY MATCHED STREAM LINED RECEIVER. 95 4.3.1 PROPOSED OVERALL ARCHITECTURE 95 4.3.2 DESIGN CONCEPT 97 4.3.3 PROPOSED PROTOCOL AND LOCKING PROCESS 100 4.4 OPTIMUM POINT SEARCH ALGORITHM BASED DCDL CONTROLLER DESIGN 102 4.5 DCDL (DIGITALLY CONTROLLED DELAY LINE) DESIGN 112 4.6 DFE (DECISION FEEDBACK EQUALIZER) AND OTHER BLOCKS DESIGN 115 4.7 SIMULATION RESULTS 117 4.8 POWER EXPECTATION AND CHIP LAYOUT 122 4.9 CONCLUSION 124 CHAPTER 5 CONCLUSION 126 BIBLIOGRAPHY 128Docto

    Clock Generation Design for Continuous-Time Sigma-Delta Analog-To-Digital Converter in Communication Systems

    Get PDF
    Software defined radio, a highly digitized wireless receiver, has drawn huge attention in modern communication system because it can not only benefit from the advanced technologies but also exploit large digital calibration of digital signal processing (DSP) to optimize the performance of receivers. Continuous-time (CT) bandpass sigma-delta (ΣΔ) modulator, used as an RF-to-digital converter, has been regarded as a potential solution for software defined ratio. The demand to support multiple standards motivates the development of a broadband CT bandpass ΣΔ which can cover the most commercial spectrum of 1GHz to 4GHz in a modern communication system. Clock generation, a major building block in radio frequency (RF) integrated circuits (ICs), usually uses a phase-locked loop (PLL) to provide the required clock frequency to modulate/demodulate the informative signals. This work explores the design of clock generation in RF ICs. First, a 2-16 GHz frequency synthesizer is proposed to provide the sampling clocks for a programmable continuous-time bandpass sigma-delta (ΣΔ) modulator in a software radio receiver system. In the frequency synthesizer, a single-sideband mixer combines feed-forward and regenerative mixing techniques to achieve the wide frequency range. Furthermore, to optimize the excess loop delay in the wideband system, a phase-tunable clock distribution network and a clock-controlled quantizer are proposed. Also, the false locking of regenerative mixing is solved by controlling the self-oscillation frequency of the CML divider. The proposed frequency synthesizer performs excellent jitter performance and efficient power consumption. Phase noise and quadrature phase accuracy are the common tradeoff in a quadrature voltage-controlled oscillator. A larger coupling ratio is preferred to obtain good phase accuracy but suffer phase noise performance. To address these fundamental trade-offs, a phasor-based analysis is used to explain bi-modal oscillation and compute the quadrature phase errors given by inevitable mismatches of components. Also, the ISF is used to estimate the noise contribution of each major noise source. A CSD QVCO is first proposed to eliminate the undesired bi-modal oscillation and enhance the quadrature phase accuracy. The second work presents a DCC QVCO. The sophisticated dynamic current-clipping coupling network reduces injecting noise into LC tank at most vulnerable timings (zero crossing points). Hence, it allows the use of strong coupling ratio to minimize the quadrature phase sensitivity to mismatches without degrading the phase noise performance. The proposed DCC QVCO is implemented in a 130-nm CMOS technology. The measured phase noise is -121 dBc/Hz at 1MHz offset from a 5GHz carrier. The QVCO consumes 4.2mW with a 1-V power supply, resulting in an outstanding Figure of Merit (FoM) of 189 dBc/Hz. Frequency divider is one of the most power hungry building blocks in a PLL-based frequency synthesizer. The complementary injection-locked frequency divider is proposed to be a low-power solution. With the complimentary injection schemes, the dividers can realize both even and odd division modulus, performing a more than 100% locking range to overcome the PVT variation. The proposed dividers feature excellent phase noise. They can be used for multiple-phase generation, programmable phase-switching frequency dividers, and phase-skewing circuits
    corecore