21 research outputs found

    ๊ณ ์† ์‹œ๋ฆฌ์–ผ ๋งํฌ๋ฅผ ์œ„ํ•œ ๊ณ ๋ฆฌ ๋ฐœ์ง„๊ธฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” ์ฃผํŒŒ์ˆ˜ ํ•ฉ์„ฑ๊ธฐ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2022. 8. ์ •๋•๊ท .In this dissertation, major concerns in the clocking of modern serial links are discussed. As sub-rate, multi-standard architectures are becoming predominant, the conventional clocking methodology seems to necessitate innovation in terms of low-cost implementation. Frequency synthesis with active, inductor-less oscillators replacing LC counterparts are reviewed, and solutions for two major drawbacks are proposed. Each solution is verified by prototype chip design, giving a possibility that the inductor-less oscillator may become a proper candidate for future high-speed serial links. To mitigate the high flicker noise of a high-frequency ring oscillator (RO), a reference multiplication technique that effectively extends the bandwidth of the following all-digital phase-locked loop (ADPLL) is proposed. The technique avoids any jitter accumulation, generating a clean mid-frequency clock, overall achieving high jitter performance in conjunction with the ADPLL. Timing constraint for the proper reference multiplication is first analyzed to determine the calibration points that may correct the existent phase errors. The weight for each calibration point is updated by the proposed a priori probability-based least-mean-square (LMS) algorithm. To minimize the time required for the calibration, each gain for the weight update is adaptively varied by deducing a posteriori which error source dominates the others. The prototype chip is fabricated in a 40-nm CMOS technology, and its measurement results verify the low-jitter, high-frequency clock generation with fast calibration settling. The presented work achieves an rms jitter of 177/223 fs at 8/16-GHz output, consuming 12.1/17-mW power. As the second embodiment, an RO-based ADPLL with an analog technique that addresses the high supply sensitivity of the RO is presented. Unlike prior arts, the circuit for the proposed technique does not extort the RO voltage headroom, allowing high-frequency oscillation. Further, the performance given from the technique is robust over process, voltage, and temperature (PVT) variations, avoiding the use of additional calibration hardware. Lastly, a comprehensive analysis of phase noise contribution is conducted for the overall ADPLL, followed by circuit optimizations, to retain the low-jitter output. Implemented in a 40-nm CMOS technology, the frequency synthesizer achieves an rms jitter of 289 fs at 8 GHz output without any injected supply noise. Under a 20-mVrms white supply noise, the ADPLL suppresses supply-noise-induced jitter by -23.8 dB.๋ณธ ๋…ผ๋ฌธ์€ ํ˜„๋Œ€ ์‹œ๋ฆฌ์–ผ ๋งํฌ์˜ ํด๋ฝํ‚น์— ๊ด€์—ฌ๋˜๋Š” ์ฃผ์š”ํ•œ ๋ฌธ์ œ๋“ค์— ๋Œ€ํ•˜์—ฌ ๊ธฐ์ˆ ํ•œ๋‹ค. ์ค€์†๋„, ๋‹ค์ค‘ ํ‘œ์ค€ ๊ตฌ์กฐ๋“ค์ด ์ฑ„ํƒ๋˜๊ณ  ์žˆ๋Š” ์ถ”์„ธ์— ๋”ฐ๋ผ, ๊ธฐ์กด์˜ ํด๋ผํ‚น ๋ฐฉ๋ฒ•์€ ๋‚ฎ์€ ๋น„์šฉ์˜ ๊ตฌํ˜„์˜ ๊ด€์ ์—์„œ ์ƒˆ๋กœ์šด ํ˜์‹ ์„ ํ•„์š”๋กœ ํ•œ๋‹ค. LC ๊ณต์ง„๊ธฐ๋ฅผ ๋Œ€์‹ ํ•˜์—ฌ ๋Šฅ๋™ ์†Œ์ž ๋ฐœ์ง„๊ธฐ๋ฅผ ์‚ฌ์šฉํ•œ ์ฃผํŒŒ์ˆ˜ ํ•ฉ์„ฑ์— ๋Œ€ํ•˜์—ฌ ์•Œ์•„๋ณด๊ณ , ์ด์— ๋ฐœ์ƒํ•˜๋Š” ๋‘๊ฐ€์ง€ ์ฃผ์š” ๋ฌธ์ œ์ ๊ณผ ๊ฐ๊ฐ์— ๋Œ€ํ•œ ํ•ด๊ฒฐ ๋ฐฉ์•ˆ์„ ํƒ์ƒ‰ํ•œ๋‹ค. ๊ฐ ์ œ์•ˆ ๋ฐฉ๋ฒ•์„ ํ”„๋กœํ† ํƒ€์ž… ์นฉ์„ ํ†ตํ•ด ๊ทธ ํšจ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•˜๊ณ , ์ด์–ด์„œ ๋Šฅ๋™ ์†Œ์ž ๋ฐœ์ง„๊ธฐ๊ฐ€ ๋ฏธ๋ž˜์˜ ๊ณ ์† ์‹œ๋ฆฌ์–ผ ๋งํฌ์˜ ํด๋ฝํ‚น์— ์‚ฌ์šฉ๋  ๊ฐ€๋Šฅ์„ฑ์— ๋Œ€ํ•ด ๊ฒ€ํ† ํ•œ๋‹ค. ์ฒซ๋ฒˆ์งธ ์‹œ์—ฐ์œผ๋กœ์จ, ๊ณ ์ฃผํŒŒ ๊ณ ๋ฆฌ ๋ฐœ์ง„๊ธฐ์˜ ๋†’์€ ํ”Œ๋ฆฌ์ปค ์žก์Œ์„ ์™„ํ™”์‹œํ‚ค๊ธฐ ์œ„ํ•ด ๊ธฐ์ค€ ์‹ ํ˜ธ๋ฅผ ๋ฐฐ์ˆ˜ํ™”ํ•˜์—ฌ ๋’ท๋‹จ์˜ ์œ„์ƒ ๊ณ ์ • ๋ฃจํ”„์˜ ๋Œ€์—ญํญ์„ ํšจ๊ณผ์ ์œผ๋กœ ๊ทน๋Œ€ํ™” ์‹œํ‚ค๋Š” ํšŒ๋กœ ๊ธฐ์ˆ ์„ ์ œ์•ˆํ•œ๋‹ค. ๋ณธ ๊ธฐ์ˆ ์€ ์ง€ํ„ฐ๋ฅผ ๋ˆ„์  ์‹œํ‚ค์ง€ ์•Š์œผ๋ฉฐ ๋”ฐ๋ผ์„œ ๊นจ๋—ํ•œ ์ค‘๊ฐ„ ์ฃผํŒŒ์ˆ˜ ํด๋ฝ์„ ์ƒ์„ฑ์‹œ์ผœ ์œ„์ƒ ๊ณ ์ • ๋ฃจํ”„์™€ ํ•จ๊ป˜ ๋†’์€ ์„ฑ๋Šฅ์˜ ๊ณ ์ฃผํŒŒ ํด๋ฝ์„ ํ•ฉ์„ฑํ•œ๋‹ค. ๊ธฐ์ค€ ์‹ ํ˜ธ๋ฅผ ์„ฑ๊ณต์ ์œผ๋กœ ๋ฐฐ์ˆ˜ํ™”ํ•˜๊ธฐ ์œ„ํ•œ ํƒ€์ด๋ฐ ์กฐ๊ฑด๋“ค์„ ๋จผ์ € ๋ถ„์„ํ•˜์—ฌ ํƒ€์ด๋ฐ ์˜ค๋ฅ˜๋ฅผ ์ œ๊ฑฐํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•๋ก ์„ ํŒŒ์•…ํ•œ๋‹ค. ๊ฐ ๊ต์ • ์ค‘๋Ÿ‰์€ ์—ฐ์—ญ์  ํ™•๋ฅ ์„ ๊ธฐ๋ฐ˜์œผ๋กœํ•œ LMS ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด ๊ฐฑ์‹ ๋˜๋„๋ก ์„ค๊ณ„๋œ๋‹ค. ๊ต์ •์— ํ•„์š”ํ•œ ์‹œ๊ฐ„์„ ์ตœ์†Œํ™” ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ, ๊ฐ ๊ต์ • ์ด๋“์€ ํƒ€์ด๋ฐ ์˜ค๋ฅ˜ ๊ทผ์›๋“ค์˜ ํฌ๊ธฐ๋ฅผ ๊ท€๋‚ฉ์ ์œผ๋กœ ์ถ”๋ก ํ•œ ๊ฐ’์„ ๋ฐ”ํƒ•์œผ๋กœ ์ง€์†์ ์œผ๋กœ ์ œ์–ด๋œ๋‹ค. 40-nm CMOS ๊ณต์ •์œผ๋กœ ๊ตฌํ˜„๋œ ํ”„๋กœํ† ํƒ€์ž… ์นฉ์˜ ์ธก์ •์„ ํ†ตํ•ด ์ €์†Œ์Œ, ๊ณ ์ฃผํŒŒ ํด๋ฝ์„ ๋น ๋ฅธ ๊ต์ • ์‹œ๊ฐ„์•ˆ์— ํ•ฉ์„ฑํ•ด ๋ƒ„์„ ํ™•์ธํ•˜์˜€๋‹ค. ์ด๋Š” 177/223 fs์˜ rms ์ง€ํ„ฐ๋ฅผ ๊ฐ€์ง€๋Š” 8/16 GHz์˜ ํด๋ฝ์„ ์ถœ๋ ฅํ•œ๋‹ค. ๋‘๋ฒˆ์งธ ์‹œ์—ฐ์œผ๋กœ์จ, ๊ณ ๋ฆฌ ๋ฐœ์ง„๊ธฐ์˜ ๋†’์€ ์ „์› ๋…ธ์ด์ฆˆ ์˜์กด์„ฑ์„ ์™„ํ™”์‹œํ‚ค๋Š” ๊ธฐ์ˆ ์ด ํฌํ•จ๋œ ์ฃผํŒŒ์ˆ˜ ํ•ฉ์„ฑ๊ธฐ๊ฐ€ ์„ค๊ณ„๋˜์—ˆ๋‹ค. ์ด๋Š” ๊ณ ๋ฆฌ ๋ฐœ์ง„๊ธฐ์˜ ์ „์•• ํ—ค๋“œ๋ฃธ์„ ๋ณด์กดํ•จ์œผ๋กœ์„œ ๊ณ ์ฃผํŒŒ ๋ฐœ์ง„์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ๋‹ค. ๋‚˜์•„๊ฐ€, ์ „์› ๋…ธ์ด์ฆˆ ๊ฐ์†Œ ์„ฑ๋Šฅ์€ ๊ณต์ •, ์ „์••, ์˜จ๋„ ๋ณ€๋™์— ๋Œ€ํ•˜์—ฌ ๋ฏผ๊ฐํ•˜์ง€ ์•Š์œผ๋ฉฐ, ๋”ฐ๋ผ์„œ ์ถ”๊ฐ€์ ์ธ ๊ต์ • ํšŒ๋กœ๋ฅผ ํ•„์š”๋กœ ํ•˜์ง€ ์•Š๋Š”๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์œ„์ƒ ๋…ธ์ด์ฆˆ์— ๋Œ€ํ•œ ํฌ๊ด„์  ๋ถ„์„๊ณผ ํšŒ๋กœ ์ตœ์ ํ™”๋ฅผ ํ†ตํ•˜์—ฌ ์ฃผํŒŒ์ˆ˜ ํ•ฉ์„ฑ๊ธฐ์˜ ์ €์žก์Œ ์ถœ๋ ฅ์„ ๋ฐฉํ•ดํ•˜์ง€ ์•Š๋Š” ๋ฐฉ๋ฒ•์„ ๊ณ ์•ˆํ•˜์˜€๋‹ค. ํ•ด๋‹น ํ”„๋กœํ† ํƒ€์ž… ์นฉ์€ 40-nm CMOS ๊ณต์ •์œผ๋กœ ๊ตฌํ˜„๋˜์—ˆ์œผ๋ฉฐ, ์ „์› ๋…ธ์ด์ฆˆ๊ฐ€ ์ธ๊ฐ€๋˜์ง€ ์•Š์€ ์ƒํƒœ์—์„œ 289 fs์˜ rms ์ง€ํ„ฐ๋ฅผ ๊ฐ€์ง€๋Š” 8 GHz์˜ ํด๋ฝ์„ ์ถœ๋ ฅํ•œ๋‹ค. ๋˜ํ•œ, 20 mVrms์˜ ์ „์› ๋…ธ์ด์ฆˆ๊ฐ€ ์ธ๊ฐ€๋˜์—ˆ์„ ๋•Œ์— ์œ ๋„๋˜๋Š” ์ง€ํ„ฐ์˜ ์–‘์„ -23.8 dB ๋งŒํผ ์ค„์ด๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค.1 Introduction 1 1.1 Motivation 3 1.1.1 Clocking in High-Speed Serial Links 4 1.1.2 Multi-Phase, High-Frequency Clock Conversion 8 1.2 Dissertation Objectives 10 2 RO-Based High-Frequency Synthesis 12 2.1 Phase-Locked Loop Fundamentals 12 2.2 Toward All-Digital Regime 15 2.3 RO Design Challenges 21 2.3.1 Oscillator Phase Noise 21 2.3.2 Challenge 1: High Flicker Noise 23 2.3.3 Challenge 2: High Supply Noise Sensitivity 26 3 Filtering RO Noise 28 3.1 Introduction 28 3.2 Proposed Reference Octupler 34 3.2.1 Delay Constraint 34 3.2.2 Phase Error Calibration 38 3.2.3 Circuit Implementation 51 3.3 IL-ADPLL Implementation 55 3.4 Measurement Results 59 3.5 Summary 63 4 RO Supply Noise Compensation 69 4.1 Introduction 69 4.2 Proposed Analog Closed Loop for Supply Noise Compensation 72 4.2.1 Circuit Implementation 73 4.2.2 Frequency-Domain Analysis 76 4.2.3 Circuit Optimization 81 4.3 ADPLL Implementation 87 4.4 Measurement Results 90 4.5 Summary 98 5 Conclusions 99 A Notes on the 8REF 102 B Notes on the ACSC 105๋ฐ•

    Design and modelling of clock and data recovery integrated circuit in 130 nm CMOS technology for 10 Gb/s serial data communications

    Get PDF
    This thesis describes the design and implementation of a fully monolithic 10 Gb/s phase and frequency-locked loop based clock and data recovery (PFLL-CDR) integrated circuit, as well as the Verilog-A modeling of an asynchronous serial link based chip to chip communication system incorporating the proposed concept. The proposed design was implemented and fabricated using the 130 nm CMOS technology offered by UMC (United Microelectronics Corporation). Different PLL-based CDR circuits topologies were investigated in terms of architecture and speed. Based on the investigation, we proposed a new concept of quarter-rate (i.e. the clocking speed in the circuit is 2.5 GHz for 10 Gb/s data rate) and dual-loop topology which consists of phase-locked and frequency-locked loop. The frequency-locked loop (FLL) operates independently from the phase-locked loop (PLL), and has a highly-desired feature that once the proper frequency has been acquired, the FLL is automatically disabled and the PLL will take over to adjust the clock edges approximately in the middle of the incoming data bits for proper sampling. Another important feature of the proposed quarter-rate concept is the inherent 1-to-4 demultiplexing of the input serial data stream. A new quarter-rate phase detector based on the non-linear early-late phase detector concept has been used to achieve the multi-Giga bit/s speed and to eliminate the need of the front-end data pre-processing (edge detecting) units usually associated with the conventional CDR circuits. An eight-stage differential ring oscillator running at 2.5 GHz frequency center was used for the voltage-controlled oscillator (VCO) to generate low-jitter multi-phase clock signals. The transistor level simulation results demonstrated excellent performances in term of locking speed and power consumption. In order to verify the accuracy of the proposed quarter-rate concept, a clockless asynchronous serial link incorporating the proposed concept and communicating two chips at 10 Gb/s has been modelled at gate level using the Verilog-A language and time-domain simulated

    Energy-efficient wireline transceivers

    Get PDF
    Power-efficient wireline transceivers are highly demanded by many applications in high performance computation and communication systems. Apart from transferring a wide range of data rates to satisfy the interconnect bandwidth requirement, the transceivers have very tight power budget and are expected to be fully integrated. This thesis explores enabling techniques to implement such transceivers in both circuit and system levels. Specifically, three prototypes will be presented: (1) a 5Gb/s reference-less clock and data recovery circuit (CDR) using phase-rotating phase-locked loop (PRPLL) to conduct phase control so as to break several fundamental trade-offs in conventional receivers; (2) a 4-10.5Gb/s continuous-rate CDR with novel frequency acquisition scheme based on bang-bang phase detector (BBPD) and a ring oscillator-based fractional-N PLL as the low noise wide range DCO in the CDR loop; (3) a source-synchronous energy-proportional link with dynamic voltage and frequency scaling (DVFS) and rapid on/off (ROO) techniques to cut the link power wastage at system level. The receiver/transceiver architectures are highly digital and address the requirements of new receiver architecture development, wide operating range, and low power/area consumption while being fully integrated. Experimental results obtained from the prototypes attest the effectiveness of the proposed techniques

    Hybrid NRZ/Multi-Tone Signaling for High-Speed Low-Power Wireline Transceivers

    Get PDF
    Over the past few decades, incessant growth of Internet networking traffic and High-Performance Computing (HPC) has led to a tremendous demand for data bandwidth. Digital communication technologies combined with advanced integrated circuit scaling trends have enabled the semiconductor and microelectronic industry to dramatically scale the bandwidth of high-loss interfaces such as Ethernet, backplane, and Digital Subscriber Line (DSL). The key to achieving higher bandwidth is to employ equalization technique to compensate the channel impairments such as Inter-Symbol Interference (ISI), crosstalk, and environmental noise. Therefore, todayรขs advanced input/outputs (I/Os) has been equipped with sophisticated equalization techniques to push beyond the uncompensated bandwidth of the system. To this end, process scaling has continually increased the data processing capability and improved the I/O performance over the last 15 years. However, since the channel bandwidth has not scaled with the same pace, the required signal processing and equalization circuitry becomes more and more complicated. Thereby, the energy efficiency improvements are largely offset by the energy needed to compensate channel impairments. In this design paradigm, re-thinking about the design strategies in order to not only satisfy the bandwidth performance, but also to improve power-performance becomes an important necessity. It is well known in communication theory that coding and signaling schemes have the potential to provide superior performance over band-limited channels. However, the choice of the optimum data communication algorithm should be considered by accounting for the circuit level power-performance trade-offs. In this thesis we have investigated the application of new algorithm and signaling schemes in wireline communications, especially for communication between microprocessors, memories, and peripherals. A new hybrid NRZ/Multi-Tone (NRZ/MT) signaling method has been developed during the course of this research. The system-level and circuit-level analysis, design, and implementation of the proposed signaling method has been performed in the frame of this work, and the silicon measurement results have proved the efficiency and the robustness of the proposed signaling methodology for wireline interfaces. In the first part of this work, a 7.5 Gb/s hybrid NRZ/MT transceiver (TRX) for multi-drop bus (MDB) memory interfaces is designed and fabricated in 40 nm CMOS technology. Reducing the complexity of the equalization circuitry on the receiver (RX) side, the proposed architecture achieves 1 pJ/bit link efficiency for a MDB channel bearing 45 dB loss at 2.5 GHz. The measurement results of the first prototype confirm that NRZ/MT serial data TRX can offer an energy-efficient solution for MDB memory interfaces. Motivated by the satisfying results of the first prototype, in the second phase of this research we have exploited the properties of multi-tone signaling, especially orthogonality among different sub-bands, to reduce the effect of crosstalk in high-dense wireline interconnects. A four-channel transceiver has been implemented in a standard CMOS 40 nm technology in order to demonstrate the performance of NRZ/MT signaling in presence of high channel loss and strong crosstalk noise. The proposed system achieves 1 pJ/bit power efficiency, while communicating over a MDB memory channel at 36 Gb/s aggregate data rate

    Digital enhancement techniques for fractional-N frequency synthesizers

    Get PDF
    Meeting the demand for unprecedented connectivity in the era of internet-of-things (IoT) requires extremely energy efficient operation of IoT nodes to extend battery life. Managing the data traffic generated by trillions of such nodes also puts severe energy constraints on the data centers. Clock generators that are essential elements in these systems consume significant power and therefore must be optimized for low power and high performance. The focus of this thesis is on improving the energy efficiency of frequency synthesizers and clocking modules by exploring design techniques at both the architectural and circuit levels. In the first part of this work, a digital fractional-N phase locked loop (FNPLL) that employs a high resolution time-to-digital converter (TDC) and a truly ฮ”ฮฃ fractional divider to achieve low in-band noise with a wide bandwidth is presented. The fractional divider employs a digital-to-time converter (DTC) to cancel out ฮ”ฮฃ quantization noise in time domain, thus alleviating TDC dynamic range requirements. The proposed digital architecture adopts a narrow range low-power time-amplifier based TDC (TA-TDC) to achieve sub 1ps resolution. Fabricated in 65nm CMOS process, the prototype PLL achieves better than -106dBc/Hz in-band noise and 3MHz PLL bandwidth at 4.5GHz output frequency using 50MHz reference. The PLL achieves excellent jitter performance of 490fsrms, while consumes only 3.7mW. This translates to the best reported jitter-power figure-of-merit (FoM) of -240.5dB among previously reported FNPLLs. Phase noise performance of ring oscillator based digital FNPLLs is severely compromised by conflicting bandwidth requirements to simultaneously suppress oscillator phase and quantization noise introduced by the TDC, ฮ”ฮฃ fractional divider, and digital-to-analog converter (DAC). As a consequence, their FoM that quantifies the power-jitter tradeoff is at least 25dB worse than their LC-oscillator based FNPLL counterparts. In the second part of this thesis, we seek to close this performance gap by extending PLL bandwidth using quantization noise cancellation techniques and by employing a dual-path digital loop filter to suppress the detrimental impact of DAC quantization noise. A prototype was implemented in a 65nm CMOS process operating over a wide frequency range of 2.0GHz-5.5GHz using a modified extended range multi-modulus divider with seamless switching. The proposed digital FNPLL achieves 1.9psrms integrated jitter while consuming only 4mW at 5GHz output. The measured in-band phase noise is better than -96 dBc/Hz at 1MHz offset. The proposed FNPLL achieves wide bandwidth up to 6MHz using a 50 MHz reference and its FoM is -228.5dB, which is at about 20dB better than previously reported ring-based digital FNPLLs. In the third part, we propose a new multi-output clock generator architecture using open loop fractional dividers for system-on-chip (SoC) platforms. Modern multi-core processors use per core clocking, where each core runs at its own speed. The core frequency can be changed dynamically to optimize for performance or power dissipation using a dynamic frequency scaling (DFS) technique. Fast frequency switching is highly desirable as long as it does not interrupt code execution; therefore it requires smooth frequency transitions with no undershoots. The second main requirement in processor clocking is the capability of spread spectrum frequency modulation. By spreading the clock energy across a wide bandwidth, the electromagnetic interference (EMI) is dramatically reduced. A conventional PLL clock generation approach suffers from a slow frequency settling and limited spread spectrum modulation capabilities. The proposed open loop fractional divider architecture overcomes the bandwidth limitation in fractional-N PLLs. The fractional divider switches the output frequency instantaneously and provides an excellent spread spectrum performance, where precise and programmable modulation depth and frequency can be applied to satisfy different EMI requirements. The fractional divider has unlimited modulation bandwidth resulting in spread spectrum modulation with no filtering, unlike fractional-N PLL; consequently it achieves higher EMI reduction. A prototype fractional divider was implemented in a 65nm CMOS process, where the measured peak-to-peak jitter is less than 27ps over a wide frequency range from 20MHz to 1GHz. The total power consumption is about 3.2mW for 1GHz output frequency. The all-digital implementation of the divider occupies the smallest area of 0.017mm2 compared to state-of-the-art designs. As the data rate of serial links goes higher, the jitter requirements of the clock generator become more stringent. Improving the jitter performance of conventional PLLs to less than (200fsrms) always comes with a large power penalty (tens of mWs). This is due to the PLL coupled noise bandwidth trade-off, which imposes stringent noise requirements on the oscillator and/or loop components. Alternatively, an injection-locked clock multiplier (ILCM) provides many advantages in terms of phase noise, power, and area compared to classical PLLs, but they suffer from a narrow lock-in range and a high sensitivity to PVT variations especially at a large multiplication factor (N). In the fourth part of this thesis, a low-jitter, low-power LC-based ILCM with a digital frequency-tracking loop (FTL) is presented. The proposed FTL relies on a new pulse gating technique to continuously tune the oscillator's free-running frequency. The FTL ensures robust operation across PVT variations and resolves the race condition existing in injection locked PLLs by decoupling frequency tuning from the injection path. As a result, the phase locking condition is only determined by the injection path. This work also introduces an accurate theoretical large-signal analysis for phase domain response (PDR) of injection locked oscillators (ILOs). The proposed PDR analysis captures the asymmetric nature of ILO's lock-in range, and the impact of frequency error on injection strength and phase noise performance. The proposed architecture and analysis are demonstrated by a prototype fabricated in 65 nm CMOS process with active area of 0.25mm2. The prototype ILCM multiplies the reference frequency by 64 to generate an output clock in the range of 6.75GHz-8.25GHz. A superior jitter performance of 190fsrms is achieved, while consuming only 2.25mW power. This translates to a best FoM of -251dB. Unlike conventional PLLs, ILCMs have been fundamentally limited to only integer-N operation and cannot synthesize fractional-N frequencies. In the last part of this thesis, we extend the merits of ILCMs to fractional-N and overcome this fundamental limitation. We employ DTC-based QNC techniques in order to align injected pulses to the oscillator's zero crossings, which enables it to pull the oscillator toward phase lock, thus realizing a fractional-N ILCM. Fabricated in 65nm CMOS process, a prototype 20-bit fractional-N ILCM with an output range of 6.75GHz-8.25GHz consumes only 3.25mW. It achieves excellent jitter performance of 110fsrms and 175fsrms in integer- and fractional-N modes respectively, which translates to the best-reported FoM in both integer- (-255dB) and fractional-N (-252dB) modes. The proposed fractional-N ILCM also features the first-reported rapid on/off capability, where the transient absolute jitter performance at wake-up is bounded below 4ps after less than 4ns. This demonstrates almost instantaneous phase settling. This unique capability enables tremendous energy saving by turning on the clock multiplier only when needed. This energy proportional operation leverages idle times to save power at the system-level of wireline and wireless transceivers

    Clock Generator Circuits for Low-Power Heterogeneous Multiprocessor Systems-on-Chip

    Get PDF
    In this work concepts and circuits for local clock generation in low-power heterogeneous multiprocessor systems-on-chip (MPSoCs) are researched and developed. The targeted systems feature a globally asynchronous locally synchronous (GALS) clocking architecture and advanced power management functionality, as for example fine-grained ultra-fast dynamic voltage and frequency scaling (DVFS). To enable this functionality compact clock generators with low chip area, low power consumption, wide output frequency range and the capability for ultra-fast frequency changes are required. They are to be instantiated individually per core. For this purpose compact all digital phase-locked loop (ADPLL) frequency synthesizers are developed. The bang-bang ADPLL architecture is analyzed using a numerical system model and optimized for low jitter accumulation. A 65nm CMOS ADPLL is implemented, featuring a novel active current bias circuit which compensates the supply voltage and temperature sensitivity of the digitally controlled oscillator (DCO) for reduced digital tuning effort. Additionally, a 28nm ADPLL with a new ultra-fast lock-in scheme based on single-shot phase synchronization is proposed. The core clock is generated by an open-loop method using phase-switching between multi-phase DCO clocks at a fixed frequency. This allows instantaneous core frequency changes for ultra-fast DVFS without re-locking the closed loop ADPLL. The sensitivity of the open-loop clock generator with respect to phase mismatch is analyzed analytically and a compensation technique by cross-coupled inverter buffers is proposed. The clock generators show small area (0.0097mm2 (65nm), 0.00234mm2 (28nm)), low power consumption (2.7mW (65nm), 0.64mW (28nm)) and they provide core clock frequencies from 83MHz to 666MHz which can be changed instantaneously. The jitter performance is compliant to DDR2/DDR3 memory interface specifications. Additionally, high-speed clocks for novel serial on-chip data transceivers are generated. The ADPLL circuits have been verified successfully by 3 testchip implementations. They enable efficient realization of future low-power MPSoCs with advanced power management functionality in deep-submicron CMOS technologies.In dieser Arbeit werden Konzepte und Schaltungen zur lokalen Takterzeugung in heterogenen Multiprozessorsystemen (MPSoCs) mit geringer Verlustleistung erforscht und entwickelt. Diese Systeme besitzen eine global-asynchrone lokal-synchrone Architektur sowie Funktionalitรคt zum Power Management, wie z.B. das feingranulare, schnelle Skalieren von Spannung und Taktfrequenz (DVFS). Um diese Funktionalitรคt zu realisieren werden kompakte Taktgeneratoren benรถtigt, welche eine kleine Chipflรคche einnehmen, wenig Verlustleitung aufnehmen, einen weiten Bereich an Ausgangsfrequenzen erzeugen und diese sehr schnell รคndern kรถnnen. Sie sollen individuell pro Prozessorkern integriert werden. Dazu werden kompakte volldigitale Phasenregelkreise (ADPLLs) entwickelt, wobei eine bang-bang ADPLL Architektur numerisch modelliert und fรผr kleine Jitterakkumulation optimiert wird. Es wird eine 65nm CMOS ADPLL implementiert, welche eine neuartige Kompensationsschlatung fรผr den digital gesteuerten Oszillator (DCO) zur Verringerung der Sensitivitรคt bezรผglich Versorgungsspannung und Temperatur beinhaltet. Zusรคtzlich wird eine 28nm CMOS ADPLL mit einer neuen Technik zum schnellen Einschwingen unter Nutzung eines Phasensynchronisierers realisiert. Der Prozessortakt wird durch ein neuartiges Phasenmultiplex- und Frequenzteilerverfahren erzeugt, welches es ermรถglicht die Taktfrequenz sofort zu รคndern um schnelles DVFS zu realisieren. Die Sensitivitรคt dieses Frequenzgenerators bezรผglich Phasen-Mismatch wird theoretisch analysiert und durch Verwendung von kreuzgekoppelten Taktverstรคrkern kompensiert. Die hier entwickelten Taktgeneratoren haben eine kleine Chipflรคche (0.0097mm2 (65nm), 0.00234mm2 (28nm)) und Leistungsaufnahme (2.7mW (65nm), 0.64mW (28nm)). Sie stellen Frequenzen von 83MHz bis 666MHz bereit, welche sofort geรคndert werden kรถnnen. Die Schaltungen erfรผllen die Jitterspezifikationen von DDR2/DDR3 Speicherinterfaces. Zusรคtzliche kรถnnen schnelle Takte fรผr neuartige serielle on-Chip Verbindungen erzeugt werden. Die ADPLL Schaltungen wurden erfolgreich in 3 Testchips erprobt. Sie ermรถglichen die effiziente Realisierung von zukรผnftigen MPSoCs mit Power Management in modernsten CMOS Technologien

    Design and realization of a 2.4 Gbps - 3.2 Gbps clock and data recovery circuit

    Get PDF
    This thesis presents the design, verification, system integration and the physical realization of a high-speed monolithic phase-locked loop (PLL) based clock and data recovery (CDR) circuit. The architecture of the CDR has been realized as a two-loop structure consisting of coarse and fine loops, each of which is capable of processing the incoming low-speed reference clock and high-speed random data. At start up, the coarse loop provides fast locking to the system frequency with the help of the reference clock. After the VCO clock reaches a proximity of system frequency , the LOCK signal is generated and the coarse loop is tumed off, while the fine loop is tumed on. Fine loop tracks the phase of the generated clock with respect to the data and aligns the VCO clock such that its rising edge is in the middle of data eye. The speed and symmetry of sub-blocks in fine loop are extremely important, since all asymmetric charging effects, skew and setup/hold problems in this loop translate into a static phase error at the clock output. The entire circuit architecture is built with a special low-voltage circuit design technique. All analogue as well as digital sub-blocks of the CDR architecture presented in this work operate on a differential signalling, which significantly makes the design more complex while ensuring a more robust perforrnance. Other important features of this CDR include small area, single power supply, low power consumption, capability to operate at very high data rates, and the ability to handle between 2.4 Gbps and 3.2 Gbps data rate. The CDR architecture was realized using a conventional 0.13-mikrometer digital CMOS technology (Foundry: UMC), which ensures a lower overall cost and better portability for the design. The CDR architecture presented in this work is capable of operating at sampling frequencies of up to 3.2 GHz, and still can achieve the robust phase alignrnent. The entire circuit is designed with single 1.2 V power supply .The overall power consumption is estimated as 18.6 mW at 3.2 GHz sampling rate. The overall silicon area of the CDR is approximately 0.3 mm^2 with its internal loop filter capacitors. Other researchers have reported similar featured PLL-based clock and data recovery circuits in terms of operating data rate, architecture and jitter performance. To the best of our knowledge, this clock recovery uses the advantage of being the first high-speed CDR designed in CMOS 0.13 mikrometer technology with the superiority on power consumption and area considerations among others. The CDR architecture presented in this thesis is intended, as a state-of-the-art clock recovery for high-speed applications such as optical communications or high bandwidth serial wireline communication needs. It can be used either as a stand-alone single-chip unit, or as an embedded intellectual property (IP) block that can be integrated with other modules on chip

    ๋ฐ์ดํ„ฐ ์ „์†ก๋กœ ํ™•์žฅ์„ฑ๊ณผ ๋ฃจํ”„ ์„ ํ˜•์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚จ ๋‹ค์ค‘์ฑ„๋„ ์ˆ˜์‹ ๊ธฐ๋“ค์— ๊ด€ํ•œ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2013. 2. ์ •๋•๊ท .Two types of serial data communication receivers that adopt a multichannel architecture for a high aggregate I/O bandwidth are presented. Two techniques for collaboration and sharing among channels are proposed to enhance the loop-linearity and channel-expandability of multichannel receivers, respectively. The first proposed receiver employs a collaborative timing scheme recovery which relies on the sharing of all outputs of phase detectors (PDs) among channels to extract common information about the timing and multilevel signaling architecture of PAM-4. The shared timing information is processed by a common global loop filter and is used to update the phase of the voltage-controlled oscillator with better rejection of per-channel noise. In addition to collaborative timing recovery, a simple linearization technique for binary PDs is proposed. The technique realizes a high-rate oversampling PD while the hardware cost is equivalent to that of a conventional 2x-oversampling clock and data recovery. The first receiver exploiting the collaborative timing recovery architecture is designed using 45-nm CMOS technology. A single data lane occupies a 0.195-mm2 area and consumes a relatively low 17.9 mW at 6 Gb/s at 1.0V. Therefore, the power efficiency is 2.98 mW/Gb/s. The simulated jitter is about 0.034 UI RMS given an input jitter value of 0.03 UI RMS, while the relatively constant loop bandwidth with the PD linearization technique is about 7.3-MHz regardless of the data-stream noise. Unlike the first receiver, the second proposed multichannel receiver was designed to reduce the hardware complexity of each lane. The receiver employs shared calibration logic among channels and yet achieves superior channel expandability with slim data lanes. A shared global calibration control, which is used in a forwarded clock receiver based on a multiphase delay-locked loop, accomplishes skew calibration, equalizer adaptation, and the phase lock of all channels during a calibration period, resulting in reduced hardware overhead and less area required by each data lane. The second forwarded clock receiver is designed in 90-nm CMOS technology. It achieves error-free eye openings of more than 0.5 UI across 9โˆ’ 28 inch Nelco 4000-6 microstrips at 4โˆ’ 7 Gb/s and more than 0.42 UI at data rates of up to 9 Gb/s. The data lane occupies only 0.152 mm2 and consumes 69.8 mW, while the rest of the receiver occupies 0.297 mm2 and consumes 56 mW at a data rate of 7 Gb/s and a supply voltage of 1.35 V.1. Introduction 1 1.1 Motivations 1.2 Thesis Organization 2. Previous Receivers for Serial-Data Communications 2.1 Classification of the Links 2.2 Clocking architecture of transceivers 2.3 Components of receiver 2.3.1 Channel loss 2.3.2 Equalizer 2.3.3 Clock and data recovery circuit 2.3.3.1. Basic architecture 2.3.3.2. Phase detector 2.3.3.2.1. Linear phase detector 2.3.3.2.2. Binary phase detector 2.3.3.3. Frequency detector 2.3.3.4. Charge pump 2.3.3.5. Voltage controlled oscillator and delay-line 2.3.4 Loop dynamics of PLL 2.3.5 Loop dynamics of DLL 3. The Proposed PLL-Based Receiver with Loop Linearization Technique 3.1 Introduction 3.2 Motivation 3.3 Overview of binary phase detection 3.4 The proposed BBPD linearization technique 3.4.1 Architecture of the proposed PLL-based receiver 3.4.2 Linearization technique of binary phase detection 3.4.3 Rotational pattern of sampling phase offset 3.5 PD gain analysis and optimization 3.6 Loop Dynamics of the 2nd-order CDR 3.7 Verification with the time-accurate behavioral simulation 3.8 Summary 4. The Proposed DLL-Based Receiver with Forwarded-Clock 4.1 Introduction 4.2 Motivation 4.3 Design consideration 4.4 Architecture of the proposed forwarded-clock receiver 4.5 Circuit description 4.5.1 Analog multi-phase DLL 4.5.2 Dual-input interpolating deley cells 4.5.3 Dedicated half-rate data samplers 4.5.4 Cherry-Hooper continuous-time linear equalizer 4.5.5 Equalizer adaptation and phase-lock scheme 4.6 Measurement results 5. Conclusion 6. BibliographyDocto
    corecore