111 research outputs found

    A Sub-Picosecond Hybrid DLL for Large-Scale Phased Array Synchronization

    Get PDF
    A large-scale timing synchronization scheme for scalable phased arrays is presented. This approach utilizes a DLL co-designed with a subsequent 2.5GHz PLL. The DLL employs a low noise, fine/coarse delay tuning to reduce the in-band rms jitter to 323fs, an order of magnitude improvement over previous works at similar frequencies. The DLL was fabricated in a 65nm bulk CMOS process and was characterized from 27MHz to 270MHz. It consumes up to 3.3mW from a 1V power supply and has a small footprint of 0.036mm^2

    데이터 μ „μ†‘λ‘œ ν™•μž₯μ„±κ³Ό 루프 μ„ ν˜•μ„±μ„ ν–₯μƒμ‹œν‚¨ 닀쀑채널 μˆ˜μ‹ κΈ°λ“€μ— κ΄€ν•œ 연ꡬ

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사)-- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : 전기·컴퓨터곡학뢀, 2013. 2. 정덕균.Two types of serial data communication receivers that adopt a multichannel architecture for a high aggregate I/O bandwidth are presented. Two techniques for collaboration and sharing among channels are proposed to enhance the loop-linearity and channel-expandability of multichannel receivers, respectively. The first proposed receiver employs a collaborative timing scheme recovery which relies on the sharing of all outputs of phase detectors (PDs) among channels to extract common information about the timing and multilevel signaling architecture of PAM-4. The shared timing information is processed by a common global loop filter and is used to update the phase of the voltage-controlled oscillator with better rejection of per-channel noise. In addition to collaborative timing recovery, a simple linearization technique for binary PDs is proposed. The technique realizes a high-rate oversampling PD while the hardware cost is equivalent to that of a conventional 2x-oversampling clock and data recovery. The first receiver exploiting the collaborative timing recovery architecture is designed using 45-nm CMOS technology. A single data lane occupies a 0.195-mm2 area and consumes a relatively low 17.9 mW at 6 Gb/s at 1.0V. Therefore, the power efficiency is 2.98 mW/Gb/s. The simulated jitter is about 0.034 UI RMS given an input jitter value of 0.03 UI RMS, while the relatively constant loop bandwidth with the PD linearization technique is about 7.3-MHz regardless of the data-stream noise. Unlike the first receiver, the second proposed multichannel receiver was designed to reduce the hardware complexity of each lane. The receiver employs shared calibration logic among channels and yet achieves superior channel expandability with slim data lanes. A shared global calibration control, which is used in a forwarded clock receiver based on a multiphase delay-locked loop, accomplishes skew calibration, equalizer adaptation, and the phase lock of all channels during a calibration period, resulting in reduced hardware overhead and less area required by each data lane. The second forwarded clock receiver is designed in 90-nm CMOS technology. It achieves error-free eye openings of more than 0.5 UI across 9βˆ’ 28 inch Nelco 4000-6 microstrips at 4βˆ’ 7 Gb/s and more than 0.42 UI at data rates of up to 9 Gb/s. The data lane occupies only 0.152 mm2 and consumes 69.8 mW, while the rest of the receiver occupies 0.297 mm2 and consumes 56 mW at a data rate of 7 Gb/s and a supply voltage of 1.35 V.1. Introduction 1 1.1 Motivations 1.2 Thesis Organization 2. Previous Receivers for Serial-Data Communications 2.1 Classification of the Links 2.2 Clocking architecture of transceivers 2.3 Components of receiver 2.3.1 Channel loss 2.3.2 Equalizer 2.3.3 Clock and data recovery circuit 2.3.3.1. Basic architecture 2.3.3.2. Phase detector 2.3.3.2.1. Linear phase detector 2.3.3.2.2. Binary phase detector 2.3.3.3. Frequency detector 2.3.3.4. Charge pump 2.3.3.5. Voltage controlled oscillator and delay-line 2.3.4 Loop dynamics of PLL 2.3.5 Loop dynamics of DLL 3. The Proposed PLL-Based Receiver with Loop Linearization Technique 3.1 Introduction 3.2 Motivation 3.3 Overview of binary phase detection 3.4 The proposed BBPD linearization technique 3.4.1 Architecture of the proposed PLL-based receiver 3.4.2 Linearization technique of binary phase detection 3.4.3 Rotational pattern of sampling phase offset 3.5 PD gain analysis and optimization 3.6 Loop Dynamics of the 2nd-order CDR 3.7 Verification with the time-accurate behavioral simulation 3.8 Summary 4. The Proposed DLL-Based Receiver with Forwarded-Clock 4.1 Introduction 4.2 Motivation 4.3 Design consideration 4.4 Architecture of the proposed forwarded-clock receiver 4.5 Circuit description 4.5.1 Analog multi-phase DLL 4.5.2 Dual-input interpolating deley cells 4.5.3 Dedicated half-rate data samplers 4.5.4 Cherry-Hooper continuous-time linear equalizer 4.5.5 Equalizer adaptation and phase-lock scheme 4.6 Measurement results 5. Conclusion 6. BibliographyDocto

    Fast Access Data Acquisition System

    Full text link

    Frequency Synthesizer Architectures for UWB MB-OFDM Alliance Application

    Get PDF

    고속 DRAM μΈν„°νŽ˜μ΄μŠ€λ₯Ό μœ„ν•œ μ „μ•• 및 μ˜¨λ„μ— λ‘”κ°ν•œ 클둝 νŒ¨μŠ€μ™€ μœ„μƒ 였λ₯˜ ꡐ정기 섀계

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사) -- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 전기·정보곡학뢀, 2021. 2. 정덕균.To cope with problems caused by the high-speed operation of the dynamic random access memory (DRAM) interface, several approaches are proposed that are focused on the clock path of the DRAM. Two delay-locked loop (DLL) based schemes, a forwarded-clock (FC) receiver (RX) with self-tracking loop and a quadrature error corrector, are proposed. Moreover, an open-loop based scheme is presented for drift compensation in the clock distribution. The open-loop scheme consumes less power consumption and reduces design complexity. The FC RX uses DLLs to compensate for voltage and temperature (VT) drift in unmatched memory interfaces. The self-tracking loop consists of two-stage cascaded DLLs to operate in a DRAM environment. With the write training and the proposed DLL, the timing relationship between the data and the sampling clock is always optimal. The proposed scheme compensates for delay drift without relying on data transitions or re-training. The proposed FC RX is fabricated in 65-nm CMOS process and has an active area containing 4 data lanes of 0.0329 mm2. After the write training is completed at the supply voltage of 1 V, the measured timing margin remains larger than 0.31-unit interval (UI) when the supply voltage drifts in the range of 0.94 V and 1.06 V from the training voltage, 1 V. At the data rate of 6.4 Gb/s, the proposed FC RX achieves an energy efficiency of 0.45 pJ/bit. Contrary to the aforementioned scheme, an open-loop-based voltage drift compensation method is proposed to minimize power consumption and occupied area. The overall clock distribution is composed of a current mode logic (CML) path and a CMOS path. In the proposed scheme, the architecture of the CML-to-CMOS converter (C2C) and the inverter is changed to compensate for supply voltage drift. The bias generator provides bias voltages to the C2C and inverters according to supply voltage for delay adjustment. The proposed clock tree is fabricated in 40 nm CMOS process and the active area is 0.004 mm2. When the supply voltage is modulated by a sinusoidal wave with 1 MHz, 100 mV peak-to-peak swing from the center of 1.1 V, applying the proposed scheme reduces the measured root-mean-square (RMS) jitter from 3.77 psRMS to 1.61 psRMS. At 6 GHz output clock, the power consumption of the proposed scheme is 11.02 mW. A DLL-based quadrature error corrector (QEC) with a wide correction range is proposed for the DRAM whose clocks are distributed over several millimeters. The quadrature error is corrected by adjusting delay lines using information from the phase error detector. The proposed error correction method minimizes increased jitter due to phase error correction by setting at least one of the delay lines in the quadrature clock path to the minimum delay. In addition, the asynchronous calibration on-off scheme reduces power consumption after calibration is complete. The proposed QEC is fabricated in 40 nm CMOS process and has an active area of 0.048 mm2. The proposed QEC exhibits a wide correctable error range of 101.6 ps and the remaining phase errors are less than 2.18Β° from 0.8 GHz to 2.3 GHz clock. At 2.3 GHz, the QEC contributes 0.53 psRMS jitter. Also, at 2.3 GHz, the power consumption is reduced from 8.89 mW to 3.39 mW when the calibration is off.λ³Έ λ…Όλ¬Έμ—μ„œλŠ” 동적 랜덀 μ•‘μ„ΈμŠ€ λ©”λͺ¨λ¦¬ (DRAM)의 속도가 증가함에 따라 클둝 νŒ¨μŠ€μ—μ„œ λ°œμƒν•  수 μžˆλŠ” λ¬Έμ œμ— λŒ€μ²˜ν•˜κΈ° μœ„ν•œ μ„Έ 가지 νšŒλ‘œλ“€μ„ μ œμ•ˆν•˜μ˜€λ‹€. μ œμ•ˆν•œ νšŒλ‘œλ“€ 쀑 두 방식듀은 지연동기루프 (delay-locked loop) 방식을 μ‚¬μš©ν•˜μ˜€κ³  λ‚˜λ¨Έμ§€ ν•œ 방식은 면적과 μ „λ ₯ μ†Œλͺ¨λ₯Ό 쀄이기 μœ„ν•΄ μ˜€ν”ˆ 루프 방식을 μ‚¬μš©ν•˜μ˜€λ‹€. DRAM의 λΉ„μ •ν•© μˆ˜μ‹ κΈ° κ΅¬μ‘°μ—μ„œ 데이터 νŒ¨μŠ€μ™€ 클둝 패슀 κ°„μ˜ 지연 뢈일치둜 인해 μ „μ•• 및 μ˜¨λ„ 변화에 따라 μ…‹μ—… νƒ€μž„ 및 ν™€λ“œ νƒ€μž„μ΄ μ€„μ–΄λ“œλŠ” 문제λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄ 지연동기루프λ₯Ό μ‚¬μš©ν•˜μ˜€λ‹€. μ œμ•ˆν•œ 지연동기루프 νšŒλ‘œλŠ” DRAM ν™˜κ²½μ—μ„œ λ™μž‘ν•˜λ„λ‘ 두 개의 μ§€μ—°λ™κΈ°λ£¨ν”„λ‘œ λ‚˜λˆ„μ—ˆλ‹€. λ˜ν•œ 초기 μ“°κΈ° ν›ˆλ ¨μ„ 톡해 데이터와 클둝을 타이밍 λ§ˆμ§„ κ΄€μ μ—μ„œ 졜적의 μœ„μΉ˜μ— λ‘˜ 수 μžˆλ‹€. λ”°λΌμ„œ μ œμ•ˆν•˜λŠ” 방식은 데이터 천이 정보가 ν•„μš”ν•˜μ§€ μ•Šλ‹€. 65-nm CMOS 곡정을 μ΄μš©ν•˜μ—¬ λ§Œλ“€μ–΄μ§„ 칩은 6.4 Gb/sμ—μ„œ 0.45 pJ/bit의 μ—λ„ˆμ§€ νš¨μœ¨μ„ 가진닀. λ˜ν•œ 1 Vμ—μ„œ μ“°κΈ° ν›ˆλ ¨ 및 지연동기루프λ₯Ό κ³ μ •μ‹œν‚€κ³  0.94 Vμ—μ„œ 1.06 VκΉŒμ§€ 곡급 전압이 λ°”λ€Œμ—ˆμ„ λ•Œ 타이밍 λ§ˆμ§„μ€ 0.31 UI보닀 큰 값을 μœ μ§€ν•˜μ˜€λ‹€. λ‹€μŒμœΌλ‘œ μ œμ•ˆν•˜λŠ” νšŒλ‘œλŠ” 클둝 뢄포 νŠΈλ¦¬μ—μ„œ μ „μ•• λ³€ν™”λ‘œ 인해 클둝 패슀의 지연이 λ‹¬λΌμ§€λŠ” 것을 μ•žμ„œ μ œμ‹œν•œ 방식과 달리 μ˜€ν”ˆ 루프 λ°©μ‹μœΌλ‘œ λ³΄μƒν•˜μ˜€λ‹€. κΈ°μ‘΄ 클둝 패슀의 인버터와 CML-to-CMOS λ³€ν™˜κΈ°μ˜ ꡬ쑰λ₯Ό λ³€κ²½ν•˜μ—¬ λ°”μ΄μ–΄μŠ€ 생성 νšŒλ‘œμ—μ„œ μƒμ„±ν•œ 곡급 전압에 따라 λ°”λ€ŒλŠ” λ°”μ΄μ–΄μŠ€ 전압을 가지고 지연을 μ‘°μ ˆν•  수 있게 ν•˜μ˜€λ‹€. 40-nm CMOS 곡정을 μ΄μš©ν•˜μ—¬ λ§Œλ“€μ–΄μ§„ 칩의 6 GHz ν΄λ‘μ—μ„œμ˜ μ „λ ₯ μ†Œλͺ¨λŠ” 11.02 mW둜 μΈ‘μ •λ˜μ—ˆλ‹€. 1.1 V μ€‘μ‹¬μœΌλ‘œ 1 MHz, 100 mV 피크 투 피크λ₯Ό κ°€μ§€λŠ” μ‚¬μΈνŒŒ μ„±λΆ„μœΌλ‘œ 곡급 전압을 λ³€μ‘°ν•˜μ˜€μ„ λ•Œ μ œμ•ˆν•œ λ°©μ‹μ—μ„œμ˜ μ§€ν„°λŠ” κΈ°μ‘΄ λ°©μ‹μ˜ 3.77 psRMSμ—μ„œ 1.61 psRMS둜 μ€„μ–΄λ“€μ—ˆλ‹€. DRAM의 솑신기 κ΅¬μ‘°μ—μ„œ 닀쀑 μœ„μƒ 클둝 κ°„μ˜ μœ„μƒ μ˜€μ°¨λŠ” μ†‘μ‹ λœ λ°μ΄ν„°μ˜ 데이터 유효 창을 κ°μ†Œμ‹œν‚¨λ‹€. 이λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄ 지연동기루프λ₯Ό λ„μž…ν•˜κ²Œ 되면 μ¦κ°€λœ μ§€μ—°μœΌλ‘œ 인해 μœ„μƒμ΄ κ΅μ •λœ ν΄λ‘μ—μ„œ 지터가 μ¦κ°€ν•œλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μ¦κ°€λœ 지터λ₯Ό μ΅œμ†Œν™”ν•˜κΈ° μœ„ν•΄ μœ„μƒ κ΅μ •μœΌλ‘œ 인해 μ¦κ°€λœ 지연을 μ΅œμ†Œν™”ν•˜λŠ” μœ„μƒ ꡐ정 회둜λ₯Ό μ œμ‹œν•˜μ˜€λ‹€. λ˜ν•œ 유휴 μƒνƒœμ—μ„œ μ „λ ₯ μ†Œλͺ¨λ₯Ό 쀄이기 μœ„ν•΄ μœ„μƒ 였차λ₯Ό κ΅μ •ν•˜λŠ” 회둜λ₯Ό μž…λ ₯ 클둝과 λΉ„λ™κΈ°μ‹μœΌλ‘œ 끌 수 μžˆλŠ” 방법 λ˜ν•œ μ œμ•ˆν•˜μ˜€λ‹€. 40-nm CMOS 곡정을 μ΄μš©ν•˜μ—¬ λ§Œλ“€μ–΄μ§„ 칩의 μœ„μƒ ꡐ정 λ²”μœ„λŠ” 101.6 ps이고 0.8 GHz λΆ€ν„° 2.3 GHzκΉŒμ§€μ˜ λ™μž‘ 주파수 λ²”μœ„μ—μ„œ μœ„μƒ κ΅μ •κΈ°μ˜ 좜λ ₯ 클둝의 μœ„μƒ μ˜€μ°¨λŠ” 2.18°보닀 μž‘λ‹€. μ œμ•ˆν•˜λŠ” μœ„μƒ ꡐ정 회둜둜 인해 μΆ”κ°€λœ μ§€ν„°λŠ” 2.3 GHzμ—μ„œ 0.53 psRMS이고 ꡐ정 회둜λ₯Ό 껐을 λ•Œ μ „λ ₯ μ†Œλͺ¨λŠ” ꡐ정 νšŒλ‘œκ°€ μΌœμ‘Œμ„ λ•ŒμΈ 8.89 mWμ—μ„œ 3.39 mW둜 μ€„μ–΄λ“€μ—ˆλ‹€.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Thesis Organization 4 Chapter 2 Background on DRAM Interface 5 2.1 Overview 5 2.2 Memory Interface 7 Chapter 3 Background on DLL 11 3.1 Overview 11 3.2 Building Blocks 15 3.2.1 Delay Line 15 3.2.2 Phase Detector 17 3.2.3 Charge Pump 19 3.2.4 Loop filter 20 Chapter 4 Forwarded-Clock Receiver with DLL-based Self-tracking Loop for Unmatched Memory Interfaces 21 4.1 Overview 21 4.2 Proposed Separated DLL 25 4.2.1 Operation of the Proposed Separated DLL 27 4.2.2 Operation of the Digital Loop Filter in DLL 31 4.3 Circuit Implementation 33 4.4 Measurement Results 37 4.4.1 Measurement Setup and Sequence 38 4.4.2 VT Drift Measurement and Simulation 40 Chapter 5 Open-loop-based Voltage Drift Compensation in Clock Distribution 46 5.1 Overview 46 5.2 Prior Works 50 5.3 Voltage Drift Compensation Method 52 5.4 Circuit Implementation 57 5.5 Measurement Results 61 Chapter 6 Quadrature Error Corrector with Minimum Total Delay Tracking 68 6.1 Overview 68 6.2 Prior Works 70 6.3 Quadrature Error Correction Method 73 6.4 Circuit Implementation 82 6.5 Measurement Results 88 Chapter 7 Conclusion 96 Bibliography 98 초둝 102Docto

    A Sub-Picosecond Hybrid DLL for Large-Scale Phased Array Synchronization

    Get PDF
    A large-scale timing synchronization scheme for scalable phased arrays is presented. This approach utilizes a DLL co-designed with a subsequent 2.5GHz PLL. The DLL employs a low noise, fine/coarse delay tuning to reduce the in-band rms jitter to 323fs, an order of magnitude improvement over previous works at similar frequencies. The DLL was fabricated in a 65nm bulk CMOS process and was characterized from 27MHz to 270MHz. It consumes up to 3.3mW from a 1V power supply and has a small footprint of 0.036mm^2

    Doctor of Philosophy

    Get PDF
    dissertationHigh speed wireless communication systems (e.g., long-term evolution (LTE), Wi-Fi) operate with high bandwidth and large peak-to-average power ratios (PAPRs). This is largely due to the use of orthogonal frequency division multiplexing (OFDM) modulation that is prevalent to maximize the spectral efficiency of the communication system. The power amplifier (PA) in the transmitter is the dominant energy consumer in the radio, largely because of the PAPR of the input signal. To reduce the energy consumption of the PA an amplifier that simultaneously achieves high efficiency and high linearity. Furthermore, to lower the cost for high volume production, it is desirable to achieve a complete System-on-Chip (SoC) integration. Linear amplifiers (e.g., Class-A, -B, -AB) are inefficient when amplifying signals with large PAPR that is associated by high peak-to-average modulation techniques such as LTE. OFDM. Switching amplifiers (e.g., Class-D, -E, -F) are very promising due to their high efficiency when compared to their linear amplifier counterparts. Linearization techniques for switching amplifiers have been intensively investigated due to their limited sensitivity to the input amplitude of the signal. Deep-submicron CMOS technology is mostly utilized for logic circuitry, and the Moore's law scaling of CMOS optimizes transistors to operate as high-speed and low-loss switches rather than high gain transistors. Hence, it is advantageous to use transistors in switching mode as switching amplifies and use high-speed digital logic circuitry to implement linearization systems and circuitry. In this work, several linearization architectures are investigated and demonstrated. An envelope elimination and restoration (EER) transmitter that comprises a class-E power amplifier and a 10-bit digital-to-analog converter (DAC) controlled current modulator is investigated. A pipelined switched-capacitor DAC is designed to control an open-loop transconductor that operates as a current modulator, modulating the amplitude of the current supplied to a class-E PA. Such a topology allows for increased filtering of the quantization noise that is problematic in most digital PAs (DPA). The proposed quadrature and multiphase architecture can avoid the bandwidth expansion and delay mismatch associated with polar PAs. The multiphase switched capacitor power amplifier (SCPA) was proposed after the quadrature SCPA and it significantly improves the power efficiency

    Sub-Picosecond Jitter Clock Generation for Time Interleaved Analog to Digital Converter

    Get PDF
    Nowadays, Multi-GHz analog-to-digital converters (ADCs) are becoming more and more popular in radar systems, software-defined radio (SDR) and wideband communications, because they can realize much higher operation speed through using many interleaved sub-ADCs to relax ADC sampling rates. Although the time interleaved ADC has some issues such as gain mismatch, offset mismatch and timing skew between each ADC channel, these deterministic errors can be solved by previous works such as digital calibration technique. However, time-interleaved ADCs require a precise sample clock to achieve an acceptable effective-numberof-bits (ENOB) which can be degraded by jitter in the sample clock. The clock generation circuits presented in this work achieves sub-picosecond jitter performance in 180nm CMOS which is suitable for time-interleaved ADC. Two different test chips were fabricated in 180nm CMOS to investigate the low jitter design technique. The low jitter delay line in two chips were designed in two different ways, but both of them utilized the low jitter design technique. In first test chip, the measured RMS jitter is 0.1061ps for each delay stage. The second chip uses the proposed low jitter Delay-Locked Loop can work from 80MHz to 120MHz, which means it can provide the time interleaved ADC with 2.4GHz to 3.6GHz low jitter sample clock, the measured delay stage jitter performance in second test chip is 0.1085ps

    Clocking and Skew-Optimization For Source-Synchronous Simultaneous Bidirectional Links

    Get PDF
    There is continuous expansion of computing capabilities in mobile devices which demands higher I/O bandwidth and dense parallel links supporting higher data rates. Highspeed signaling leverages technology advancements to achieve higher data rates but is limited by the bandwidth of the electrical copper channel which have not scaled accordingly. To meet the continuous data-rate demand, Simultaneous Bi-directional (SBD) signaling technique is an attractive alternative relative to uni-directional signaling as it can work at lower clock speeds, exhibits better spectral efficiency and provides higher throughput in pad limited PCBs. For low-power and more robust system, the SBD transceiver should utilize forwarded clock system and per-pin de-skew circuits to correct the phase difference developed between the data and clock. The system can be configured in two roles, master and slave. To save more power, the system should have only one clock generator. The master has its own clock source and shares its clock to the slave through the clock channel, and the slave uses this forwarded clock to deserialize the inbound data and serialize the outbound data. A clock-to-data skew exists which can be corrected with a phase tracking CDR. This thesis presents a low-power implementation of forwarded clocking and clock-to-data skew optimization for a 40 Gbps SBD transceiver. The design is implemented in 28nm CMOS technology and consumes 8.8mW of power for 20 Gbps NRZ data at 0.9 V supply. The area occupied by the clocking 0.018 mm^2 area
    • …
    corecore