222 research outputs found

    μ°¨μ„ΈλŒ€ HBM 용 고집적, μ €μ „λ ₯ μ†‘μˆ˜μ‹ κΈ° 섀계

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사) -- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 전기·정보곡학뢀, 2020. 8. 정덕균.This thesis presents design techniques for high-density power-efficient transceiver for the next-generation high bandwidth memory (HBM). Unlike the other memory interfaces, HBM uses a 3D-stacked package using through-silicon via (TSV) and a silicon interposer. The transceiver for HBM should be able to solve the problems caused by the 3D-stacked package and TSV. At first, a data (DQ) receiver for HBM with a self-tracking loop that tracks a phase skew between DQ and data strobe (DQS) due to a voltage or thermal drift is proposed. The self-tracking loop achieves low power and small area by uti-lizing an analog-assisted baud-rate phase detector. The proposed pulse-to-charge (PC) phase detector (PD) converts the phase skew to a voltage differ-ence and detects the phase skew from the voltage difference. An offset calibra-tion scheme that can compensates for a mismatch of the PD is also proposed. The proposed calibration scheme operates without any additional sensing cir-cuits by taking advantage of the write training of HBM. Fabricated in 65 nm CMOS, the DQ receiver shows a power efficiency of 370 fJ/b at 4.8 Gb/s and occupies 0.0056 mm2. The experimental results show that the DQ receiver op-erates without any performance degradation under a Β± 10% supply variation. In a second prototype IC, a high-density transceiver for HBM with a feed-forward-equalizer (FFE)-combined crosstalk (XT) cancellation scheme is pre-sented. To compensate for the XT, the transmitter pre-distorts the amplitude of the FFE output according to the XT. Since the proposed XT cancellation (XTC) scheme reuses the FFE implemented to equalize the channel loss, additional circuits for the XTC is minimized. Thanks to the XTC scheme, a channel pitch can be significantly reduced, allowing for the high channel density. Moreover, the 3D-staggered channel structure removes the ground layer between the verti-cally adjacent channels, which further reduces a cross-sectional area of the channel per lane. The test chip including 6 data lanes is fabricated in 65 nm CMOS technology. The 6-mm channels are implemented on chip to emulate the silicon interposer between the HBM and the processor. The operation of the XTC scheme is verified by simultaneously transmitting 4-Gb/s data to the 6 consecutive channels with 0.5-um pitch and the XTC scheme reduces the XT-induced jitter up to 78 %. The measurement result shows that the transceiver achieves the throughput of 8 Gb/s/um. The transceiver occupies 0.05 mm2 for 6 lanes and consumes 36.6 mW at 6 x 4 Gb/s.λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μ°¨μ„ΈλŒ€ HBM을 μœ„ν•œ 고집적 μ €μ „λ ₯ μ†‘μˆ˜μ‹ κΈ° 섀계 방법을 μ œμ•ˆν•œλ‹€. 첫 번째둜, μ „μ•• 및 μ˜¨λ„ 변화에 μ˜ν•œ 데이터와 클럭 κ°„ μœ„μƒ 차이λ₯Ό 보상할 수 μžˆλŠ” 자체 좔적 루프λ₯Ό 가진 데이터 μˆ˜μ‹ κΈ°λ₯Ό μ œμ•ˆν•œλ‹€. μ œμ•ˆν•˜λŠ” 자체 좔적 λ£¨ν”„λŠ” 데이터 전솑 속도와 같은 μ†λ„λ‘œ λ™μž‘ν•˜λŠ” μœ„μƒ κ²€μΆœκΈ°λ₯Ό μ‚¬μš©ν•˜μ—¬ μ „λ ₯ μ†Œλͺ¨μ™€ 면적을 μ€„μ˜€λ‹€. λ˜ν•œ λ©”λͺ¨λ¦¬μ˜ μ“°κΈ° ν›ˆλ ¨ (write training) 과정을 μ΄μš©ν•˜μ—¬ 효과적으둜 μœ„μƒ κ²€μΆœκΈ°μ˜ μ˜€ν”„μ…‹μ„ 보상할 수 μžˆλŠ” 방법을 μ œμ•ˆν•œλ‹€. μ œμ•ˆν•˜λŠ” 데이터 μˆ˜μ‹ κΈ°λŠ” 65 nm κ³΅μ •μœΌλ‘œ μ œμž‘λ˜μ–΄ 4.8 Gb/sμ—μ„œ 370 fJ/b을 μ†Œλͺ¨ν•˜μ˜€λ‹€. λ˜ν•œ 10 % 의 μ „μ•• 변화에 λŒ€ν•˜μ—¬ μ•ˆμ •μ μœΌλ‘œ λ™μž‘ν•˜λŠ” 것을 ν™•μΈν•˜μ˜€λ‹€. 두 번째둜, ν”Όλ“œ ν¬μ›Œλ“œ 이퀄라이저와 κ²°ν•©λœ 크둜슀 토크 보상 방식을 ν™œμš©ν•œ 고집적 μ†‘μˆ˜μ‹ κΈ°λ₯Ό μ œμ•ˆν•œλ‹€. μ œμ•ˆν•˜λŠ” μ†‘μ‹ κΈ°λŠ” 크둜슀 토크 크기에 ν•΄λ‹Ήν•˜λŠ” 만큼 솑신기 좜λ ₯을 μ™œκ³‘ν•˜μ—¬ 크둜슀 토크λ₯Ό λ³΄μƒν•œλ‹€. μ œμ•ˆν•˜λŠ” 크둜슀 토크 보상 방식은 채널 손싀을 λ³΄μƒν•˜κΈ° μœ„ν•΄ κ΅¬ν˜„λœ ν”Όλ“œ ν¬μ›Œλ“œ 이퀄라이저λ₯Ό μž¬ν™œμš©ν•¨μœΌλ‘œμ¨ 좔가적인 회둜λ₯Ό μ΅œμ†Œν™”ν•œλ‹€. μ œμ•ˆν•˜λŠ” μ†‘μˆ˜μ‹ κΈ°λŠ” 크둜슀 토크가 보상 κ°€λŠ₯ν•˜κΈ° λ•Œλ¬Έμ—, 채널 간격을 크게 쀄여 고집적 톡신을 κ΅¬ν˜„ν•˜μ˜€λ‹€. λ˜ν•œ 집적도λ₯Ό 더 μ¦κ°€μ‹œν‚€κΈ° μœ„ν•΄ μ„Έλ‘œλ‘œ μΈμ ‘ν•œ 채널 μ‚¬μ΄μ˜ 차폐 측을 μ œκ±°ν•œ 적측 채널 ꡬ쑰λ₯Ό μ œμ•ˆν•œλ‹€. 6개의 μ†‘μˆ˜μ‹ κΈ°λ₯Ό ν¬ν•¨ν•œ ν”„λ‘œν† νƒ€μž… 칩은 65 nm κ³΅μ •μœΌλ‘œ μ œμž‘λ˜μ—ˆλ‹€. HBMκ³Ό ν”„λ‘œμ„Έμ„œ μ‚¬μ΄μ˜ silicon interposer channel 을 λͺ¨μ‚¬ν•˜κΈ° μœ„ν•œ 6 mm 의 채널이 μΉ© μœ„μ— κ΅¬ν˜„λ˜μ—ˆλ‹€. μ œμ•ˆν•˜λŠ” 크둜슀 토크 보상 방식은 0.5 um κ°„κ²©μ˜ 6개의 μΈμ ‘ν•œ 채널에 λ™μ‹œμ— 데이터λ₯Ό μ „μ†‘ν•˜μ—¬ κ²€μ¦λ˜μ—ˆμœΌλ©°, 크둜슀 ν† ν¬λ‘œ μΈν•œ 지터λ₯Ό μ΅œλŒ€ 78 % κ°μ†Œμ‹œμΌ°λ‹€. μ œμ•ˆν•˜λŠ” μ†‘μˆ˜μ‹ κΈ°λŠ” 8 Gb/s/um 의 μ²˜λ¦¬λŸ‰μ„ 가지며 6 개의 μ†‘μˆ˜μ‹ κΈ°κ°€ 총 36.6 mW의 μ „λ ₯을 μ†Œλͺ¨ν•˜μ˜€λ‹€.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 4 CHAPTER 2 BACKGROUND ON HIGH-BANDWIDTH MEMORY 6 2.1 OVERVIEW 6 2.2 TRANSCEIVER ARCHITECTURE 10 2.3 READ/WRITE OPERATION 15 2.3.1 READ OPERATION 15 2.3.2 WRITE OPERATION 19 CHAPTER 3 BACKGROUNDS ON COUPLED WIRES 21 3.1 GENERALIZED MODEL 21 3.2 EFFECT OF CROSSTALK 26 CHAPTER 4 DQ RECEIVER WITH BAUD-RATE SELF-TRACKING LOOP 29 4.1 OVERVIEW 29 4.2 FEATURES OF DQ RECEIVER FOR HBM 33 4.3 PROPOSED PULSE-TO-CHARGE PHASE DETECTOR 35 4.3.1 OPERATION OF PULSE-TO-CHARGE PHASE DETECTOR 35 4.3.2 OFFSET CALIBRATION 37 4.3.3 OPERATION SEQUENCE 39 4.4 CIRCUIT IMPLEMENTATION 42 4.5 MEASUREMENT RESULT 46 CHAPTER 5 HIGH-DENSITY TRANSCEIVER FOR HBM WITH 3D-STAGGERED CHANNEL AND CROSSTALK CANCELLATION SCHEME 57 5.1 OVERVIEW 57 5.2 PROPOSED 3D-STAGGERED CHANNEL 61 5.2.1 IMPLEMENTATION OF 3D-STAGGERED CHANNEL 61 5.2.2 CHANNEL CHARACTERISTICS AND MODELING 66 5.3 PROPOSED FEED-FORWARD-EQUALIZER-COMBINED CROSSTALK CANCELLATION SCHEME 72 5.4 CIRCUIT IMPLEMENTATION 77 5.4.1 OVERALL ARCHITECTURE 77 5.4.2 TRANSMITTER WITH FFE-COMBINED XTC 79 5.4.3 RECEIVER 81 5.5 MEASUREMENT RESULT 82 CHAPTER 6 CONCLUSION 93 BIBLIOGRAPHY 95 초 둝 102Docto

    Analog Baseband Filters and Mixed Signal Circuits for Broadband Receiver Systems

    Get PDF
    Data transfer rates of communication systems continue to rise fueled by aggressive demand for voice, video and Internet data. Device scaling enabled by modern lithography has paved way for System-on-Chip solutions integrating compute intensive digital signal processing. This trend coupled with demand for low power, battery-operated consumer devices offers extensive research opportunities in analog and mixed-signal designs that enable modern communication systems. The first part of the research deals with broadband wireless receivers. With an objective to gain insight, we quantify the impact of undesired out-band blockers on analog baseband in a broadband radio. We present a systematic evaluation of the dynamic range requirements at the baseband and A/D conversion boundary. A prototype UHF receiver designed using RFCMOS 0.18[mu]m technology to support this research integrates a hybrid continuous- and discrete-time analog baseband along with the RF front-end. The chip consumes 120mW from a 1.8V/2.5V dual supply and achieves a noise figure of 7.9dB, an IIP3 of -8dBm (+2dbm) at maximum gain (at 9dB RF attenuation). High linearity active RC filters are indispensable in wireless radios. A novel feed-forward OTA applicable to active RC filters in analog baseband is presented. Simulation results from the chip prototype designed in RFCMOS 0.18[mu]m technology show an improvement in the out-band linearity performance that translates to increased dynamic range in the presence of strong adjacent blockers. The second part of the research presents an adaptive clock-recovery system suitable for high-speed wireline transceivers. The main objective is to improve the jitter-tracking and jitter-filtering trade-off in serial link clock-recovery applications. A digital state-machine that enables the proposed mixed-signal adaptation solution to achieve this objective is presented. The advantages of the proposed mixed-signal solution operating at 10Gb/s are supported by experimental results from the prototype in RFCMOS 0.18[mu]m technology

    Design of energy-efficient high-speed wireline transceiver

    Get PDF
    Energy efficiency has become the most important performance metric of integrated circuits used in many applications ranging from mobile devices to high-performance processors. The power problem permeates both computing and communication systems alike. Especially in the era of Big Data, continuously growing demand for higher communication bandwidth is driving the need for energy-efficient high-speed I/O serial links. However, the rate at which the energy efficiency of serial links is improving is much slower than the rate at which the required data transfer bandwidth is increasing. This dissertation explores two design approaches for energy-efficient communication systems. The first design approach maximizes the energy efficiency of a transceiver without any performance loss, and as a prototype, a source-synchronous multi-Gb/s transceiver that achieves excellent energy efficiency lower than 0.3pJ/bit is presented. To this end, the proposed transceiver employs aggressive supply voltage scaling, and multiplexed transmitter and receiver synchronized by low-rate multi-phase clocks are adopted to achieve high data rate even at a supply voltage close to the device threshold voltage. Phase spacing errors resulting from device mismatches are corrected using a self-calibration scheme. The proposed phase calibration method uses a single digital delay-locked loop (DLL) for calibrating all the phases, which makes the calibration process insensitive to the supply voltage level. Thanks to this technique, the proposed multi-Gb/s transceiver operates robustly and energy-efficiently at a very low supply voltage. Fabricated in a 65nm CMOS process, the energy efficiency and data rate of the prototype transceiver vary from 0.29pJ/bit to 0.58pJ/bit and 1Gb/s to 6Gb/s, respectively, as the supply voltage is varied from 0.45V to 0.7V. In the second approach, observing that the data traffic in a real system is bursty, a full-rate burst-mode transceiver that achieves rapid on/off operation needed for energy-proportional systems is presented. By injecting input data edges into the oscillator embedded in a classical type-II digital clock and data recovery (CDR) circuit, the proposed receiver achieves instantaneous phase-locking and input jitter filtering simultaneously. In other words, the proposed CDR combines the advantages of conventional feed-forward and feedback architectures to achieve energy-proportional operation. By controlling the number of data edges injected into the oscillator, both the jitter transfer bandwidth and the jitter tolerance corner are accurately controlled. The feedback loop also corrects for any frequency error and helps improve the CDR's immunity to oscillator frequency drift during the power-on and -off states. This also improves the CDR's tolerance to consecutive identical digits present in the input data. Fabricated in a 90nm CMOS process, the prototype receiver instantaneously locks onto the very first data edge and consumes 6.1mW at 2.2Gb/s. Owing to its short power-on time, the overall transceiver's energy efficiency varies only from 5.4pJ/bit to 10.7pJ/bit when the effective data rate is varied from 2.2Gb/s to 0.22Gb/s

    Digital Centric Multi-Gigabit SerDes Design and Verification

    Get PDF
    Advances in semiconductor manufacturing still lead to ever decreasing feature sizes and constantly allow higher degrees of integration in application specific integrated circuits (ASICs). Therefore the bandwidth requirements on the external interfaces of such systems on chips (SoC) are steadily growing. Yet, as the number of pins on these ASICs is not increasing in the same pace - known as pin limitation - the bandwidth per pin has to be increased. SerDes (Serializer/Deserializer) technology, which allows to transfer data serially at very high data rates of 25Gbps and more is a key technology to overcome pin limitation and exploit the computing power that can be achieved in todays SoCs. As such SerDes blocks together with the digital logic interfacing them form complex mixed signal systems, verification of performance and functional correctness is very challenging. In this thesis a novel mixed-signal design methodology is proposed, which tightly couples model and implementation in order to ensure consistency throughout the design cycles and hereby accelerate the overall implementation flow. A tool flow that has been developed is presented, which integrates well into state of the art electronic design automation (EDA) environments and enables the usage of this methodology in practice. Further, the design space of todays high-speed serial links is analyzed and an architecture is proposed, which pushes complexity into the digital domain in order to achieve robustness, portability between manufacturing processes and scaling with advanced node technologies. The all digital phase locked loop (PLL) and clock data recovery (CDR), which have been developed are described in detail. The developed design flow was used for the implementation of the SerDes architecture in a 28nm silicon process and proved to be indispensable for future projects

    Adaptive Receiver Design for High Speed Optical Communication

    Get PDF
    Conventional input/output (IO) links consume power, independent of changes in the bandwidth demand by the system they are deployed in. As the system is designed to satisfy the peak bandwidth demand, most of the time the IO links are idle but still consuming power. In big data centers, the overall utilization ratio of IO links is less than 10%, corresponding to a large amount of energy wasted for idle operation. This work demonstrates a 60 Gb/s high sensitivity non-return-to-zero (NRZ) optical receiver in 14 nm FinFET technology with less than 7 ns power-on time. The power on time includes the data detection, analog bias settling, photo-diode DC current cancellation, and phase locking by the clock and data recovery circuit (CDR). The receiver autonomously detects the data demand on the link via a proposed link protocol and does not require any external enable or disable signals. The proposed link protocol is designed to minimize the off-state power consumption and power-on time of the link. In order to achieve high data-rate and high-sensitivity while maintaining the power budget, a 1-tap decision feedback equalization method is applied in digital domain. The sensitivity is measured to be -8 dBm, -11 dBm, and -13 dBm OMA (optical modulation amplitude) at 60 Gb/s, 48 Gb/s, and 32 Gb/s data rates, respectively. The energy efficiency in always-on mode is around 2.2 pJ/bit for all data-rates with the help of supply and bias scaling. The receiver incorporates a phase interpolator based clock-and-data recovery circuit with approximately 80 MHz jitter-tolerance corner frequency, thanks to the low-latency full custom CDR logic design. This work demonstrates the fastest ever reported CMOS optical receiver and runs almost at twice the data-rate of the state-of-the-art CMOS optical receiver by the time of the publication. The data-rate is comparable to BiCMOS optical receivers but at a fraction of the power consumption

    Toward realizing power scalable and energy proportional high-speed wireline links

    Get PDF
    Growing computational demand and proliferation of cloud computing has placed high-speed serial links at the center stage. Due to saturating energy efficiency improvements over the last five years, increasing the data throughput comes at the cost of power consumption. Conventionally, serial link power can be reduced by optimizing individual building blocks such as output drivers, receiver, or clock generation and distribution. However, this approach yields very limited efficiency improvement. This dissertation takes an alternative approach toward reducing the serial link power. Instead of optimizing the power of individual building blocks, power of the entire serial link is reduced by exploiting serial link usage by the applications. It has been demonstrated that serial links in servers are underutilized. On average, they are used only 15% of the time, i.e. these links are idle for approximately 85% of the time. Conventional links consume power during idle periods to maintain synchronization between the transmitter and the receiver. However, by powering-off the link when idle and powering it back when needed, power consumption of the serial link can be scaled proportionally to its utilization. This approach of rapid power state transitioning is known as the rapid-on/off approach. For the rapid-on/off to be effective, ideally the power-on time, off-state power, and power state transition energy must all be close to zero. However, in practice, it is very difficult to achieve these ideal conditions. Work presented in this dissertation addresses these challenges. When this research work was started (2011-12), there were only a couple of research papers available in the area of rapid-on/off links. Systematic study or design of a rapid power state transitioning in serial links was not available in the literature. Since rapid-on/off with nanoseconds granularity is not a standard in any wireline communication, even the popular test equipment does not support testing any such feature, neither any formal measurement methodology was available. All these circumstances made the beginning difficult. However, these challenges provided a unique opportunity to explore new architectural techniques and identify trade-offs. The key contributions of this dissertation are as follows. The first and foremost contribution is understanding the underlying limitations of saturating energy efficiency improvements in serial links and why there is a compelling need to find alternative ways to reduce the serial link power. The second contribution is to identify potential power saving techniques and evaluate the challenges they pose and the opportunities they present. The third contribution is the design of a 5Gb/s transmitter with a rapid-on/off feature. The transmitter achieves rapid-on/off capability in voltage mode output driver by using a fast-digital regulator, and in the clock multiplier by accurate frequency pre-setting and periodic reference insertion. To ease timing requirements, an improved edge replacement logic circuit for the clock multiplier is proposed. Mathematical modeling of power-on time as a function of various circuit parameters is also discussed. The proposed transmitter demonstrates energy proportional operation over wide variations of link utilization, and is, therefore, suitable for energy efficient links. Fabricated in 90nm CMOS technology, the voltage mode driver, and the clock multiplier achieve power-on-time of only 2ns and 10ns, respectively. This dissertation highlights key trade-off in the clock multiplier architecture, to achieve fast power-on-lock capability at the cost of jitter performance. The fourth contribution is the design of a 7GHz rapid-on/off LC-PLL based clock multi- plier. The phase locked loop (PLL) based multiplier was developed to overcome the limita- tions of the MDLL based approach. Proposed temperature compensated LC-PLL achieves power-on-lock in 1ns. The fifth and biggest contribution of this dissertation is the design of a 7Gb/s embedded clock transceiver, which achieves rapid-on/off capability in LC-PLL, current-mode transmit- ter and receiver. It was the first reported design of a complete transceiver, with an embedded clock architecture, having rapid-on/off capability. Background phase calibration technique in PLL and CDR phase calibration logic in the receiver enable instantaneous lock on power-on. The proposed transceiver demonstrates power scalability with a wide range of link utiliza- tion and, therefore, helps in improving overall system efficiency. Fabricated in 65nm CMOS technology, the 7Gb/s transceiver achieves power-on-lock in less than 20ns. The transceiver achieves power scaling by 44x (63.7mW-to-1.43mW) and energy efficiency degradation by only 2.2x (9.1pJ/bit-to-20.5pJ/bit), when the effective data rate (link utilization) changes by 100x (7Gb/s-to-70Mb/s). The sixth and final contribution is the design of a temperature sensor to compensate the frequency drifts due to temperature variations, during long power-off periods, in the fast power-on-lock LC-PLL. The proposed self-referenced VCO-based temperature sensor is designed with all digital logic gates and achieves low supply sensitivity. This sensor is suitable for integration in processor and DRAM environments. The proposed sensor works on the principle of directly converting temperature information to frequency and finally to digital bits. A novel sensing technique is proposed in which temperature information is acquired by creating a threshold voltage difference between the transistors used in the oscillators. Reduced supply sensitivity is achieved by employing junction capacitance, and the overhead of voltage regulators and an external ideal reference frequency is avoided. The effect of VCO phase noise on the sensor resolution is mathematically evaluated. Fabricated in the 65nm CMOS process, the prototype can operate with a supply ranging from 0.85V to 1.1V, and it achieves a supply sensitivity of 0.034oC/mV and an inaccuracy of Β±0.9oC and Β±2.3oC from 0-100oC after 2-point calibration, with and without static nonlinearity correction, respectively. It achieves a resolution of 0.3oC, resolution FoM of 0.3(nJ/conv)res2 , and measurement (conversion) time of 6.5ΞΌs

    A 2.5 Gb/s SONET clock and data recovery macro cell

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.Includes bibliographical references (leaves 109-110).by Andrew Holden Dickson.M.S

    Digital Intensive Mixed Signal Circuits with In-situ Performance Monitors

    Get PDF
    University of Minnesota Ph.D. dissertation.November 2016. Major: Electrical/Computer Engineering. Advisor: Chris Kim. 1 computer file (PDF); x, 137 pages.Digital intensive circuit design techniques of different mixed-signal systems such as data converters, clock generators, voltage regulators etc. are gaining attention for the implementation of modern microprocessors and system-on-chips (SoCs) in order to fully utilize the benefits of CMOS technology scaling. Moreover different performance improvement schemes, for example, noise reduction, spur cancellation, linearity improvement etc. can be easily performed in digital domain. In addition to that, increasing speed and complexity of modern SoCs necessitate the requirement of in-situ measurement schemes, primarily for high volume testing. In-situ measurements not only obviate the need for expensive measurement equipments and probing techniques, but also reduce the test time significantly when a large number of chips are required to be tested. Several digital intensive circuit design techniques are proposed in this dissertation along with different in-situ performance monitors for a variety of mixed signal systems. First, a novel beat frequency quantization technique is proposed in a two-step VCO quantizer based ADC implementation for direct digital conversion of low amplitude bio- potential signals. By direct conversion, it alleviates the requirement of the area and power consuming analog-frontend (AFE) used in a conventional ADC designs. This prototype design is realized in a 65nm CMOS technology. Measured SNDR is 44.5dB from a 10mVpp, 300Hz signal and power consumption is only 38ΞΌW. Next, three different clock generation circuits, a phase-locked loop (PLL), a multiplying delay-locked loop (MDLL) and a frequency-locked loop (FLL) are presented. First a 0.4-to-1.6GHz sub-sampling fractional-N all digital PLL architecture is discussed that utilizes a D-flip-flop as a digital sub-sampler. Measurement results from a 65nm CMOS test-chip shows 5dB lower phase noise at 100KHz offset frequency, compared to a conventional architecture. The Digital PLL (DPLL) architecture is further extended for a digital MDLL implementation in order to suppress the VCO phase noise beyond the DPLL bandwidth. A zero-offset aperture phase detector (APD) and a digital- to-time converter (DTC) are employed for static phase-offset (SPO) cancellation. A unique in-situ detection circuitry achieves a high resolution SPO measurement in time domain. A 65nm test-chip shows 0.2-to-1.45GHz output frequency range while reducing the phase-noise by 9dB compared to a DPLL. Next, a frequency-to-current converter (FTC) based fractional FLL is proposed for a low accuracy clock generation in an extremely low area for IoT application. High density deep-trench capacitors are used for area reduction. The test-chip is fabricated in a 32nm SOI technology that takes only 0.0054mm2 active area. A high-resolution in-situ period jitter measurement block is also incorporated in this design. Finally, a time based digital low dropout (DLDO) regulator architecture is proposed for fine grain power delivery over a wide load current dynamic range and input/output voltage in order to facilitate dynamic voltage and frequency scaling (DVFS). High- resolution beat frequency detector dynamically adjusts the loop sampling frequency for ripple and settling time reduction due to load transients. A fixed steady-state voltage offset provides inherent active voltage positioning (AVP) for ripple reduction. Circuit simulations in a 65nm technology show more than 90% current efficiency for 100X load current variation, while it can operate for an input voltage range of 0.6V – 1.2V

    Power-Proportional Optical Links

    Get PDF
    The continuous increase in data transfer rate in short-reach links, such as chip-to-chip and between servers within a data-center, demands high-speed links. As power efficiency becomes ever more important in these links, power-efficient optical links need to be designed. Power efficiency in a link can be achieved by enabling power-proportional communication over the serial link. In power-proportional links, the power dissipated by a link is proportional to the amount of data communicated. Normally, data-rate demand is not constant, and the peak data-rate is not required all the time. If a link is not adapted according to the data-rate demand, there will be a fixed power dissipation, and the power efficiency of the link will degrade during the sub-maximal link utilization. Adapting links to real-time data-rate requirements reduces power dissipation. Power proportionality is achieved by scaling the power of the serial link linearly with the link utilization, and techniques such as variable data-rate and burst-mode can be adopted for this purpose. Links whose data rate (and hence power dissipation) can be varied in response to system demands are proposed in this work. Past works have presented rapidly reconfigurable bandwidth in variable data-rate receivers, allowing lower power dissipation for lower data-rate operation. However, maintaining synchronization during reconfiguration was not possible since previous approaches have introduced changes in front-end delay when they are reconfigured. This work presents a technique that allows rapid bandwidth adjustment while maintaining a near-constant delay through the receiver suitable for a power-scalable variable data-rate optical link. Measurements of a fabricated integrated circuit (IC) show nearly constant energy per bit across a 2Γ— variation in data rate while introducing less than 10 % of a unit interval (UI) of delay variation. With continuously increasing data communication in data-centers, parallel optical links with ever-increasing per-lane data rates are being used to meet overall throughput demands. Simultaneously, power efficiency is becoming increasingly important for these links since they do not transmit useful data all the time. The burst-mode solution for vertical-cavity surface-emitting laser (VCSEL)-based point-to-point communication can be used to improve links’ energy efficiency during low link activity. The burst-mode technique for VCSEL-based links has not yet been deployed commercially. Past works have presented burst-mode solutions for single-channel receivers, allowing lower power dissipation during low link activity and solutions for fast activation of the receivers. However, this work presents a novel technique that allows rapid activation of a front-end and fast locking of a clock-and-data-recovery (CDR) for a multi-channel parallel link, utilizing opportunities arising from the parallel nature of many VCSEL-based links. The idea has been demonstrated through electrical and optical measurements of a fabricated IC at 10 Gbps, which show fast data detection and activation of the circuitry within 49 UIs while allowing the front-end to achieve better energy efficiency during low link activity. Simulation results are also presented in support of the proposed technique which allows the CDR to lock within 26 UIs from when it is powered on
    • …
    corecore