Search CORE

8 research outputs found

A 2-40 Gb/s PAM4/NRZ dual-mode wireline transmitter with 4:1 MUX in 65-nm CMOS

Author: He Y
Lv F
Wang J
Wang Z
Wang Z
Yuan S
Zhang C
Zhao F
Zheng X
Publication venue: 'The Institute of Electronics Engineers of Korea'
Publication date
Field of study

This paper presents a 2-40 Gb/s dual-mode wireline transmitter supporting the four-level pulse amplitude modulation (PAM4) and non-return-to-zero (NRZ) modulation with a multiplexer (MUX)-based two-tap feed-forward equalizer (FFE). An edge-acceleration technique is proposed for the 4:1 MUX to increase the bandwidth. By utilizing a dedicated cascode current source, the output swing can achieve 900 mV with a level deviation of only 0.12% for PAM4. Fabricated in a 65-nm CMOS process, the transmitter consumes 117 mW and 89 mW at 40 Gb/s in PAM4 and NRZ at 1.2 V supply. © 2018, Institute of Electronics Engineers of Korea. All rights reserved

LJMU Research Online (Liverpool John Moores University)

Crossref

A 1.8-pJ/b, 12.5-25-Gb/s wide range all-digital clock and data recovery circuit

Author: Bauwelinck Johan
Moeneclaey Bart
Ramon Hannes
Rombouts Pieter
Torfs Guy
Verbeke Marijn
Yin Xin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Recently, there has been a strong drive to replace established analog circuits for multi-gigabit clock and data recovery (CDR) by more digital solutions. We focused on phase locked loop-based all-digital CDR (AD-CDR) techniques which contain a digital loop filter (DLF) and a digital controlled oscillator (DCO) and pushed the digital integration up to a level where our DLF is entirely synthesized. To enable this, we found that extensive subsampling can be used to decrease the speed of the DLF while maintaining a good operation. Additionally, an Inverse Alexander phase detector and a 5.5-bit resolution DCO complete the AD-CDR architecture. As a result of the low complexity and digital architecture, the AD-CDR occupies a compact active chip area of 0.050 mm(2) and consumes only 46 mW at 25 Gb/s. This is the smallest area and the lowest power consumption compared with the state-of-the-art. In addition, our implementation is highly tunable due to the synthesized logic, and supports a wide operating range (12.5-25 Gb/s), which is a significantly larger range compared with the previous work. Finally, thanks to our digital architecture, the power dissipation decreases linearly while moving to the lower speeds of our operating range. This is in contrast with the most prior work, making our design truly adaptive

Ghent University Academic Bibliography

A 40-Gb/s serial link transceiver in 28-nm CMOS technology

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

State of the art in chip-to-chip interconnects

Author: Olaortua Garcia Andreu
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/06/2021
Field of study

This thesis presents a study of short-range links for chips mounted in the same package, on printed circuit boards or interposers. Implemented in CMOS technology between 7 and 250 nm, with links that operate at a data rate between 0,4 and 112 Gb/s/pin and with energy efficiencies from 0,3 to 67,7 pJ/bit. The links operate on channels with an attenuation lower than 50 dB. A comparison is made with graphical representations between the different articles that shows the correlation between the different essential metrics of chip-to-chip interconnects, as well as its evolution over the last 20 years.Esta tesis presenta un estudio de enlaces de corto alcance para chips montados en un mismo paquete, en placas de circuito impreso o intercaladores. Implementado en tecnología CMOS entre 7 y 250 nm, con enlaces que operan a una velocidad de datos entre 0,4 y 112 Gb/s/pin y con eficiencias energéticas de 0,3 a 67,7 pJ/bit. Los enlaces operan en canales con una atenuación inferior a 50 dB. Se realiza una comparación con representaciones gráficas entre los diferentes artículos que muestra la correlación entre las distintas métricas esenciales de las interconexiones chip a chip, así como su evolución en los últimos 20 años.Aquesta tesi presenta un estudi d'enllaços de curt abast per a xips muntats en el mateix paquet, en plaques de circuits impresos o interposers. Implementat en tecnologia CMOS entre 7 i 250 nm, amb enllaços que funcionen a una velocitat de dades entre 0,4 i 112 Gb/s/pin i amb eficiències energètiques de 0,3 a 67,7 pJ/bit. Els enllaços funcionen en canals amb una atenuació inferior a 50 dB. Es fa una comparació amb representacions gràfiques entre els diferents articles que mostra la correlació entre les diferents mètriques essencials d'interconnexions xip a xip, així com la seva evolució en els darrers 20 anys

UPCommons. Portal del coneixement obert de la UPC

Recommended from our members

Heterogeneous Integration on Silicon-Interconnect Fabric using fine-pitch interconnects (≤10 �m)

Author: Jangam SivaChandra
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Today, the ever-growing data-bandwidth demand is pushing the boundaries of the traditional printed circuit board (PCB) based integration schemes. Moreover, with the apparent saturation of semiconductor scaling, commonly called Moore's law, system scaling warrants a paradigm shift in packaging technologies, assembly techniques, and integration methodologies. In this work, a superior alternative to PCBs called the Silicon-Interconnect Fabric (Si-IF) is investigated. The Si-IF is a silicon-based, package-less, fine-pitch, highly scalable, heterogeneous integration platform for wafer-scale systems. In this technology, unpackaged dielets are assembled on the Si-IF at small inter-dielet spacings (≤100 �m) using fine-pitch (≤10 �m) die-to-substrate interconnects. A novel assembly process using a solder-less direct metal-metal (gold-gold and copper-copper) thermal compression bonding was developed. Using this process, sub-10 �m pitch interconnects with a low specific contact resistance of ≤0.7 Ω-�m2 were successfully demonstrated. Because of the tightly packed Si-IF assembly, the communication links between the neighboring dies are short (≤500 �m) with low loss (≤2 dB), comparable to on-chip connections. Consequently, simple buffers can transfer data between dies using a Simple Universal Parallel intERface for chips (SuperCHIPS) at low latency (<30 ps), low energy per bit (≤0.03 pJ/b), and high data-rates (up to 10 Gbps/link), corresponding to an aggregate bandwidth up to 8 Tbps/mm. The benefits of the SuperCHIPS protocol were experimentally demonstrated to provide 5-90X higher data-bandwidth, 8-30X lower latency, and 5-40X lower energy per bit compared to existing integration schemes. This dissertation addresses the assembly technology and communication protocols of the Si-IF technology

eScholarship - University of California

저전력, 저면적 유선 송수신기 설계를 위한 회로 기술

Author: 배우람
Publication venue: 서울대학교 대학원
Publication date: 01/08/2016
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 8. 정덕균.In this thesis, novel circuit techniques for low-power and area-efficient wireline transceiver, including a phase-locked loop (PLL) based on a two-stage ring oscillator, a scalable voltage-mode transmitter, and a forwarded-clock (FC) receiver based on a delay-locked-loop (DLL) based per-pin deskew, are proposed. At first, a two-stage ring PLL that provides a four-phase, high-speed clock for a quarter-rate TX in order to minimize power consumption is presented. Several analyses and verification techniques, ranging from the clocking architectures for a high-speed TX to oscillation failures in a two-stage ring oscillator, are addressed in this thesis. A tri-state-inverter–based frequency-divider and an AC-coupled clock-buffer are used for high-speed operations with minimal power and area overheads. The proposed PLL fabricated in the 65-nm CMOS technology occupies an active area of 0.009 mm2 with an integrated-RMS-jitter of 414 fs from 10 kHz to 100 MHz while consuming 7.6 mW from a 1.2-V supply at 10 GHz. The resulting figure-of-merit is -238.8 dB, which surpasses that of the state-of-the-art ring-PLLs by 4 dB. Secondly, a voltage-mode (VM) transmitter which offers a wide operation range of 6 to 32 Gb/s, controllable pre-emphasis equalization and output voltage swing without altering output impedance, and a power supply scalability is presented. A quarter-rate clocking architecture is employed in order to maximize the scalability and energy efficiency across the variety of operating conditions. A P-over-N VM driver is used for CMOS compatibility and wide voltage-swing range required for various I/O standards. Two supply regulators calibrate the output impedance of the VM driver across the wide swing and pre-emphasis range. A single phase-locked loop is used to provide a wide frequency range of 1.5-to-8 GHz. The prototype chip is fabricated in 65-nm CMOS technology and occupies active area of 0.48x0.36 mm2. The proposed transmitter achieves 250-to-600-mV single-ended swing and exhibits the energy efficiency of 2.10-to-2.93 pJ/bit across the data rate of 6-to-32 Gb/s. And last, this thesis describes a power and area-efficient FC receiver and includes an analysis of the jitter tolerance of the FC receiver. In the proposed design, jitter tolerance is maximized according to the analysis by employing a DLL-based de-skewing. A sample-swapping bang-bang phase-detector (SS-BBPD) eliminates the stuck locking caused by the finite delay range of the voltage-controlled delay line (VCDL), and also reduces the required delay range of the VCDL by half. The proposed FC receiver is fabricated in 65-nm CMOS technology and occupies an active area of 0.025 mm2. At a data rate of 12.5 Gb/s, the proposed FC receiver exhibits an energy efficiency of 0.36 pJ/bit, and tolerates 1.4-UIpp sinusoidal jitter of 300 MHz.Chapter 1. Introduction 1 1.1. Motivation 1 1.2. Thesis organization 5 Chapter 2. Phase-Locked Loop Based on Two-Stage Ring Oscillator 7 2.1. Overivew 7 2.2. Background and Analysis of a Two-stage Ring Oscillator 11 2.3. Circuit Implementation of The Proposed PLL 25 2.4. Measurement Results 33 Chapter 3. A Scalable Voltage-Mode Transmitter 37 3.1. Overview 37 3.2. Design Considerations on a Scalable Serial Link Transmitter 40 3.3. Circuit Implementation 46 3.4. Measurement Results 56 Chapter 4. Delay-Locked Loop Based Forwarded-Clock Receiver 62 4.1. Overview 62 4.2. Timing and Data Recovery in a Serial Link 65 4.3. DLL-Based Forwarded-Clock Receiver Characteristics 70 4.4. Circuit Implementation 79 4.5. Measurement Results 89 Chapter 5. Conclusion 94 Appendix 96 Appendix A. Design flow to optimize a high-speed ring oscillator 96 Appendix B. Reflection Issues in N-over-N Voltage-Mode Driver 99 Appendix C. Analysis on output swing and power consumption of the P-over-N voltage-mode driver 107 Appendix D. Loop Dynamics of DLL 112 Bibliography 121 Abstract 128Docto

SNU Open Repository and Archive

Low-power subsampling all-digital clock and data recovery techniques for multi-gigabit passive optical networks

Author: Verbeke Marijn
Publication venue
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Design of High-Speed SerDes Transceiver for Chip-to-Chip Communications in CMOS Process

Author: Zheng Xuqiang
Publication venue
Publication date
Field of study

With the continuous increase of on-chip computation capacities and exponential growth of data-intensive applications, the high-speed data transmission through serial links has become the backbone for modern communication systems. To satisfy the massive data-exchanging requirement, the data rate of such serial links has been updated from several Gb/s to tens of Gb/s. Currently, the commercial standards such as Ethernet 400GbE, InfiniBand high data rate (HDR), and common electrical interface (CEI)-56G has been developing towards 40+ Gb/s. As the core component within these links, the transceiver chipset plays a fundamental role in balancing the operation speed, power consumption, area occupation, and operation range. Meanwhile, the CMOS process has become the dominant technology in modern transceiver chip fabrications due to its large-scale digital integration capability and aggressive pricing advantage. This research aims to explore advanced techniques that are capable of exploiting the maximum operation speed of the CMOS process, and hence provides potential solutions for 40+ Gb/s CMOS transceiver designs. The major contributions are summarized as follows. A low jitter ring-oscillator-based injection-locked clock multiplier (RILCM) with a hybrid frequency tracking loop that consists of a traditional phase-locked loop (PLL), a timing-adjusted loop, and a loop selection state-machine is implemented in 65-nm C-MOS process. In the ring voltage-controlled oscillator, a full-swing pseudo-differential delay cell is proposed to lower the device noise to phase noise conversion. To obtain high operation speed and high detection accuracy, a compact timing-adjusted phase detector tightly combined with a well-matched charge pump is designed. Meanwhile, a lock-loss detection and lock recovery is devised to endow the RILCM with a similar lock-acquisition ability as conventional PLL, thus excluding the initial frequency set- I up aid and preventing the potential lock-loss risk. The experimental results show that the figure-of-merit of the designed RILCM reaches -247.3 dB, which is better than previous RILCMs and even comparable to the large-area LC-ILCMs. The transmitter (TX) and receiver (RX) chips are separately designed and fab- ricated in 65-nm CMOS process. The transmitter chip employs a quarter-rate multi-multiplexer (MUX)-based 4-tap feed-forward equalizer (FFE) to pre-distort the output. To increase the maximum operating speed, a bandwidth-enhanced 4:1 MUX with the capability of eliminating charge-sharing effect is proposed. To produce the quarter-rate parallel data streams with appropriate delays, a compact latch array associated with an interleaved-retiming technique is designed. The receiver chip employs a two-stage continuous-time linear equalizer (CTLE) as the analog front-end and integrates an improved clock data recovery to extract the sampling clocks and retime the incoming data. To automatically balance the jitter tracking and jitter suppression, passive low-pass filters with adaptively-adjusted bandwidth are introduced into the data-sampling path. To optimize the linearity of the phase interpolation, a time-averaging-based compensating phase interpolator is proposed. For equalization, a combined TX-FFE and RX-CTLE is applied to compensate for the channel loss, where a low-cost edge-data correlation-based sign zero-forcing adaptation algorithm is proposed to automatically adjust the TX-FFE’s tap weights. Measurement results show that the fabricated transmitter/receiver chipset can deliver 40 Gb/s random data at a bit error rate of 16 dB loss at the half-baud frequency, while consuming a total power of 370 mW

University of Lincoln Institutional Repository