Search CORE

19 research outputs found

High Speed Reconfigurable NRZ/PAM4 Transceiver Design Techniques

Author: Roshan Zamir Ashkan
Publication venue
Publication date: 17/01/2019
Field of study

While the majority of wireline standards use simple binary non-return-to-zero (NRZ) signaling, four-level pulse-amplitude modulation (PAM4) standards are emerging to increase bandwidth density. This dissertation proposes efficient implementations for high speed NRZ/PAM4 transceivers. The first prototype includes a dual-mode NRZ/PAM4 serial I/O transmitter which can support both modulations with minimum power and hardware overhead. A source-series-terminated (SST) transmitter achieves 1.2Vpp output swing and employs lookup table (LUT) control of a 31-segment output digital-to-analog converter (DAC) to implement 4/2-tap feed-forward equalization (FFE) in NRZ/PAM4 modes, respectively. Transmitter power is improved with low-overhead analog impedance control in the DAC cells and a quarter-rate serializer based on a tri-state inverter-based mux with dynamic pre-driver gates. The transmitter is designed to work with a receiver that implements an NRZ/PAM4 decision feedback equalizer (DFE) that employs 1 finite impulse response (FIR) and 2 infinite impulse response (IIR) taps for first post-cursor and long-tail ISI cancellation, respectively. Fabricated in GP 65-nm CMOS, the transmitter occupies 0.060mm² area and achieves 16Gb/s NRZ and 32Gb/s PAM4 operation at 10.4 and 4.9 mW/Gb/s while operating over channels with 27.6 and 13.5dB loss at Nyquist, respectively. The second prototype presents a 56Gb/s four-level pulse amplitude modulation (PAM4) quarter-rate wireline receiver which is implemented in a 65nm CMOS process. The frontend utilize a single stage continuous time linear equalizer (CTLE) to boost the main cursor and relax the pre-cursor cancelation requirement, requiring only a 2-tap pre-cursor feed-forward equalization (FFE) on the transmitter side. A 2-tap decision feedback equalizer (DFE) with one finite impulse response (FIR) tap and one infinite impulse response (IIR) tap is employed to cancel first post-cursor and longtail inter-symbol interference (ISI). The FIR tap direct feedback is implemented inside the CML slicers to relax the critical timing of DFE and maximize the achievable data-rate. In addition to the per-slice main 3 data samplers, an error sampler is utilized for background threshold control and an edge-based sampler performs both PLL-based CDR phase detection and generates information for background DFE tap adaptation. The receiver consumes 4.63mW/Gb/s and compensates for up to 20.8dB loss when operated with a 2- tap FFE transmitter. The experimental results and comparison with state-of-the-art shows superior power efficiency of the presented prototypes for similar data-rate and channel loss. The usage of proposed design techniques are not limited to these specific prototypes and can be applied for any wireline transceiver with different modulation, data-rate and CMOS technology

Texas A&M Repository

Energy-Efficient Receiver Design for High-Speed Interconnects

Author: Chen Kuan-Chang
Publication venue
Publication date: 01/01/2022
Field of study

High-speed interconnects are of vital importance to the operation of high-performance computing and communication systems, determining the ultimate bandwidth or data rates at which the information can be exchanged. Optical interconnects and the employment of high-order modulation formats are considered as the solutions to fulfilling the envisioned speed and power efficiency of future interconnects. One common key factor in bringing the success is the availability of energy-efficient receivers with superior sensitivity. To enhance the receiver sensitivity, improvement in the signal-to-noise ratio (SNR) of the front-end circuits, or equalization that mitigates the detrimental inter-symbol interference (ISI) is required. In this dissertation, architectural and circuit-level energy-efficient techniques serving these goals are presented. First, an avalanche photodetector (APD)-based optical receiver is described, which utilizes non-return-to-zero (NRZ) modulation and is applicable to burst-mode operation. For the purposes of improving the overall optical link energy efficiency as well as the link bandwidth, this optical receiver is designed to achieve high sensitivity and high reconfiguration speed. The high sensitivity is enabled by optimizing the SNR at the front-end through adjusting the APD responsivity via its reverse bias voltage, along with the incorporation of 2-tap feedforward equalization (FFE) and 2-tap decision feedback equalization (DFE) implemented in current-integrating fashion. The high reconfiguration speed is empowered by the proposed integrating dc and amplitude comparators, which eliminate the RC settling time constraints. The receiver circuits, excluding the APD die, are fabricated in 28-nm CMOS technology. The optical receiver achieves bit-error-rate (BER) better than 1E−12 at −16-dBm optical modulation amplitude (OMA), 2.24-ns reconfiguration time with 5-dB dynamic range, and 1.37-pJ/b energy efficiency at 25 Gb/s. Second, a 4-level pulse amplitude modulation (PAM4) wireline receiver is described, which incorporates continuous time linear equalizers (CTLEs) and a 2-tap direct DFE dedicated to the compensation for the first and second post-cursor ISI. The direct DFE in a PAM4 receiver (PAM4-DFE) is made possible by the proposed CMOS track-and-regenerate slicer. This proposed slicer offers rail-to-rail digital feedback signals with significantly improved clock-to-Q delay performance. The reduced slicer delay relaxes the settling time constraint of the summer circuits and allows the stringent DFE timing constraint to be satisfied. With the availability of a direct DFE employing the proposed slicer, inductor-based bandwidth enhancement and loop-unrolling techniques, which can be power/area intensive, are not required. Fabricated in 28-nm CMOS technology, the PAM4 receiver achieves BER better than 1E−12 and 1.1-pJ/b energy efficiency at 60 Gb/s, measured over a channel with 8.2-dB loss at Nyquist frequency. Third, digital neural-network-enhanced FFEs (NN-FFEs) for PAM4 analog-to-digital converter (ADC)-based optical interconnects are described. The proposed NN-FFEs employ a custom learnable piecewise linear (PWL) activation function to tackle the nonlinearities with short memory lengths. In contrast to the conventional Volterra equalizers where multipliers are utilized to generate the nonlinear terms, the proposed NN-FFEs leverage the custom PWL activation function for nonlinear operations and reduce the required number of multipliers, thereby improving the area and power efficiencies. Applications in the optical interconnects based on micro-ring modulators (MRMs) are demonstrated with simulation results of 50-Gb/s and 100-Gb/s links adopting PAM4 signaling. The proposed NN-FFEs and the conventional Volterra equalizers are synthesized with the standard-cell libraries in a commercial 28-nm CMOS technology, and their power consumptions and performance are compared. Better than 37% lower power overhead can be achieved by employing the proposed NN-FFEs, in comparison with the Volterra equalizer that leads to similar improvement in the symbol-error-rate (SER) performance.</p

Caltech Theses and Dissertations

Design Techniques for High Performance Serial Link Transceivers

Author: Min Byungho
Publication venue
Publication date: 21/08/2017
Field of study

Increasing data rates over electrical channels with significant frequency-dependent loss is difficult due to excessive inter-symbol interference (ISI). In order to achieve sufficient link margins at high rates, I/O system designers implement equalization in the transmitters and are motivated to consider more spectrally-efficient modulation formats relative to the common PAM-2 scheme, such as PAM-4 and duobinary. The first work, reviews when to consider PAM-4 and duobinary formats, as the modulation scheme which yields the highest system margins at a given data rate is a function of the channel loss profile, and presents a 20Gb/s triple-mode transmitter capable of efficiently implementing these three modulation schemes and three-tap feedforward equalization. A statistical link modeling tool, which models ISI, crosstalk, random noise, and timing jitter, is developed to compare the three common modulation formats operating on electrical backplane channel models. In order to improve duobinary modulation efficiency, a low-power quarter-rate duobinary precoder circuit is proposed which provides significant timing margin improvement relative to full-rate precoders. Also as serial I/O data rates scale above 10 Gb/s, crosstalk between neighboring channels degrades system bit-error rate (BER) performance. The next work presents receive-side circuitry which merges the cancellation of both near-end and far-end crosstalk (NEXT/FEXT) and can automatically adapt to different channel environments and variations in process, voltage, and temperature. NEXT cancellation is realized with a novel 3-tap FIR filter which combines two traditional FIR filter taps and a continuous-time band-pass filter IIR tap for efficient crosstalk cancellation, with all filter tap coefficients automatically determined via an ondie sign-sign least-mean-square (SS-LMS) adaptation engine. FEXT cancellation is realized by coupling the aggressor signal through a differentiator circuit whose gain is automatically adjusted with a power-detection-based adaptation loop. In conclusion, the proposed architectures in the transmitter side and receiver side together are to be good solution in the high speed I/O serial links to improve the performance by overcome the physical channel loss and adjacent channel noise as the system becomes complicated

Texas A&M Repository

Design of Low-Power NRZ/PAM-4 Wireline Transmitters

Author: Yang Hae Woong
Publication venue
Publication date: 12/01/2021
Field of study

Rapid growing demand for instant multimedia access in a myriad of digital devices has pushed the need for higher bandwidth in modern communication hardwares ranging from short-reach (SR) memory/storage interfaces to long-reach (LR) data center Ethernets. At the same time, comprehensive design optimization of link system that meets the energy-efficiency is required for mobile computing and low operational cost at datacenters. This doctoral study consists of design of two low-swing wireline transmitters featuring a low-power clock distribution and 2-tap equalization in energy-efficient manners up to 20-Gb/s operation. In spite of the reduced signaling power in the voltage-mode (VM) transmit driver, the presence of the segment selection logic still diminishes the power saving benefit. The first work presents a scalable VM transmitter which offers low static power dissipation and adopts an impedance-modulated 2-tap equalizer with analog tap control, thereby obviating driver segmentation and reducing pre-driver complexity and dynamic power. Per-channel quadrature clock generation with injection-locked oscillators (ILO) allows the generation of rail-to-rail quadrature clocks. Energy efficiency is further improved with capacitively driven low-swing global clock distribution and supply scaling at lower data rates, while output eye quality is maintained at low voltages with automatic phase calibration of the local ILO-generated quarter-rate clocks. A prototype fabricated in a general purpose 65 nm CMOS process includes a 2 mm global clock distribution network and two transmitters that support an output swing range of 100-300mV with up to 12-dB of equalization. The transmitters achieve 8-16 Gb/s operation at 0.65-1.05 pJ/b energy efficiency. The second work involves a dual-mode NRZ/PAM-4 differential low-swing voltage-mode (VM) transmitter. The pulse-selected output multiplexing allows reduction of power supply and deterministic jitter caused by large on-chip parasitic inherent in the transmission-gate-based multiplexers in the earlier work. Analog impedance control replica circuits running in the background produce gate-biasing voltages that control the peaking ratio for 2-tap feed-forward equalization and PAM-4 symbol levels for high-linearity. This analog control also allows for efficient generation of the middle levels in PAM-4 operation with good linearity quantified by level separation mismatch ratio of 95%. In NRZ mode, 2-tap feedforward equalization is configurable in high-performance controlled-impedance or energy-efficient impedance-modulated settings to provide performance scalability. Analytic design consideration on dynamic power, data-rate, mismatch, and output swing brings optimal performance metric on the given technology node. The proof-of-concept prototype is verified on silicon with 65 nm CMOS process with improved performance in speed and energy-efficiency owing to double-stack NMOS transistors in the output stage. The transmitter consumes as low as 29.6mW in 20-Gb/s NRZ and 25.5mW in the 28-Gb/s PAM-4 operations

Texas A&M Repository

최적에 가까운 타이밍 적응을 위해 치우친 데이터 레벨과 눈 경사 디텍터를 사용한 최대 눈크기추적 클럭 및 데이터 복원회로 설계

Author: 주혜윤
Publication venue: 서울대학교 대학원
Publication date: 01/02/2021
Field of study

학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2021. 2. 정덕균.이 논문에서는 최소-비트 비트 에러율 (BER)에 대한 최대 눈크기 추적 CDR (MET-CDR)의 설계가 제안되었다. 제안 된 CDR 은 최적의 샘플링 단계를 찾기 위해 반복 절차를 가진 BER 카운터 또는 아이 모니터가 필 요하지 않다. 에러 샘플러 출력에 가중치를 두어 더하여 얻은 치우친 데 이터 레벨 (biased dLev) 은 사전 커서 ISI(pre-cursor ISI) 의 정보도 고려한 눈 높이 정보를 추출한다. 델타 T 만큼의 시간 차이를 둔 지점에서 작동 하는 두 샘플러는 현재 눈 높이와 눈 기울기의 극성을 감지하고, 이 정보 를 통해 제안하는 CDR 은 눈 기울기가 0 이되는 최대 눈 높이로 수렴한 다. 측정 결과는 최대 눈 높이와 최소 BER 의 샘플링 위치가 잘 일치 함 을 보여준다. 28nm CMOS 공정으로 구현된 수신기 칩은 23.5dB 의 채널 손실이 있는 상태에서 26Gb/s 에서 동작 가능하다. 0.25UI 의 아이 오프닝 을 가지며, 87mW 의 파워를 소비한다.In this thesis, design of a maximum-eye-tracking CDR (MET-CDR) for minimum bit error rate (BER) is proposed. The proposed CDR does not require a BER coun-ter or an eye-opening monitor with any iterative procedure to find the near-optimal sampling phase. The biased data-level obtained from the weighted sum of error sampler outputs, UP and DN, extracts the actual eye height information in the presence of pre-cursor ISI. Two samplers operating on two slightly different tim-ings detect the current eye height and the polarity of the eye slope so that the CDR tracks the maximum eye height where the slope becomes zero. Measured results show that the sampling phase of the maximum eye height and that of the mini-mum BER match well. A prototype receiver fabricated in 28 nm CMOS process operates at 26 Gb/s with an eye-opening of 0.25 UI and consumes 87 mW while equalizing 23.5 dB of loss at 13 GHz.ABSTRACT I CONTENTS II LIST OF FIGURES IV LIST OF TABLES VIII CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 4 CHAPTER 2 BACKGROUNDS 5 2.1 RECEIVER FRONT-END 5 2.1.1 CHANNEL 7 2.1.2 EQUALIZER 17 2.1.3 CDR 32 2.2 PRIOR ARTS ON CLOCK RECOVERY 39 2.2.1 BB-CDR 39 2.2.2 BER-BASED CDR 41 2.2.3 EOM-BASED CDR 44 2.3 CONCEPT OF THE PROPOSED CDR 47 CHAPTER 3 MAXIMUM-EYE-TRACKING CDR WITH BIASED DATA-LEVEL AND EYE SLOPE DETECTOR 49 3.1 OVERVIEW 49 3.2 DESIGN OF MET-CDR 50 3.2.1 EYE HEIGHT INFORMATION FROM BIASED DATA-LEVEL 50 3.2.2 EYE SLOPE DETECTOR AND ADAPTATION ALGORITHM 60 3.2.3 ARCHITECTURE AND IMPLEMENTATION 67 3.2.4 VERIFICATION OF THE ALGORITHM 71 3.2.5 ANALYSIS ON THE BIASED DATA-LEVEL 76 3.3 EXPANSION OF MET-CDR TO PAM4 SIGNALING 84 3.3.1 MET-CDR WITH PAM4 84 3.3.2 CONSIDERATIONS FOR PAM4 87 CHAPTER 4 MEASUREMENT RESULTS 89 CHAPTER 5 CONCLUSION 99 APPENDIX A MATLAB CODE FOR SIMULATING RECEIVER WITH MET-CDR 100 BIBLIOGRAPHY 105 초 록 113Docto

SNU Open Repository and Archive

오프셋 제거기의 적응 제어 등화기와 보우-레이트 위상 검출기를 활용한 수신기 설계

Author: 이광호
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2021.8. 염제완.In this thesis, designs of high-speed, low-power wireline receivers (RX) are explained. To be specific, the circuit techniques of DC offset cancellation, merged-summer DFE, stochastic Baud-rate CDR, and the phase detector (PD) for multi-level signal are proposed. At first, an RX with adaptive offset cancellation (AOC) and merged summer decision-feedback equalizer (DFE) is proposed. The proposed AOC engine removes the random DC offset of the data path by examining the random data stream's sampled data and edge outputs. In addition, the proposed RX incorporates a shared-summer DFE in a half-rate structure to reduce power dissipation and hardware complexity of the adaptive equalizer. A prototype chip fabricated in 40 nm CMOS technology occupies an active area of 0.083 mm2. Thanks to the AOC engine, the proposed RX achieves the BER of less than 10-12 in a wide range of data rates: 1.62-10 Gb/s. The proposed RX consumes 18.6 mW at 10 Gb/s over a channel with a 27 dB loss at 5 GHz, exhibiting a figure-of-merit of 0.068 pJ/b/dB. Secondly, a 40 nm CMOS RX with Baud-rate phase-detector (BRPD) is proposed. The RX includes two PDs: the BRPD employing the stochastic technique and the BRPD suitable for multi-level signals. Thanks to the Baud-rate CDR’s advantage, by not using an edge-sampling clock, the proposed CDR can reduce the power consumption by lowering the hardware complexity. Besides, the proposed stochastic phase detector (SPD) tracks an optimal phase-locking point that maximizes the vertical eye opening. Furthermore, despite residual inter-symbol interference, proposed BRPD for multi-level signal secures vertical eye margin, which is especially vulnerable in the multi-level signal. Besides, the proposed BRPD has a unique lock point with an adaptive DFE, unlike conventional Mueller-Muller PD. A prototype chip fabricated in 40 nm CMOS technology occupies an active area of 0.24 mm2. The proposed PAM-4 RX achieves the bit-error-rate less than 10-11 in 48 Gb/s and the power efficiency of 2.42 pJ/b.본 논문은 고속, 저전력으로 동작하는 유선 수신기의 설계에 대해 설명하고 있다. 구체적으로 말하면, 오프셋 상쇄, 병합된 서머를 사용하는 결정 피드백 등화기 기술, 확률적 보우 레이트 클럭과 데이터 복원기, 그리고 다중 레벨 신호에 적합한 위상 검출기를 제안한다. 첫째로, 적응 오프셋 제거 및 병합된 서머를 사용하는 결정 피드백 등화기를 갖춘 수신기를 제안한다. 제안된 적응 오프셋 제거 엔진은 임의의 데이터 스트림의 샘플링 데이터, 에지 출력을 검사하여 데이터 경로 상의 오프셋을 제거한다. 또한 하프 레이트 구조의 병합된 서머를 사용하는 결정 피드백 등화기는 전력의 사용과 하드웨어의 복잡성을 줄인다. 40 nm CMOS 기술로 제작된 프로토타입 칩은 0.083 mm2 의 면적을 가진다. 적응 오프셋 제거기 덕분에 제안된 수신기는 10-12 미만의 BER을 달성한다. 또한 제안된 수신기는 5GHz에서 27 dB의 로스를 갖는 채널에서 10 Gb/s의 속도에서 18.6 mW를 소비하며 0.068 pJ/b/dB의 FoM을 달성하였다. 두번째로, 보우 레이트 위상 검출기가 있는 40 nm CMOS 수신기가 제안되었다. 수신기에는 두개의 보우 레이트 위상 검출기를 포함한다. 하나는 확률론적 기법을 사용하는 보우 레이트 위상 검출기이다. 보우 레이트 클럭 데이터 복원기의 장점 덕분에 에지 샘플링 클럭을 사용하지 않음으로서 파워의 소모와 하드웨어의 복잡성을 줄였다. 또한 확률적 위상 검출기는 수직 아이 오프닝을 최대화하는 최적의 위상 지점을 찾을 수 있었다. 다른 위상 검출기는 다중 레벨 신호에 적합한 방식이다. 심볼 간 간섭이 다중 레벨 신호에 매우 취약한 문제가 있더라도 제안된 다중 레벨 신호용 보우 레이트 위상 검출기는 수직 아이 마진을 확보한다. 게다가 제안된 보우 레이트 위상 검출기는 기존의 뮬러-뮐러 위상 검출기와 달리 적응형 결정 피드백 등화기가 있더라도 유일한 락 지점을 갖는다. 프로토타입 칩은 0.24mm2의 면적을 가진다. 제안된 PAM-4 수신기는 48 Gb/s의 속도에서 10-11 미만의 BER을 가지고, 2.42 pJ/b의 FoM을 가진다.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 5 CHAPTER 2 BACKGROUNDS 6 2.1 BASIC ARCHITECTURE IN SERIAL LINK 6 2.1.1 SERIAL COMMUNICATION 6 2.1.2 CLOCK AND DATA RECOVERY 8 2.1.3 MULTI-LEVEL PULSE-AMPLITUDE MODULATION 10 2.2 EQUALIZER 12 2.2.1 EQUALIZER OVERVIEW 12 2.2.2 DECISION-FEEDBACK EQUALIZER 15 2.2.3 ADAPTIVE EQUALIZER 18 2.3 CLOCK RECOVERY 21 2.3.1 2X OVERSAMPLING PD ALEXANDER PD 22 2.3.2 BAUD-RATE PD MUELLER MULLER PD 25 CHAPTER 3 AN ADAPTIVE OFFSET CANCELLATION SCHEME AND SHARED SUMMER ADAPTIVE DFE 28 3.1 OVERVIEW 28 3.2 AN ADAPTIVE OFFSET CANCELLATION SCHEME AND SHARED-SUMMER ADAPTIVE DFE FOR LOW POWER RECEIVER 31 3.3 SHARED SUMMER DFE 37 3.4 RECEIVER IMPLEMENTATION 42 3.5 MEASUREMENT RESULTS 45 CHAPTER 4 PAM-4 BAUD-RATE DIGITAL CDR 51 4.1 OVERVIEW 51 4.2 OVERALL ARCHITECTURE 53 4.2.1 PROPOSED BAUD-RATE CDR ARCHITECTURE 53 4.2.2 PROPOSED ANALOG FRONT-END STRUCTURE 59 4.3 STOCHASTIC PHASE DETECTION PAM-4 CDR 64 4.3.1 PROPOSED STOCHASTIC PHASE DETECTION 64 4.3.2 COMPARISON OF THE STOCHASTIC PD WITH SS-MMPD 70 4.4 PHASE DETECTION FOR MULTI-LEVEL SIGNALING 73 4.4.1 PROPOSED BAUD-RATE PHASE DETECTOR FOR MULTI-LEVEL SIGNAL 73 4.4.2 DATA LEVEL AND DFE COEFFICIENT ADAPTATION 79 4.4.3 PROPOSED PHASE DETECTOR 84 4.5 MEASUREMENT RESULT 88 4.5.1 MEASUREMENT OF THE PROPOSED STOCHASTIC BAUD-RATE PHASE DETECTION 94 4.5.2 MEASUREMENT OF THE PROPOSED BAUD-RATE PHASE DETECTION FOR MULTI-LEVEL SIGNAL 97 CHAPTER 5 CONCLUSION 103 BIBLIOGRAPHY 105 초 록 109박

SNU Open Repository and Archive

메모리 인터페이스를 위한 4 레벨 펄스 진폭 변조 쿼터 레이트 수신기 설계

Author: 박현규
Publication venue: 서울대학교 대학원
Publication date: 01/08/2022
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2022. 8. 김수환.본 연구에서는 메모리 인터페이스를 위한 4 레벨 펄스 진폭 변조 (PAM-4) 수신기와 직교 클록을 생성하는 직교 신호 보정기를 제안된다. 데이터 센터에서 증가하는 IP 트래픽은 고속 및 저전력 메모리 인터페이스에 대한 수요를 증가시켜왔다. 이러한 요구를 만족시키기 위해 클럭 및 나이퀴스트 주파수를 높이지 않고도 데이터 전송률을 높일 수 있는 PAM-4 신호가 주목을 받고 있다. PAM-4 신호는 제로 비 복귀 신호 (NRZ) 보다 3배 낮은 수직 마진을 가지며, 이는 결정 피드백 이퀄라이저 내 슬라이스의 클럭-큐 딜레이를 증가시키며, 이로 인해 PAM-4 결정 피드백 이퀄라이저의 성능을 제한하는 요인이다. 본 연구에서는 인버터 기반의 합산기를 이용, 선택적으로 신호를 증폭시키는 결정 피드백 이퀄라이저를 사용함으로써 슬라이서의 전력 소모를 증가시키지 않으면서 슬라이서의 클럭-큐 딜레이를 줄일 수 있다. 또한, 적응형 지연 이득 컨트롤러를 포함하는 직교 신호 보정기는 높은 정확도와 빠른 스큐 보정으로 쿼드러처 클럭 간의 스큐를 교정할 수 있다. 선택적 눈 증폭 결정 피드백 이퀄라이저와 적응형 지연 이득 컨트롤러를 포함하는 직교 신호 보정기의 성능을 검증하기 위해 프로토타입 칩을 제작하였다. 제작된 칩은 65 nm CMOS 공정으로 제작되었다. 프로토타입 칩은 24 Gb/s/pin 에서 10-12 의 비트 에러율을 100 mUI 의 신호 너비로 달성하였다. 프로토타입 칩 내 PAM-4 수신기는 0.73 pJ/b 의 에너지 효율을 갖는다. 또한 적응형 지연 이득 컨트롤러를 포함하는 직교 신호 보정기는 3 GHz 쿼드러처 클럭 간 최대 21.2 ps 의 스큐를 0.8 ps 까지 줄일 수 있으며, 이 때 76.9 ns 의 교정 시간을 갖는다. 제안하는 직교 신호 보정기는 3 GHz 에서 2.15 mW/GHz 의 전력 효율을 갖는다.A four-level pulse amplitude modulation (PAM-4) receiver, and a quadrature signal corrector (QSC) that generates quadrature clocks for memory interfaces is presented. Increasing IP traffic in data centers has increased the demand for high-speed and low-power memory interfaces. To satisfy this demand, PAM-4 signaling, which can increase data-rate without increasing clock and Nyquist frequency, is received considerable attention. PAM- signaling has vertical which three times lower than non-return-to-zero (NRZ) signaling, which makes the clock-to-Q delay of the slicer in the decision feedback equalizer (DFE) increases. This makes the DFE difficult to satisfy the timing constraint. In this paper, by using a DFE with inverter-based summers, the clock-to-Q delay of the slicer can be reduced without increasing the power consumption of the slicers. Also, the QSC using an adaptive delay gain controller can correct the skew between the quadrature clock with low skew and short correction time. The prototype receiver including the DFE with the inverter-based summer and the QSC using the adaptive delay gain controller was fabricated in 65 nm CMOS process. The prototype chip can achieve a bit error rate (BER) of 10-12 at 24 Gb/s/pin, and at this time, an eye width of 100 mUI is secured. The efficiency of the receiver is 0.73 pJ/b. In addition, the QSC cna reduce the maximum 21.2 ps of skew between 3 GHz quadrature clocks to 0.8 ps and has a correction time of 76.9 ns. The efficiency of the QSC is 2.15 mW/GHz.ABSTRACT 1 CONTENTS 3 LIST OF FIGURES 5 LIST OF TABLE 9 CHAPTER 1 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 PAM-4 SIGNALING 7 1.2.1 DESIGN CONSIDERATIONS ON PAM-4 RECEIVER 10 1.2.2 PRIOR WORKS 14 1.3 QUARTER-RATE ARCHITECTURE 18 1.3.1 DESIGN CONSIDERATION ON QUARTER-RATE ARCHITECTURE 20 1.3.2 PRIOR WORKS 25 1.4 SUMMARY 28 1.5 THESIS ORGANIZATION 30 CHAPTER 2 31 CONCEPTS OF DFE WITH INVERTER-BASED SUMMER 31 2.1 CONCEPTUAL ARCHITECTURE OF DFE WITH INVERTER-BASED SUMMER 32 2.2 DESIGN CONSIDERATION OF INVERTER-BASED SUMMER 37 CHAPTER 3 41 CONCEPTS OF QUADRATURE SIGNAL CORRECTOR USING ADAPTIVE DELAY GAIN CONTROLLER 41 3.1 OPERATION OF PROPOSED QUADRATURE SIGNAL CORRECTOR 42 3.2 LOOP FILTER INCLUDING ADAPTIVE DELAY GAIN CONTROLLER 45 CHAPTER 4 48 ARCHITECTURE AND IMPLEMENTATION 48 4.1 OVERALL ARCHITECTURE 49 4.2 ANALOG FRONT END 52 4.3 DECISION FEEDBACK EQUALIZER WITH INVERTER-BASED SUMMER 54 4.4 CLOCK PATH 62 4.5 QUADRATURE SIGNAL CORRECTOR WITH ADAPTIVE DELAY GAIN CONTROLLER 63 CHAPTER 5 70 EXPERIMENTAL RESULTS 70 5.1 EXPERIMENTAL SETUP 70 5.2 EXPERIMENTAL RESULTS 74 5.2.1 MEASUREMENT RESULTS OF PAM-4 RECEIVER WITH DECISION FEEDBACK EQUALIZER USING INVERTER-BASED SUMMER 74 5.2.2 MEASUREMENT RESULTS OF QUADRATURE SIGNAL CORRECTOR USING ADAPTIVE DELAY GAIN CONTROLLER 77 CHAPTER 6 83 CONCLUSION 83 BIBLIOGRAPHY 86박

SNU Open Repository and Archive

Design Techniques for High Pin Efficiency Wireline Transceivers

Author: Fan Yang-Hang
Publication venue
Publication date: 17/12/2020
Field of study

While the majority of wireline research investigates bandwidth improvement and how to overcome the high channel loss, pin efficiency is also critical in high-performance wireline applications. This dissertation proposes two different implementations for high pin efficiency wireline transceivers. The first prototype achieves twice pin efficiency than unidirectional signaling, which is 32Gb/s simultaneous bidirectional transceiver supporting transmission and reception on the same channel at the same time. It includes an efficient low-swing voltage-mode driver with an R-gm hybrid for signal separation, combining the continuous-time-linear-equalizer (CTLE) and echo cancellation (EC) in a single stage, and employing a low-complexity 5/4X CDA system. Support of a wide range of channels is possible with foreground adaptation of the EC finite impulse response (FIR) filter taps with a sign-sign least-mean-square (SSLMS) algorithm. Fabricated in TSMC 28-nm CMOS, the 32Gb/s SBD transceiver occupies

0.09 mm^{2}

area and achieves 16Gb/s uni-directional and 32Gb/s simultaneous bi-directional signals. 32Gb/s SBD operation consumes 1.83mW/Gb/s with 10.8dB channel loss at Nyquist rate. The second prototype presents an optical transmitter with a quantum-dot (QD) microring laser. This can support wavelength-division multiplexing allowing for high pin efficiency application by packing multiple high-bandwidth signals onto one optical channel. The development QD microring laser model accurately captures the intrinsic photonic high-speed dynamics and allows for the future co-design of the circuits and photonic device. To achieve higher bandwidth than intrinsic one, utilizing both techniques of optical injection locking (OIL) and 2-tap asymmetric Feed-forward equalizer (FFE) can perform 22Gb/s operation with 3.2mW/Gb/s. The first hybrid-integration directly-modulated OIL QD microring laser system is demonstrated

Texas A&M Repository

The truth about 2-level transition elimination in bang-bang PAM-4 CDRs

Author: Rombouts Pieter
Torfs Guy
Verbeke Marijn
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Reception of 4-level pulse amplitude modulation (PAM-4) requires a clock and data recovery (CDR) circuit, typically implemented by a PLL-like structure. An essential block in such a CDR is the phase detector which should detect whether the recovered clock leads or lags the incoming data edges. In typical implementations an incoming data edge is detected by sensing whether the incoming waveform crosses a data threshold level. However, there is some ambiguity in detecting the incoming data edge because PAM-4 modulation has 3 thresholds. If the waveform crosses multiple threshold levels, the level crossings will occur at different time instants due to the finite rise/fall time of the incoming waveform. In this work, we first analyze qualitatively and quantitatively CDR systems that use one threshold for phase adjustment. Here, eliminating the 2-level transitions decreases the amount of jitter injected by the phase detector. However, the available transitions for phase adjustment are also reduced, which lowers the CDR's robustness. Secondly, for CDR systems using three thresholds, a combination of two techniques: majority voting and elimination of 2-level transitions is investigated. We prove that in this case, the elimination of 2-level transitions is not needed and even gives a worse performance when implemented

Ghent University Academic Bibliography

Design of High-Speed SerDes Transceiver for Chip-to-Chip Communications in CMOS Process

Author: Zheng Xuqiang
Publication venue
Publication date
Field of study

With the continuous increase of on-chip computation capacities and exponential growth of data-intensive applications, the high-speed data transmission through serial links has become the backbone for modern communication systems. To satisfy the massive data-exchanging requirement, the data rate of such serial links has been updated from several Gb/s to tens of Gb/s. Currently, the commercial standards such as Ethernet 400GbE, InfiniBand high data rate (HDR), and common electrical interface (CEI)-56G has been developing towards 40+ Gb/s. As the core component within these links, the transceiver chipset plays a fundamental role in balancing the operation speed, power consumption, area occupation, and operation range. Meanwhile, the CMOS process has become the dominant technology in modern transceiver chip fabrications due to its large-scale digital integration capability and aggressive pricing advantage. This research aims to explore advanced techniques that are capable of exploiting the maximum operation speed of the CMOS process, and hence provides potential solutions for 40+ Gb/s CMOS transceiver designs. The major contributions are summarized as follows. A low jitter ring-oscillator-based injection-locked clock multiplier (RILCM) with a hybrid frequency tracking loop that consists of a traditional phase-locked loop (PLL), a timing-adjusted loop, and a loop selection state-machine is implemented in 65-nm C-MOS process. In the ring voltage-controlled oscillator, a full-swing pseudo-differential delay cell is proposed to lower the device noise to phase noise conversion. To obtain high operation speed and high detection accuracy, a compact timing-adjusted phase detector tightly combined with a well-matched charge pump is designed. Meanwhile, a lock-loss detection and lock recovery is devised to endow the RILCM with a similar lock-acquisition ability as conventional PLL, thus excluding the initial frequency set- I up aid and preventing the potential lock-loss risk. The experimental results show that the figure-of-merit of the designed RILCM reaches -247.3 dB, which is better than previous RILCMs and even comparable to the large-area LC-ILCMs. The transmitter (TX) and receiver (RX) chips are separately designed and fab- ricated in 65-nm CMOS process. The transmitter chip employs a quarter-rate multi-multiplexer (MUX)-based 4-tap feed-forward equalizer (FFE) to pre-distort the output. To increase the maximum operating speed, a bandwidth-enhanced 4:1 MUX with the capability of eliminating charge-sharing effect is proposed. To produce the quarter-rate parallel data streams with appropriate delays, a compact latch array associated with an interleaved-retiming technique is designed. The receiver chip employs a two-stage continuous-time linear equalizer (CTLE) as the analog front-end and integrates an improved clock data recovery to extract the sampling clocks and retime the incoming data. To automatically balance the jitter tracking and jitter suppression, passive low-pass filters with adaptively-adjusted bandwidth are introduced into the data-sampling path. To optimize the linearity of the phase interpolation, a time-averaging-based compensating phase interpolator is proposed. For equalization, a combined TX-FFE and RX-CTLE is applied to compensate for the channel loss, where a low-cost edge-data correlation-based sign zero-forcing adaptation algorithm is proposed to automatically adjust the TX-FFE’s tap weights. Measurement results show that the fabricated transmitter/receiver chipset can deliver 40 Gb/s random data at a bit error rate of 16 dB loss at the half-baud frequency, while consuming a total power of 370 mW

University of Lincoln Institutional Repository