Search CORE

5 research outputs found

An Eight lanes 7Gb/s/pin Source Synchronous Single-Ended RX with Equalization and Far-End Crosstalk Cancellation for Backplane Channels

Author: Aprile Cosimo
Braendli Matthias
Cevher Volkan
Cevrero Alessandro
Francese Pier Andrea
Kossel. Marcel
Kull Lukas
Leblebici Yusuf
Menolfi Christian
Morf Thomas
Oezkaya Ilter
Toifl Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/01/2018
Field of study

This paper presents a versatile crosstalk cancellation scheme for single-ended multi-lane backplane links. System-level investigations show that a scheme, which combines analog filters and decision-feedback crosstalk compensation on the receiver (RX) side only, can efficiently remove crosstalk patterns in straight channels as well as boards with reflections due to via stubs. An eight-lane single-ended RX has been manufactured in 32-nm SOI CMOS to validate our findings. A CTLE and eight-tap decision feedback equalizer equalize the channel without transmitter feedforward equalizer. A continuous time crosstalk canceller reduces precursors by nearest neighbors, while the residual postcursors from all aggressors are suppressed by direct feedback 7x8-tap decision-feedback crosstalk canceller (DFXC). Measurements with flip-chip packaged RX show that the RX macro can equalize both a 30-dB insertion loss single-ended channel with 0-dB signal-to-crosstalk at Nyquist and a channel with 28-dB attenuation with the signal-to-crosstalk ratio of 6 dB combined with reflections due to via stubs. The RX operates up to 7 Gb/s/pin with PRBS11 data at bit error rate (BER) <10⁻¹², and occupies 300x350 μm² with an energy efficiency of 5.9 mW/Gb/s from 1-V supply

Infoscience - École polytechnique fédérale de Lausanne

A 5 Gb/s Single-Ended Parallel Receiver With Adaptive Crosstalk-Induced Jitter Cancellation

Author: Kim B
Park HJ
Seon-Kyoo Lee
Sim JY
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

This paper presents an adaptive far-end crosstalk cancellation scheme for a single-ended parallel receiver. The adaptation engine is embedded in a single representative channel CDR, and the receiver efficiently reduces the crosstalk noise with a minimal cost in hardware and power consumption. In addition, the proposed scheme can be applied to any given CDR and equalizing circuits. The receiver is fabricated in 0.13 mu m CMOS technology and achieves a reduction of FEXT-induced jitter up to 75%. The receiver consumes 65 mW at 5 Gb/s (4.3 mW/Gb/s/pin) including a PLL for global clock distribution.X1198sciescopu

포항공과대학교

차세대 HBM 용 고집적, 저전력 송수신기 설계

Author: 고한곤
Publication venue: 서울대학교 대학원
Publication date: 01/08/2020
Field of study

학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2020. 8. 정덕균.This thesis presents design techniques for high-density power-efficient transceiver for the next-generation high bandwidth memory (HBM). Unlike the other memory interfaces, HBM uses a 3D-stacked package using through-silicon via (TSV) and a silicon interposer. The transceiver for HBM should be able to solve the problems caused by the 3D-stacked package and TSV. At first, a data (DQ) receiver for HBM with a self-tracking loop that tracks a phase skew between DQ and data strobe (DQS) due to a voltage or thermal drift is proposed. The self-tracking loop achieves low power and small area by uti-lizing an analog-assisted baud-rate phase detector. The proposed pulse-to-charge (PC) phase detector (PD) converts the phase skew to a voltage differ-ence and detects the phase skew from the voltage difference. An offset calibra-tion scheme that can compensates for a mismatch of the PD is also proposed. The proposed calibration scheme operates without any additional sensing cir-cuits by taking advantage of the write training of HBM. Fabricated in 65 nm CMOS, the DQ receiver shows a power efficiency of 370 fJ/b at 4.8 Gb/s and occupies 0.0056 mm2. The experimental results show that the DQ receiver op-erates without any performance degradation under a ± 10% supply variation. In a second prototype IC, a high-density transceiver for HBM with a feed-forward-equalizer (FFE)-combined crosstalk (XT) cancellation scheme is pre-sented. To compensate for the XT, the transmitter pre-distorts the amplitude of the FFE output according to the XT. Since the proposed XT cancellation (XTC) scheme reuses the FFE implemented to equalize the channel loss, additional circuits for the XTC is minimized. Thanks to the XTC scheme, a channel pitch can be significantly reduced, allowing for the high channel density. Moreover, the 3D-staggered channel structure removes the ground layer between the verti-cally adjacent channels, which further reduces a cross-sectional area of the channel per lane. The test chip including 6 data lanes is fabricated in 65 nm CMOS technology. The 6-mm channels are implemented on chip to emulate the silicon interposer between the HBM and the processor. The operation of the XTC scheme is verified by simultaneously transmitting 4-Gb/s data to the 6 consecutive channels with 0.5-um pitch and the XTC scheme reduces the XT-induced jitter up to 78 %. The measurement result shows that the transceiver achieves the throughput of 8 Gb/s/um. The transceiver occupies 0.05 mm2 for 6 lanes and consumes 36.6 mW at 6 x 4 Gb/s.본 논문에서는 차세대 HBM을 위한 고집적 저전력 송수신기 설계 방법을 제안한다. 첫 번째로, 전압 및 온도 변화에 의한 데이터와 클럭 간 위상 차이를 보상할 수 있는 자체 추적 루프를 가진 데이터 수신기를 제안한다. 제안하는 자체 추적 루프는 데이터 전송 속도와 같은 속도로 동작하는 위상 검출기를 사용하여 전력 소모와 면적을 줄였다. 또한 메모리의 쓰기 훈련 (write training) 과정을 이용하여 효과적으로 위상 검출기의 오프셋을 보상할 수 있는 방법을 제안한다. 제안하는 데이터 수신기는 65 nm 공정으로 제작되어 4.8 Gb/s에서 370 fJ/b을 소모하였다. 또한 10 % 의 전압 변화에 대하여 안정적으로 동작하는 것을 확인하였다. 두 번째로, 피드 포워드 이퀄라이저와 결합된 크로스 토크 보상 방식을 활용한 고집적 송수신기를 제안한다. 제안하는 송신기는 크로스 토크 크기에 해당하는 만큼 송신기 출력을 왜곡하여 크로스 토크를 보상한다. 제안하는 크로스 토크 보상 방식은 채널 손실을 보상하기 위해 구현된 피드 포워드 이퀄라이저를 재활용함으로써 추가적인 회로를 최소화한다. 제안하는 송수신기는 크로스 토크가 보상 가능하기 때문에, 채널 간격을 크게 줄여 고집적 통신을 구현하였다. 또한 집적도를 더 증가시키기 위해 세로로 인접한 채널 사이의 차폐 층을 제거한 적층 채널 구조를 제안한다. 6개의 송수신기를 포함한 프로토타입 칩은 65 nm 공정으로 제작되었다. HBM과 프로세서 사이의 silicon interposer channel 을 모사하기 위한 6 mm 의 채널이 칩 위에 구현되었다. 제안하는 크로스 토크 보상 방식은 0.5 um 간격의 6개의 인접한 채널에 동시에 데이터를 전송하여 검증되었으며, 크로스 토크로 인한 지터를 최대 78 % 감소시켰다. 제안하는 송수신기는 8 Gb/s/um 의 처리량을 가지며 6 개의 송수신기가 총 36.6 mW의 전력을 소모하였다.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 4 CHAPTER 2 BACKGROUND ON HIGH-BANDWIDTH MEMORY 6 2.1 OVERVIEW 6 2.2 TRANSCEIVER ARCHITECTURE 10 2.3 READ/WRITE OPERATION 15 2.3.1 READ OPERATION 15 2.3.2 WRITE OPERATION 19 CHAPTER 3 BACKGROUNDS ON COUPLED WIRES 21 3.1 GENERALIZED MODEL 21 3.2 EFFECT OF CROSSTALK 26 CHAPTER 4 DQ RECEIVER WITH BAUD-RATE SELF-TRACKING LOOP 29 4.1 OVERVIEW 29 4.2 FEATURES OF DQ RECEIVER FOR HBM 33 4.3 PROPOSED PULSE-TO-CHARGE PHASE DETECTOR 35 4.3.1 OPERATION OF PULSE-TO-CHARGE PHASE DETECTOR 35 4.3.2 OFFSET CALIBRATION 37 4.3.3 OPERATION SEQUENCE 39 4.4 CIRCUIT IMPLEMENTATION 42 4.5 MEASUREMENT RESULT 46 CHAPTER 5 HIGH-DENSITY TRANSCEIVER FOR HBM WITH 3D-STAGGERED CHANNEL AND CROSSTALK CANCELLATION SCHEME 57 5.1 OVERVIEW 57 5.2 PROPOSED 3D-STAGGERED CHANNEL 61 5.2.1 IMPLEMENTATION OF 3D-STAGGERED CHANNEL 61 5.2.2 CHANNEL CHARACTERISTICS AND MODELING 66 5.3 PROPOSED FEED-FORWARD-EQUALIZER-COMBINED CROSSTALK CANCELLATION SCHEME 72 5.4 CIRCUIT IMPLEMENTATION 77 5.4.1 OVERALL ARCHITECTURE 77 5.4.2 TRANSMITTER WITH FFE-COMBINED XTC 79 5.4.3 RECEIVER 81 5.5 MEASUREMENT RESULT 82 CHAPTER 6 CONCLUSION 93 BIBLIOGRAPHY 95 초 록 102Docto

SNU Open Repository and Archive

Design Techniques for High Performance Serial Link Transceivers

Author: Min Byungho
Publication venue
Publication date: 21/08/2017
Field of study

Increasing data rates over electrical channels with significant frequency-dependent loss is difficult due to excessive inter-symbol interference (ISI). In order to achieve sufficient link margins at high rates, I/O system designers implement equalization in the transmitters and are motivated to consider more spectrally-efficient modulation formats relative to the common PAM-2 scheme, such as PAM-4 and duobinary. The first work, reviews when to consider PAM-4 and duobinary formats, as the modulation scheme which yields the highest system margins at a given data rate is a function of the channel loss profile, and presents a 20Gb/s triple-mode transmitter capable of efficiently implementing these three modulation schemes and three-tap feedforward equalization. A statistical link modeling tool, which models ISI, crosstalk, random noise, and timing jitter, is developed to compare the three common modulation formats operating on electrical backplane channel models. In order to improve duobinary modulation efficiency, a low-power quarter-rate duobinary precoder circuit is proposed which provides significant timing margin improvement relative to full-rate precoders. Also as serial I/O data rates scale above 10 Gb/s, crosstalk between neighboring channels degrades system bit-error rate (BER) performance. The next work presents receive-side circuitry which merges the cancellation of both near-end and far-end crosstalk (NEXT/FEXT) and can automatically adapt to different channel environments and variations in process, voltage, and temperature. NEXT cancellation is realized with a novel 3-tap FIR filter which combines two traditional FIR filter taps and a continuous-time band-pass filter IIR tap for efficient crosstalk cancellation, with all filter tap coefficients automatically determined via an ondie sign-sign least-mean-square (SS-LMS) adaptation engine. FEXT cancellation is realized by coupling the aggressor signal through a differentiator circuit whose gain is automatically adjusted with a power-detection-based adaptation loop. In conclusion, the proposed architectures in the transmitter side and receiver side together are to be good solution in the high speed I/O serial links to improve the performance by overcome the physical channel loss and adjacent channel noise as the system becomes complicated

Texas A&M Repository

Learning-Based Hardware Design for Data Acquisition Systems

Author: Aprile Cosimo
Publication venue: Lausanne, EPFL
Publication date: 29/08/2018
Field of study

This multidisciplinary research work aims to investigate the optimized information extraction from signals or data volumes and to develop tailored hardware implementations that trade-off the complexity of data acquisition with that of data processing, conceptually allowing radically new device designs. The mathematical results in classical Compressive Sampling (CS) support the paradigm of Analog-to-Information Conversion (AIC) as a replacement for conventional ADC technologies. The AICs simultaneously perform data acquisition and compression, seeking to directly sample signals for achieving specific tasks as opposed to acquiring a full signal only at the Nyquist rate to throw most of it away via compression. Our contention is that in order for CS to live up its name, both theory and practice must leverage concepts from learning. This work demonstrates our contention in hardware prototypes, with key trade-offs, for two different fields of application as edge and big-data computing. In the framework of edge-data computing, such as wearable and implantable ecosystems, the power budget is defined by the battery capacity, which generally limits the device performance and usability. This is more evident in very challenging field, such as medical monitoring, where high performance requirements are necessary for the device to process the information with high accuracy. Furthermore, in applications like implantable medical monitoring, the system performances have to merge the small area as well as the low-power requirements, in order to facilitate the implant bio-compatibility, avoiding the rejection from the human body. Based on our new mathematical foundations, we built different prototypes to get a neural signal acquisition chip that not only rigorously trades off its area, energy consumption, and the quality of its signal output, but also significantly outperforms the state-of-the-art in all aspects. In the framework of big-data and high-performance computation, such as in high-end servers application, the RF circuits meant to transmit data from chip-to-chip or chip-to-memory are defined by low power requirements, since the heat generated by the integrated circuits is partially distributed by the chip package. Hence, the overall system power budget is defined by its affordable cooling capacity. For this reason, application specific architectures and innovative techniques are used for low-power implementation. In this work, we have developed a single-ended multi-lane receiver for high speed I/O link in servers application. The receiver operates at 7 Gbps by learning inter-symbol interference and electromagnetic coupling noise in chip-to-chip communication systems. A learning-based approach allows a versatile receiver circuit which not only copes with large channel attenuation but also implements novel crosstalk reduction techniques, to allow single-ended multiple lines transmission, without sacrificing its overall bandwidth for a given area within the interconnect's data-path

Infoscience - École polytechnique fédérale de Lausanne