Search CORE

13 research outputs found

High-Speed Link Modeling: Analog/Digital Equalization and Modulation Techniques

Author: Lee Keytaek
Publication venue
Publication date
Field of study

High-speed serial input-output (I/O) link has required advanced equalization and modulation techniques to mitigate inter-symbol interference (ISI) caused by multi-Gb/s signaling over band-limited channels. Increasing demands for transceiver power and area complexity has leveraged on-going interest in analog-to-digital converter (ADC) based link, which allows for robust equalization and flexible adaptation to advanced signaling. With diverse options in ISI control techniques, link performance analysis for complicated transceiver architectures is very important. This work presents advanced statistical modeling for ADC-based link, performance comparison of existing modulation and equalization techniques, and proposed hybrid ADC-based receiver that achieves further power saving in digital equalization. Statistical analysis precisely estimates high-speed link margins at given implementation constrains and low target bit-error-rate (BER), typically ranges from 1e-12 to 1e-15, by applying proper statistical bound of noise and distortion. The proposed statistical ADC-based link modeling utilizes bounded probability density function (PDF) of limited quantization distortion (4-6 bits) through digital feed-forward and decision feedback equalizers (FFE-DFE) to improve low target BER estimation. Based on statistical modeling, this work surveys the impact of insufficient equalization, jitter and crosstalk on modulation selection among two and four level pulse amplitude modulation (PAM-2 and PAM-4, respectively) and duobinary, and ADC resolution reduction performance by partial analog equalizer (PAE). While the information of channel loss at effective Nyquist frequency and signaling constellation loss initially guides modulation selection, the statistical analysis results show that PAM-4 best tolerates jitter and crosstalk, and duobinary requires the least equalization complexity. Meanwhile, despite robust digital equalization, high-speed ADC complexity and power consumption is still a critical bottleneck, so that PAE is necessitated to reduce ADC resolution requirement. Statistical analysis presents up to 8-bit resolution is required in 12.5Gb/s data communications at 46dB of channel loss without PAE, while 5-bit ADC is enough with 3-tap FFE PAE. For optimal ADC resolution reduction by PAE, digital equalizer complexity also increases to provide enough margin tolerating significant quantization distortion. The proposed hybrid receiver defines unreliable signal thresholds by statistical analysis and selectively takes additional digital equalization to save potentially increasing dynamic power consumption in digital. Simulation results report that the hybrid receiver saves at least 64% of digital equalization power with 3-tap FFE PAE in 12.5Gb/s data rate and up to 46dB loss channels. Finally, this work shows the use of embedded-DFE ADC in the hybrid receiver is limited by error propagation

Texas A&M Repository

Modeling and Design of Architectures for High-Speed ADC-Based Serial Links

Author: Shiva Kiran FNU
Publication venue
Publication date: 12/01/2021
Field of study

There is an ongoing dramatic rise in the volume of internet traffic. Standards such as 56Gb/s OIF very short reach (VSR), medium reach (MR) and long reach (LR) standards for chip to chip communication over channels with up to 10dB, 20dB and 30dB insertion loss at the PAM 4 Nyquist frequency, respectively, are being adopted. These standards call for the spectrally efficient PAM-4 signaling over NRZ signaling. PAM-4 signaling offers challenges such as a reduced SNR at the receiver, susceptibility to nonlinearities and increased sensitivity to residual ISI. Equalization provided by traditional mixed signal architectures can be insufficient to achieve the target BER requirements for very long reach channels. ADC-based receiver architectures for PAM-4 links take advantage of the more powerful equalization techniques, which lend themselves to easier and robust digital implementations, to extend the amount of insertion loss that the receiver can handle. However, ADC-based receivers can consume more power compared to mixed-signal implementations. Techniques that model the receiver performance to understand the various system trade-offs are necessary. This research presents a fast and accurate hybrid modeling framework to efficiently investigate system trade-offs for an ADC-based receiver. The key contribution being the addition of ADC related non-idealities such as quantization noise in the presence of integral and differential nonlinearities, and time-interleaving mismatch errors such as gain mismatch, bandwidth mismatch, offset mismatch and sampling skew. The research also presents a 52Gb/s ADC-based PAM-4 receiver prototype employing a 32-way time-interleaved, 2-bit/stage, 6-bit SAR ADC and a DSP with a 12-tap FFE and a 2-tap DFE. A new DFE architecture that reduces the complexity of a PAM-4 DFE to that of an NRZ DFE while simultaneously nearly doubling the maximum achievable data rate is presented. The receiver architecture also includes an analog front-end (AFE) consisting of a programmable two stage CTLE. A digital baud-rate CDR’s utilizing a Mueller-Muller phase detector sets the sampling phase. Measurement results show that for 32Gb/s operation a BER < 10⁻⁹ is achieved for a 30dB loss channel while for 52 Gb/s operation achieves a BER < 10⁻⁶ for a 31dB loss channel with a power efficiency of 8.06pj/bit

Texas A&M Repository

Toward realizing power scalable and energy proportional high-speed wireline links

Author: Anand Tejasvi
Publication venue
Publication date: 01/12/2015
Field of study

Growing computational demand and proliferation of cloud computing has placed high-speed serial links at the center stage. Due to saturating energy efficiency improvements over the last five years, increasing the data throughput comes at the cost of power consumption. Conventionally, serial link power can be reduced by optimizing individual building blocks such as output drivers, receiver, or clock generation and distribution. However, this approach yields very limited efficiency improvement. This dissertation takes an alternative approach toward reducing the serial link power. Instead of optimizing the power of individual building blocks, power of the entire serial link is reduced by exploiting serial link usage by the applications. It has been demonstrated that serial links in servers are underutilized. On average, they are used only 15% of the time, i.e. these links are idle for approximately 85% of the time. Conventional links consume power during idle periods to maintain synchronization between the transmitter and the receiver. However, by powering-off the link when idle and powering it back when needed, power consumption of the serial link can be scaled proportionally to its utilization. This approach of rapid power state transitioning is known as the rapid-on/off approach. For the rapid-on/off to be effective, ideally the power-on time, off-state power, and power state transition energy must all be close to zero. However, in practice, it is very difficult to achieve these ideal conditions. Work presented in this dissertation addresses these challenges. When this research work was started (2011-12), there were only a couple of research papers available in the area of rapid-on/off links. Systematic study or design of a rapid power state transitioning in serial links was not available in the literature. Since rapid-on/off with nanoseconds granularity is not a standard in any wireline communication, even the popular test equipment does not support testing any such feature, neither any formal measurement methodology was available. All these circumstances made the beginning difficult. However, these challenges provided a unique opportunity to explore new architectural techniques and identify trade-offs. The key contributions of this dissertation are as follows. The first and foremost contribution is understanding the underlying limitations of saturating energy efficiency improvements in serial links and why there is a compelling need to find alternative ways to reduce the serial link power. The second contribution is to identify potential power saving techniques and evaluate the challenges they pose and the opportunities they present. The third contribution is the design of a 5Gb/s transmitter with a rapid-on/off feature. The transmitter achieves rapid-on/off capability in voltage mode output driver by using a fast-digital regulator, and in the clock multiplier by accurate frequency pre-setting and periodic reference insertion. To ease timing requirements, an improved edge replacement logic circuit for the clock multiplier is proposed. Mathematical modeling of power-on time as a function of various circuit parameters is also discussed. The proposed transmitter demonstrates energy proportional operation over wide variations of link utilization, and is, therefore, suitable for energy efficient links. Fabricated in 90nm CMOS technology, the voltage mode driver, and the clock multiplier achieve power-on-time of only 2ns and 10ns, respectively. This dissertation highlights key trade-off in the clock multiplier architecture, to achieve fast power-on-lock capability at the cost of jitter performance. The fourth contribution is the design of a 7GHz rapid-on/off LC-PLL based clock multi- plier. The phase locked loop (PLL) based multiplier was developed to overcome the limita- tions of the MDLL based approach. Proposed temperature compensated LC-PLL achieves power-on-lock in 1ns. The fifth and biggest contribution of this dissertation is the design of a 7Gb/s embedded clock transceiver, which achieves rapid-on/off capability in LC-PLL, current-mode transmit- ter and receiver. It was the first reported design of a complete transceiver, with an embedded clock architecture, having rapid-on/off capability. Background phase calibration technique in PLL and CDR phase calibration logic in the receiver enable instantaneous lock on power-on. The proposed transceiver demonstrates power scalability with a wide range of link utiliza- tion and, therefore, helps in improving overall system efficiency. Fabricated in 65nm CMOS technology, the 7Gb/s transceiver achieves power-on-lock in less than 20ns. The transceiver achieves power scaling by 44x (63.7mW-to-1.43mW) and energy efficiency degradation by only 2.2x (9.1pJ/bit-to-20.5pJ/bit), when the effective data rate (link utilization) changes by 100x (7Gb/s-to-70Mb/s). The sixth and final contribution is the design of a temperature sensor to compensate the frequency drifts due to temperature variations, during long power-off periods, in the fast power-on-lock LC-PLL. The proposed self-referenced VCO-based temperature sensor is designed with all digital logic gates and achieves low supply sensitivity. This sensor is suitable for integration in processor and DRAM environments. The proposed sensor works on the principle of directly converting temperature information to frequency and finally to digital bits. A novel sensing technique is proposed in which temperature information is acquired by creating a threshold voltage difference between the transistors used in the oscillators. Reduced supply sensitivity is achieved by employing junction capacitance, and the overhead of voltage regulators and an external ideal reference frequency is avoided. The effect of VCO phase noise on the sensor resolution is mathematically evaluated. Fabricated in the 65nm CMOS process, the prototype can operate with a supply ranging from 0.85V to 1.1V, and it achieves a supply sensitivity of 0.034oC/mV and an inaccuracy of ±0.9oC and ±2.3oC from 0-100oC after 2-point calibration, with and without static nonlinearity correction, respectively. It achieves a resolution of 0.3oC, resolution FoM of 0.3(nJ/conv)res2 , and measurement (conversion) time of 6.5μs

Illinois Digital Environment for Access to Learning and Scholarship Repository

Design of High-Speed Power-Efficient A/D Converters for Wireline ADC-Based Receiver Applications

Author: Cai Shengchang
Publication venue
Publication date: 17/01/2019
Field of study

Serial input/output (I/O) data rates are increasing in order to support the explosion in network traffic driven by big data applications such as the Internet of Things (IoT), cloud computing and etc. As the high-speed data symbol times shrink, this results in an increased amount of inter-symbol interference (ISI) for transmission over both severe low-pass electrical channels and dispersive optical channels. This necessitates increased equalization complexity and consideration of advanced modulation schemes, such as four-level pulse amplitude modulation (PAM-4). Serial links which utilize an analog-to-digital converter (ADC) receiver front-end offer a potential solution, as they enable more powerful and flexible digital signal processing (DSP) for equalization and symbol detection and can easily support advanced modulation schemes. Moreover, the DSP back-end provides robustness to process, voltage, and temperature (PVT) variations, benefits from improved area and power with CMOS technology scaling and offers easy design transfer between different technology nodes and thus improved time-to-market. However, ADC-based receivers generally consume higher power relative to their mixed-signal counterparts because of the significant power consumed by conventional multi-GS/s ADC implementations. This motivates exploration of energy-efficient ADC designs with moderate resolution and very high sampling rates to support data rates at or above 50Gb/s. This dissertation presents two power-efficient designs of ≥25GS/s time-interleaved ADCs for ADC-based wireline receivers. The first prototype includes the implementation of a 6b 25GS/s time-interleaved multi-bit search ADC in 65nm CMOS with a soft-decision selection algorithm that provides redundancy for relaxed track-and-hold (T/H) settling and improved metastability tolerance, achieving a figure-of-merit (FoM) of 143fJ/conversion step and 1.76pJ/bit for a PAM-4 receiver design. The second prototype features the design of a 52Gb/s PAM-4 ADC-based receiver in 65nm CMOS, where the front-end consists of a 4-stage continuous-time linear equalizer (CTLE)/variable gain amplifier (VGA) and a 6b 26GS/s time-interleaved SAR ADC with a comparator-assisted 2b/stage structure for reduced digital-to-analog converter (DAC) complexity and a 3-tap embedded feed-forward equalizer (FFE) for relaxed ADC resolution requirement. The receiver front-end achieves an efficiency of 4.53bJ/bit, while compensating for up to 31dB loss with DSP and no transmitter (TX) equalization

Texas A&M Repository

Design of Low-Power NRZ/PAM-4 Wireline Transmitters

Author: Yang Hae Woong
Publication venue
Publication date: 12/01/2021
Field of study

Rapid growing demand for instant multimedia access in a myriad of digital devices has pushed the need for higher bandwidth in modern communication hardwares ranging from short-reach (SR) memory/storage interfaces to long-reach (LR) data center Ethernets. At the same time, comprehensive design optimization of link system that meets the energy-efficiency is required for mobile computing and low operational cost at datacenters. This doctoral study consists of design of two low-swing wireline transmitters featuring a low-power clock distribution and 2-tap equalization in energy-efficient manners up to 20-Gb/s operation. In spite of the reduced signaling power in the voltage-mode (VM) transmit driver, the presence of the segment selection logic still diminishes the power saving benefit. The first work presents a scalable VM transmitter which offers low static power dissipation and adopts an impedance-modulated 2-tap equalizer with analog tap control, thereby obviating driver segmentation and reducing pre-driver complexity and dynamic power. Per-channel quadrature clock generation with injection-locked oscillators (ILO) allows the generation of rail-to-rail quadrature clocks. Energy efficiency is further improved with capacitively driven low-swing global clock distribution and supply scaling at lower data rates, while output eye quality is maintained at low voltages with automatic phase calibration of the local ILO-generated quarter-rate clocks. A prototype fabricated in a general purpose 65 nm CMOS process includes a 2 mm global clock distribution network and two transmitters that support an output swing range of 100-300mV with up to 12-dB of equalization. The transmitters achieve 8-16 Gb/s operation at 0.65-1.05 pJ/b energy efficiency. The second work involves a dual-mode NRZ/PAM-4 differential low-swing voltage-mode (VM) transmitter. The pulse-selected output multiplexing allows reduction of power supply and deterministic jitter caused by large on-chip parasitic inherent in the transmission-gate-based multiplexers in the earlier work. Analog impedance control replica circuits running in the background produce gate-biasing voltages that control the peaking ratio for 2-tap feed-forward equalization and PAM-4 symbol levels for high-linearity. This analog control also allows for efficient generation of the middle levels in PAM-4 operation with good linearity quantified by level separation mismatch ratio of 95%. In NRZ mode, 2-tap feedforward equalization is configurable in high-performance controlled-impedance or energy-efficient impedance-modulated settings to provide performance scalability. Analytic design consideration on dynamic power, data-rate, mismatch, and output swing brings optimal performance metric on the given technology node. The proof-of-concept prototype is verified on silicon with 65 nm CMOS process with improved performance in speed and energy-efficiency owing to double-stack NMOS transistors in the output stage. The transmitter consumes as low as 29.6mW in 20-Gb/s NRZ and 25.5mW in the 28-Gb/s PAM-4 operations

Texas A&M Repository

오프셋 제거기의 적응 제어 등화기와 보우-레이트 위상 검출기를 활용한 수신기 설계

Author: 이광호
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2021.8. 염제완.In this thesis, designs of high-speed, low-power wireline receivers (RX) are explained. To be specific, the circuit techniques of DC offset cancellation, merged-summer DFE, stochastic Baud-rate CDR, and the phase detector (PD) for multi-level signal are proposed. At first, an RX with adaptive offset cancellation (AOC) and merged summer decision-feedback equalizer (DFE) is proposed. The proposed AOC engine removes the random DC offset of the data path by examining the random data stream's sampled data and edge outputs. In addition, the proposed RX incorporates a shared-summer DFE in a half-rate structure to reduce power dissipation and hardware complexity of the adaptive equalizer. A prototype chip fabricated in 40 nm CMOS technology occupies an active area of 0.083 mm2. Thanks to the AOC engine, the proposed RX achieves the BER of less than 10-12 in a wide range of data rates: 1.62-10 Gb/s. The proposed RX consumes 18.6 mW at 10 Gb/s over a channel with a 27 dB loss at 5 GHz, exhibiting a figure-of-merit of 0.068 pJ/b/dB. Secondly, a 40 nm CMOS RX with Baud-rate phase-detector (BRPD) is proposed. The RX includes two PDs: the BRPD employing the stochastic technique and the BRPD suitable for multi-level signals. Thanks to the Baud-rate CDR’s advantage, by not using an edge-sampling clock, the proposed CDR can reduce the power consumption by lowering the hardware complexity. Besides, the proposed stochastic phase detector (SPD) tracks an optimal phase-locking point that maximizes the vertical eye opening. Furthermore, despite residual inter-symbol interference, proposed BRPD for multi-level signal secures vertical eye margin, which is especially vulnerable in the multi-level signal. Besides, the proposed BRPD has a unique lock point with an adaptive DFE, unlike conventional Mueller-Muller PD. A prototype chip fabricated in 40 nm CMOS technology occupies an active area of 0.24 mm2. The proposed PAM-4 RX achieves the bit-error-rate less than 10-11 in 48 Gb/s and the power efficiency of 2.42 pJ/b.본 논문은 고속, 저전력으로 동작하는 유선 수신기의 설계에 대해 설명하고 있다. 구체적으로 말하면, 오프셋 상쇄, 병합된 서머를 사용하는 결정 피드백 등화기 기술, 확률적 보우 레이트 클럭과 데이터 복원기, 그리고 다중 레벨 신호에 적합한 위상 검출기를 제안한다. 첫째로, 적응 오프셋 제거 및 병합된 서머를 사용하는 결정 피드백 등화기를 갖춘 수신기를 제안한다. 제안된 적응 오프셋 제거 엔진은 임의의 데이터 스트림의 샘플링 데이터, 에지 출력을 검사하여 데이터 경로 상의 오프셋을 제거한다. 또한 하프 레이트 구조의 병합된 서머를 사용하는 결정 피드백 등화기는 전력의 사용과 하드웨어의 복잡성을 줄인다. 40 nm CMOS 기술로 제작된 프로토타입 칩은 0.083 mm2 의 면적을 가진다. 적응 오프셋 제거기 덕분에 제안된 수신기는 10-12 미만의 BER을 달성한다. 또한 제안된 수신기는 5GHz에서 27 dB의 로스를 갖는 채널에서 10 Gb/s의 속도에서 18.6 mW를 소비하며 0.068 pJ/b/dB의 FoM을 달성하였다. 두번째로, 보우 레이트 위상 검출기가 있는 40 nm CMOS 수신기가 제안되었다. 수신기에는 두개의 보우 레이트 위상 검출기를 포함한다. 하나는 확률론적 기법을 사용하는 보우 레이트 위상 검출기이다. 보우 레이트 클럭 데이터 복원기의 장점 덕분에 에지 샘플링 클럭을 사용하지 않음으로서 파워의 소모와 하드웨어의 복잡성을 줄였다. 또한 확률적 위상 검출기는 수직 아이 오프닝을 최대화하는 최적의 위상 지점을 찾을 수 있었다. 다른 위상 검출기는 다중 레벨 신호에 적합한 방식이다. 심볼 간 간섭이 다중 레벨 신호에 매우 취약한 문제가 있더라도 제안된 다중 레벨 신호용 보우 레이트 위상 검출기는 수직 아이 마진을 확보한다. 게다가 제안된 보우 레이트 위상 검출기는 기존의 뮬러-뮐러 위상 검출기와 달리 적응형 결정 피드백 등화기가 있더라도 유일한 락 지점을 갖는다. 프로토타입 칩은 0.24mm2의 면적을 가진다. 제안된 PAM-4 수신기는 48 Gb/s의 속도에서 10-11 미만의 BER을 가지고, 2.42 pJ/b의 FoM을 가진다.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 5 CHAPTER 2 BACKGROUNDS 6 2.1 BASIC ARCHITECTURE IN SERIAL LINK 6 2.1.1 SERIAL COMMUNICATION 6 2.1.2 CLOCK AND DATA RECOVERY 8 2.1.3 MULTI-LEVEL PULSE-AMPLITUDE MODULATION 10 2.2 EQUALIZER 12 2.2.1 EQUALIZER OVERVIEW 12 2.2.2 DECISION-FEEDBACK EQUALIZER 15 2.2.3 ADAPTIVE EQUALIZER 18 2.3 CLOCK RECOVERY 21 2.3.1 2X OVERSAMPLING PD ALEXANDER PD 22 2.3.2 BAUD-RATE PD MUELLER MULLER PD 25 CHAPTER 3 AN ADAPTIVE OFFSET CANCELLATION SCHEME AND SHARED SUMMER ADAPTIVE DFE 28 3.1 OVERVIEW 28 3.2 AN ADAPTIVE OFFSET CANCELLATION SCHEME AND SHARED-SUMMER ADAPTIVE DFE FOR LOW POWER RECEIVER 31 3.3 SHARED SUMMER DFE 37 3.4 RECEIVER IMPLEMENTATION 42 3.5 MEASUREMENT RESULTS 45 CHAPTER 4 PAM-4 BAUD-RATE DIGITAL CDR 51 4.1 OVERVIEW 51 4.2 OVERALL ARCHITECTURE 53 4.2.1 PROPOSED BAUD-RATE CDR ARCHITECTURE 53 4.2.2 PROPOSED ANALOG FRONT-END STRUCTURE 59 4.3 STOCHASTIC PHASE DETECTION PAM-4 CDR 64 4.3.1 PROPOSED STOCHASTIC PHASE DETECTION 64 4.3.2 COMPARISON OF THE STOCHASTIC PD WITH SS-MMPD 70 4.4 PHASE DETECTION FOR MULTI-LEVEL SIGNALING 73 4.4.1 PROPOSED BAUD-RATE PHASE DETECTOR FOR MULTI-LEVEL SIGNAL 73 4.4.2 DATA LEVEL AND DFE COEFFICIENT ADAPTATION 79 4.4.3 PROPOSED PHASE DETECTOR 84 4.5 MEASUREMENT RESULT 88 4.5.1 MEASUREMENT OF THE PROPOSED STOCHASTIC BAUD-RATE PHASE DETECTION 94 4.5.2 MEASUREMENT OF THE PROPOSED BAUD-RATE PHASE DETECTION FOR MULTI-LEVEL SIGNAL 97 CHAPTER 5 CONCLUSION 103 BIBLIOGRAPHY 105 초 록 109박

SNU Open Repository and Archive

Design of High-Speed SerDes Transceiver for Chip-to-Chip Communications in CMOS Process

Author: Zheng Xuqiang
Publication venue
Publication date
Field of study

With the continuous increase of on-chip computation capacities and exponential growth of data-intensive applications, the high-speed data transmission through serial links has become the backbone for modern communication systems. To satisfy the massive data-exchanging requirement, the data rate of such serial links has been updated from several Gb/s to tens of Gb/s. Currently, the commercial standards such as Ethernet 400GbE, InfiniBand high data rate (HDR), and common electrical interface (CEI)-56G has been developing towards 40+ Gb/s. As the core component within these links, the transceiver chipset plays a fundamental role in balancing the operation speed, power consumption, area occupation, and operation range. Meanwhile, the CMOS process has become the dominant technology in modern transceiver chip fabrications due to its large-scale digital integration capability and aggressive pricing advantage. This research aims to explore advanced techniques that are capable of exploiting the maximum operation speed of the CMOS process, and hence provides potential solutions for 40+ Gb/s CMOS transceiver designs. The major contributions are summarized as follows. A low jitter ring-oscillator-based injection-locked clock multiplier (RILCM) with a hybrid frequency tracking loop that consists of a traditional phase-locked loop (PLL), a timing-adjusted loop, and a loop selection state-machine is implemented in 65-nm C-MOS process. In the ring voltage-controlled oscillator, a full-swing pseudo-differential delay cell is proposed to lower the device noise to phase noise conversion. To obtain high operation speed and high detection accuracy, a compact timing-adjusted phase detector tightly combined with a well-matched charge pump is designed. Meanwhile, a lock-loss detection and lock recovery is devised to endow the RILCM with a similar lock-acquisition ability as conventional PLL, thus excluding the initial frequency set- I up aid and preventing the potential lock-loss risk. The experimental results show that the figure-of-merit of the designed RILCM reaches -247.3 dB, which is better than previous RILCMs and even comparable to the large-area LC-ILCMs. The transmitter (TX) and receiver (RX) chips are separately designed and fab- ricated in 65-nm CMOS process. The transmitter chip employs a quarter-rate multi-multiplexer (MUX)-based 4-tap feed-forward equalizer (FFE) to pre-distort the output. To increase the maximum operating speed, a bandwidth-enhanced 4:1 MUX with the capability of eliminating charge-sharing effect is proposed. To produce the quarter-rate parallel data streams with appropriate delays, a compact latch array associated with an interleaved-retiming technique is designed. The receiver chip employs a two-stage continuous-time linear equalizer (CTLE) as the analog front-end and integrates an improved clock data recovery to extract the sampling clocks and retime the incoming data. To automatically balance the jitter tracking and jitter suppression, passive low-pass filters with adaptively-adjusted bandwidth are introduced into the data-sampling path. To optimize the linearity of the phase interpolation, a time-averaging-based compensating phase interpolator is proposed. For equalization, a combined TX-FFE and RX-CTLE is applied to compensate for the channel loss, where a low-cost edge-data correlation-based sign zero-forcing adaptation algorithm is proposed to automatically adjust the TX-FFE’s tap weights. Measurement results show that the fabricated transmitter/receiver chipset can deliver 40 Gb/s random data at a bit error rate of 16 dB loss at the half-baud frequency, while consuming a total power of 370 mW

University of Lincoln Institutional Repository

Design of High-Speed CMOS Interface Circuits for Optical Communications

Author: 정규섭
Publication venue: 서울대학교 대학원
Publication date: 01/08/2017
Field of study

학위논문 (박사)-- 서울대학교 대학원 공과대학 전기·컴퓨터공학부, 2017. 8. 정덕균.The bandwidth requirement of wireline communications has increased ex-ponentially because of the ever-increasing demand for data centers and high-performance computing systems. However, it becomes difficult to satisfy the requirement with legacy electrical links which suffer from frequency-dependent losses due to skin effect, dielectric loss, channel reflections, and crosstalk, resulting in a severe bandwidth limitation. In order to overcome this challenge, it is necessary to introduce optical communication technology, which has been mainly used for long-reach communications, such as long-haul net-works and metropolitan area networks, to the medium- and short-reach com-munication systems. However, there still remain important issues to be resolved to facilitate the adoption of the optical technologies. The most critical challeng-es are the energy efficiency and the cost competitiveness as compared to the legacy copper-based electrical communications. One possible solution is silicon photonics that has long been investigated by a number of research groups. De-spite inherent incompatibility of silicon with the photonic world, silicon pho-tonics is promising and is the only solution that can leverage the mature CMOS technologies. In this thesis, we summarize the current status of silicon photonics and pro-vide the prospect of the optical interconnection. We also present key circuit techniques essential to the implementation of high-speed and low-power optical receivers. And then, we propose optical receiver architectures satisfying the aforementioned requirements with novel circuit techniques.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 6 CHAPTER 2 BACKGROUND OF OPTICAL COMMUNICATION 7 2.1 OVERVIEW OF OPTICAL LINK 7 2.2 SILICON PHOTONICS 11 2.3 HYBRID INTEGRATION 22 2.4 SILICON-BASED PHOTODIODES 28 2.4.1 BASIC TERMINOLOGY 28 2.4.2 SILICON PD 29 2.4.3 GERMANIUM PD 32 2.4.4 INTEGRATION WITH WAVEGUIDE 33 CHAPTER 3 CIRCUIT TECHNIQUES FOR OPTICAL RECEIVER 35 3.1 BASIS OF TRANSIMPEDANCE AMPLIFIER 35 3.2 TOPOLOGY OF TIA 39 3.2.1 RESISTOR-BASED TIA 39 3.2.2 COMMON-GATE-BASED TIA 41 3.2.3 FEEDBACK-BASED TIA 44 3.2.4 INVERTER-BASED TIA 47 3.2.5 INTEGRATING RECEIVER 48 3.3 BANDWIDTH EXTENSION TECHNIQUES 49 3.3.1 INDUCTOR-BASED TECHNIQUE 49 3.3.2 EQUALIZATION 61 3.4 CLOCK AND DATA RECOVERY CIRCUITS 66 3.4.1 CDR BASIC 66 3.4.2 CDR EXAMPLES 68 CHAPTER 4 LOW-POWER OPTICAL RECEIVER FRONT-END 73 4.1 OVERVIEW 73 4.2 INVERTER-BASED TIA WITH RESISTIVE FEEDBACK 74 4.3 INVERTER-BASED TIA WITH RESISTIVE AND INDUCTIVE FEEDBACK 81 4.4 CIRCUIT IMPLEMENTATION 89 4.5 MEASUREMENT RESULTS 93 CHAPTER 5 BANDWIDTH- AND POWER-SCALABLE OPTICAL RECEIVER FRONT-END 96 5.1 OVERVIEW 96 5.2 BANDWIDTH AND POWER SCALABILITY 97 5.3 GM STABILIZATION 98 5.4 OVERALL BLOCK DIAGRAM OF RECEIVER 104 5.5 MEASUREMENT RESULTS 111 CHAPTER 6 CONCLUSION 118 BIBLIOGRAPHY 120 초 록 131Docto

SNU Open Repository and Archive

Energy-Efficient Receiver Design for High-Speed Interconnects

Author: Chen Kuan-Chang
Publication venue
Publication date: 01/01/2022
Field of study

High-speed interconnects are of vital importance to the operation of high-performance computing and communication systems, determining the ultimate bandwidth or data rates at which the information can be exchanged. Optical interconnects and the employment of high-order modulation formats are considered as the solutions to fulfilling the envisioned speed and power efficiency of future interconnects. One common key factor in bringing the success is the availability of energy-efficient receivers with superior sensitivity. To enhance the receiver sensitivity, improvement in the signal-to-noise ratio (SNR) of the front-end circuits, or equalization that mitigates the detrimental inter-symbol interference (ISI) is required. In this dissertation, architectural and circuit-level energy-efficient techniques serving these goals are presented. First, an avalanche photodetector (APD)-based optical receiver is described, which utilizes non-return-to-zero (NRZ) modulation and is applicable to burst-mode operation. For the purposes of improving the overall optical link energy efficiency as well as the link bandwidth, this optical receiver is designed to achieve high sensitivity and high reconfiguration speed. The high sensitivity is enabled by optimizing the SNR at the front-end through adjusting the APD responsivity via its reverse bias voltage, along with the incorporation of 2-tap feedforward equalization (FFE) and 2-tap decision feedback equalization (DFE) implemented in current-integrating fashion. The high reconfiguration speed is empowered by the proposed integrating dc and amplitude comparators, which eliminate the RC settling time constraints. The receiver circuits, excluding the APD die, are fabricated in 28-nm CMOS technology. The optical receiver achieves bit-error-rate (BER) better than 1E−12 at −16-dBm optical modulation amplitude (OMA), 2.24-ns reconfiguration time with 5-dB dynamic range, and 1.37-pJ/b energy efficiency at 25 Gb/s. Second, a 4-level pulse amplitude modulation (PAM4) wireline receiver is described, which incorporates continuous time linear equalizers (CTLEs) and a 2-tap direct DFE dedicated to the compensation for the first and second post-cursor ISI. The direct DFE in a PAM4 receiver (PAM4-DFE) is made possible by the proposed CMOS track-and-regenerate slicer. This proposed slicer offers rail-to-rail digital feedback signals with significantly improved clock-to-Q delay performance. The reduced slicer delay relaxes the settling time constraint of the summer circuits and allows the stringent DFE timing constraint to be satisfied. With the availability of a direct DFE employing the proposed slicer, inductor-based bandwidth enhancement and loop-unrolling techniques, which can be power/area intensive, are not required. Fabricated in 28-nm CMOS technology, the PAM4 receiver achieves BER better than 1E−12 and 1.1-pJ/b energy efficiency at 60 Gb/s, measured over a channel with 8.2-dB loss at Nyquist frequency. Third, digital neural-network-enhanced FFEs (NN-FFEs) for PAM4 analog-to-digital converter (ADC)-based optical interconnects are described. The proposed NN-FFEs employ a custom learnable piecewise linear (PWL) activation function to tackle the nonlinearities with short memory lengths. In contrast to the conventional Volterra equalizers where multipliers are utilized to generate the nonlinear terms, the proposed NN-FFEs leverage the custom PWL activation function for nonlinear operations and reduce the required number of multipliers, thereby improving the area and power efficiencies. Applications in the optical interconnects based on micro-ring modulators (MRMs) are demonstrated with simulation results of 50-Gb/s and 100-Gb/s links adopting PAM4 signaling. The proposed NN-FFEs and the conventional Volterra equalizers are synthesized with the standard-cell libraries in a commercial 28-nm CMOS technology, and their power consumptions and performance are compared. Better than 37% lower power overhead can be achieved by employing the proposed NN-FFEs, in comparison with the Volterra equalizer that leads to similar improvement in the symbol-error-rate (SER) performance.</p

Caltech Theses and Dissertations

Machine learning approach for high speed link modeling and IBIS-AMI model generation

Author: Wang Xinying
Publication venue
Publication date: 01/12/2020
Field of study

The high-speed link system is one of the major components in the networking infrastructure. Developing a high-performance behavioral model for such a system is crucial but challenging, especially when taking nonlinearity into account. This work reports modeling the high-speed link (HSL) system using machine learning and implementing the model into IBIS-AMI, an industrial standard for SerDes simulation and verification. We started with developing a Volterra series model by extracting the Volterra kernels using feed forward neural networks. We proposed a monomial power series neural network (MPSNN) which can extract Volterra kernels that relate to nonlinearity up to the third order. We developed an analytical mapping from neural network weights to Volterra kernels. The analytical mapping allows accurate time domain signal reconstruction with extracted Volterra kernels. We applied the MPSNN to model pulse amplitude modulation 4 level (PAM-4) and non-return-to-zero (NRZ) system. Volterra kernels up to the third order can be accurately identified. The curse of dimensionality associated with Volterra series impedes the practical applications of the Volterra series. The number of Volterra kernels increases exponentially with the increase in memory length and the nonlinearity order. The large number of Volterra kernels consume a vast amount of computational power during signal reconstruction. To address this challenge, we proposed a Laguerre-Volterra feed forward neural network (LVFFN). The input time-series signal is orthogonalized, in other words, Laguerre-expanded, before it is feed to the neural network. The dimension of the input signal is significantly reduced, which results in many fewer neurons in the hidden layer. We modeled the PAM-4 and NRZ system with LVFFN. The resulted model has the number of parameters that are up to six orders of magnitudes less than the Volterra series. We could also model just the receiver instead of the whole system to add more flexibility to the model in practical applications. The LVFFN model greatly addressed the curse of dimensionality associated with Volterra series. Then the next question addresses how are we going to use it. Since the machine learning based model is not a standardized model, it is difficult to be co-simulated with models generated by other approaches. To circumvent the challenges in model transportability and interoperability, we implemented the LVFFN model into the IBIS-AMI model, an industrial standard model that is compatible with most of the circuit simulators. We could simulate the LVFFN IBIS-AMI model in Keysight ADS and conduct the eye-diagram analysis. IBIS-AMI model generation is not trivial. It requires cross-disciplinary knowledge in signal integrity, HSL circuit, and software engineering. To facilitate the process of IBIS-AMI model generation, we developed a software, ezAMI, that can generate the IBIS-AMI model by clicks. The software is developed using Qt/C++ and is an open-source software. The software architecture and tutorial are introduced in this dissertation as well

Illinois Digital Environment for Access to Learning and Scholarship Repository