23 research outputs found

    State of the art in chip-to-chip interconnects

    Get PDF
    This thesis presents a study of short-range links for chips mounted in the same package, on printed circuit boards or interposers. Implemented in CMOS technology between 7 and 250 nm, with links that operate at a data rate between 0,4 and 112 Gb/s/pin and with energy efficiencies from 0,3 to 67,7 pJ/bit. The links operate on channels with an attenuation lower than 50 dB. A comparison is made with graphical representations between the different articles that shows the correlation between the different essential metrics of chip-to-chip interconnects, as well as its evolution over the last 20 years.Esta tesis presenta un estudio de enlaces de corto alcance para chips montados en un mismo paquete, en placas de circuito impreso o intercaladores. Implementado en tecnología CMOS entre 7 y 250 nm, con enlaces que operan a una velocidad de datos entre 0,4 y 112 Gb/s/pin y con eficiencias energéticas de 0,3 a 67,7 pJ/bit. Los enlaces operan en canales con una atenuación inferior a 50 dB. Se realiza una comparación con representaciones gráficas entre los diferentes artículos que muestra la correlación entre las distintas métricas esenciales de las interconexiones chip a chip, así como su evolución en los últimos 20 años.Aquesta tesi presenta un estudi d'enllaços de curt abast per a xips muntats en el mateix paquet, en plaques de circuits impresos o interposers. Implementat en tecnologia CMOS entre 7 i 250 nm, amb enllaços que funcionen a una velocitat de dades entre 0,4 i 112 Gb/s/pin i amb eficiències energètiques de 0,3 a 67,7 pJ/bit. Els enllaços funcionen en canals amb una atenuació inferior a 50 dB. Es fa una comparació amb representacions gràfiques entre els diferents articles que mostra la correlació entre les diferents mètriques essencials d'interconnexions xip a xip, així com la seva evolució en els darrers 20 anys

    A 60-Gb/s PAM4 Wireline Receiver With 2-Tap Direct Decision Feedback Equalization Employing Track-and-Regenerate Slicers in 28-nm CMOS

    Get PDF
    This article describes a 4-level pulse amplitude modulation (PAM4) receiver incorporating continuous time linear equalizers (CTLEs) and a 2-tap direct decision feedback equalizer (DFE) for applications in wireline communication. A CMOS track-and-regenerate slicer is proposed and employed in the PAM4 receiver. The proposed slicer is designed for the purposes of improving the clock-to-Q delay as well as the output signal swing. A direct DFE in a PAM4 receiver is made possible with the proposed slicer by having rail-to-rail digital feedback signals available with reduced delay, and accordingly relaxing the settling time constraint of the summer. With the 2-tap direct DFE enabled by the proposed slicer, loop-unrolling and inductor-based bandwidth enhancement techniques, which can be area/power intensive, are not necessary at high data rates. The PAM4 receiver fabricated in 28-nm CMOS technology achieves bit-error-rate (BER) better than 1E-12, and energy efficiency of 1.1 pJ/b at 60 Gb/s, measured over a channel with 8.2-dB loss at Nyquist

    High Speed Reconfigurable NRZ/PAM4 Transceiver Design Techniques

    Get PDF
    While the majority of wireline standards use simple binary non-return-to-zero (NRZ) signaling, four-level pulse-amplitude modulation (PAM4) standards are emerging to increase bandwidth density. This dissertation proposes efficient implementations for high speed NRZ/PAM4 transceivers. The first prototype includes a dual-mode NRZ/PAM4 serial I/O transmitter which can support both modulations with minimum power and hardware overhead. A source-series-terminated (SST) transmitter achieves 1.2Vpp output swing and employs lookup table (LUT) control of a 31-segment output digital-to-analog converter (DAC) to implement 4/2-tap feed-forward equalization (FFE) in NRZ/PAM4 modes, respectively. Transmitter power is improved with low-overhead analog impedance control in the DAC cells and a quarter-rate serializer based on a tri-state inverter-based mux with dynamic pre-driver gates. The transmitter is designed to work with a receiver that implements an NRZ/PAM4 decision feedback equalizer (DFE) that employs 1 finite impulse response (FIR) and 2 infinite impulse response (IIR) taps for first post-cursor and long-tail ISI cancellation, respectively. Fabricated in GP 65-nm CMOS, the transmitter occupies 0.060mm² area and achieves 16Gb/s NRZ and 32Gb/s PAM4 operation at 10.4 and 4.9 mW/Gb/s while operating over channels with 27.6 and 13.5dB loss at Nyquist, respectively. The second prototype presents a 56Gb/s four-level pulse amplitude modulation (PAM4) quarter-rate wireline receiver which is implemented in a 65nm CMOS process. The frontend utilize a single stage continuous time linear equalizer (CTLE) to boost the main cursor and relax the pre-cursor cancelation requirement, requiring only a 2-tap pre-cursor feed-forward equalization (FFE) on the transmitter side. A 2-tap decision feedback equalizer (DFE) with one finite impulse response (FIR) tap and one infinite impulse response (IIR) tap is employed to cancel first post-cursor and longtail inter-symbol interference (ISI). The FIR tap direct feedback is implemented inside the CML slicers to relax the critical timing of DFE and maximize the achievable data-rate. In addition to the per-slice main 3 data samplers, an error sampler is utilized for background threshold control and an edge-based sampler performs both PLL-based CDR phase detection and generates information for background DFE tap adaptation. The receiver consumes 4.63mW/Gb/s and compensates for up to 20.8dB loss when operated with a 2- tap FFE transmitter. The experimental results and comparison with state-of-the-art shows superior power efficiency of the presented prototypes for similar data-rate and channel loss. The usage of proposed design techniques are not limited to these specific prototypes and can be applied for any wireline transceiver with different modulation, data-rate and CMOS technology

    오프셋 제거기의 적응 제어 등화기와 보우-레이트 위상 검출기를 활용한 수신기 설계

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2021.8. 염제완.In this thesis, designs of high-speed, low-power wireline receivers (RX) are explained. To be specific, the circuit techniques of DC offset cancellation, merged-summer DFE, stochastic Baud-rate CDR, and the phase detector (PD) for multi-level signal are proposed. At first, an RX with adaptive offset cancellation (AOC) and merged summer decision-feedback equalizer (DFE) is proposed. The proposed AOC engine removes the random DC offset of the data path by examining the random data stream's sampled data and edge outputs. In addition, the proposed RX incorporates a shared-summer DFE in a half-rate structure to reduce power dissipation and hardware complexity of the adaptive equalizer. A prototype chip fabricated in 40 nm CMOS technology occupies an active area of 0.083 mm2. Thanks to the AOC engine, the proposed RX achieves the BER of less than 10-12 in a wide range of data rates: 1.62-10 Gb/s. The proposed RX consumes 18.6 mW at 10 Gb/s over a channel with a 27 dB loss at 5 GHz, exhibiting a figure-of-merit of 0.068 pJ/b/dB. Secondly, a 40 nm CMOS RX with Baud-rate phase-detector (BRPD) is proposed. The RX includes two PDs: the BRPD employing the stochastic technique and the BRPD suitable for multi-level signals. Thanks to the Baud-rate CDR’s advantage, by not using an edge-sampling clock, the proposed CDR can reduce the power consumption by lowering the hardware complexity. Besides, the proposed stochastic phase detector (SPD) tracks an optimal phase-locking point that maximizes the vertical eye opening. Furthermore, despite residual inter-symbol interference, proposed BRPD for multi-level signal secures vertical eye margin, which is especially vulnerable in the multi-level signal. Besides, the proposed BRPD has a unique lock point with an adaptive DFE, unlike conventional Mueller-Muller PD. A prototype chip fabricated in 40 nm CMOS technology occupies an active area of 0.24 mm2. The proposed PAM-4 RX achieves the bit-error-rate less than 10-11 in 48 Gb/s and the power efficiency of 2.42 pJ/b.본 논문은 고속, 저전력으로 동작하는 유선 수신기의 설계에 대해 설명하고 있다. 구체적으로 말하면, 오프셋 상쇄, 병합된 서머를 사용하는 결정 피드백 등화기 기술, 확률적 보우 레이트 클럭과 데이터 복원기, 그리고 다중 레벨 신호에 적합한 위상 검출기를 제안한다. 첫째로, 적응 오프셋 제거 및 병합된 서머를 사용하는 결정 피드백 등화기를 갖춘 수신기를 제안한다. 제안된 적응 오프셋 제거 엔진은 임의의 데이터 스트림의 샘플링 데이터, 에지 출력을 검사하여 데이터 경로 상의 오프셋을 제거한다. 또한 하프 레이트 구조의 병합된 서머를 사용하는 결정 피드백 등화기는 전력의 사용과 하드웨어의 복잡성을 줄인다. 40 nm CMOS 기술로 제작된 프로토타입 칩은 0.083 mm2 의 면적을 가진다. 적응 오프셋 제거기 덕분에 제안된 수신기는 10-12 미만의 BER을 달성한다. 또한 제안된 수신기는 5GHz에서 27 dB의 로스를 갖는 채널에서 10 Gb/s의 속도에서 18.6 mW를 소비하며 0.068 pJ/b/dB의 FoM을 달성하였다. 두번째로, 보우 레이트 위상 검출기가 있는 40 nm CMOS 수신기가 제안되었다. 수신기에는 두개의 보우 레이트 위상 검출기를 포함한다. 하나는 확률론적 기법을 사용하는 보우 레이트 위상 검출기이다. 보우 레이트 클럭 데이터 복원기의 장점 덕분에 에지 샘플링 클럭을 사용하지 않음으로서 파워의 소모와 하드웨어의 복잡성을 줄였다. 또한 확률적 위상 검출기는 수직 아이 오프닝을 최대화하는 최적의 위상 지점을 찾을 수 있었다. 다른 위상 검출기는 다중 레벨 신호에 적합한 방식이다. 심볼 간 간섭이 다중 레벨 신호에 매우 취약한 문제가 있더라도 제안된 다중 레벨 신호용 보우 레이트 위상 검출기는 수직 아이 마진을 확보한다. 게다가 제안된 보우 레이트 위상 검출기는 기존의 뮬러-뮐러 위상 검출기와 달리 적응형 결정 피드백 등화기가 있더라도 유일한 락 지점을 갖는다. 프로토타입 칩은 0.24mm2의 면적을 가진다. 제안된 PAM-4 수신기는 48 Gb/s의 속도에서 10-11 미만의 BER을 가지고, 2.42 pJ/b의 FoM을 가진다.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 5 CHAPTER 2 BACKGROUNDS 6 2.1 BASIC ARCHITECTURE IN SERIAL LINK 6 2.1.1 SERIAL COMMUNICATION 6 2.1.2 CLOCK AND DATA RECOVERY 8 2.1.3 MULTI-LEVEL PULSE-AMPLITUDE MODULATION 10 2.2 EQUALIZER 12 2.2.1 EQUALIZER OVERVIEW 12 2.2.2 DECISION-FEEDBACK EQUALIZER 15 2.2.3 ADAPTIVE EQUALIZER 18 2.3 CLOCK RECOVERY 21 2.3.1 2X OVERSAMPLING PD ALEXANDER PD 22 2.3.2 BAUD-RATE PD MUELLER MULLER PD 25 CHAPTER 3 AN ADAPTIVE OFFSET CANCELLATION SCHEME AND SHARED SUMMER ADAPTIVE DFE 28 3.1 OVERVIEW 28 3.2 AN ADAPTIVE OFFSET CANCELLATION SCHEME AND SHARED-SUMMER ADAPTIVE DFE FOR LOW POWER RECEIVER 31 3.3 SHARED SUMMER DFE 37 3.4 RECEIVER IMPLEMENTATION 42 3.5 MEASUREMENT RESULTS 45 CHAPTER 4 PAM-4 BAUD-RATE DIGITAL CDR 51 4.1 OVERVIEW 51 4.2 OVERALL ARCHITECTURE 53 4.2.1 PROPOSED BAUD-RATE CDR ARCHITECTURE 53 4.2.2 PROPOSED ANALOG FRONT-END STRUCTURE 59 4.3 STOCHASTIC PHASE DETECTION PAM-4 CDR 64 4.3.1 PROPOSED STOCHASTIC PHASE DETECTION 64 4.3.2 COMPARISON OF THE STOCHASTIC PD WITH SS-MMPD 70 4.4 PHASE DETECTION FOR MULTI-LEVEL SIGNALING 73 4.4.1 PROPOSED BAUD-RATE PHASE DETECTOR FOR MULTI-LEVEL SIGNAL 73 4.4.2 DATA LEVEL AND DFE COEFFICIENT ADAPTATION 79 4.4.3 PROPOSED PHASE DETECTOR 84 4.5 MEASUREMENT RESULT 88 4.5.1 MEASUREMENT OF THE PROPOSED STOCHASTIC BAUD-RATE PHASE DETECTION 94 4.5.2 MEASUREMENT OF THE PROPOSED BAUD-RATE PHASE DETECTION FOR MULTI-LEVEL SIGNAL 97 CHAPTER 5 CONCLUSION 103 BIBLIOGRAPHY 105 초 록 109박

    Design of High-Speed Power-Efficient A/D Converters for Wireline ADC-Based Receiver Applications

    Get PDF
    Serial input/output (I/O) data rates are increasing in order to support the explosion in network traffic driven by big data applications such as the Internet of Things (IoT), cloud computing and etc. As the high-speed data symbol times shrink, this results in an increased amount of inter-symbol interference (ISI) for transmission over both severe low-pass electrical channels and dispersive optical channels. This necessitates increased equalization complexity and consideration of advanced modulation schemes, such as four-level pulse amplitude modulation (PAM-4). Serial links which utilize an analog-to-digital converter (ADC) receiver front-end offer a potential solution, as they enable more powerful and flexible digital signal processing (DSP) for equalization and symbol detection and can easily support advanced modulation schemes. Moreover, the DSP back-end provides robustness to process, voltage, and temperature (PVT) variations, benefits from improved area and power with CMOS technology scaling and offers easy design transfer between different technology nodes and thus improved time-to-market. However, ADC-based receivers generally consume higher power relative to their mixed-signal counterparts because of the significant power consumed by conventional multi-GS/s ADC implementations. This motivates exploration of energy-efficient ADC designs with moderate resolution and very high sampling rates to support data rates at or above 50Gb/s. This dissertation presents two power-efficient designs of ≥25GS/s time-interleaved ADCs for ADC-based wireline receivers. The first prototype includes the implementation of a 6b 25GS/s time-interleaved multi-bit search ADC in 65nm CMOS with a soft-decision selection algorithm that provides redundancy for relaxed track-and-hold (T/H) settling and improved metastability tolerance, achieving a figure-of-merit (FoM) of 143fJ/conversion step and 1.76pJ/bit for a PAM-4 receiver design. The second prototype features the design of a 52Gb/s PAM-4 ADC-based receiver in 65nm CMOS, where the front-end consists of a 4-stage continuous-time linear equalizer (CTLE)/variable gain amplifier (VGA) and a 6b 26GS/s time-interleaved SAR ADC with a comparator-assisted 2b/stage structure for reduced digital-to-analog converter (DAC) complexity and a 3-tap embedded feed-forward equalizer (FFE) for relaxed ADC resolution requirement. The receiver front-end achieves an efficiency of 4.53bJ/bit, while compensating for up to 31dB loss with DSP and no transmitter (TX) equalization

    Low Power Analog Processing for Ultra-High-Speed Receivers with RF Correlation

    Get PDF
    Ultra-high-speed data communication receivers (Rxs) conventionally require analog digital converters (ADC)s with high sampling rates which have design challenges in terms of adequate resolution and power. This leads to ultra-high-speed Rxs utilising expensive and bulky high-speed oscilloscopes which are extremely inefficient for demodulation, in terms of power and size. Designing energy-efficient mixed-signal and baseband units for ultra-high-speed Rxs requires a paradigm approach detailed in this paper that circumvents the use of power-hungry ADCs by employing low-power analog processing. The low-power analog Rx employs direct-demodulation with RF correlation using low-power comparators. The Rx is able to support multiple modulations with highest modulation of 16-QAM reported so far for direct-demodulation with RF correlation. Simulations using Matlab, Simulink R2020a® indicate sufficient symbol-error rate (SER) performance at a symbol rate of 8 GS/s for the 71 GHz Urban Micro Cell and 140 GHz indoor channels. Power analysis undertaken with current analog, hybrid and digital beamforming approaches requiring ADCs indicates considerable power savings. This novel approach can be adopted for ultra-high-speed Rxs envisaged for beyond fifth generation (B5G)/sixth generation (6G)/ terahertz (THz) communication without the power-hungry ADCs, leading to low-power integrated design solutions

    Energy-Efficient Receiver Design for High-Speed Interconnects

    Get PDF
    High-speed interconnects are of vital importance to the operation of high-performance computing and communication systems, determining the ultimate bandwidth or data rates at which the information can be exchanged. Optical interconnects and the employment of high-order modulation formats are considered as the solutions to fulfilling the envisioned speed and power efficiency of future interconnects. One common key factor in bringing the success is the availability of energy-efficient receivers with superior sensitivity. To enhance the receiver sensitivity, improvement in the signal-to-noise ratio (SNR) of the front-end circuits, or equalization that mitigates the detrimental inter-symbol interference (ISI) is required. In this dissertation, architectural and circuit-level energy-efficient techniques serving these goals are presented. First, an avalanche photodetector (APD)-based optical receiver is described, which utilizes non-return-to-zero (NRZ) modulation and is applicable to burst-mode operation. For the purposes of improving the overall optical link energy efficiency as well as the link bandwidth, this optical receiver is designed to achieve high sensitivity and high reconfiguration speed. The high sensitivity is enabled by optimizing the SNR at the front-end through adjusting the APD responsivity via its reverse bias voltage, along with the incorporation of 2-tap feedforward equalization (FFE) and 2-tap decision feedback equalization (DFE) implemented in current-integrating fashion. The high reconfiguration speed is empowered by the proposed integrating dc and amplitude comparators, which eliminate the RC settling time constraints. The receiver circuits, excluding the APD die, are fabricated in 28-nm CMOS technology. The optical receiver achieves bit-error-rate (BER) better than 1E−12 at −16-dBm optical modulation amplitude (OMA), 2.24-ns reconfiguration time with 5-dB dynamic range, and 1.37-pJ/b energy efficiency at 25 Gb/s. Second, a 4-level pulse amplitude modulation (PAM4) wireline receiver is described, which incorporates continuous time linear equalizers (CTLEs) and a 2-tap direct DFE dedicated to the compensation for the first and second post-cursor ISI. The direct DFE in a PAM4 receiver (PAM4-DFE) is made possible by the proposed CMOS track-and-regenerate slicer. This proposed slicer offers rail-to-rail digital feedback signals with significantly improved clock-to-Q delay performance. The reduced slicer delay relaxes the settling time constraint of the summer circuits and allows the stringent DFE timing constraint to be satisfied. With the availability of a direct DFE employing the proposed slicer, inductor-based bandwidth enhancement and loop-unrolling techniques, which can be power/area intensive, are not required. Fabricated in 28-nm CMOS technology, the PAM4 receiver achieves BER better than 1E−12 and 1.1-pJ/b energy efficiency at 60 Gb/s, measured over a channel with 8.2-dB loss at Nyquist frequency. Third, digital neural-network-enhanced FFEs (NN-FFEs) for PAM4 analog-to-digital converter (ADC)-based optical interconnects are described. The proposed NN-FFEs employ a custom learnable piecewise linear (PWL) activation function to tackle the nonlinearities with short memory lengths. In contrast to the conventional Volterra equalizers where multipliers are utilized to generate the nonlinear terms, the proposed NN-FFEs leverage the custom PWL activation function for nonlinear operations and reduce the required number of multipliers, thereby improving the area and power efficiencies. Applications in the optical interconnects based on micro-ring modulators (MRMs) are demonstrated with simulation results of 50-Gb/s and 100-Gb/s links adopting PAM4 signaling. The proposed NN-FFEs and the conventional Volterra equalizers are synthesized with the standard-cell libraries in a commercial 28-nm CMOS technology, and their power consumptions and performance are compared. Better than 37% lower power overhead can be achieved by employing the proposed NN-FFEs, in comparison with the Volterra equalizer that leads to similar improvement in the symbol-error-rate (SER) performance.</p

    Modeling and Design of Architectures for High-Speed ADC-Based Serial Links

    Get PDF
    There is an ongoing dramatic rise in the volume of internet traffic. Standards such as 56Gb/s OIF very short reach (VSR), medium reach (MR) and long reach (LR) standards for chip to chip communication over channels with up to 10dB, 20dB and 30dB insertion loss at the PAM 4 Nyquist frequency, respectively, are being adopted. These standards call for the spectrally efficient PAM-4 signaling over NRZ signaling. PAM-4 signaling offers challenges such as a reduced SNR at the receiver, susceptibility to nonlinearities and increased sensitivity to residual ISI. Equalization provided by traditional mixed signal architectures can be insufficient to achieve the target BER requirements for very long reach channels. ADC-based receiver architectures for PAM-4 links take advantage of the more powerful equalization techniques, which lend themselves to easier and robust digital implementations, to extend the amount of insertion loss that the receiver can handle. However, ADC-based receivers can consume more power compared to mixed-signal implementations. Techniques that model the receiver performance to understand the various system trade-offs are necessary. This research presents a fast and accurate hybrid modeling framework to efficiently investigate system trade-offs for an ADC-based receiver. The key contribution being the addition of ADC related non-idealities such as quantization noise in the presence of integral and differential nonlinearities, and time-interleaving mismatch errors such as gain mismatch, bandwidth mismatch, offset mismatch and sampling skew. The research also presents a 52Gb/s ADC-based PAM-4 receiver prototype employing a 32-way time-interleaved, 2-bit/stage, 6-bit SAR ADC and a DSP with a 12-tap FFE and a 2-tap DFE. A new DFE architecture that reduces the complexity of a PAM-4 DFE to that of an NRZ DFE while simultaneously nearly doubling the maximum achievable data rate is presented. The receiver architecture also includes an analog front-end (AFE) consisting of a programmable two stage CTLE. A digital baud-rate CDR’s utilizing a Mueller-Muller phase detector sets the sampling phase. Measurement results show that for 32Gb/s operation a BER < 10⁻⁹ is achieved for a 30dB loss channel while for 52 Gb/s operation achieves a BER < 10⁻⁶ for a 31dB loss channel with a power efficiency of 8.06pj/bit
    corecore