3 research outputs found
High Speed Reconfigurable NRZ/PAM4 Transceiver Design Techniques
While the majority of wireline standards use simple binary non-return-to-zero (NRZ) signaling, four-level pulse-amplitude modulation (PAM4) standards are emerging to increase bandwidth density. This dissertation proposes efficient implementations for high speed NRZ/PAM4 transceivers. The first prototype includes a dual-mode NRZ/PAM4 serial I/O transmitter which can support both modulations with minimum power and hardware overhead. A source-series-terminated (SST) transmitter achieves 1.2Vpp output swing and employs lookup table (LUT) control of a 31-segment output digital-to-analog converter (DAC) to implement 4/2-tap feed-forward equalization (FFE) in NRZ/PAM4 modes, respectively. Transmitter power is improved with low-overhead analog impedance control in the DAC cells and a quarter-rate serializer based on a tri-state inverter-based mux with dynamic pre-driver gates. The transmitter is designed to work with a receiver that implements an NRZ/PAM4 decision feedback equalizer (DFE) that employs 1 finite impulse response (FIR) and 2 infinite impulse response (IIR) taps for first post-cursor and long-tail ISI cancellation, respectively. Fabricated in GP 65-nm CMOS, the transmitter occupies 0.060mmยฒ area and achieves 16Gb/s NRZ and 32Gb/s PAM4 operation at 10.4 and 4.9 mW/Gb/s while operating over channels with 27.6 and 13.5dB loss at Nyquist, respectively. The second prototype presents a 56Gb/s four-level pulse amplitude modulation (PAM4) quarter-rate wireline receiver which is implemented in a 65nm CMOS process. The frontend utilize a single stage continuous time linear equalizer (CTLE) to boost the main cursor and relax the pre-cursor cancelation requirement, requiring only a 2-tap pre-cursor feed-forward equalization (FFE) on the transmitter side. A 2-tap decision feedback equalizer (DFE) with one finite impulse response (FIR) tap and one infinite impulse response (IIR) tap is employed to cancel first post-cursor and longtail inter-symbol interference (ISI). The FIR tap direct feedback is implemented inside the CML slicers to relax the critical timing of DFE and maximize the achievable data-rate. In addition to the per-slice main 3 data samplers, an error sampler is utilized for background threshold control and an edge-based sampler performs both PLL-based CDR phase detection and generates information for background DFE tap adaptation. The receiver consumes 4.63mW/Gb/s and compensates for up to 20.8dB loss when operated with a 2- tap FFE transmitter. The experimental results and comparison with state-of-the-art shows superior power efficiency of the presented prototypes for similar data-rate and channel loss. The usage of proposed design techniques are not limited to these specific prototypes and can be applied for any wireline transceiver with different modulation, data-rate and CMOS technology
์คํ์ ์ ๊ฑฐ๊ธฐ์ ์ ์ ์ ์ด ๋ฑํ๊ธฐ์ ๋ณด์ฐ-๋ ์ดํธ ์์ ๊ฒ์ถ๊ธฐ๋ฅผ ํ์ฉํ ์์ ๊ธฐ ์ค๊ณ
ํ์๋
ผ๋ฌธ(๋ฐ์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ๊ณต๊ณผ๋ํ ์ ๊ธฐยท์ ๋ณด๊ณตํ๋ถ, 2021.8. ์ผ์ ์.In this thesis, designs of high-speed, low-power wireline receivers (RX) are explained. To be specific, the circuit techniques of DC offset cancellation, merged-summer DFE, stochastic Baud-rate CDR, and the phase detector (PD) for multi-level signal are proposed.
At first, an RX with adaptive offset cancellation (AOC) and merged summer decision-feedback equalizer (DFE) is proposed. The proposed AOC engine removes the random DC offset of the data path by examining the random data stream's sampled data and edge outputs. In addition, the proposed RX incorporates a shared-summer DFE in a half-rate structure to reduce power dissipation and hardware complexity of the adaptive equalizer. A prototype chip fabricated in 40 nm CMOS technology occupies an active area of 0.083 mm2. Thanks to the AOC engine, the proposed RX achieves the BER of less than 10-12 in a wide range of data rates: 1.62-10 Gb/s. The proposed RX consumes 18.6 mW at 10 Gb/s over a channel with a 27 dB loss at 5 GHz, exhibiting a figure-of-merit of 0.068 pJ/b/dB.
Secondly, a 40 nm CMOS RX with Baud-rate phase-detector (BRPD) is proposed. The RX includes two PDs: the BRPD employing the stochastic technique and the BRPD suitable for multi-level signals. Thanks to the Baud-rate CDRโs advantage, by not using an edge-sampling clock, the proposed CDR can reduce the power consumption by lowering the hardware complexity. Besides, the proposed stochastic phase detector (SPD) tracks an optimal phase-locking point that maximizes the vertical eye opening. Furthermore, despite residual inter-symbol interference, proposed BRPD for multi-level signal secures vertical eye margin, which is especially vulnerable in the multi-level signal. Besides, the proposed BRPD has a unique lock point with an adaptive DFE, unlike conventional Mueller-Muller PD. A prototype chip fabricated in 40 nm CMOS technology occupies an active area of 0.24 mm2. The proposed PAM-4 RX achieves the bit-error-rate less than 10-11 in 48 Gb/s and the power efficiency of 2.42 pJ/b.๋ณธ ๋
ผ๋ฌธ์ ๊ณ ์, ์ ์ ๋ ฅ์ผ๋ก ๋์ํ๋ ์ ์ ์์ ๊ธฐ์ ์ค๊ณ์ ๋ํด ์ค๋ช
ํ๊ณ ์๋ค. ๊ตฌ์ฒด์ ์ผ๋ก ๋งํ๋ฉด, ์คํ์
์์, ๋ณํฉ๋ ์๋จธ๋ฅผ ์ฌ์ฉํ๋ ๊ฒฐ์ ํผ๋๋ฐฑ ๋ฑํ๊ธฐ ๊ธฐ์ , ํ๋ฅ ์ ๋ณด์ฐ ๋ ์ดํธ ํด๋ญ๊ณผ ๋ฐ์ดํฐ ๋ณต์๊ธฐ, ๊ทธ๋ฆฌ๊ณ ๋ค์ค ๋ ๋ฒจ ์ ํธ์ ์ ํฉํ ์์ ๊ฒ์ถ๊ธฐ๋ฅผ ์ ์ํ๋ค.
์ฒซ์งธ๋ก, ์ ์ ์คํ์
์ ๊ฑฐ ๋ฐ ๋ณํฉ๋ ์๋จธ๋ฅผ ์ฌ์ฉํ๋ ๊ฒฐ์ ํผ๋๋ฐฑ ๋ฑํ๊ธฐ๋ฅผ ๊ฐ์ถ ์์ ๊ธฐ๋ฅผ ์ ์ํ๋ค. ์ ์๋ ์ ์ ์คํ์
์ ๊ฑฐ ์์ง์ ์์์ ๋ฐ์ดํฐ ์คํธ๋ฆผ์ ์ํ๋ง ๋ฐ์ดํฐ, ์์ง ์ถ๋ ฅ์ ๊ฒ์ฌํ์ฌ ๋ฐ์ดํฐ ๊ฒฝ๋ก ์์ ์คํ์
์ ์ ๊ฑฐํ๋ค. ๋ํ ํํ ๋ ์ดํธ ๊ตฌ์กฐ์ ๋ณํฉ๋ ์๋จธ๋ฅผ ์ฌ์ฉํ๋ ๊ฒฐ์ ํผ๋๋ฐฑ ๋ฑํ๊ธฐ๋ ์ ๋ ฅ์ ์ฌ์ฉ๊ณผ ํ๋์จ์ด์ ๋ณต์ก์ฑ์ ์ค์ธ๋ค. 40 nm CMOS ๊ธฐ์ ๋ก ์ ์๋ ํ๋กํ ํ์
์นฉ์ 0.083 mm2 ์ ๋ฉด์ ์ ๊ฐ์ง๋ค. ์ ์ ์คํ์
์ ๊ฑฐ๊ธฐ ๋๋ถ์ ์ ์๋ ์์ ๊ธฐ๋ 10-12 ๋ฏธ๋ง์ BER์ ๋ฌ์ฑํ๋ค. ๋ํ ์ ์๋ ์์ ๊ธฐ๋ 5GHz์์ 27 dB์ ๋ก์ค๋ฅผ ๊ฐ๋ ์ฑ๋์์ 10 Gb/s์ ์๋์์ 18.6 mW๋ฅผ ์๋นํ๋ฉฐ 0.068 pJ/b/dB์ FoM์ ๋ฌ์ฑํ์๋ค.
๋๋ฒ์งธ๋ก, ๋ณด์ฐ ๋ ์ดํธ ์์ ๊ฒ์ถ๊ธฐ๊ฐ ์๋ 40 nm CMOS ์์ ๊ธฐ๊ฐ ์ ์๋์๋ค. ์์ ๊ธฐ์๋ ๋๊ฐ์ ๋ณด์ฐ ๋ ์ดํธ ์์ ๊ฒ์ถ๊ธฐ๋ฅผ ํฌํจํ๋ค. ํ๋๋ ํ๋ฅ ๋ก ์ ๊ธฐ๋ฒ์ ์ฌ์ฉํ๋ ๋ณด์ฐ ๋ ์ดํธ ์์ ๊ฒ์ถ๊ธฐ์ด๋ค. ๋ณด์ฐ ๋ ์ดํธ ํด๋ญ ๋ฐ์ดํฐ ๋ณต์๊ธฐ์ ์ฅ์ ๋๋ถ์ ์์ง ์ํ๋ง ํด๋ญ์ ์ฌ์ฉํ์ง ์์์ผ๋ก์ ํ์์ ์๋ชจ์ ํ๋์จ์ด์ ๋ณต์ก์ฑ์ ์ค์๋ค. ๋ํ ํ๋ฅ ์ ์์ ๊ฒ์ถ๊ธฐ๋ ์์ง ์์ด ์คํ๋์ ์ต๋ํํ๋ ์ต์ ์ ์์ ์ง์ ์ ์ฐพ์ ์ ์์๋ค. ๋ค๋ฅธ ์์ ๊ฒ์ถ๊ธฐ๋ ๋ค์ค ๋ ๋ฒจ ์ ํธ์ ์ ํฉํ ๋ฐฉ์์ด๋ค. ์ฌ๋ณผ ๊ฐ ๊ฐ์ญ์ด ๋ค์ค ๋ ๋ฒจ ์ ํธ์ ๋งค์ฐ ์ทจ์ฝํ ๋ฌธ์ ๊ฐ ์๋๋ผ๋ ์ ์๋ ๋ค์ค ๋ ๋ฒจ ์ ํธ์ฉ ๋ณด์ฐ ๋ ์ดํธ ์์ ๊ฒ์ถ๊ธฐ๋ ์์ง ์์ด ๋ง์ง์ ํ๋ณดํ๋ค. ๊ฒ๋ค๊ฐ ์ ์๋ ๋ณด์ฐ ๋ ์ดํธ ์์ ๊ฒ์ถ๊ธฐ๋ ๊ธฐ์กด์ ๋ฎฌ๋ฌ-๋ฎ๋ฌ ์์ ๊ฒ์ถ๊ธฐ์ ๋ฌ๋ฆฌ ์ ์ํ ๊ฒฐ์ ํผ๋๋ฐฑ ๋ฑํ๊ธฐ๊ฐ ์๋๋ผ๋ ์ ์ผํ ๋ฝ ์ง์ ์ ๊ฐ๋๋ค. ํ๋กํ ํ์
์นฉ์ 0.24mm2์ ๋ฉด์ ์ ๊ฐ์ง๋ค. ์ ์๋ PAM-4 ์์ ๊ธฐ๋ 48 Gb/s์ ์๋์์ 10-11 ๋ฏธ๋ง์ BER์ ๊ฐ์ง๊ณ , 2.42 pJ/b์ FoM์ ๊ฐ์ง๋ค.CHAPTER 1 INTRODUCTION 1
1.1 MOTIVATION 1
1.2 THESIS ORGANIZATION 5
CHAPTER 2 BACKGROUNDS 6
2.1 BASIC ARCHITECTURE IN SERIAL LINK 6
2.1.1 SERIAL COMMUNICATION 6
2.1.2 CLOCK AND DATA RECOVERY 8
2.1.3 MULTI-LEVEL PULSE-AMPLITUDE MODULATION 10
2.2 EQUALIZER 12
2.2.1 EQUALIZER OVERVIEW 12
2.2.2 DECISION-FEEDBACK EQUALIZER 15
2.2.3 ADAPTIVE EQUALIZER 18
2.3 CLOCK RECOVERY 21
2.3.1 2X OVERSAMPLING PD ALEXANDER PD 22
2.3.2 BAUD-RATE PD MUELLER MULLER PD 25
CHAPTER 3 AN ADAPTIVE OFFSET CANCELLATION SCHEME AND SHARED SUMMER ADAPTIVE DFE 28
3.1 OVERVIEW 28
3.2 AN ADAPTIVE OFFSET CANCELLATION SCHEME AND SHARED-SUMMER ADAPTIVE DFE FOR LOW POWER RECEIVER 31
3.3 SHARED SUMMER DFE 37
3.4 RECEIVER IMPLEMENTATION 42
3.5 MEASUREMENT RESULTS 45
CHAPTER 4 PAM-4 BAUD-RATE DIGITAL CDR 51
4.1 OVERVIEW 51
4.2 OVERALL ARCHITECTURE 53
4.2.1 PROPOSED BAUD-RATE CDR ARCHITECTURE 53
4.2.2 PROPOSED ANALOG FRONT-END STRUCTURE 59
4.3 STOCHASTIC PHASE DETECTION PAM-4 CDR 64
4.3.1 PROPOSED STOCHASTIC PHASE DETECTION 64
4.3.2 COMPARISON OF THE STOCHASTIC PD WITH SS-MMPD 70
4.4 PHASE DETECTION FOR MULTI-LEVEL SIGNALING 73
4.4.1 PROPOSED BAUD-RATE PHASE DETECTOR FOR MULTI-LEVEL SIGNAL 73
4.4.2 DATA LEVEL AND DFE COEFFICIENT ADAPTATION 79
4.4.3 PROPOSED PHASE DETECTOR 84
4.5 MEASUREMENT RESULT 88
4.5.1 MEASUREMENT OF THE PROPOSED STOCHASTIC BAUD-RATE PHASE DETECTION 94
4.5.2 MEASUREMENT OF THE PROPOSED BAUD-RATE PHASE DETECTION FOR MULTI-LEVEL SIGNAL 97
CHAPTER 5 CONCLUSION 103
BIBLIOGRAPHY 105
์ด ๋ก 109๋ฐ
Design of clock and data recovery circuits for energy-efficient short-reach optical transceivers
Nowadays, the increasing demand for cloud based computing and social media
services mandates higher throughput (at least 56 Gb/s per data lane with 400
Gb/s total capacity 1) for short reach optical links (with the reach typically less
than 2 km) inside data centres. The immediate consequences are the huge
and power hungry data centers. To address these issues the intra-data-center
connectivity by means of optical links needs continuous upgrading.
In recent years, the trend in the industry has shifted toward the use of more
complex modulation formats like PAM4 due to its spectral efficiency over the
traditional NRZ. Another advantage is the reduced number of channels count
which is more cost-effective considering the required area and the I/O density.
However employing PAM4 results in more complex transceivers circuitry due
to the presence of multilevel transitions and reduced noise budget. In addition,
providing higher speed while accommodating the stringent requirements
of higher density and energy efficiency (< 5 pJ/bit), makes the design of the
optical links more challenging and requires innovative design techniques both
at the system and circuit level.
This work presents the design of a Clock and Data Recovery Circuit (CDR) as
one of the key building blocks for the transceiver modules used in such fibreoptic
links. Capable of working with PAM4 signalling format, the new proposed
CDR architecture targets data rates of 50โ56 Gb/s while achieving the required
energy efficiency (< 5 pJ/bit).
At the system level, the design proposes a new PAM4 PD which provides a better
trade-off in terms of bandwidth and systematic jitter generation in the CDR. By
using a digital loop controller (DLC), the CDR gains considerable area reduction
with flexibility to adjust the loop dynamics.
At the circuit level it focuses on applying different circuit techniques to mitigate
the circuit imperfections. It presents a wideband analog front end (AFE),
suitable for a 56 Gb/s, 28-Gbaud PAM-4 signal, by using an 8x interleaved, master/
slave based sample and hold circuit. In addition, the AFE is equipped with
a calibration scheme which corrects the errors associated with the sampling
channelsโ offset voltage and gain mismatches. The presented digital to phase
converter (DPC) features a modified phase interpolator (PI), a new quadrature
phase corrector (QPC) and multi-phase output with de-skewing capabilities.The DPC (as a standalone block) and the CDR (as the main focus of this work)
were fabricated in 65-nm CMOS technology. Based on the measurements, the
DPC achieves DNL/INL of 0.7/6 LSB respectively while consuming 40.5 mW
power from 1.05 V supply. Although the CDR was not fully operational with
the PAM4 input, the results from 25-Gbaud PAM2 (NRZ) test setup were used
to estimate the performance. Under this scenario, the 1-UI JTOL bandwidth
was measured to be 2 MHz with BER threshold of 10โ4. The chip consumes 236
mW of power while operating on 1 โ 1.2 V supply range achieving an energyefficiency
of 4.27 pJ/bit