33 research outputs found

    Design of energy-efficient high-speed wireline transceiver

    Get PDF
    Energy efficiency has become the most important performance metric of integrated circuits used in many applications ranging from mobile devices to high-performance processors. The power problem permeates both computing and communication systems alike. Especially in the era of Big Data, continuously growing demand for higher communication bandwidth is driving the need for energy-efficient high-speed I/O serial links. However, the rate at which the energy efficiency of serial links is improving is much slower than the rate at which the required data transfer bandwidth is increasing. This dissertation explores two design approaches for energy-efficient communication systems. The first design approach maximizes the energy efficiency of a transceiver without any performance loss, and as a prototype, a source-synchronous multi-Gb/s transceiver that achieves excellent energy efficiency lower than 0.3pJ/bit is presented. To this end, the proposed transceiver employs aggressive supply voltage scaling, and multiplexed transmitter and receiver synchronized by low-rate multi-phase clocks are adopted to achieve high data rate even at a supply voltage close to the device threshold voltage. Phase spacing errors resulting from device mismatches are corrected using a self-calibration scheme. The proposed phase calibration method uses a single digital delay-locked loop (DLL) for calibrating all the phases, which makes the calibration process insensitive to the supply voltage level. Thanks to this technique, the proposed multi-Gb/s transceiver operates robustly and energy-efficiently at a very low supply voltage. Fabricated in a 65nm CMOS process, the energy efficiency and data rate of the prototype transceiver vary from 0.29pJ/bit to 0.58pJ/bit and 1Gb/s to 6Gb/s, respectively, as the supply voltage is varied from 0.45V to 0.7V. In the second approach, observing that the data traffic in a real system is bursty, a full-rate burst-mode transceiver that achieves rapid on/off operation needed for energy-proportional systems is presented. By injecting input data edges into the oscillator embedded in a classical type-II digital clock and data recovery (CDR) circuit, the proposed receiver achieves instantaneous phase-locking and input jitter filtering simultaneously. In other words, the proposed CDR combines the advantages of conventional feed-forward and feedback architectures to achieve energy-proportional operation. By controlling the number of data edges injected into the oscillator, both the jitter transfer bandwidth and the jitter tolerance corner are accurately controlled. The feedback loop also corrects for any frequency error and helps improve the CDR's immunity to oscillator frequency drift during the power-on and -off states. This also improves the CDR's tolerance to consecutive identical digits present in the input data. Fabricated in a 90nm CMOS process, the prototype receiver instantaneously locks onto the very first data edge and consumes 6.1mW at 2.2Gb/s. Owing to its short power-on time, the overall transceiver's energy efficiency varies only from 5.4pJ/bit to 10.7pJ/bit when the effective data rate is varied from 2.2Gb/s to 0.22Gb/s

    최적 위상 검출 회로를 이용한 클럭 및 데이터 복원 회로에 관한 연구

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 8. 김재하.Bang-bang phase detectors are widely used for today's high-speed communication circuits such as phase-locked loops (PLLs), delay-locked loops (DLLs) and clock-and-data recovery loops (CDRs) because it is simple, fast, accurate and amenable to digital implementations. However, its hard nonlinearity poses difficulties in design and analyses of the bang-bang controlled timing loops. Especially, dithering in bang-bang controlled CDRs sets conflicting requirements on the phase adjustment resolution as one tries to maximize the tracking bandwidth and minimize jitter. A fine phase step is helpful to minimize the dithering, but it requires circuits with finer resolution that consumes large power and area. In this background, this dissertation introduces an optimal phase detection technique that can minimize the effect of dithering without requiring fine phase resolution. A novel phase interval detector that looks for a phase interval enclosing the desired lock point is shown to find the optimal phase that minimizes the timing error without dithering. A digitally-controlled, phase-interpolating DLL-based CDR fabricated in 65nm CMOS demonstrates that it can achieve small area of 0.026mm^2 and low jitter of 41mUIp-p with a coarse phase adjustment step of 0.11UI, while dissipating only 8.4mW at 5Gbps. For the theoretic basis, various analysis techniques to understand bang-bang controlled timing loops are also presented. The proposed techniques are explained for both linearized loop and non-linear one, and applied to the evaluation of the proposed phase detection technique.1 Introduction 1 1.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Thesis Contribution and Organization . . . . . . . . . . . . . . . . . 6 2 Pseudo-Linear Analysis of Bang-Bang Controlled Loops 9 2.1 Model of a Second-Order, Bang-Bang Controlled Timing Loop . . . 9 2.2 Necessary Condition for the Pseudo-Linear Analysis . . . . . . . . . 12 2.3 Derivation of Necessity Condition for the Pseudo-Linear Analysis . . 17 2.4 A Linearized Model of the Bang-Bang Phase Detector . . . . . . . . 18 2.5 Linearized Gain of a Bang-Bang Phase Detector for Jitter Transfer and Jitter Generation Analyses . . . . . . . . . . . . . . . . . . . . . 21 2.6 Jitter Transfer and Jitter Generation Analyses . . . . . . . . . . . . 29 2.7 Linearized Gains of a Bang-bang Phase Detector for Jitter Tolerance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.8 Jitter Tolerance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 41 3 Nonlinear Analysis of Bang-Bang Controlled Loops 48 3.1 Transient Analysis of Bang-Bang Controlled Timing Loops . . . . . 48 3.2 Phase-portrait Analysis of Bang-Bang Controlled Timing Loops . . . 51 3.3 Markov-chain Analysis of Bang-Bang Controlled Timing Loops . . . 53 3.4 Analysis of Clock-and-Data Recovery Circuits . . . . . . . . . . . . . 57 3.4.1 Prediction of Bit-Error Rate . . . . . . . . . . . . . . . . . . 57 3.4.2 Eect of Transition Density . . . . . . . . . . . . . . . . . . . 58 3.4.3 Eect of Decimation . . . . . . . . . . . . . . . . . . . . . . . 61 3.4.4 Analysis of Oversampling Phase Detectors . . . . . . . . . . . 66 4 Design of Ditherless Clock and Data Recovery Circuit 75 4.1 Optimal Phase Detection . . . . . . . . . . . . . . . . . . . . . . . . 75 4.2 Proposed Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.3 Analysis of the CDR with Phase Interval Detection . . . . . . . . . . 84 4.4 Circuit Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.4.1 Sampling Receiver . . . . . . . . . . . . . . . . . . . . . . . . 89 4.4.2 Phase Detector . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.4.3 Digital Loop Filter . . . . . . . . . . . . . . . . . . . . . . . . 95 4.4.4 Phase Locked-Loop . . . . . . . . . . . . . . . . . . . . . . . . 98 4.4.5 Phase Interpolator . . . . . . . . . . . . . . . . . . . . . . . . 99 4.5 Built-In Self-Test Circuit for Jitter Tolerance Measurement . . . . . 102 4.6 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5 Conclusion 114 References 116Docto

    COMMUNICATION CHAIN SUB-BLOCK MODELLING AND IMPLEMENTATION IN FPGA

    Get PDF
    Současné obvody pro obnovu bitové synchronizace jsou stále ve většině případů založené na analogové smyčce fázového závěsu. Analogovou část obvodu je třeba pro každou výrobní technologii znovu navrhnout a migrace takového bloku mezi technologiemi je obtížná. V literatuře existují varianty obvodu CDR založené na asynchronním převzorkování, které jsou implementovány v čistě digitální technologii a jejich migrace je tak bezproblémová. Jako jejich hlavní nevýhody jsou uváděny vysoká složitost digitálního obvodu a menší přirozená odolnost těchto metod vůči jitteru přítomnému v signálu. Tato práce má za cíl ukázat, že tyto nevýhody plně digitálních obvodů CDR je možné překonat návrhem nových algoritmů a jejich optimalizací. Pro optimalizaci byly využity obvody FPGA, které umožňují měnit parametry zkoumaného obvodu v reálném čase (podobně jako při modelování na PC) a zároveň umožňují měření v reálných podmínkách, které je často obtížné modelovat. Rychlost měření je navíc mnohonásobně vyšší, než při použití simulace. Výstupem práce jsou optimalizované bloky CDR s výjimečně nízkými nároky na hardware a velmi malou spotřebou. Jejich chybovost je v reálných přenosových podmínkách zcela srovnatelná s běžnými obvody založenými na smyčce fázového závěsu. Tím byly odstraněny hlavní nevýhody plně digitálních obvodů CDR. Práce se dále zabývá měřením parametrů digitálních spojů. Byla vyvinuta nová metoda pro měření jitteru přímo v obvodu FPGA, která umožňuje charakterizovat přenosový kanál bez nutnosti připojení dalších přístrojů a tedy bez ovlivnění měřeného kanálu. Metoda vyžaduje jen použití minimálního množství hardwarových prostředků obvodu FPGA. Dále byl vyvinu specializovaný měřič distribuce chybovosti, který umožňuje podrobnou analýzu vlastností atmosférických optických spojů z hlediska možného protichybového zabezpečení přenosu dat. V návaznosti na tato měření byla navržena koncepce systému pro zabezpečení přenosu dat založeného na technice ARQ.Most modern clock and data recovery circuits (CDR) are based on analog blocks that need to be redesigned whenever the technology process is to be changed. On the other hand, CDR based blind oversampling architecture (BO-CDR) can be completely designed in a digital process which makes its migration very simple. The main disadvantages of the BO-CDR that are usually mentioned in a literature are complexity of its digital circuitry and finite phase resolution resulting in larger jitter sensitivity and higher error rate. This thesis will show that those problems can be solved by designing a new algorithm of BO-CDR and subsequent optimization. For this task an FPGA was selected as simulation and verification platform. This enables to change parameters of the optimized circuit in real time while measuring on real links (unlike a simulation using inaccurate link models). The output of this optimization is a new BO-CDR algorithm with heavily reduced complexity and very low error rate. A new FPGA-based method of jitter measurement was developed (primary for CDR analysis), which enables a quick link characterization without using probing or additional equipment. The new method requires only a minimum usage of FPGA resources. Finally, new measurement equipment was developed to measure bit error distribution on FSO links to be able to develop a suitable error correction scheme based on ARQ protocol.

    A 10Gb/s Full On-chip Bang-Bang Clock and Data Recovery System Using an Adaptive Loop Bandwidth Strategy

    Get PDF
    As demand for higher bandwidth I/O grows, the front end design of serial link becomes significant to overcome stringent timing requirements on noisy and bandwidthlimited channels. As a clock reconstructing module in a receiver, the recovered clock quality of Clock and Data Recovery is the main issue of the receiver performance. However, from unknown incoming jitter, it is difficult to optimize loop dynamics to minimize steady-state and dynamic jitter. In this thesis a 10 Gb/s adaptive loop bandwidth clock and data recovery circuit with on-chip loop filter is presented. The proposed system optimizes the loop bandwidth adaptively to minimize jitter so that it leads to an improved jitter tolerance performance. This architecture tunes the loop bandwidth by a factor of eight based on the phase information of incoming data. The resulting architecture performs as good as a maximum fixed loop bandwidth CDR while tracking high speed input jitter and as good as a minimum fixed bandwidth CDR while suppressing wide bandwidth steady-state jitter. By employing a mixed mode predictor, high updating rate loop bandwidth adaptation is achieved with low power consumption. Another relevant feature is that it integrates a typically large off-chip filter using a capacitance multiplication technique that employs dual charge pumps. The functionality of the proposed architecture has been verified through schematic and behavioral model simulations. In the simulation, the performance of jitter tolerance is confirmed that the proposed solution provides improved results and robustness to the variation of jitter profile. Its applicability to industrial standards is also verified by the jitter tolerance passing SONET OC-192 successfully

    최적에 가까운 타이밍 적응을 위해 치우친 데이터 레벨과 눈 경사 디텍터를 사용한 최대 눈크기추적 클럭 및 데이터 복원회로 설계

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2021. 2. 정덕균.이 논문에서는 최소-비트 비트 에러율 (BER)에 대한 최대 눈크기 추적 CDR (MET-CDR)의 설계가 제안되었다. 제안 된 CDR 은 최적의 샘플링 단계를 찾기 위해 반복 절차를 가진 BER 카운터 또는 아이 모니터가 필 요하지 않다. 에러 샘플러 출력에 가중치를 두어 더하여 얻은 치우친 데 이터 레벨 (biased dLev) 은 사전 커서 ISI(pre-cursor ISI) 의 정보도 고려한 눈 높이 정보를 추출한다. 델타 T 만큼의 시간 차이를 둔 지점에서 작동 하는 두 샘플러는 현재 눈 높이와 눈 기울기의 극성을 감지하고, 이 정보 를 통해 제안하는 CDR 은 눈 기울기가 0 이되는 최대 눈 높이로 수렴한 다. 측정 결과는 최대 눈 높이와 최소 BER 의 샘플링 위치가 잘 일치 함 을 보여준다. 28nm CMOS 공정으로 구현된 수신기 칩은 23.5dB 의 채널 손실이 있는 상태에서 26Gb/s 에서 동작 가능하다. 0.25UI 의 아이 오프닝 을 가지며, 87mW 의 파워를 소비한다.In this thesis, design of a maximum-eye-tracking CDR (MET-CDR) for minimum bit error rate (BER) is proposed. The proposed CDR does not require a BER coun-ter or an eye-opening monitor with any iterative procedure to find the near-optimal sampling phase. The biased data-level obtained from the weighted sum of error sampler outputs, UP and DN, extracts the actual eye height information in the presence of pre-cursor ISI. Two samplers operating on two slightly different tim-ings detect the current eye height and the polarity of the eye slope so that the CDR tracks the maximum eye height where the slope becomes zero. Measured results show that the sampling phase of the maximum eye height and that of the mini-mum BER match well. A prototype receiver fabricated in 28 nm CMOS process operates at 26 Gb/s with an eye-opening of 0.25 UI and consumes 87 mW while equalizing 23.5 dB of loss at 13 GHz.ABSTRACT I CONTENTS II LIST OF FIGURES IV LIST OF TABLES VIII CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 4 CHAPTER 2 BACKGROUNDS 5 2.1 RECEIVER FRONT-END 5 2.1.1 CHANNEL 7 2.1.2 EQUALIZER 17 2.1.3 CDR 32 2.2 PRIOR ARTS ON CLOCK RECOVERY 39 2.2.1 BB-CDR 39 2.2.2 BER-BASED CDR 41 2.2.3 EOM-BASED CDR 44 2.3 CONCEPT OF THE PROPOSED CDR 47 CHAPTER 3 MAXIMUM-EYE-TRACKING CDR WITH BIASED DATA-LEVEL AND EYE SLOPE DETECTOR 49 3.1 OVERVIEW 49 3.2 DESIGN OF MET-CDR 50 3.2.1 EYE HEIGHT INFORMATION FROM BIASED DATA-LEVEL 50 3.2.2 EYE SLOPE DETECTOR AND ADAPTATION ALGORITHM 60 3.2.3 ARCHITECTURE AND IMPLEMENTATION 67 3.2.4 VERIFICATION OF THE ALGORITHM 71 3.2.5 ANALYSIS ON THE BIASED DATA-LEVEL 76 3.3 EXPANSION OF MET-CDR TO PAM4 SIGNALING 84 3.3.1 MET-CDR WITH PAM4 84 3.3.2 CONSIDERATIONS FOR PAM4 87 CHAPTER 4 MEASUREMENT RESULTS 89 CHAPTER 5 CONCLUSION 99 APPENDIX A MATLAB CODE FOR SIMULATING RECEIVER WITH MET-CDR 100 BIBLIOGRAPHY 105 초 록 113Docto

    Energy-efficient wireline transceivers

    Get PDF
    Power-efficient wireline transceivers are highly demanded by many applications in high performance computation and communication systems. Apart from transferring a wide range of data rates to satisfy the interconnect bandwidth requirement, the transceivers have very tight power budget and are expected to be fully integrated. This thesis explores enabling techniques to implement such transceivers in both circuit and system levels. Specifically, three prototypes will be presented: (1) a 5Gb/s reference-less clock and data recovery circuit (CDR) using phase-rotating phase-locked loop (PRPLL) to conduct phase control so as to break several fundamental trade-offs in conventional receivers; (2) a 4-10.5Gb/s continuous-rate CDR with novel frequency acquisition scheme based on bang-bang phase detector (BBPD) and a ring oscillator-based fractional-N PLL as the low noise wide range DCO in the CDR loop; (3) a source-synchronous energy-proportional link with dynamic voltage and frequency scaling (DVFS) and rapid on/off (ROO) techniques to cut the link power wastage at system level. The receiver/transceiver architectures are highly digital and address the requirements of new receiver architecture development, wide operating range, and low power/area consumption while being fully integrated. Experimental results obtained from the prototypes attest the effectiveness of the proposed techniques

    Design of clock and data recovery circuits for energy-efficient short-reach optical transceivers

    Get PDF
    Nowadays, the increasing demand for cloud based computing and social media services mandates higher throughput (at least 56 Gb/s per data lane with 400 Gb/s total capacity 1) for short reach optical links (with the reach typically less than 2 km) inside data centres. The immediate consequences are the huge and power hungry data centers. To address these issues the intra-data-center connectivity by means of optical links needs continuous upgrading. In recent years, the trend in the industry has shifted toward the use of more complex modulation formats like PAM4 due to its spectral efficiency over the traditional NRZ. Another advantage is the reduced number of channels count which is more cost-effective considering the required area and the I/O density. However employing PAM4 results in more complex transceivers circuitry due to the presence of multilevel transitions and reduced noise budget. In addition, providing higher speed while accommodating the stringent requirements of higher density and energy efficiency (< 5 pJ/bit), makes the design of the optical links more challenging and requires innovative design techniques both at the system and circuit level. This work presents the design of a Clock and Data Recovery Circuit (CDR) as one of the key building blocks for the transceiver modules used in such fibreoptic links. Capable of working with PAM4 signalling format, the new proposed CDR architecture targets data rates of 50−56 Gb/s while achieving the required energy efficiency (< 5 pJ/bit). At the system level, the design proposes a new PAM4 PD which provides a better trade-off in terms of bandwidth and systematic jitter generation in the CDR. By using a digital loop controller (DLC), the CDR gains considerable area reduction with flexibility to adjust the loop dynamics. At the circuit level it focuses on applying different circuit techniques to mitigate the circuit imperfections. It presents a wideband analog front end (AFE), suitable for a 56 Gb/s, 28-Gbaud PAM-4 signal, by using an 8x interleaved, master/ slave based sample and hold circuit. In addition, the AFE is equipped with a calibration scheme which corrects the errors associated with the sampling channels’ offset voltage and gain mismatches. The presented digital to phase converter (DPC) features a modified phase interpolator (PI), a new quadrature phase corrector (QPC) and multi-phase output with de-skewing capabilities.The DPC (as a standalone block) and the CDR (as the main focus of this work) were fabricated in 65-nm CMOS technology. Based on the measurements, the DPC achieves DNL/INL of 0.7/6 LSB respectively while consuming 40.5 mW power from 1.05 V supply. Although the CDR was not fully operational with the PAM4 input, the results from 25-Gbaud PAM2 (NRZ) test setup were used to estimate the performance. Under this scenario, the 1-UI JTOL bandwidth was measured to be 2 MHz with BER threshold of 10−4. The chip consumes 236 mW of power while operating on 1 − 1.2 V supply range achieving an energyefficiency of 4.27 pJ/bit

    Optical Network Design, Modelling and Performance Evaluation for the Upgraded LHC at CERN

    Get PDF
    This thesis considers how advances in optical network and optoelectronic technologies may be utilised in particle physics applications. The research is carried out within a certain framework; CERN's Large Hadron Collider (LHC) upgrade. The focus is on the upgrade of the "last-tier" data links, those residing between the last information-processing stage and the accelerator. For that purpose, different network architectures, based on the Passive Optical Network (PON) architectural paradigm, are designed and evaluated. Firstly, a Time-Division Multiplexed (TDM) PON targeting timing, trigger and control applications is designed. The bi-directional, point-to-multipoint nature of the architecture leads to infrastructure efficiency increase. A custom protocol is developed and implemented using FPGAs. It is experimentally verified that the network design can deliver significantly higher data rate than the current infrastructure and meet the stringent latency requirements of the targeted application. Consequently, the design of a network that can be utilised to transmit all types of information at the upgraded LHC, the High-Luminosity LHC (HL-LHC) is discussed. The most challenging requirement is that of the high upstream data rate. As WDM offers virtual point-to-point connectivity, the possibility of using a Wavelength-Division Multiplexed (WDM) PON is theoretically investigated. The shortcomings of this solution are identified; these include high cost and complexity, therefore a simpler architecture is designed. This is also based on the PON paradigm and features the use of Reflective Electroabsorption Modulators (REAM) at the front-end (close to the particle collision point). Its performance is experimentally investigated and shown to meet the requirements of a unified architecture at the HL-LHC from a networking perspective. Finally, since the radiation resistance of optoelectronic components used at the front-end is of major importance, the REAM radiation hardness is experimentally investigated. Their radiation resistance limits are established, while new insights into the radiation damage mechanism are gained

    A 3.2 Gb/s CDR Using Semi-Blind Oversampling to Achieve High Jitter Tolerance

    No full text
    Abstract—A hybrid CDR is presented that embeds a 5 blind-oversampling CDR within a conventional phase-tracking CDR. This hybrid CDR has a jitter tolerance that is the product of the individual jitter tolerances. In this implementation, the jitter tolerance of a phase-tracking CDR alone is increased by a factor of 32 at frequencies below its loop filter’s bandwidth, while maintaining the high-frequency jitter tolerance of a 5 blind-oversampling CDR. Measured data from a 0.11 m CMOS test chip at 2.4 Gb/s confirm a 200 UI peak-to-peak jitter tolerance for a 200 kHz jitter. The test chip operates from 1.9 Gb/s to 3.5 Gb/s with a BER less than 10 II, consuming 115 mW at 2.4 Gb/s. Index Terms—Blind oversampling, clock and data recovery (CDR), jitter tolerance, oversampling, phase tracking. I
    corecore