20 research outputs found

    Source-synchronous I/O Links using Adaptive Interface Training for High Bandwidth Applications

    Get PDF
    Mobility is the key to the global business which requires people to be always connected to a central server. With the exponential increase in smart phones, tablets, laptops, mobile traffic will soon reach in the range of Exabytes per month by 2018. Applications like video streaming, on-demand-video, online gaming, social media applications will further increase the traffic load. Future application scenarios, such as Smart Cities, Industry 4.0, Machine-to-Machine (M2M) communications bring the concepts of Internet of Things (IoT) which requires high-speed low power communication infrastructures. Scientific applications, such as space exploration, oil exploration also require computing speed in the range of Exaflops/s by 2018 which means TB/s bandwidth at each memory node. To achieve such bandwidth, Input/Output (I/O) link speed between two devices needs to be increased to GB/s. The data at high speed between devices can be transferred serially using complex Clock-Data-Recovery (CDR) I/O links or parallely using simple source-synchronous I/O links. Even though CDR is more efficient than the source-synchronous method for single I/O link, but to achieve TB/s bandwidth from a single device, additional I/O links will be required and the source-synchronous method will be more advantageous in terms of area and power requirements as additional I/O links do not require extra hardware resources. At high speed, there are several non-idealities (Supply noise, crosstalk, Inter- Symbol-Interference (ISI), etc.) which create unwanted skew problem among parallel source-synchronous I/O links. To solve these problems, adaptive trainings are used in time domain to synchronize parallel source-synchronous I/O links irrespective of these non-idealities. In this thesis, two novel adaptive training architectures for source-synchronous I/O links are discussed which require significantly less silicon area and power in comparison to state-of-the-art architectures. First novel adaptive architecture is based on the unit delay concept to synchronize two parallel clocks by adjusting the phase of one clock in only one direction. Second novel adaptive architecture concept consists of Phase Interpolator (PI)-based Phase Locked Loop (PLL) which can adjust the phase in both direction and achieve faster synchronization at the expense of added complexity. With an increase in parallel I/O links, clock skew which is generated by the improper clock tree, also affects the timing margin. Incorrect duty cycle further reduces the timing margin mainly in Double Data Rate (DDR) systems which are generally used to increase the bandwidth of a high-speed communication system. To solve clock skew and duty cycle problems, a novel clock tree buffering algorithm and a novel duty cycle corrector are described which further reduce the power consumption of a source-synchronous system

    RF MEMS reference oscillators platform for wireless communications

    Get PDF
    A complete platform for RF MEMS reference oscillator is built to replace bulky quartz from mobile devices, thus reducing size and cost. The design targets LTE transceivers. A low phase noise 76.8 MHz reference oscillator is designed using material temperature compensated AlN-on-silicon resonator. The thesis proposes a system combining piezoelectric resonator with low loading CMOS cross coupled series resonance oscillator to reach state-of-the-art LTE phase noise specifications. The designed resonator is a two port fundamental width extensional mode resonator. The resonator characterized by high unloaded quality factor in vacuum is designed with low temperature coefficient of frequency (TCF) using as compensation material which enhances the TCF from - 3000 ppm to 105 ppm across temperature ranges of -40˚C to 85˚C. By using a series resonant CMOS oscillator, phase noise of -123 dBc/Hz at 1 kHz, and -162 dBc/Hz at 1MHz offset is achieved. The oscillator’s integrated RMS jitter is 106 fs (10 kHz–20 MHz), consuming 850 μA, with startup time is 250μs, achieving a Figure-of-merit (FOM) of 216 dB. Electronic frequency compensation is presented to further enhance the frequency stability of the oscillator. Initial frequency offset of 8000 ppm and temperature drift errors are combined and further addressed electronically. A simple digital compensation circuitry generates a compensation word as an input to 21 bit MASH 1 -1-1 sigma delta modulator incorporated in RF LTE fractional N-PLL for frequency compensation. Temperature is sensed using low power BJT band-gap front end circuitry with 12 bit temperature to digital converter characterized by a resolution of 0.075˚C. The smart temperature sensor consumes only 4.6 μA. 700 MHz band LTE signal proved to have the stringent phase noise and frequency resolution specifications among all LTE bands. For this band, the achieved jitter value is 1.29 ps and the output frequency stability is 0.5 ppm over temperature ranges from -40˚C to 85˚C. The system is built on 32nm CMOS technology using 1.8V IO device

    Inductorless Frequency Synthesizers for Low-Cost Wireless

    Get PDF
    AbstractThe quest for ubiquitous wireless connectivity, drives an increasing demand for compact and efficient means of frequency generation. Conventional synthesizer options, however, generally trade one requirement for the other, achieving either excellent levels of efficiency by leveraging LC-oscillators, or a very compact area by relying on ring-oscillators. This chapter describes a recently introduced class of inductorless frequency synthesizers, based on the periodic realignment of a ring-oscillator, that have the potential to break this tradeoff. After analyzing their jitter-power product, the conditions that ensure optimum performance are derived and a novel digital-to-time converter range-reduction technique is introduced, to enable low-jitter and low-power fractional-N frequency synthesis. A prototype, which implements the proposed design guidelines and techniques, has been fabricated in 65 nm CMOS. It occupies a core area of 0:0275 mm2^{2} 2 and covers the 1:6-to-3:0 GHz range, achieving an absolute rms jitter (integrated from 30 kHz-to-30 MHz) of 397 fs at 2:5 mW power. With a corresponding jitter-power figure-of-merit of −244 dB in the fractional-N mode, the prototype outperforms prior state-of-the-art inductorless frequency synthesizers

    고속 시리얼 링크를 위한 고리 발진기를 기반으로 하는 주파수 합성기

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2022. 8. 정덕균.In this dissertation, major concerns in the clocking of modern serial links are discussed. As sub-rate, multi-standard architectures are becoming predominant, the conventional clocking methodology seems to necessitate innovation in terms of low-cost implementation. Frequency synthesis with active, inductor-less oscillators replacing LC counterparts are reviewed, and solutions for two major drawbacks are proposed. Each solution is verified by prototype chip design, giving a possibility that the inductor-less oscillator may become a proper candidate for future high-speed serial links. To mitigate the high flicker noise of a high-frequency ring oscillator (RO), a reference multiplication technique that effectively extends the bandwidth of the following all-digital phase-locked loop (ADPLL) is proposed. The technique avoids any jitter accumulation, generating a clean mid-frequency clock, overall achieving high jitter performance in conjunction with the ADPLL. Timing constraint for the proper reference multiplication is first analyzed to determine the calibration points that may correct the existent phase errors. The weight for each calibration point is updated by the proposed a priori probability-based least-mean-square (LMS) algorithm. To minimize the time required for the calibration, each gain for the weight update is adaptively varied by deducing a posteriori which error source dominates the others. The prototype chip is fabricated in a 40-nm CMOS technology, and its measurement results verify the low-jitter, high-frequency clock generation with fast calibration settling. The presented work achieves an rms jitter of 177/223 fs at 8/16-GHz output, consuming 12.1/17-mW power. As the second embodiment, an RO-based ADPLL with an analog technique that addresses the high supply sensitivity of the RO is presented. Unlike prior arts, the circuit for the proposed technique does not extort the RO voltage headroom, allowing high-frequency oscillation. Further, the performance given from the technique is robust over process, voltage, and temperature (PVT) variations, avoiding the use of additional calibration hardware. Lastly, a comprehensive analysis of phase noise contribution is conducted for the overall ADPLL, followed by circuit optimizations, to retain the low-jitter output. Implemented in a 40-nm CMOS technology, the frequency synthesizer achieves an rms jitter of 289 fs at 8 GHz output without any injected supply noise. Under a 20-mVrms white supply noise, the ADPLL suppresses supply-noise-induced jitter by -23.8 dB.본 논문은 현대 시리얼 링크의 클락킹에 관여되는 주요한 문제들에 대하여 기술한다. 준속도, 다중 표준 구조들이 채택되고 있는 추세에 따라, 기존의 클라킹 방법은 낮은 비용의 구현의 관점에서 새로운 혁신을 필요로 한다. LC 공진기를 대신하여 능동 소자 발진기를 사용한 주파수 합성에 대하여 알아보고, 이에 발생하는 두가지 주요 문제점과 각각에 대한 해결 방안을 탐색한다. 각 제안 방법을 프로토타입 칩을 통해 그 효용성을 검증하고, 이어서 능동 소자 발진기가 미래의 고속 시리얼 링크의 클락킹에 사용될 가능성에 대해 검토한다. 첫번째 시연으로써, 고주파 고리 발진기의 높은 플리커 잡음을 완화시키기 위해 기준 신호를 배수화하여 뒷단의 위상 고정 루프의 대역폭을 효과적으로 극대화 시키는 회로 기술을 제안한다. 본 기술은 지터를 누적 시키지 않으며 따라서 깨끗한 중간 주파수 클락을 생성시켜 위상 고정 루프와 함께 높은 성능의 고주파 클락을 합성한다. 기준 신호를 성공적으로 배수화하기 위한 타이밍 조건들을 먼저 분석하여 타이밍 오류를 제거하기 위한 방법론을 파악한다. 각 교정 중량은 연역적 확률을 기반으로한 LMS 알고리즘을 통해 갱신되도록 설계된다. 교정에 필요한 시간을 최소화 하기 위하여, 각 교정 이득은 타이밍 오류 근원들의 크기를 귀납적으로 추론한 값을 바탕으로 지속적으로 제어된다. 40-nm CMOS 공정으로 구현된 프로토타입 칩의 측정을 통해 저소음, 고주파 클락을 빠른 교정 시간안에 합성해 냄을 확인하였다. 이는 177/223 fs의 rms 지터를 가지는 8/16 GHz의 클락을 출력한다. 두번째 시연으로써, 고리 발진기의 높은 전원 노이즈 의존성을 완화시키는 기술이 포함된 주파수 합성기가 설계되었다. 이는 고리 발진기의 전압 헤드룸을 보존함으로서 고주파 발진을 가능하게 한다. 나아가, 전원 노이즈 감소 성능은 공정, 전압, 온도 변동에 대하여 민감하지 않으며, 따라서 추가적인 교정 회로를 필요로 하지 않는다. 마지막으로, 위상 노이즈에 대한 포괄적 분석과 회로 최적화를 통하여 주파수 합성기의 저잡음 출력을 방해하지 않는 방법을 고안하였다. 해당 프로토타입 칩은 40-nm CMOS 공정으로 구현되었으며, 전원 노이즈가 인가되지 않은 상태에서 289 fs의 rms 지터를 가지는 8 GHz의 클락을 출력한다. 또한, 20 mVrms의 전원 노이즈가 인가되었을 때에 유도되는 지터의 양을 -23.8 dB 만큼 줄이는 것을 확인하였다.1 Introduction 1 1.1 Motivation 3 1.1.1 Clocking in High-Speed Serial Links 4 1.1.2 Multi-Phase, High-Frequency Clock Conversion 8 1.2 Dissertation Objectives 10 2 RO-Based High-Frequency Synthesis 12 2.1 Phase-Locked Loop Fundamentals 12 2.2 Toward All-Digital Regime 15 2.3 RO Design Challenges 21 2.3.1 Oscillator Phase Noise 21 2.3.2 Challenge 1: High Flicker Noise 23 2.3.3 Challenge 2: High Supply Noise Sensitivity 26 3 Filtering RO Noise 28 3.1 Introduction 28 3.2 Proposed Reference Octupler 34 3.2.1 Delay Constraint 34 3.2.2 Phase Error Calibration 38 3.2.3 Circuit Implementation 51 3.3 IL-ADPLL Implementation 55 3.4 Measurement Results 59 3.5 Summary 63 4 RO Supply Noise Compensation 69 4.1 Introduction 69 4.2 Proposed Analog Closed Loop for Supply Noise Compensation 72 4.2.1 Circuit Implementation 73 4.2.2 Frequency-Domain Analysis 76 4.2.3 Circuit Optimization 81 4.3 ADPLL Implementation 87 4.4 Measurement Results 90 4.5 Summary 98 5 Conclusions 99 A Notes on the 8REF 102 B Notes on the ACSC 105박

    INJECTION-LOCKING TECHNIQUES FOR MULTI-CHANNEL ENERGY EFFICIENT TRANSMITTER

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    PHY Link Design and Optimization For High-Speed Low-Power Communication Systems

    Get PDF
    The ever-growing demands for high-bandwidth data transfer have been pushing towards advancing research efforts in the field of high-performing communication systems. Studies on the performance of single chip, e.g. faster multi-core processors and higher system memory capacity, have been explored. To further enhance the system performance, researches have been focused on the improvement of data-transfer bandwidth for chip-to-chip communication in the high-speed serial link. Many solutions have been addressed to overcome the bottleneck caused by the non-idealties such as bandwidth-limited electrical channel that connects two link devices and varieties of undesired noise in the communication systems. Nevertheless, with these solutions data have run into limitations of the timing margins for high-speed interfaces running at multiple gigabits per second data rates on low-cost Printed Circuit Board (PCB) material with constrained power budget. Therefore, the challenge in designing a physical layer (PHY) link for high-speed communication systems turns out to be power-efficient, reliable and cost-effective. In this context, this dissertation is intended to focus on architectural design, system-level and circuit-level verification of a PHY link as well as system performance optimization in respective of power, reliability and adaptability in high-speed communication systems. The PHY is mainly composed of clock data recovery (CDR), equalizers (EQs) and high- speed I/O drivers. Symmetrical structure of the PHY link is usually duplicated in both link devices for bidirectional data transmission. By introducing training mechanisms into high-speed communication systems, the timing in one link device is adaptively aligned to the timing condition specified in the other link device despite of different skews or induced jitter resulting from process, voltage and temperature (PVT) variations in the individual link. With reliable timing relationships among the interface signals provided, the total system bandwidth is dramatically improved. On the other hand, interface training offers high flexibility for reuse without further investigation on high demanding components involved in high costs. In the training mode, a CDR module is essential for reconstructing the transmitted bitstream to achieve the best data eye and to detect the edges of data stream in asynchronous systems or source-synchronous systems. Generally, the CDR works as a feedback control system that aligns its output clock to the center of the received data. In systems that contain multiple data links, the overall CDR power consumption increases linearly with the increase in number of links as one CDR is required for each link. Therefore, a power-efficient CDR plays a significant role in such systems with parallel links. Furthermore, a high performance CDR requires low jitter generation in spite of high input jitter. To minimize the trade-off between power consumption and CDR jitter, a novel CDR architecture is proposed by utilizing the proportional-integral (PI) controller and three times sampling scheme. Meanwhile, signal integrity (SI) becomes critical as the data rate exceeds several gigabits per second. Distorted data due to the non-idealties in systems are likely to reduce the signal quality aggressively and result in intolerable transmission errors in worst case scenarios, thus affect the system effective bandwidth. Hence, additional trainings such as transmitter (Tx) and receiver (Rx) EQ trainings for SI purpose are inserted into the interface training. Besides, a simplified system architecture with unsymmetrical placement of adaptive Rx and Tx EQs in a single link device is proposed and analyzed by using different coefficient adaptation algorithms. This architecture enables to reduce a large number of EQs through the training, especially in case of parallel links. Meanwhile, considerable power and chip area are saved. Finally, high-speed I/O driver against PVT variations is discussed. Critical issues such as overshoot and undershoot interfering with the data are primarily accompanied by impedance mismatch between the I/O driver and its transmitting channel. By applying PVT compensation technique I/O driver impedances can be effectively calibrated close to the target value. Different digital impedance calibration algorithms against PVT variations are implemented and compared for achieving fast calibration and low power requirements

    데이터 전송로 확장성과 루프 선형성을 향상시킨 다중채널 수신기들에 관한 연구

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2013. 2. 정덕균.Two types of serial data communication receivers that adopt a multichannel architecture for a high aggregate I/O bandwidth are presented. Two techniques for collaboration and sharing among channels are proposed to enhance the loop-linearity and channel-expandability of multichannel receivers, respectively. The first proposed receiver employs a collaborative timing scheme recovery which relies on the sharing of all outputs of phase detectors (PDs) among channels to extract common information about the timing and multilevel signaling architecture of PAM-4. The shared timing information is processed by a common global loop filter and is used to update the phase of the voltage-controlled oscillator with better rejection of per-channel noise. In addition to collaborative timing recovery, a simple linearization technique for binary PDs is proposed. The technique realizes a high-rate oversampling PD while the hardware cost is equivalent to that of a conventional 2x-oversampling clock and data recovery. The first receiver exploiting the collaborative timing recovery architecture is designed using 45-nm CMOS technology. A single data lane occupies a 0.195-mm2 area and consumes a relatively low 17.9 mW at 6 Gb/s at 1.0V. Therefore, the power efficiency is 2.98 mW/Gb/s. The simulated jitter is about 0.034 UI RMS given an input jitter value of 0.03 UI RMS, while the relatively constant loop bandwidth with the PD linearization technique is about 7.3-MHz regardless of the data-stream noise. Unlike the first receiver, the second proposed multichannel receiver was designed to reduce the hardware complexity of each lane. The receiver employs shared calibration logic among channels and yet achieves superior channel expandability with slim data lanes. A shared global calibration control, which is used in a forwarded clock receiver based on a multiphase delay-locked loop, accomplishes skew calibration, equalizer adaptation, and the phase lock of all channels during a calibration period, resulting in reduced hardware overhead and less area required by each data lane. The second forwarded clock receiver is designed in 90-nm CMOS technology. It achieves error-free eye openings of more than 0.5 UI across 9− 28 inch Nelco 4000-6 microstrips at 4− 7 Gb/s and more than 0.42 UI at data rates of up to 9 Gb/s. The data lane occupies only 0.152 mm2 and consumes 69.8 mW, while the rest of the receiver occupies 0.297 mm2 and consumes 56 mW at a data rate of 7 Gb/s and a supply voltage of 1.35 V.1. Introduction 1 1.1 Motivations 1.2 Thesis Organization 2. Previous Receivers for Serial-Data Communications 2.1 Classification of the Links 2.2 Clocking architecture of transceivers 2.3 Components of receiver 2.3.1 Channel loss 2.3.2 Equalizer 2.3.3 Clock and data recovery circuit 2.3.3.1. Basic architecture 2.3.3.2. Phase detector 2.3.3.2.1. Linear phase detector 2.3.3.2.2. Binary phase detector 2.3.3.3. Frequency detector 2.3.3.4. Charge pump 2.3.3.5. Voltage controlled oscillator and delay-line 2.3.4 Loop dynamics of PLL 2.3.5 Loop dynamics of DLL 3. The Proposed PLL-Based Receiver with Loop Linearization Technique 3.1 Introduction 3.2 Motivation 3.3 Overview of binary phase detection 3.4 The proposed BBPD linearization technique 3.4.1 Architecture of the proposed PLL-based receiver 3.4.2 Linearization technique of binary phase detection 3.4.3 Rotational pattern of sampling phase offset 3.5 PD gain analysis and optimization 3.6 Loop Dynamics of the 2nd-order CDR 3.7 Verification with the time-accurate behavioral simulation 3.8 Summary 4. The Proposed DLL-Based Receiver with Forwarded-Clock 4.1 Introduction 4.2 Motivation 4.3 Design consideration 4.4 Architecture of the proposed forwarded-clock receiver 4.5 Circuit description 4.5.1 Analog multi-phase DLL 4.5.2 Dual-input interpolating deley cells 4.5.3 Dedicated half-rate data samplers 4.5.4 Cherry-Hooper continuous-time linear equalizer 4.5.5 Equalizer adaptation and phase-lock scheme 4.6 Measurement results 5. Conclusion 6. BibliographyDocto

    Design of clock and data recovery circuits for energy-efficient short-reach optical transceivers

    Get PDF
    Nowadays, the increasing demand for cloud based computing and social media services mandates higher throughput (at least 56 Gb/s per data lane with 400 Gb/s total capacity 1) for short reach optical links (with the reach typically less than 2 km) inside data centres. The immediate consequences are the huge and power hungry data centers. To address these issues the intra-data-center connectivity by means of optical links needs continuous upgrading. In recent years, the trend in the industry has shifted toward the use of more complex modulation formats like PAM4 due to its spectral efficiency over the traditional NRZ. Another advantage is the reduced number of channels count which is more cost-effective considering the required area and the I/O density. However employing PAM4 results in more complex transceivers circuitry due to the presence of multilevel transitions and reduced noise budget. In addition, providing higher speed while accommodating the stringent requirements of higher density and energy efficiency (< 5 pJ/bit), makes the design of the optical links more challenging and requires innovative design techniques both at the system and circuit level. This work presents the design of a Clock and Data Recovery Circuit (CDR) as one of the key building blocks for the transceiver modules used in such fibreoptic links. Capable of working with PAM4 signalling format, the new proposed CDR architecture targets data rates of 50−56 Gb/s while achieving the required energy efficiency (< 5 pJ/bit). At the system level, the design proposes a new PAM4 PD which provides a better trade-off in terms of bandwidth and systematic jitter generation in the CDR. By using a digital loop controller (DLC), the CDR gains considerable area reduction with flexibility to adjust the loop dynamics. At the circuit level it focuses on applying different circuit techniques to mitigate the circuit imperfections. It presents a wideband analog front end (AFE), suitable for a 56 Gb/s, 28-Gbaud PAM-4 signal, by using an 8x interleaved, master/ slave based sample and hold circuit. In addition, the AFE is equipped with a calibration scheme which corrects the errors associated with the sampling channels’ offset voltage and gain mismatches. The presented digital to phase converter (DPC) features a modified phase interpolator (PI), a new quadrature phase corrector (QPC) and multi-phase output with de-skewing capabilities.The DPC (as a standalone block) and the CDR (as the main focus of this work) were fabricated in 65-nm CMOS technology. Based on the measurements, the DPC achieves DNL/INL of 0.7/6 LSB respectively while consuming 40.5 mW power from 1.05 V supply. Although the CDR was not fully operational with the PAM4 input, the results from 25-Gbaud PAM2 (NRZ) test setup were used to estimate the performance. Under this scenario, the 1-UI JTOL bandwidth was measured to be 2 MHz with BER threshold of 10−4. The chip consumes 236 mW of power while operating on 1 − 1.2 V supply range achieving an energyefficiency of 4.27 pJ/bit

    Injection locked ring oscillator design for application in Direct Time of Flight LIDAR

    Get PDF
    Diplomová práce přibližuje systémy LIDAR přímo měřící čas průletu a časově digitální převodníky určené k použití v těchto systémech. Představuje problematiku distribuce hodinových signálů napříč soubory časově digitálních převodníků v LIDAR systémech a věnuje se jednomu z nových řešení této problematiky, které je založené na injekcí zavěšených oscilátorech. Technika injekčního zavěšení oscilátorů je důkladně matematicky popsána. V programu Matlab byl vytvořen simulační model injekcí zavěšeného kruhového oscilátoru, který potvrzuje správnost uvedených analytických predikcí. Ve výrobní technologii ONK65 byl navržen injekcí zavěšený kruhový oscilátor stabilizovaný pomocí smyčky závěsu zpoždění, určený pro implementaci časově digitálního převodníku pro systém LIDAR. Navržený injekcí zavěšený kruhový oscilátor byl verifikován počítačovými simulacemi zohledňujícími vliv procesních, napěťových i teplotních variací. Oscilátor poskytuje specifikované časové rozlišení 50 pikosekund a dosahuje dvakrát nižší hodnoty fázového neklidu než ekvivalentní volnoběžný oscilátor v dané technologii.The diploma thesis provides an introduction to Direct Time of Flight LIDAR systems and Time to Digital Converters used in these systems. It discusses the problem of clock distribution in LIDAR Time to Digital Converter arrays, and examines one of the possible solutions to this problem based on injection locked oscillators. The injection locking phenomenon is thoroughly mathematically described and a Matlab model of an injection locked ring oscillator is presented, confirming the analytic predictions. In ONK65 processing technology, an injection locked ring oscillator biased by a delay locked loop meant specifically for application in Time to Digital Converters for LIDAR systems has been designed. The designed oscillator has been verified by computer simulations taking process, voltage and temperature variations into account and offers specified time resolution of 50 picosecond as well as two times less clock jitter than an equivalent free-running oscillator in the given processing technology.

    Topical Workshop on Electronics for Particle Physics

    Get PDF
    The purpose of the workshop was to present results and original concepts for electronics research and development relevant to particle physics experiments as well as accelerator and beam instrumentation at future facilities; to review the status of electronics for the LHC experiments; to identify and encourage common efforts for the development of electronics; and to promote information exchange and collaboration in the relevant engineering and physics communities
    corecore