146 research outputs found

    통계적 주파수 검출기 기반 기준 주파수를 사용하지 않는 클록 및 데이터 복원 회로의 설계 방법론

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2022. 8. 정덕균.In this thesis, a design of a high-speed, power-efficient, wide-range clock and data recovery (CDR) without a reference clock is proposed. A frequency acquisition scheme using a stochastic frequency detector (SFD) based on the Alexander phase detector (PD) is utilized for the referenceless operation. Pat-tern histogram analysis is presented to analyze the frequency acquisition behavior of the SFD and verified by simulation. Based on the information obtained by pattern histogram analysis, SFD using autocovariance is proposed. With a direct-proportional path and a digital integral path, the proposed referenceless CDR achieves frequency lock at all measurable conditions, and the measured frequency acquisition time is within 7μs. The prototype chip has been fabricated in a 40-nm CMOS process and occupies an active area of 0.032 mm2. The proposed referenceless CDR achieves the BER of less than 10-12 at 32 Gb/s and exhibits an energy efficiency of 1.15 pJ/b at 32 Gb/s with a 1.0 V supply.본 논문은 기준 클럭이 없는 고속, 저전력, 광대역으로 동작하는 클럭 및 데이터 복원회로의 설계를 제안한다. 기준 클럭이 없는 동작을 위해서 알렉산더 위상 검출기에 기반한 통계적 주파수 검출기를 사용하는 주파수 획득 방식이 사용된다. 통계적 주파수 검출기의 주파수 추적 양상을 분석하기 위해 패턴 히스토그램 분석 방법론을 제시하였고 시뮬레이션을 통해 검증하였다. 패턴 히스토그램 분석을 통해 얻은 정보를 바탕으로 자기공분산을 이용한 통계적 주파수 검출기를 제안한다. 직접 비례 경로와 디지털 적분 경로를 통해 제안된 기준 클럭이 없는 클럭 및 데이터 복원회로는 모든 측정 가능한 조건에서 주파수 잠금을 달성하는 데 성공하였고, 모든 경우에서 측정된 주파수 추적 시간은 7μs 이내이다. 40-nm CMOS 공정을 이용하여 만들어진 칩은 0.032 mm2의 면적을 차지한다. 제안하는 클럭 및 데이터 복원회로는 32 Gb/s의 속도에서 비트에러율 10-12 이하로 동작하였고, 에너지 효율은 32Gb/s의 속도에서 1.0V 공급전압을 사용하여 1.15 pJ/b을 달성하였다.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 13 CHAPTER 2 BACKGROUNDS 14 2.1 CLOCKING ARCHITECTURES IN SERIAL LINK INTERFACE 14 2.2 GENERAL CONSIDERATIONS FOR CLOCK AND DATA RECOVERY 24 2.2.1 OVERVIEW 24 2.2.2 JITTER 26 2.2.3 CDR JITTER CHARACTERISTICS 33 2.3 CDR ARCHITECTURES 39 2.3.1 PLL-BASED CDR – WITH EXTERNAL REFERENCE CLOCK 39 2.3.2 DLL/PI-BASED CDR 44 2.3.3 PLL-BASED CDR – WITHOUT EXTERNAL REFERENCE CLOCK 47 2.4 FREQUENCY ACQUISITION SCHEME 50 2.4.1 TYPICAL FREQUENCY DETECTORS 50 2.4.1.1 DIGITAL QUADRICORRELATOR FREQUENCY DETECTOR 50 2.4.1.2 ROTATIONAL FREQUENCY DETECTOR 54 2.4.2 PRIOR WORKS 56 CHAPTER 3 DESIGN OF THE REFERENCELESS CDR USING SFD 58 3.1 OVERVIEW 58 3.2 PROPOSED FREQUENCY DETECTOR 62 3.2.1 MOTIVATION 62 3.2.2 PATTERN HISTOGRAM ANALYSIS 68 3.2.3 INTRODUCTION OF AUTOCOVARIANCE TO STOCHASTIC FREQUENCY DETECTOR 75 3.3 CIRCUIT IMPLEMENTATION 83 3.3.1 IMPLEMENTATION OF THE PROPOSED REFERENCELESS CDR 83 3.3.2 CONTINUOUS-TIME LINEAR EQUALIZER (CTLE) 85 3.3.3 DIGITALLY-CONTROLLED OSCILLATOR (DCO) 87 3.4 MEASUREMENT RESULTS 89 CHAPTER 4 CONCLUSION 99 APPENDIX A DETAILED FREQUENCY ACQUISITION WAVEFORMS OF THE PROPOSED SFD 100 BIBLIOGRAPHY 108 초 록 122박

    A high speed serializer/deserializer design

    Get PDF
    A Serializer/Deserializer (SerDes) is a circuit that converts parallel data into a serial stream and vice versa. It helps solve clock/data skew problems, simplifies data transmission, lowers the power consumption and reduces the chip cost. The goal of this project was to solve the challenges in high speed SerDes design, which included the low jitter design, wide bandwidth design and low power design. A quarter-rate multiplexer/demultiplexer (MUX/DEMUX) was implemented. This quarter-rate structure decreases the required clock frequency from one half to one quarter of the data rate. It is shown that this significantly relaxes the design of the VCO at high speed and achieves lower power consumption. A novel multi-phase LC-ring oscillator was developed to supply a low noise clock to the SerDes. This proposed VCO combined an LC-tank with a ring structure to achieve both wide tuning range (11%) and low phase noise (-110dBc/Hz at 1MHz offset). With this structure, a data rate of 36 Gb/s was realized with a measured peak-to-peak jitter of 10ps using 0.18microm SiGe BiCMOS technology. The power consumption is 3.6W with 3.4V power supply voltage. At a 60 Gb/s data rate the simulated peak-to-peak jitter was 4.8ps using 65nm CMOS technology. The power consumption is 92mW with 2V power supply voltage. A time-to-digital (TDC) calibration circuit was designed to compensate for the phase mismatches among the multiple phases of the PLL clock using a three dimensional fully depleted silicon on insulator (3D FDSOI) CMOS process. The 3D process separated the analog PLL portion from the digital calibration portion into different tiers. This eliminated the noise coupling through the common substrate in the 2D process. Mismatches caused by the vertical tier-to-tier interconnections and the temperature influence in the 3D process were attenuated by the proposed calibration circuit. The design strategy and circuits developed from this dissertation provide significant benefit to both wired and wireless applications

    Modelização em MatLab® de interfaces de comunicação de alto débito

    Get PDF
    Mestrado em Engenharia Electrónica e TelecomunicaçõesNow-a-days, high-speed digital data transmission is under continuous development. The constant increasing on the bitrates has been lead to the need of more sophisticated and complex receivers, systems that provide the recovering of the transmitted data over a dispersive channel that degrades the transmitted signal quality. Therefore, the receiver shall compensate the distortion introduced by the channel as well as synchronize the received signal that in addition to distortion, is also affected by jitter. The distortion derived from the channel is attenuated by means of equalization circuits that offset the channel frequency response at the transmission rate, making it as flat as possible for the desired frequency. On the other hand, the synchronization of the received signal is achieved by means of clock and data recovery circuits that usually recover the clock signal through the data transitions for sampling the received data. The main focus of this thesis concerns the modeling of a data receiver for a high-speed interface. The simulation of the data receiver block implies the modeling of a transmission channel depending on its characteristics. The proposed transmission system, from the transmitter to the output of the data recovery block, includes equalization filters for signal conditioning, of which several distinct architectures are studied. It’s proposed two architectures for the clock and data recovery circuit. The first one is a 2x oversampling clock and data recovery circuit based on a Phase Tracking architecture. The second one, is a 3x oversampling clock and data recovery based on a Blind Sampling architecture. By modeling both of the architectures of the clock and data recovery circuit, it’s intended to analyze the respective jitter tolerance results. It is crucial to know the amount of jitter that can be tolerated by these circuits in order to recover the data with a satisfying bit error ratio. The obtained results show a very close match to the theoretical values, where the 2x and 3x oversampling architecture presents a jitter tolerance of, approximately, 12UI and 23UI respectively for low jitter frequencies.Hoje em dia, a transmissão de dados digital de alto débito binário encontra-se em constante evolução. O contínuo aumento das taxas de transmissão tem vindo a exigir sistemas de receção cada vez mais sofisticados e complexos, que facultem a recuperação dos dados transmitidos ao longo de um canal dispersivo que degrada a qualidade do sinal transmitido. Consequentemente, cabe ao recetor compensar a distorção introduzida pelo canal bem como a sincronização do sinal recebido que, para além de sofrer distorção, vem também afetado por jitter. A distorção introduzida pelo canal é atenuada através de circuitos de igualização, que compensam a resposta em frequência do canal à frequência de transmissão, de maneira a tornar a mesma o mais plana possível para a frequência desejada. Por sua vez, a sincronização do sinal recebido é conseguida através de circuitos de recuperação de dados e relógio, que, geralmente, geram um sinal de relógio a partir das transições do sinal de dados que é posteriormente utilizado para fazer a amostragem dos dados recebidos. O principal foco desta tese incide na modelação de um sistema de receção de dados de uma interface de alta velocidade. A simulação do bloco de receção de dados implica a modelação de um canal de transmissão em função das características do mesmo. O sistema de transmissão proposto, desde o transmissor até à saída do bloco de recuperação de dados, inclui filtros de igualização para acondicionamento de sinal, dos quais várias arquiteturas distintas são estudadas. São propostas duas arquiteturas para o circuito de recuperação de dados e relógio. A primeira trata-se de um circuito de recuperação de dados e relógio com sobre-amostragem 2x, baseado numa arquitetura de Phase Tracking. A segunda arquitetura trata-se de um circuito de recuperação de dados e relógio com sobre-amostragem 3x, baseado num arquitetura Blind Sampling. A análise de resultados da modelação de ambas as arquiteturas do circuito de recuperação de dados e relógio é realizada através da aquisição das respetivas curvas de tolerância de jitter. É fundamental conhecer a quantidade de jitter tolerado por estes circuitos a fim de recuperar os dados com uma probabilidade de erro de bit satisfatória. Os resultados obtidos mostram uma correspondência bastante próxima dos valores teóricos, onde a arquitetura com sobre-amostragem 2x apresenta uma tolerância de jitter de, aproximadamente, 12UI e a arquitetura com sobre-amostragem 3x apresenta uma tolerância de, aproximadamente, 23UI para baixas frequências de jitter

    Design of energy efficient high speed I/O interfaces

    Get PDF
    Energy efficiency has become a key performance metric for wireline high speed I/O interfaces. Consequently, design of low power I/O interfaces has garnered large interest that has mostly been focused on active power reduction techniques at peak data rate. In practice, most systems exhibit a wide range of data transfer patterns. As a result, low energy per bit operation at peak data rate does not necessarily translate to overall low energy operation. Therefore, I/O interfaces that can scale their power consumption with data rate requirement are desirable. Rapid on-off I/O interfaces have a potential to scale power with data rate requirements without severely affecting either latency or the throughput of the I/O interface. In this work, we explore circuit techniques for designing rapid on-off high speed wireline I/O interfaces and digital fractional-N PLLs. A burst-mode transmitter suitable for rapid on-off I/O interfaces is presented that achieves 6 ns turn-on time by utilizing a fast frequency settling ring oscillator in digital multiplying delay-locked loop and a rapid on-off biasing scheme for current mode output driver. Fabricated in 90 nm CMOS process, the prototype achieves 2.29 mW/Gb/s energy efficiency at peak data rate of 8 Gb/s. A 125X (8 Gb/s to 64 Mb/s) change in effective data rate results in 67X (18.29 mW to 0.27 mW) change in transmitter power consumption corresponding to only 2X (2.29 mW/Gb/s to 4.24 mW/Gb/s) degradation in energy efficiency for 32-byte long data bursts. We also present an analytical bit error rate (BER) computation technique for this transmitter under rapid on-off operation, which uses MDLL settling measurement data in conjunction with always-on transmitter measurements. This technique indicates that the BER bathtub width for 10^(−12) BER is 0.65 UI and 0.72 UI during rapid on-off operation and always-on operation, respectively. Next, a pulse response estimation-based technique is proposed enabling burst-mode operation for baud-rate sampling receivers that operate over high loss channels. Such receivers typically employ discrete time equalization to combat inter-symbol interference. Implementation details are provided for a receiver chip, fabricated in 65nm CMOS technology, that demonstrates efficacy of the proposed technique. A low complexity pulse response estimation technique is also presented for low power receivers that do not employ discrete time equalizers. We also present techniques for implementation of highly digital fractional-N PLL employing a phase interpolator based fractional divider to improve the quantization noise shaping properties of a 1-bit ∆Σ frequency-to-digital converter. Fabricated in 65nm CMOS process, the prototype calibration-free fractional-N Type-II PLL employs the proposed frequency-to-digital converter in place of a high resolution time-to-digital converter and achieves 848 fs rms integrated jitter (1 kHz-30 MHz) and -101 dBc/Hz in-band phase noise while generating 5.054 GHz output from 31.25 MHz input

    Design and realization of a 2.4 Gbps - 3.2 Gbps clock and data recovery circuit

    Get PDF
    This thesis presents the design, verification, system integration and the physical realization of a high-speed monolithic phase-locked loop (PLL) based clock and data recovery (CDR) circuit. The architecture of the CDR has been realized as a two-loop structure consisting of coarse and fine loops, each of which is capable of processing the incoming low-speed reference clock and high-speed random data. At start up, the coarse loop provides fast locking to the system frequency with the help of the reference clock. After the VCO clock reaches a proximity of system frequency , the LOCK signal is generated and the coarse loop is tumed off, while the fine loop is tumed on. Fine loop tracks the phase of the generated clock with respect to the data and aligns the VCO clock such that its rising edge is in the middle of data eye. The speed and symmetry of sub-blocks in fine loop are extremely important, since all asymmetric charging effects, skew and setup/hold problems in this loop translate into a static phase error at the clock output. The entire circuit architecture is built with a special low-voltage circuit design technique. All analogue as well as digital sub-blocks of the CDR architecture presented in this work operate on a differential signalling, which significantly makes the design more complex while ensuring a more robust perforrnance. Other important features of this CDR include small area, single power supply, low power consumption, capability to operate at very high data rates, and the ability to handle between 2.4 Gbps and 3.2 Gbps data rate. The CDR architecture was realized using a conventional 0.13-mikrometer digital CMOS technology (Foundry: UMC), which ensures a lower overall cost and better portability for the design. The CDR architecture presented in this work is capable of operating at sampling frequencies of up to 3.2 GHz, and still can achieve the robust phase alignrnent. The entire circuit is designed with single 1.2 V power supply .The overall power consumption is estimated as 18.6 mW at 3.2 GHz sampling rate. The overall silicon area of the CDR is approximately 0.3 mm^2 with its internal loop filter capacitors. Other researchers have reported similar featured PLL-based clock and data recovery circuits in terms of operating data rate, architecture and jitter performance. To the best of our knowledge, this clock recovery uses the advantage of being the first high-speed CDR designed in CMOS 0.13 mikrometer technology with the superiority on power consumption and area considerations among others. The CDR architecture presented in this thesis is intended, as a state-of-the-art clock recovery for high-speed applications such as optical communications or high bandwidth serial wireline communication needs. It can be used either as a stand-alone single-chip unit, or as an embedded intellectual property (IP) block that can be integrated with other modules on chip

    Engineering evaluations and studies. Volume 3: Exhibit C

    Get PDF
    High rate multiplexes asymmetry and jitter, data-dependent amplitude variations, and transition density are discussed

    Modelling and performance analysis of multigigabit serial interconnects using real number based analog verification methods

    Get PDF
    The increasing importance of multigigabit transceiver circuits in modern chip design calls for new methods of analyzing and integrating these challenging building blocks. This work presents a design and analysis framework basend on the SystemVerilog real number modeling ansatz. It further extends the simulation possibilities thus obtained by introducing additional higher level numeric modelling and evaluation methods to support multigigabit statistical link budgeting procedures based on the Peak Distortion Algorithm

    Digital Centric Multi-Gigabit SerDes Design and Verification

    Get PDF
    Advances in semiconductor manufacturing still lead to ever decreasing feature sizes and constantly allow higher degrees of integration in application specific integrated circuits (ASICs). Therefore the bandwidth requirements on the external interfaces of such systems on chips (SoC) are steadily growing. Yet, as the number of pins on these ASICs is not increasing in the same pace - known as pin limitation - the bandwidth per pin has to be increased. SerDes (Serializer/Deserializer) technology, which allows to transfer data serially at very high data rates of 25Gbps and more is a key technology to overcome pin limitation and exploit the computing power that can be achieved in todays SoCs. As such SerDes blocks together with the digital logic interfacing them form complex mixed signal systems, verification of performance and functional correctness is very challenging. In this thesis a novel mixed-signal design methodology is proposed, which tightly couples model and implementation in order to ensure consistency throughout the design cycles and hereby accelerate the overall implementation flow. A tool flow that has been developed is presented, which integrates well into state of the art electronic design automation (EDA) environments and enables the usage of this methodology in practice. Further, the design space of todays high-speed serial links is analyzed and an architecture is proposed, which pushes complexity into the digital domain in order to achieve robustness, portability between manufacturing processes and scaling with advanced node technologies. The all digital phase locked loop (PLL) and clock data recovery (CDR), which have been developed are described in detail. The developed design flow was used for the implementation of the SerDes architecture in a 28nm silicon process and proved to be indispensable for future projects

    Design of High-Speed SerDes Transceiver for Chip-to-Chip Communications in CMOS Process

    Get PDF
    With the continuous increase of on-chip computation capacities and exponential growth of data-intensive applications, the high-speed data transmission through serial links has become the backbone for modern communication systems. To satisfy the massive data-exchanging requirement, the data rate of such serial links has been updated from several Gb/s to tens of Gb/s. Currently, the commercial standards such as Ethernet 400GbE, InfiniBand high data rate (HDR), and common electrical interface (CEI)-56G has been developing towards 40+ Gb/s. As the core component within these links, the transceiver chipset plays a fundamental role in balancing the operation speed, power consumption, area occupation, and operation range. Meanwhile, the CMOS process has become the dominant technology in modern transceiver chip fabrications due to its large-scale digital integration capability and aggressive pricing advantage. This research aims to explore advanced techniques that are capable of exploiting the maximum operation speed of the CMOS process, and hence provides potential solutions for 40+ Gb/s CMOS transceiver designs. The major contributions are summarized as follows. A low jitter ring-oscillator-based injection-locked clock multiplier (RILCM) with a hybrid frequency tracking loop that consists of a traditional phase-locked loop (PLL), a timing-adjusted loop, and a loop selection state-machine is implemented in 65-nm C-MOS process. In the ring voltage-controlled oscillator, a full-swing pseudo-differential delay cell is proposed to lower the device noise to phase noise conversion. To obtain high operation speed and high detection accuracy, a compact timing-adjusted phase detector tightly combined with a well-matched charge pump is designed. Meanwhile, a lock-loss detection and lock recovery is devised to endow the RILCM with a similar lock-acquisition ability as conventional PLL, thus excluding the initial frequency set- I up aid and preventing the potential lock-loss risk. The experimental results show that the figure-of-merit of the designed RILCM reaches -247.3 dB, which is better than previous RILCMs and even comparable to the large-area LC-ILCMs. The transmitter (TX) and receiver (RX) chips are separately designed and fab- ricated in 65-nm CMOS process. The transmitter chip employs a quarter-rate multi-multiplexer (MUX)-based 4-tap feed-forward equalizer (FFE) to pre-distort the output. To increase the maximum operating speed, a bandwidth-enhanced 4:1 MUX with the capability of eliminating charge-sharing effect is proposed. To produce the quarter-rate parallel data streams with appropriate delays, a compact latch array associated with an interleaved-retiming technique is designed. The receiver chip employs a two-stage continuous-time linear equalizer (CTLE) as the analog front-end and integrates an improved clock data recovery to extract the sampling clocks and retime the incoming data. To automatically balance the jitter tracking and jitter suppression, passive low-pass filters with adaptively-adjusted bandwidth are introduced into the data-sampling path. To optimize the linearity of the phase interpolation, a time-averaging-based compensating phase interpolator is proposed. For equalization, a combined TX-FFE and RX-CTLE is applied to compensate for the channel loss, where a low-cost edge-data correlation-based sign zero-forcing adaptation algorithm is proposed to automatically adjust the TX-FFE’s tap weights. Measurement results show that the fabricated transmitter/receiver chipset can deliver 40 Gb/s random data at a bit error rate of 16 dB loss at the half-baud frequency, while consuming a total power of 370 mW
    corecore