56 research outputs found

    PAM4 Transmitter and Receiver Equalizers Optimization for High-Speed Serial Links

    Get PDF
    As the telecommunications markets evolves, the demand of faster data transfers and processing continue to increase. In order to confront this demand, the peripheral component interconnect express (PCIe) has been increasing the data rates from PCIe Gen 1(4 Gb/s) to PCIe Gen 5(32 Gb/s). This evolution has brought new challenges due to the high-speed interconnections effects which can cause data loss and intersymbol interference. Under these conditions the traditional non return to zero modulation (NRZ) scheme became a bottle neck due to bandwidth limitations in the high-speed interconnects. The pulse amplitude modulation 4-level (PAM4) scheme is been implemented in next generation of PCIe (PCIe6) doubling the data rate without increasing the channel bandwidth. However, while PAM4 solve the bandwidth problem it also brings new challenges in post silicon equalization. Tuning the transmitter (Tx) and receiver (Rx) across different interconnect channels can be a very time-consuming task due to multiple equalizers implemented in the serializer/deserializer (SerDes). Typical current industrial practices for SerDes equalizers tuning require massive lab measurements, since they are based on exhaustive enumeration methods, making the equalization process too lengthy and practically prohibitive under current silicon time-to-market commitments. In this master’s dissertation a numerical method is proposed to optimize the transmitter and receiver equalizers of a PCIe6 link. The experimental results, tested in a MATLAB simulation environment, demonstrate the effectiveness of the proposed approach by delivering optimal PAM4 eye diagrams margins while significantly reducing the jitter.ITESO, A.C

    A high speed serializer/deserializer design

    Get PDF
    A Serializer/Deserializer (SerDes) is a circuit that converts parallel data into a serial stream and vice versa. It helps solve clock/data skew problems, simplifies data transmission, lowers the power consumption and reduces the chip cost. The goal of this project was to solve the challenges in high speed SerDes design, which included the low jitter design, wide bandwidth design and low power design. A quarter-rate multiplexer/demultiplexer (MUX/DEMUX) was implemented. This quarter-rate structure decreases the required clock frequency from one half to one quarter of the data rate. It is shown that this significantly relaxes the design of the VCO at high speed and achieves lower power consumption. A novel multi-phase LC-ring oscillator was developed to supply a low noise clock to the SerDes. This proposed VCO combined an LC-tank with a ring structure to achieve both wide tuning range (11%) and low phase noise (-110dBc/Hz at 1MHz offset). With this structure, a data rate of 36 Gb/s was realized with a measured peak-to-peak jitter of 10ps using 0.18microm SiGe BiCMOS technology. The power consumption is 3.6W with 3.4V power supply voltage. At a 60 Gb/s data rate the simulated peak-to-peak jitter was 4.8ps using 65nm CMOS technology. The power consumption is 92mW with 2V power supply voltage. A time-to-digital (TDC) calibration circuit was designed to compensate for the phase mismatches among the multiple phases of the PLL clock using a three dimensional fully depleted silicon on insulator (3D FDSOI) CMOS process. The 3D process separated the analog PLL portion from the digital calibration portion into different tiers. This eliminated the noise coupling through the common substrate in the 2D process. Mismatches caused by the vertical tier-to-tier interconnections and the temperature influence in the 3D process were attenuated by the proposed calibration circuit. The design strategy and circuits developed from this dissertation provide significant benefit to both wired and wireless applications

    Digital Centric Multi-Gigabit SerDes Design and Verification

    Get PDF
    Advances in semiconductor manufacturing still lead to ever decreasing feature sizes and constantly allow higher degrees of integration in application specific integrated circuits (ASICs). Therefore the bandwidth requirements on the external interfaces of such systems on chips (SoC) are steadily growing. Yet, as the number of pins on these ASICs is not increasing in the same pace - known as pin limitation - the bandwidth per pin has to be increased. SerDes (Serializer/Deserializer) technology, which allows to transfer data serially at very high data rates of 25Gbps and more is a key technology to overcome pin limitation and exploit the computing power that can be achieved in todays SoCs. As such SerDes blocks together with the digital logic interfacing them form complex mixed signal systems, verification of performance and functional correctness is very challenging. In this thesis a novel mixed-signal design methodology is proposed, which tightly couples model and implementation in order to ensure consistency throughout the design cycles and hereby accelerate the overall implementation flow. A tool flow that has been developed is presented, which integrates well into state of the art electronic design automation (EDA) environments and enables the usage of this methodology in practice. Further, the design space of todays high-speed serial links is analyzed and an architecture is proposed, which pushes complexity into the digital domain in order to achieve robustness, portability between manufacturing processes and scaling with advanced node technologies. The all digital phase locked loop (PLL) and clock data recovery (CDR), which have been developed are described in detail. The developed design flow was used for the implementation of the SerDes architecture in a 28nm silicon process and proved to be indispensable for future projects

    Transmitter and Receiver Equalizers Optimization Methodologies for High-Speed Links in Industrial Computer Platforms Post-Silicon Validation

    Get PDF
    As microprocessor design scales to nanometric technology, traditional post-silicon validation techniques are inappropriate to get a full system functional coverage. Physical complexity and extreme technology process variations introduce design challenges to guarantee performance over process, voltage, and temperature conditions. In addition, there is an increasingly higher number of mixed-signal circuits within microprocessors. Many of them correspond to high-speed input/output (HSIO) links. Improvements in signaling methods, circuits, and process technology have allowed HSIO data rates to scale beyond 10 Gb/s, where undesired effects can create multiple signal integrity problems. With all of these elements, post-silicon validation of HSIO links is tough and time-consuming. One of the major challenges in electrical validation of HSIO links lies in the physical layer (PHY) tuning process, where equalization techniques are used to cancel these undesired effects. Typical current industrial practices for PHY tuning require massive lab measurements, since they are based on exhaustive enumeration methods. In this work, direct and surrogate-based optimization methods, including space mapping, are proposed based on suitable objective functions to efficiently tune the transmitter and receiver equalizers. The proposed methodologies are evaluated by lab measurements on realistic industrial post-silicon validation platforms, confirming dramatic speed up in PHY tuning and substantial performance improvement

    A 40-Gb/s Quarter-Rate SerDes Transmitter and Receiver Chipset in 65-nm CMOS

    Get PDF
    This paper presents a 40-Gb/s transmitter (TX) and receiver (RX) chipset for chip-to-chip communications in a 65-nm CMOS process. The TX implements a quarter-rate multi-multiplexer (MUX)-based four-tap feed-forward equalizer (FFE), where a charge-sharing-effect elimination technique is introduced into the 4:1 MUX to optimize its jitter performance and power efficiency. The RX employs a two-stage continuous-time linear equalizer as the analog front end and integrates a low-cost sign-based zero-forcing engine relying on edge-data correlation to automatically adjust the tap weights of the TX-FFE. By embedding low-pass filters with an adaptively adjusting bandwidth into the data-sampling path and adopting high-linearity compensating phase interpolators, the clock data recovery achieves both high jitter tolerance and low jitter generation. The fabricated TX and RX chipset delivers 40-Gb/s PRBS data at BER 16-dB loss at half-baud frequency, while consuming a total power of 370 mW

    A Holistic Formulation for System Margining and Jitter Tolerance Optimization in Industrial Post-Silicon Validation

    Get PDF
    There is an increasingly higher number of mixed-signal circuits within microprocessors and systems on chip (SoC). A significant portion of them corresponds to high-speed input/output (HSIO) links. Post-silicon validation of HSIO links can be critical for making a product release qualification decision under aggressive launch schedules. The optimization of receiver analog circuitry in modern HSIO links is a very time consuming post-silicon validation process. Current industrial practices are based on exhaustive enumeration methods to improve either the system margins or the jitter tolerance compliance test. In this paper, these two requirements are addressed in a holistic optimization-based approach. We propose a novel objective function based on these two metrics. Our method employs Kriging to build a surrogate model based on system margining and jitter tolerance measurements. The proposed method, tested with three different realistic server HSIO links, is able to deliver optimal system margins and guarantee jitter tolerance compliance while substantially decreasing the typical post-silicon validation time.ITESO, A.C

    Delay Flip-Flop (DFF) Metastability Impact on Clock and Data Recovery (CDR) and Phase-Locked Loop (PLL) Circuits

    Get PDF
    Modeling delay flip-flops for binary (e.g., Alexander) phase detectors requires paying close attention to three important timing parameters: setup time, hold time, and clock edge-to-output (or briefly C2Q time). These parameters have a critical role in determining the status of the system on the circuit level. This study provided a guideline for designing an optimum DFF for an Alexander phase detector in a clock and data recovery circuit. Furthermore, it indicated DFF timing requirements for a high-speed phase detector in a clock and data recovery circuit. The CDR was also modeled by Verilog-A, and the results were compared with Simulink model achievements. Eventually designed in 45 nm CMOS technology, for 10 Gbps random sequence, the recovered clock contained 0.136 UI and 0.15 UI peak-to-peak jitter on the falling and rising edges respectively, and the lock time was 125 ns. The overall power dissipation was 21 mW from a 1 V supply voltage. Future work includes layout design and manufacturing of the proposed design

    Clock And Data Recovery Using Bang-bang Pll’s

    Get PDF
    Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2008Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2008Bu çalışmada, saat ve data işaretlerinin yeniden çıkarımında kullanılan iki konumlu faz kititlemeli çevrimlerden bahsedilmiştir. Sistem seviyesinde hızlı simülasyonlar yapabilmek amacıyla çevrim elemanlarının davranışsal modelleri geliştirilmiştir. İki konumlu kontrol sistemlerinin el ile analizinin oldukça zor olmasından dolayı modelleme zorunlu hale gelmektedir. Ayrıca gerçeklenen elemanların idealsizliklerinden kaynaklanan davranışlar da olabilidiğince modellenmeye çalışılmıştır. Söz konusu faz kilitlemeli çevrimlerin sistem seviyesinde sağlaması gereken özelliklerin kabaca hesaplanması ve datadaki değişim sıklığının bu özellikleri nasıl etkilediği anlatılmıştır. Çevrim elemanlarının tranzistör seviyesinde nasıl gerçeklendiklerinden bahsedilmiştir. Çok kullanılan bir ring osilatör yapısı olan simetrik yüklü osilatör (Maneatis yük) çevrimde etkili bir şekilde kullanabilmek amacıyla modifiye edilmiştir. Osilatörün üretim ve sıcaklık değişimlerini tolere edebilmesi için kazancının yüksek olması gerekir. Bu da sistemin harici gürültü kaynaklarına (besleme, taban gürültüsü gibi) olan duyarlılığını oldukça arttırmaktadır. Bu nedenle osilatörü otomatik olarak kalibre eden bir teknik geliştirilmiştir. Değişik faz kilitlemeli çevrimlere uygulanabilen teknik için osilatörün akım kontollü olması gerekmektedir. Frekans kitlenmesi gerçekleştikten sonra osilatörün akımı bir analog-sayısal çevirici ile örneklenmekte ve asıl sistem bu nokta etrafında daha dar bir bölgede çalışmaktadır. Ayrıca, sıcaklıktan kaynaklanabilecek değişimler de analog-sayısal dönüştürücünün refererans akımı üzerinden kompanze edilmektedir. Son olarak, tasarlanan sistemin simülasyon sonuçları verilmiştir. 0.18um CMOS teknolojisinde tasarlanan devre 5Gb/s data hızlarında çalışabilmektedir.In this work, bang-bang PLL structures, which are extensively used in clock and data recovery systems, are investigated. Behavioral models of loop elements are created to do faster simulations in system level. This step is mandatory in bang-bang systems, which are hard to analyze with simple calculations. Some non-idealities of real circuit elements are inserted to these models. System level design issues of bang-bang PLL’s are discussed and the effect of data transition density to system specifications is mentioned. Transistor level implementations of loop elements are described. A popular delay cell with symmetric loads (Maneatis cell) is modified to be used effectively in a bang-bang loop. Gain of the VCO seems very large after initial design, which is required to cover the operating frequency range over process and temperature corners. Large gain makes the system prone to external noise sources such as noise from power supply, substrate etc. Therefore, an automatic calibration method is developed to reduce the VCO gain. This technique can be applied to any current controlled oscillators in various phase locked loops. After frequency lock is achieved, current of the oscillator is sampled by a current mode ADC and a narrower range is generated around that point. Additionally, frequency variation due to temperature is compensated through the specifically designed reference current of ADC. Finally, simulation results of CDR and calibration circuits are given. CDR is designed in 0.18um CMOS technology and can operate at 5Gb/s data rate.Yüksek LisansM.Sc

    Design and realization of a 2.4 Gbps - 3.2 Gbps clock and data recovery circuit

    Get PDF
    This thesis presents the design, verification, system integration and the physical realization of a high-speed monolithic phase-locked loop (PLL) based clock and data recovery (CDR) circuit. The architecture of the CDR has been realized as a two-loop structure consisting of coarse and fine loops, each of which is capable of processing the incoming low-speed reference clock and high-speed random data. At start up, the coarse loop provides fast locking to the system frequency with the help of the reference clock. After the VCO clock reaches a proximity of system frequency , the LOCK signal is generated and the coarse loop is tumed off, while the fine loop is tumed on. Fine loop tracks the phase of the generated clock with respect to the data and aligns the VCO clock such that its rising edge is in the middle of data eye. The speed and symmetry of sub-blocks in fine loop are extremely important, since all asymmetric charging effects, skew and setup/hold problems in this loop translate into a static phase error at the clock output. The entire circuit architecture is built with a special low-voltage circuit design technique. All analogue as well as digital sub-blocks of the CDR architecture presented in this work operate on a differential signalling, which significantly makes the design more complex while ensuring a more robust perforrnance. Other important features of this CDR include small area, single power supply, low power consumption, capability to operate at very high data rates, and the ability to handle between 2.4 Gbps and 3.2 Gbps data rate. The CDR architecture was realized using a conventional 0.13-mikrometer digital CMOS technology (Foundry: UMC), which ensures a lower overall cost and better portability for the design. The CDR architecture presented in this work is capable of operating at sampling frequencies of up to 3.2 GHz, and still can achieve the robust phase alignrnent. The entire circuit is designed with single 1.2 V power supply .The overall power consumption is estimated as 18.6 mW at 3.2 GHz sampling rate. The overall silicon area of the CDR is approximately 0.3 mm^2 with its internal loop filter capacitors. Other researchers have reported similar featured PLL-based clock and data recovery circuits in terms of operating data rate, architecture and jitter performance. To the best of our knowledge, this clock recovery uses the advantage of being the first high-speed CDR designed in CMOS 0.13 mikrometer technology with the superiority on power consumption and area considerations among others. The CDR architecture presented in this thesis is intended, as a state-of-the-art clock recovery for high-speed applications such as optical communications or high bandwidth serial wireline communication needs. It can be used either as a stand-alone single-chip unit, or as an embedded intellectual property (IP) block that can be integrated with other modules on chip
    corecore