106 research outputs found

    A high speed serializer/deserializer design

    Get PDF
    A Serializer/Deserializer (SerDes) is a circuit that converts parallel data into a serial stream and vice versa. It helps solve clock/data skew problems, simplifies data transmission, lowers the power consumption and reduces the chip cost. The goal of this project was to solve the challenges in high speed SerDes design, which included the low jitter design, wide bandwidth design and low power design. A quarter-rate multiplexer/demultiplexer (MUX/DEMUX) was implemented. This quarter-rate structure decreases the required clock frequency from one half to one quarter of the data rate. It is shown that this significantly relaxes the design of the VCO at high speed and achieves lower power consumption. A novel multi-phase LC-ring oscillator was developed to supply a low noise clock to the SerDes. This proposed VCO combined an LC-tank with a ring structure to achieve both wide tuning range (11%) and low phase noise (-110dBc/Hz at 1MHz offset). With this structure, a data rate of 36 Gb/s was realized with a measured peak-to-peak jitter of 10ps using 0.18microm SiGe BiCMOS technology. The power consumption is 3.6W with 3.4V power supply voltage. At a 60 Gb/s data rate the simulated peak-to-peak jitter was 4.8ps using 65nm CMOS technology. The power consumption is 92mW with 2V power supply voltage. A time-to-digital (TDC) calibration circuit was designed to compensate for the phase mismatches among the multiple phases of the PLL clock using a three dimensional fully depleted silicon on insulator (3D FDSOI) CMOS process. The 3D process separated the analog PLL portion from the digital calibration portion into different tiers. This eliminated the noise coupling through the common substrate in the 2D process. Mismatches caused by the vertical tier-to-tier interconnections and the temperature influence in the 3D process were attenuated by the proposed calibration circuit. The design strategy and circuits developed from this dissertation provide significant benefit to both wired and wireless applications

    Reporte de formación complementaria en área de concentración en Diseño de Circuitos Integrados

    Get PDF
    El objetivo de este documento es presentar los proyectos realizados en el área de Diseño de Circuitos Electrónicos Analógicos que mayor impacto representaron para la formación académica del alumno en el área de concentración seleccionada. En la asignatura de Diseño de Circuitos Integrados Analógicos se trabajó el proyecto Amplificador CMOS con compensación de offset para aplicación en bloques SerDes. En la asignatura de Diseño Avanzado de Circuitos Integrados Analógicos se realizó el proyecto Filtro OTA-C Pasa Bajas con sintonización automática asistida por PLL. Finalmente, en la materia de Diseño de Circuitos Integrados Digitales se llevó a cabo el proyecto Transmisor de Datos Digitales con impedancia de salida, énfasis y modulación de amplitud configurable

    고속 시리얼 링크를 위한 고리 발진기를 기반으로 하는 주파수 합성기

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2022. 8. 정덕균.In this dissertation, major concerns in the clocking of modern serial links are discussed. As sub-rate, multi-standard architectures are becoming predominant, the conventional clocking methodology seems to necessitate innovation in terms of low-cost implementation. Frequency synthesis with active, inductor-less oscillators replacing LC counterparts are reviewed, and solutions for two major drawbacks are proposed. Each solution is verified by prototype chip design, giving a possibility that the inductor-less oscillator may become a proper candidate for future high-speed serial links. To mitigate the high flicker noise of a high-frequency ring oscillator (RO), a reference multiplication technique that effectively extends the bandwidth of the following all-digital phase-locked loop (ADPLL) is proposed. The technique avoids any jitter accumulation, generating a clean mid-frequency clock, overall achieving high jitter performance in conjunction with the ADPLL. Timing constraint for the proper reference multiplication is first analyzed to determine the calibration points that may correct the existent phase errors. The weight for each calibration point is updated by the proposed a priori probability-based least-mean-square (LMS) algorithm. To minimize the time required for the calibration, each gain for the weight update is adaptively varied by deducing a posteriori which error source dominates the others. The prototype chip is fabricated in a 40-nm CMOS technology, and its measurement results verify the low-jitter, high-frequency clock generation with fast calibration settling. The presented work achieves an rms jitter of 177/223 fs at 8/16-GHz output, consuming 12.1/17-mW power. As the second embodiment, an RO-based ADPLL with an analog technique that addresses the high supply sensitivity of the RO is presented. Unlike prior arts, the circuit for the proposed technique does not extort the RO voltage headroom, allowing high-frequency oscillation. Further, the performance given from the technique is robust over process, voltage, and temperature (PVT) variations, avoiding the use of additional calibration hardware. Lastly, a comprehensive analysis of phase noise contribution is conducted for the overall ADPLL, followed by circuit optimizations, to retain the low-jitter output. Implemented in a 40-nm CMOS technology, the frequency synthesizer achieves an rms jitter of 289 fs at 8 GHz output without any injected supply noise. Under a 20-mVrms white supply noise, the ADPLL suppresses supply-noise-induced jitter by -23.8 dB.본 논문은 현대 시리얼 링크의 클락킹에 관여되는 주요한 문제들에 대하여 기술한다. 준속도, 다중 표준 구조들이 채택되고 있는 추세에 따라, 기존의 클라킹 방법은 낮은 비용의 구현의 관점에서 새로운 혁신을 필요로 한다. LC 공진기를 대신하여 능동 소자 발진기를 사용한 주파수 합성에 대하여 알아보고, 이에 발생하는 두가지 주요 문제점과 각각에 대한 해결 방안을 탐색한다. 각 제안 방법을 프로토타입 칩을 통해 그 효용성을 검증하고, 이어서 능동 소자 발진기가 미래의 고속 시리얼 링크의 클락킹에 사용될 가능성에 대해 검토한다. 첫번째 시연으로써, 고주파 고리 발진기의 높은 플리커 잡음을 완화시키기 위해 기준 신호를 배수화하여 뒷단의 위상 고정 루프의 대역폭을 효과적으로 극대화 시키는 회로 기술을 제안한다. 본 기술은 지터를 누적 시키지 않으며 따라서 깨끗한 중간 주파수 클락을 생성시켜 위상 고정 루프와 함께 높은 성능의 고주파 클락을 합성한다. 기준 신호를 성공적으로 배수화하기 위한 타이밍 조건들을 먼저 분석하여 타이밍 오류를 제거하기 위한 방법론을 파악한다. 각 교정 중량은 연역적 확률을 기반으로한 LMS 알고리즘을 통해 갱신되도록 설계된다. 교정에 필요한 시간을 최소화 하기 위하여, 각 교정 이득은 타이밍 오류 근원들의 크기를 귀납적으로 추론한 값을 바탕으로 지속적으로 제어된다. 40-nm CMOS 공정으로 구현된 프로토타입 칩의 측정을 통해 저소음, 고주파 클락을 빠른 교정 시간안에 합성해 냄을 확인하였다. 이는 177/223 fs의 rms 지터를 가지는 8/16 GHz의 클락을 출력한다. 두번째 시연으로써, 고리 발진기의 높은 전원 노이즈 의존성을 완화시키는 기술이 포함된 주파수 합성기가 설계되었다. 이는 고리 발진기의 전압 헤드룸을 보존함으로서 고주파 발진을 가능하게 한다. 나아가, 전원 노이즈 감소 성능은 공정, 전압, 온도 변동에 대하여 민감하지 않으며, 따라서 추가적인 교정 회로를 필요로 하지 않는다. 마지막으로, 위상 노이즈에 대한 포괄적 분석과 회로 최적화를 통하여 주파수 합성기의 저잡음 출력을 방해하지 않는 방법을 고안하였다. 해당 프로토타입 칩은 40-nm CMOS 공정으로 구현되었으며, 전원 노이즈가 인가되지 않은 상태에서 289 fs의 rms 지터를 가지는 8 GHz의 클락을 출력한다. 또한, 20 mVrms의 전원 노이즈가 인가되었을 때에 유도되는 지터의 양을 -23.8 dB 만큼 줄이는 것을 확인하였다.1 Introduction 1 1.1 Motivation 3 1.1.1 Clocking in High-Speed Serial Links 4 1.1.2 Multi-Phase, High-Frequency Clock Conversion 8 1.2 Dissertation Objectives 10 2 RO-Based High-Frequency Synthesis 12 2.1 Phase-Locked Loop Fundamentals 12 2.2 Toward All-Digital Regime 15 2.3 RO Design Challenges 21 2.3.1 Oscillator Phase Noise 21 2.3.2 Challenge 1: High Flicker Noise 23 2.3.3 Challenge 2: High Supply Noise Sensitivity 26 3 Filtering RO Noise 28 3.1 Introduction 28 3.2 Proposed Reference Octupler 34 3.2.1 Delay Constraint 34 3.2.2 Phase Error Calibration 38 3.2.3 Circuit Implementation 51 3.3 IL-ADPLL Implementation 55 3.4 Measurement Results 59 3.5 Summary 63 4 RO Supply Noise Compensation 69 4.1 Introduction 69 4.2 Proposed Analog Closed Loop for Supply Noise Compensation 72 4.2.1 Circuit Implementation 73 4.2.2 Frequency-Domain Analysis 76 4.2.3 Circuit Optimization 81 4.3 ADPLL Implementation 87 4.4 Measurement Results 90 4.5 Summary 98 5 Conclusions 99 A Notes on the 8REF 102 B Notes on the ACSC 105박

    CIRCUITS AND ARCHITECTURE FOR BIO-INSPIRED AI ACCELERATORS

    Get PDF
    Technological advances in microelectronics envisioned through Moore’s law have led to powerful processors that can handle complex and computationally intensive tasks. Nonetheless, these advancements through technology scaling have come at an unfavorable cost of significantly larger power consumption, which has posed challenges for data processing centers and computers at scale. Moreover, with the emergence of mobile computing platforms constrained by power and bandwidth for distributed computing, the necessity for more energy-efficient scalable local processing has become more significant. Unconventional Compute-in-Memory architectures such as the analog winner-takes-all associative-memory and the Charge-Injection Device processor have been proposed as alternatives. Unconventional charge-based computation has been employed for neural network accelerators in the past, where impressive energy efficiency per operation has been attained in 1-bit vector-vector multiplications, and in recent work, multi-bit vector-vector multiplications. In the latter, computation was carried out by counting quanta of charge at the thermal noise limit, using packets of about 1000 electrons. These systems are neither analog nor digital in the traditional sense but employ mixed-signal circuits to count the packets of charge and hence we call them Quasi-Digital. By amortizing the energy costs of the mixed-signal encoding/decoding over compute-vectors with many elements, high energy efficiencies can be achieved. In this dissertation, I present a design framework for AI accelerators using scalable compute-in-memory architectures. On the device level, two primitive elements are designed and characterized as target computational technologies: (i) a multilevel non-volatile cell and (ii) a pseudo Dynamic Random-Access Memory (pseudo-DRAM) bit-cell. At the level of circuit description, compute-in-memory crossbars and mixed-signal circuits were designed, allowing seamless connectivity to digital controllers. At the level of data representation, both binary and stochastic-unary coding are used to compute Vector-Vector Multiplications (VMMs) at the array level. Finally, on the architectural level, two AI accelerator for data-center processing and edge computing are discussed. Both designs are scalable multi-core Systems-on-Chip (SoCs), where vector-processor arrays are tiled on a 2-layer Network-on-Chip (NoC), enabling neighbor communication and flexible compute vs. memory trade-off. General purpose Arm/RISCV co-processors provide adequate bootstrapping and system-housekeeping and a high-speed interface fabric facilitates Input/Output to main memory

    CubeSat Data Transmission and Storage Throughput Optimization Through the Use of a Zynq SoC Based CubeSat Science Instrument Interface Electronics Board

    Get PDF
    The CubeSat standard sprang from the desire to create a satellite standard that would open the doors for universities and other lower budget research institutions by making it more feasible to get their work into space. Since then, many other institutions and industries have been adopting variations on the standard for their own use. As more people are seeking out to use the CubeSat standard as their main bus, the standards and practices of the community have grown and expanded and with this growth, new challenges have been created. One such challenge is the bandwidth limitation in the RF-downlink. When carrying payloads requiring what might seem to be a relatively small (science data) bandwidth requirement (on the order of thousands of bps), the RF-link to ground is overloaded. Many approaches in the past have been put forth to help alleviate this issue, unfortunately, none have been fully adopted. This paper presents a solution that takes advantage of new technology yet to be fully exploited in space applications. The key to the solution lies in removing the bandwidth requirements by enabling onboard post-data processing and compression. In order to achieve the high computational needs, while minimizing power consumption, a Xilinx Zynq-7000 SoC is used, creating a highly-programmable, open integration device. This report outlines the design, fabrication and testing of this solution. The completion of the Zynq Processing System CubeSat Science Instrument Interface Electronics Board (or ZPS-Board), ultimately demonstrates the feasibility of this solution. Additionally, this research is funded by NASA’s JPL, with secondary motives for the creating of a space application Zynq-7000 SoC based product. Upon successful completion of the ZPS-Board, the product creates a platform for JPL to perform environmental testing in order to study the effects and performance characteristics of the Zynq in space applications

    The Efficient Design of Time-to-Digital Converters

    Get PDF

    Design and realization of a 2.4 Gbps - 3.2 Gbps clock and data recovery circuit

    Get PDF
    This thesis presents the design, verification, system integration and the physical realization of a high-speed monolithic phase-locked loop (PLL) based clock and data recovery (CDR) circuit. The architecture of the CDR has been realized as a two-loop structure consisting of coarse and fine loops, each of which is capable of processing the incoming low-speed reference clock and high-speed random data. At start up, the coarse loop provides fast locking to the system frequency with the help of the reference clock. After the VCO clock reaches a proximity of system frequency , the LOCK signal is generated and the coarse loop is tumed off, while the fine loop is tumed on. Fine loop tracks the phase of the generated clock with respect to the data and aligns the VCO clock such that its rising edge is in the middle of data eye. The speed and symmetry of sub-blocks in fine loop are extremely important, since all asymmetric charging effects, skew and setup/hold problems in this loop translate into a static phase error at the clock output. The entire circuit architecture is built with a special low-voltage circuit design technique. All analogue as well as digital sub-blocks of the CDR architecture presented in this work operate on a differential signalling, which significantly makes the design more complex while ensuring a more robust perforrnance. Other important features of this CDR include small area, single power supply, low power consumption, capability to operate at very high data rates, and the ability to handle between 2.4 Gbps and 3.2 Gbps data rate. The CDR architecture was realized using a conventional 0.13-mikrometer digital CMOS technology (Foundry: UMC), which ensures a lower overall cost and better portability for the design. The CDR architecture presented in this work is capable of operating at sampling frequencies of up to 3.2 GHz, and still can achieve the robust phase alignrnent. The entire circuit is designed with single 1.2 V power supply .The overall power consumption is estimated as 18.6 mW at 3.2 GHz sampling rate. The overall silicon area of the CDR is approximately 0.3 mm^2 with its internal loop filter capacitors. Other researchers have reported similar featured PLL-based clock and data recovery circuits in terms of operating data rate, architecture and jitter performance. To the best of our knowledge, this clock recovery uses the advantage of being the first high-speed CDR designed in CMOS 0.13 mikrometer technology with the superiority on power consumption and area considerations among others. The CDR architecture presented in this thesis is intended, as a state-of-the-art clock recovery for high-speed applications such as optical communications or high bandwidth serial wireline communication needs. It can be used either as a stand-alone single-chip unit, or as an embedded intellectual property (IP) block that can be integrated with other modules on chip
    corecore