3 research outputs found

    Design of Low-Power NRZ/PAM-4 Wireline Transmitters

    Get PDF
    Rapid growing demand for instant multimedia access in a myriad of digital devices has pushed the need for higher bandwidth in modern communication hardwares ranging from short-reach (SR) memory/storage interfaces to long-reach (LR) data center Ethernets. At the same time, comprehensive design optimization of link system that meets the energy-efficiency is required for mobile computing and low operational cost at datacenters. This doctoral study consists of design of two low-swing wireline transmitters featuring a low-power clock distribution and 2-tap equalization in energy-efficient manners up to 20-Gb/s operation. In spite of the reduced signaling power in the voltage-mode (VM) transmit driver, the presence of the segment selection logic still diminishes the power saving benefit. The first work presents a scalable VM transmitter which offers low static power dissipation and adopts an impedance-modulated 2-tap equalizer with analog tap control, thereby obviating driver segmentation and reducing pre-driver complexity and dynamic power. Per-channel quadrature clock generation with injection-locked oscillators (ILO) allows the generation of rail-to-rail quadrature clocks. Energy efficiency is further improved with capacitively driven low-swing global clock distribution and supply scaling at lower data rates, while output eye quality is maintained at low voltages with automatic phase calibration of the local ILO-generated quarter-rate clocks. A prototype fabricated in a general purpose 65 nm CMOS process includes a 2 mm global clock distribution network and two transmitters that support an output swing range of 100-300mV with up to 12-dB of equalization. The transmitters achieve 8-16 Gb/s operation at 0.65-1.05 pJ/b energy efficiency. The second work involves a dual-mode NRZ/PAM-4 differential low-swing voltage-mode (VM) transmitter. The pulse-selected output multiplexing allows reduction of power supply and deterministic jitter caused by large on-chip parasitic inherent in the transmission-gate-based multiplexers in the earlier work. Analog impedance control replica circuits running in the background produce gate-biasing voltages that control the peaking ratio for 2-tap feed-forward equalization and PAM-4 symbol levels for high-linearity. This analog control also allows for efficient generation of the middle levels in PAM-4 operation with good linearity quantified by level separation mismatch ratio of 95%. In NRZ mode, 2-tap feedforward equalization is configurable in high-performance controlled-impedance or energy-efficient impedance-modulated settings to provide performance scalability. Analytic design consideration on dynamic power, data-rate, mismatch, and output swing brings optimal performance metric on the given technology node. The proof-of-concept prototype is verified on silicon with 65 nm CMOS process with improved performance in speed and energy-efficiency owing to double-stack NMOS transistors in the output stage. The transmitter consumes as low as 29.6mW in 20-Gb/s NRZ and 25.5mW in the 28-Gb/s PAM-4 operations

    ๋ฉ”๋ชจ๋ฆฌ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์œ„ํ•œ 20Gbps๊ธ‰ ์ง๋ ฌํ™” ์†ก์ˆ˜์‹ ๊ธฐ ์„ค๊ณ„

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2013. 8. ์ •๋•๊ท .Various types of serial link for current and future memory interface are presented in this thesis. At first, PHY design for commercial GDDR3 memory is proposed. GDDR3 PHY is consists of read path, write path, command path. Write path and command path calibrate skew by using VDL (Variable delay line), while read path calibrates skew by using DLL (Delay locked loop) and VDL. There are four data channels and one command/address channel. Each data channel consists of one clock signal (DQS) and eight data signals (DQ). Data channel operates in 1.2Gbps (1.08Gbps~1.2Gbps), and command/address channel operates 600Mbps (540Mbps~600Mbps). In particular, DLL design for high speed and for SSN (simultaneous switching noise) is concentrated in this thesis. Secondly, serial link design for silicon photonics is proposed. Silicon photonics is the strongest candidate for next generation memory interface. Modulator driver for modulator, TIA (trans-impedance amplifier) and LA (limiting amplifier) for photo diode design are discussed. It operates above 12.5Gbps but it consumes much power 7.2mW/Gbps (transmitter core), 2mW/Gbps (receiver core) because it is connected with optical device which has large parasitic capacitance. Overall receiver which includes CDR (clock and data recovery) is also implemented. Many chips are fabricated in 65nm, 0.13um CMOS process. Finally, electrical serial link for 20Gbps memory link is proposed. Overall architecture is forwarded clocking architecture, and is very simple and intuitive. It does not need additional synchronizer. This open loop delay matched stream line receiver finds optimum sampling point with DCDL (Digitally controlled delay line) controller and expects to consume low power structurally. Only two phase half rate clock is transmitted through clock channel, but half rate time interleaved way sampling is performed by aid of initial value settable PRBS chaser. A CMOS Chip is fabricated by 65nm process and it occupies 2500um x 2500um (transceiver). It is expected that about 2.6mW(2.4mW)/Gbps (transmitter), 4.1mW(2.7mW)/Gbps (receiver). Power consumption improvement is expected in advanced process.ABSTRACT I CONTENTS V LIST OF FIGURES VII LIST OF TABLES XII CHAPTER 1 INTRODUCTION ๏ผ‘ 1.1 MOTIVATION ๏ผ‘ 1.2 THESIS ORGANIZATION ๏ผ‘๏ผ CHAPTER 2 A SERIAL LINK PHY DESIGN FOR GDDR3 MEMORY INTERFACE 11 2.1 INTRODUCTION 11 2.2 GDDR3 MEMORY INTERFACE ARCHITECTURE 12 2.2.1 READ PATH ARCHITECTURE 15 2.2.2 WRITE PATH ARCHITECTURE 17 2.2.3 COMMAND PATH ARCHITECTURE 19 2.3 DLL DESIGN FOR MEMORY INTERFACE 20 2.3.1 SSN(SIMULTANEOUS SWITCHING NOISE) 20 2.3.2 DLL ARCHITECTURE 21 2.3.3 VOLTAGE CONTROLLED DELAY LINE (VCDL) 22 2.3.4 HYSTERESIS COARSE LOCK DETECTOR (HCLD) 23 2.3.5 DYNAMIC PHASE DETECTOR AND CHARGE PUMP 26 2.4 SIMULATION RESULT 29 2.5 CONCLUSION 32 CHAPTER 3 OPTICAL FRONT-END SERIAL LINK DESIGN FOR 20 GBPS MEMORY INTERFACE 35 3.1 SILICON PHOTONICS INTRODUCTION 35 3.2 OPTICAL FRONT-END TRANSMITTER DESIGN 45 3.2.1 MODULATOR DRIVER REQUIREMENTS 46 3.2.2 MODULATOR DRIVER DESIGN - CURRENT MODE DRIVER 47 3.2.3 MODULATOR DRIVER DESIGN - CURRENT MODE DRIVER 50 3.3 OPTICAL FRONT-END RECEIVER DESIGN 55 3.3.1 OPTICAL RECEIVER BACK END REQUIREMENTS 56 3.3.2 OPTICAL RECEIVER BACK END DESIGN โ€“ TIA 57 3.3.3 OPTICAL RECEIVER BACK END DESIGN โ€“ LA, DRIVER 63 3.3.4 OPTICAL RECEIVER BACK END DESIGN โ€“ CDR 66 3.4 MEASUREMENT AND SIMULATION RESULTS 70 3.4.1 MEASUREMENT AND SIMULATION ENVIRONMENTS 70 3.4.2 OPTICAL TX FRONT END MEASUREMENT AND SIMULATION 74 3.4.3 OPTICAL RX FRONT END MEASUREMENT AND SIMULATION 77 3.4.4 OPTICAL RX BACK END SIMULATION 79 3.4.5 OPTICAL-ELECTRICAL OVERALL MEASUREMENTS 80 3.4.6 DIE PHOTO AND LAYOUT 82 3.5 CONCLUSION 86 CHAPTER 4 ELECTRICAL FRONT-END SERIAL LINK DESIGN FOR 20GBPS MEMORY INTERFACE 87 4.1 INTRODUCTION 87 4.2 CONVENTIONAL ELECTRICAL FRONT-END HIGH SPEED SERIAL LINK ARCHITECTURES 90 4.3 DESIGN CONCEPT AND PROPOSED SERIAL LINK ARCHITECTURE โ€“ OPEN LOOP DELAY MATCHED STREAM LINED RECEIVER. 95 4.3.1 PROPOSED OVERALL ARCHITECTURE 95 4.3.2 DESIGN CONCEPT 97 4.3.3 PROPOSED PROTOCOL AND LOCKING PROCESS 100 4.4 OPTIMUM POINT SEARCH ALGORITHM BASED DCDL CONTROLLER DESIGN 102 4.5 DCDL (DIGITALLY CONTROLLED DELAY LINE) DESIGN 112 4.6 DFE (DECISION FEEDBACK EQUALIZER) AND OTHER BLOCKS DESIGN 115 4.7 SIMULATION RESULTS 117 4.8 POWER EXPECTATION AND CHIP LAYOUT 122 4.9 CONCLUSION 124 CHAPTER 5 CONCLUSION 126 BIBLIOGRAPHY 128Docto

    Evaluating Techniques for Wireless Interconnected 3D Processor Arrays

    Get PDF
    In this thesis the viability of a wireless interconnect network for a highly parallel computer is investigated. The main theme of this thesis is to project the performance of a wireless network used to connect the processors in a parallel machine of such design. This thesis is going to investigate new design opportunities a wireless interconnect network can offer for parallel computing. A simulation environment is designed and implemented to carry out the tests. The results have shown that if the available radio spectrum is shared effectively between building blocks of the parallel machine, there are substantial chances to achieve high processor utilisation. The results show that some factors play a major role in the performance of such a machine. The size of the machine, the size of the problem and the communication and computation capabilities of each element of the machine are among those factors. The results show these factors set a limit on the number of nodes engaged in some classes of tasks. They have shown promising potential for further expansion and evolution of our idea to new architectural opportunities, which is discussed by the end of this thesis. To build a real machine of this type the architects would need to solve a number of challenging problems including heat dissipation, delivering electric power and Chip/board design; however, these issues are not part of this thesis and will be tackled in future
    corecore