181 research outputs found

    μ°¨μ„ΈλŒ€ HBM 용 고집적, μ €μ „λ ₯ μ†‘μˆ˜μ‹ κΈ° 섀계

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사) -- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 전기·정보곡학뢀, 2020. 8. 정덕균.This thesis presents design techniques for high-density power-efficient transceiver for the next-generation high bandwidth memory (HBM). Unlike the other memory interfaces, HBM uses a 3D-stacked package using through-silicon via (TSV) and a silicon interposer. The transceiver for HBM should be able to solve the problems caused by the 3D-stacked package and TSV. At first, a data (DQ) receiver for HBM with a self-tracking loop that tracks a phase skew between DQ and data strobe (DQS) due to a voltage or thermal drift is proposed. The self-tracking loop achieves low power and small area by uti-lizing an analog-assisted baud-rate phase detector. The proposed pulse-to-charge (PC) phase detector (PD) converts the phase skew to a voltage differ-ence and detects the phase skew from the voltage difference. An offset calibra-tion scheme that can compensates for a mismatch of the PD is also proposed. The proposed calibration scheme operates without any additional sensing cir-cuits by taking advantage of the write training of HBM. Fabricated in 65 nm CMOS, the DQ receiver shows a power efficiency of 370 fJ/b at 4.8 Gb/s and occupies 0.0056 mm2. The experimental results show that the DQ receiver op-erates without any performance degradation under a Β± 10% supply variation. In a second prototype IC, a high-density transceiver for HBM with a feed-forward-equalizer (FFE)-combined crosstalk (XT) cancellation scheme is pre-sented. To compensate for the XT, the transmitter pre-distorts the amplitude of the FFE output according to the XT. Since the proposed XT cancellation (XTC) scheme reuses the FFE implemented to equalize the channel loss, additional circuits for the XTC is minimized. Thanks to the XTC scheme, a channel pitch can be significantly reduced, allowing for the high channel density. Moreover, the 3D-staggered channel structure removes the ground layer between the verti-cally adjacent channels, which further reduces a cross-sectional area of the channel per lane. The test chip including 6 data lanes is fabricated in 65 nm CMOS technology. The 6-mm channels are implemented on chip to emulate the silicon interposer between the HBM and the processor. The operation of the XTC scheme is verified by simultaneously transmitting 4-Gb/s data to the 6 consecutive channels with 0.5-um pitch and the XTC scheme reduces the XT-induced jitter up to 78 %. The measurement result shows that the transceiver achieves the throughput of 8 Gb/s/um. The transceiver occupies 0.05 mm2 for 6 lanes and consumes 36.6 mW at 6 x 4 Gb/s.λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μ°¨μ„ΈλŒ€ HBM을 μœ„ν•œ 고집적 μ €μ „λ ₯ μ†‘μˆ˜μ‹ κΈ° 섀계 방법을 μ œμ•ˆν•œλ‹€. 첫 번째둜, μ „μ•• 및 μ˜¨λ„ 변화에 μ˜ν•œ 데이터와 클럭 κ°„ μœ„μƒ 차이λ₯Ό 보상할 수 μžˆλŠ” 자체 좔적 루프λ₯Ό 가진 데이터 μˆ˜μ‹ κΈ°λ₯Ό μ œμ•ˆν•œλ‹€. μ œμ•ˆν•˜λŠ” 자체 좔적 λ£¨ν”„λŠ” 데이터 전솑 속도와 같은 μ†λ„λ‘œ λ™μž‘ν•˜λŠ” μœ„μƒ κ²€μΆœκΈ°λ₯Ό μ‚¬μš©ν•˜μ—¬ μ „λ ₯ μ†Œλͺ¨μ™€ 면적을 μ€„μ˜€λ‹€. λ˜ν•œ λ©”λͺ¨λ¦¬μ˜ μ“°κΈ° ν›ˆλ ¨ (write training) 과정을 μ΄μš©ν•˜μ—¬ 효과적으둜 μœ„μƒ κ²€μΆœκΈ°μ˜ μ˜€ν”„μ…‹μ„ 보상할 수 μžˆλŠ” 방법을 μ œμ•ˆν•œλ‹€. μ œμ•ˆν•˜λŠ” 데이터 μˆ˜μ‹ κΈ°λŠ” 65 nm κ³΅μ •μœΌλ‘œ μ œμž‘λ˜μ–΄ 4.8 Gb/sμ—μ„œ 370 fJ/b을 μ†Œλͺ¨ν•˜μ˜€λ‹€. λ˜ν•œ 10 % 의 μ „μ•• 변화에 λŒ€ν•˜μ—¬ μ•ˆμ •μ μœΌλ‘œ λ™μž‘ν•˜λŠ” 것을 ν™•μΈν•˜μ˜€λ‹€. 두 번째둜, ν”Όλ“œ ν¬μ›Œλ“œ 이퀄라이저와 κ²°ν•©λœ 크둜슀 토크 보상 방식을 ν™œμš©ν•œ 고집적 μ†‘μˆ˜μ‹ κΈ°λ₯Ό μ œμ•ˆν•œλ‹€. μ œμ•ˆν•˜λŠ” μ†‘μ‹ κΈ°λŠ” 크둜슀 토크 크기에 ν•΄λ‹Ήν•˜λŠ” 만큼 솑신기 좜λ ₯을 μ™œκ³‘ν•˜μ—¬ 크둜슀 토크λ₯Ό λ³΄μƒν•œλ‹€. μ œμ•ˆν•˜λŠ” 크둜슀 토크 보상 방식은 채널 손싀을 λ³΄μƒν•˜κΈ° μœ„ν•΄ κ΅¬ν˜„λœ ν”Όλ“œ ν¬μ›Œλ“œ 이퀄라이저λ₯Ό μž¬ν™œμš©ν•¨μœΌλ‘œμ¨ 좔가적인 회둜λ₯Ό μ΅œμ†Œν™”ν•œλ‹€. μ œμ•ˆν•˜λŠ” μ†‘μˆ˜μ‹ κΈ°λŠ” 크둜슀 토크가 보상 κ°€λŠ₯ν•˜κΈ° λ•Œλ¬Έμ—, 채널 간격을 크게 쀄여 고집적 톡신을 κ΅¬ν˜„ν•˜μ˜€λ‹€. λ˜ν•œ 집적도λ₯Ό 더 μ¦κ°€μ‹œν‚€κΈ° μœ„ν•΄ μ„Έλ‘œλ‘œ μΈμ ‘ν•œ 채널 μ‚¬μ΄μ˜ 차폐 측을 μ œκ±°ν•œ 적측 채널 ꡬ쑰λ₯Ό μ œμ•ˆν•œλ‹€. 6개의 μ†‘μˆ˜μ‹ κΈ°λ₯Ό ν¬ν•¨ν•œ ν”„λ‘œν† νƒ€μž… 칩은 65 nm κ³΅μ •μœΌλ‘œ μ œμž‘λ˜μ—ˆλ‹€. HBMκ³Ό ν”„λ‘œμ„Έμ„œ μ‚¬μ΄μ˜ silicon interposer channel 을 λͺ¨μ‚¬ν•˜κΈ° μœ„ν•œ 6 mm 의 채널이 μΉ© μœ„μ— κ΅¬ν˜„λ˜μ—ˆλ‹€. μ œμ•ˆν•˜λŠ” 크둜슀 토크 보상 방식은 0.5 um κ°„κ²©μ˜ 6개의 μΈμ ‘ν•œ 채널에 λ™μ‹œμ— 데이터λ₯Ό μ „μ†‘ν•˜μ—¬ κ²€μ¦λ˜μ—ˆμœΌλ©°, 크둜슀 ν† ν¬λ‘œ μΈν•œ 지터λ₯Ό μ΅œλŒ€ 78 % κ°μ†Œμ‹œμΌ°λ‹€. μ œμ•ˆν•˜λŠ” μ†‘μˆ˜μ‹ κΈ°λŠ” 8 Gb/s/um 의 μ²˜λ¦¬λŸ‰μ„ 가지며 6 개의 μ†‘μˆ˜μ‹ κΈ°κ°€ 총 36.6 mW의 μ „λ ₯을 μ†Œλͺ¨ν•˜μ˜€λ‹€.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 4 CHAPTER 2 BACKGROUND ON HIGH-BANDWIDTH MEMORY 6 2.1 OVERVIEW 6 2.2 TRANSCEIVER ARCHITECTURE 10 2.3 READ/WRITE OPERATION 15 2.3.1 READ OPERATION 15 2.3.2 WRITE OPERATION 19 CHAPTER 3 BACKGROUNDS ON COUPLED WIRES 21 3.1 GENERALIZED MODEL 21 3.2 EFFECT OF CROSSTALK 26 CHAPTER 4 DQ RECEIVER WITH BAUD-RATE SELF-TRACKING LOOP 29 4.1 OVERVIEW 29 4.2 FEATURES OF DQ RECEIVER FOR HBM 33 4.3 PROPOSED PULSE-TO-CHARGE PHASE DETECTOR 35 4.3.1 OPERATION OF PULSE-TO-CHARGE PHASE DETECTOR 35 4.3.2 OFFSET CALIBRATION 37 4.3.3 OPERATION SEQUENCE 39 4.4 CIRCUIT IMPLEMENTATION 42 4.5 MEASUREMENT RESULT 46 CHAPTER 5 HIGH-DENSITY TRANSCEIVER FOR HBM WITH 3D-STAGGERED CHANNEL AND CROSSTALK CANCELLATION SCHEME 57 5.1 OVERVIEW 57 5.2 PROPOSED 3D-STAGGERED CHANNEL 61 5.2.1 IMPLEMENTATION OF 3D-STAGGERED CHANNEL 61 5.2.2 CHANNEL CHARACTERISTICS AND MODELING 66 5.3 PROPOSED FEED-FORWARD-EQUALIZER-COMBINED CROSSTALK CANCELLATION SCHEME 72 5.4 CIRCUIT IMPLEMENTATION 77 5.4.1 OVERALL ARCHITECTURE 77 5.4.2 TRANSMITTER WITH FFE-COMBINED XTC 79 5.4.3 RECEIVER 81 5.5 MEASUREMENT RESULT 82 CHAPTER 6 CONCLUSION 93 BIBLIOGRAPHY 95 초 둝 102Docto

    Photonic integration enabling new multiplexing concepts in optical board-to-board and rack-to-rack interconnects

    Get PDF
    New broadband applications are causing the datacenters to proliferate, raising the bar for higher interconnection speeds. So far, optical board-to-board and rack-to-rack interconnects relied primarily on low-cost commodity optical components assembled in a single package. Although this concept proved successful in the first generations of optical-interconnect modules, scalability is a daunting issue as signaling rates extend beyond 25 Gb/s. In this paper we present our work towards the development of two technology platforms for migration beyond Infiniband enhanced data rate (EDR), introducing new concepts in board-to-board and rack-to-rack interconnects. The first platform is developed in the framework of MIRAGE European project and relies on proven VCSEL technology, exploiting the inherent cost, yield, reliability and power consumption advantages of VCSELs. Wavelength multiplexing, PAM-4 modulation and multi-core fiber (MCF) multiplexing are introduced by combining VCSELs with integrated Si and glass photonics as well as BiCMOS electronics. An in-plane MCF-to-SOI interface is demonstrated, allowing coupling from the MCF cores to 340x400 nm Si waveguides. Development of a low-power VCSEL driver with integrated feed-forward equalizer is reported, allowing PAM-4 modulation of a bandwidth-limited VCSEL beyond 25 Gbaud. The second platform, developed within the frames of the European project PHOXTROT, considers the use of modulation formats of increased complexity in the context of optical interconnects. Powered by the evolution of DSP technology and towards an integration path between inter and intra datacenter traffic, this platform investigates optical interconnection system concepts capable to support 16QAM 40GBd data traffic, exploiting the advancements of silicon and polymer technologies

    Foundry-Enabled Scalable All-to-All Optical Interconnects Using Silicon Nitride Arrayed Waveguide Router Interposers and Silicon Photonic Transceivers

    Get PDF
    This paper summarizes our latest results of integrated all-to-all optical interconnect systems using compact, low-loss silicon nitride (SiN) arrayed waveguide grating router (AWGR) through AIM photonics' multiple-project-wafer services. In particular, we have designed, taped out, and initially characterized a chip-scale silicon photonic low-latency interconnect optical network switch (Si-LIONS) system with an 8 Γ— 8 200 GHz spacing cyclic SiN AWGR, 64 microdisk modulators, and 64 on-chip germanium photodector (PD). The 8 Γ— 8 SiN AWGR in design has a measured insertion loss of 1.8 dB and a crosstalk of -13 dB, with a footprint of 1.3 mm Γ— 0.9 mm. We measured an error-free performance of the microdisk modulator at 10 Gb/s upon 1Vpp voltage swing. We demonstrated wavelength routing with error-free data transmission using the on-chip modulator, SiN AWGR, and an external PD. We have designed and taped out the optical interposer version of the all-to-all system using SiN waveguides and low-loss chip-to-interposer couplers. Finally, we illustrate our preliminary designs and results of 16 Γ— 16 and 32 Γ— 32 SiN AWGRs, and discuss the possibility of scaling beyond 1024 Γ— 1024 all-to-all interconnections with reduced number of wavelengths (e.g., 64) using the Thin-CLOS architecture

    Architecting a One-to-many Traffic-Aware and Secure Millimeter-Wave Wireless Network-in-Package Interconnect for Multichip Systems

    Get PDF
    With the aggressive scaling of device geometries, the yield of complex Multi Core Single Chip(MCSC) systems with many cores will decrease due to the higher probability of manufacturing defects especially, in dies with a large area. Disintegration of large System-on-Chips(SoCs) into smaller chips called chiplets has shown to improve the yield and cost of complex systems. Therefore, platform-based computing modules such as embedded systems and micro-servers have already adopted Multi Core Multi Chip (MCMC) architectures overMCSC architectures. Due to the scaling of memory intensive parallel applications in such systems, data is more likely to be shared among various cores residing in different chips resulting in a significant increase in chip-to-chip traffic, especially one-to-many traffic. This one-to-many traffic is originated mainly to maintain cache-coherence between many cores residing in multiple chips. Besides, one-to-many traffics are also exploited by many parallel programming models, system-level synchronization mechanisms, and control signals. How-ever, state-of-the-art Network-on-Chip (NoC)-based wired interconnection architectures do not provide enough support as they handle such one-to-many traffic as multiple unicast trafficusing a multi-hop MCMC communication fabric. As a result, even a small portion of such one-to-many traffic can significantly reduce system performance as traditional NoC-basedinterconnect cannot mask the high latency and energy consumption caused by chip-to-chipwired I/Os. Moreover, with the increase in memory intensive applications and scaling of MCMC systems, traditional NoC-based wired interconnects fail to provide a scalable inter-connection solution required to support the increased cache-coherence and synchronization generated one-to-many traffic in future MCMC-based High-Performance Computing (HPC) nodes. Therefore, these computation and memory intensive MCMC systems need an energy-efficient, low latency, and scalable one-to-many (broadcast/multicast) traffic-aware interconnection infrastructure to ensure high-performance. Research in recent years has shown that Wireless Network-in-Package (WiNiP) architectures with CMOS compatible Millimeter-Wave (mm-wave) transceivers can provide a scalable, low latency, and energy-efficient interconnect solution for on and off-chip communication. In this dissertation, a one-to-many traffic-aware WiNiP interconnection architecture with a starvation-free hybrid Medium Access Control (MAC), an asymmetric topology, and a novel flow control has been proposed. The different components of the proposed architecture are individually one-to-many traffic-aware and as a system, they collaborate with each other to provide required support for one-to-many traffic communication in a MCMC environment. It has been shown that such interconnection architecture can reduce energy consumption and average packet latency by 46.96% and 47.08% respectively for MCMC systems. Despite providing performance enhancements, wireless channel, being an unguided medium, is vulnerable to various security attacks such as jamming induced Denial-of-Service (DoS), eavesdropping, and spoofing. Further, to minimize the time-to-market and design costs, modern SoCs often use Third Party IPs (3PIPs) from untrusted organizations. An adversary either at the foundry or at the 3PIP design house can introduce a malicious circuitry, to jeopardize an SoC. Such malicious circuitry is known as a Hardware Trojan (HT). An HTplanted in the WiNiP from a vulnerable design or manufacturing process can compromise a Wireless Interface (WI) to enable illegitimate transmission through the infected WI resulting in a potential DoS attack for other WIs in the MCMC system. Moreover, HTs can be used for various other malicious purposes, including battery exhaustion, functionality subversion, and information leakage. This information when leaked to a malicious external attackercan reveals important information regarding the application suites running on the system, thereby compromising the user profile. To address persistent jamming-based DoS attack in WiNiP, in this dissertation, a secure WiNiP interconnection architecture for MCMC systems has been proposed that re-uses the one-to-many traffic-aware MAC and existing Design for Testability (DFT) hardware along with Machine Learning (ML) approach. Furthermore, a novel Simulated Annealing (SA)-based routing obfuscation mechanism was also proposed toprotect against an HT-assisted novel traffic analysis attack. Simulation results show that,the ML classifiers can achieve an accuracy of 99.87% for DoS attack detection while SA-basedrouting obfuscation could reduce application detection accuracy to only 15% for HT-assistedtraffic analysis attack and hence, secure the WiNiP fabric from age-old and emerging attacks

    PIXAPP Photonics Packaging Pilot Line development of a silicon photonic optical transceiver with pluggable fiber connectivity

    Get PDF
    This paper demonstrates how the PIXAPP Photonics Packaging Pilot Line uses its extensive packaging capabilities across its European partner network to design and assemble a highly integrated silicon photonic-based optical transceiver. The processes used are based on PIXAPP's open access packaging design rules or Assembly Design Kit (ADK). The transceiver was designed to have the Tx and Rx elements integrated on to a single silicon photonic chip, together with flipchip control electronics, hybrid laser and micro-optics. The transceiver used the on-chip micro-optics to enable a pluggable fiber connection, avoiding the need to bond optical fibers directly to the photonic chip. Finally, the packaged transceiver module was tested, showing 56 Gb/s loop-back modulation and de-modulation, validating both the transmitter and receiver performance

    A Scalable & Energy Efficient Graphene-Based Interconnection Framework for Intra and Inter-Chip Wireless Communication in Terahertz Band

    Get PDF
    Network-on-Chips (NoCs) have emerged as a communication infrastructure for the multi-core System-on-Chips (SoCs). Despite its advantages, due to the multi-hop communication over the metal interconnects, traditional Mesh based NoC architectures are not scalable in terms of performance and energy consumption. Folded architectures such as Torus and Folded Torus were proposed to improve the performance of NoCs while retaining the regular tile-based structure for ease of manufacturing. Ultra-low-latency and low-power express channels between communicating cores have also been proposed to improve the performance of conventional NoCs. However, the performance gain of these approaches is limited due to metal/dielectric based interconnection. Many emerging interconnect technologies such as 3D integration, photonic, Radio Frequency (RF), and wireless interconnects have been envisioned to alleviate the issues of a metal/dielectric interconnect system. However, photonic and RF interconnects need the additional physically overlaid optical waveguides or micro-strip transmission lines to enable data transmission across the NoC. Several on-chip antennas have shown to improve energy efficiency and bandwidth of on-chip data communications. However, the date rates of the mm-wave wireless channels are limited by the state-of-the-art power-efficient transceiver design. Recent research has brought to light novel graphene based antennas operating at THz frequencies. Due to the higher operating frequencies compared to mm-wave transceivers, the data rate that can be supported by these antennas are significantly higher. Higher operating frequencies imply that graphene based antennas are just hundred micrometers in size compared to dimensions in the range of a millimeter of mm-wave antennas. Such reduced dimensions are suitable for integration of several such transceivers in a single NoC for relatively low overheads. In this work, to exploit the benefits of a regular NoC structure in conjunction with emerging Graphene-based wireless interconnect. We propose a toroidal folding based NoC architecture. The novelty of this folding based approach is that we are using low power, high bandwidth, single hop direct point to point wireless links instead of multihop communication that happens through metallic wires. We also propose a novel phased based communication protocol through which multiple wireless links can be made active at a time without having any interference among the transceiver. This offers huge gain in terms of performance as compared to token based mechanism where only a single wireless link can be made active at a time. We also propose to extend Graphene-based wireless links to enable energy-efficient, phase-based chip-to-chip communication to create a seamless, wireless interconnection fabric for multichip systems as well. Through cycle-accurate system-level simulations, we demonstrate that such designs with torus like folding based on THz links instead of global wires along with the proposed phase based multichip systems. We provide estimates that they are able to provide significant gains (about 3 to 4 times better in terms of achievable bandwidth, packet latency and average packet energy when compared to wired system) in performance and energy efficiency in data transfer in a NoC as well as multichip system. Thus, realization of these kind of interconnection framework that could support high data rate links in Tera-bits-per-second that will alleviate the capacity limitations of current interconnection framework

    Laser written glass interposer for fiber coupling to silicon photonic integrated circuits

    Get PDF
    Recent advancements in photonic-electronic integration push towards denser multichannel fiber to silicon photonic chip coupling solutions. However, current packaging schemes based on suitably polished fiber arrays do not provide sufficient scalability. Alternatively, lithographically-patterned fused silica glass interposers have been proposed, allowing for the integration of fanout waveguides between a dense array of on-chip silicon waveguides and a cleaved fiber ribbon. In this paper, we propose the use of femtosecond laser inscription for the fabrication of the fused silica glass interposer, allowing for a monolithic integration of waveguides and V-grooves for fiber alignment. The waveguides obtained by Femtosecond Laser Direct Writing (FLDW) have a propagation loss of 0.88 dB/cm at 1550 nm. The mode-field diameter is 12.8 +/- 0.4 mu m, allowing for a coupling loss of 1.24 +/- 0.32 dB when coupling to a standard single mode optical fiber, passively aligned to the fused silica waveguide by insertion in a V-groove created by Femtosecond Laser Irradiation followed by Chemical Etching (FLICE). The average surface roughness of the etched waveguide facet is 160 +/- 5 nm. Scattering loss when coupling to fiber is reduced by use of an index-matching adhesive for fiber fixation. A polished out-of-plane coupling mirror at an angle of 41.5. injects the light into standard grating couplers, providing a quasi-planar fiber-to-chip package. The excess loss of the proposed solution is limited to 2 dB per interface, including mirror, waveguide and fiber coupling losses
    • …
    corecore