181 research outputs found
μ°¨μΈλ HBM μ© κ³ μ§μ , μ μ λ ₯ μ‘μμ κΈ° μ€κ³
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν μ κΈ°Β·μ 보곡νλΆ, 2020. 8. μ λκ· .This thesis presents design techniques for high-density power-efficient transceiver for the next-generation high bandwidth memory (HBM). Unlike the other memory interfaces, HBM uses a 3D-stacked package using through-silicon via (TSV) and a silicon interposer. The transceiver for HBM should be able to solve the problems caused by the 3D-stacked package and TSV.
At first, a data (DQ) receiver for HBM with a self-tracking loop that tracks a phase skew between DQ and data strobe (DQS) due to a voltage or thermal drift is proposed. The self-tracking loop achieves low power and small area by uti-lizing an analog-assisted baud-rate phase detector. The proposed pulse-to-charge (PC) phase detector (PD) converts the phase skew to a voltage differ-ence and detects the phase skew from the voltage difference. An offset calibra-tion scheme that can compensates for a mismatch of the PD is also proposed. The proposed calibration scheme operates without any additional sensing cir-cuits by taking advantage of the write training of HBM. Fabricated in 65 nm CMOS, the DQ receiver shows a power efficiency of 370 fJ/b at 4.8 Gb/s and occupies 0.0056 mm2. The experimental results show that the DQ receiver op-erates without any performance degradation under a Β± 10% supply variation.
In a second prototype IC, a high-density transceiver for HBM with a feed-forward-equalizer (FFE)-combined crosstalk (XT) cancellation scheme is pre-sented. To compensate for the XT, the transmitter pre-distorts the amplitude of the FFE output according to the XT. Since the proposed XT cancellation (XTC) scheme reuses the FFE implemented to equalize the channel loss, additional circuits for the XTC is minimized. Thanks to the XTC scheme, a channel pitch can be significantly reduced, allowing for the high channel density. Moreover, the 3D-staggered channel structure removes the ground layer between the verti-cally adjacent channels, which further reduces a cross-sectional area of the channel per lane. The test chip including 6 data lanes is fabricated in 65 nm CMOS technology. The 6-mm channels are implemented on chip to emulate the silicon interposer between the HBM and the processor. The operation of the XTC scheme is verified by simultaneously transmitting 4-Gb/s data to the 6 consecutive channels with 0.5-um pitch and the XTC scheme reduces the XT-induced jitter up to 78 %. The measurement result shows that the transceiver achieves the throughput of 8 Gb/s/um. The transceiver occupies 0.05 mm2 for 6 lanes and consumes 36.6 mW at 6 x 4 Gb/s.λ³Έ λ
Όλ¬Έμμλ μ°¨μΈλ HBMμ μν κ³ μ§μ μ μ λ ₯ μ‘μμ κΈ° μ€κ³ λ°©λ²μ μ μνλ€. 첫 λ²μ§Έλ‘, μ μ λ° μ¨λ λ³νμ μν λ°μ΄ν°μ ν΄λ κ° μμ μ°¨μ΄λ₯Ό 보μν μ μλ μ체 μΆμ 루νλ₯Ό κ°μ§ λ°μ΄ν° μμ κΈ°λ₯Ό μ μνλ€. μ μνλ μ체 μΆμ 루νλ λ°μ΄ν° μ μ‘ μλμ κ°μ μλλ‘ λμνλ μμ κ²μΆκΈ°λ₯Ό μ¬μ©νμ¬ μ λ ₯ μλͺ¨μ λ©΄μ μ μ€μλ€. λν λ©λͺ¨λ¦¬μ μ°κΈ° νλ ¨ (write training) κ³Όμ μ μ΄μ©νμ¬ ν¨κ³Όμ μΌλ‘ μμ κ²μΆκΈ°μ μ€νμ
μ 보μν μ μλ λ°©λ²μ μ μνλ€. μ μνλ λ°μ΄ν° μμ κΈ°λ 65 nm 곡μ μΌλ‘ μ μλμ΄ 4.8 Gb/sμμ 370 fJ/bμ μλͺ¨νμλ€. λν 10 % μ μ μ λ³νμ λνμ¬ μμ μ μΌλ‘ λμνλ κ²μ νμΈνμλ€.
λ λ²μ§Έλ‘, νΌλ ν¬μλ μ΄νλΌμ΄μ μ κ²°ν©λ ν¬λ‘μ€ ν ν¬ λ³΄μ λ°©μμ νμ©ν κ³ μ§μ μ‘μμ κΈ°λ₯Ό μ μνλ€. μ μνλ μ‘μ κΈ°λ ν¬λ‘μ€ ν ν¬ ν¬κΈ°μ ν΄λΉνλ λ§νΌ μ‘μ κΈ° μΆλ ₯μ μ곑νμ¬ ν¬λ‘μ€ ν ν¬λ₯Ό 보μνλ€. μ μνλ ν¬λ‘μ€ ν ν¬ λ³΄μ λ°©μμ μ±λ μμ€μ 보μνκΈ° μν΄ κ΅¬νλ νΌλ ν¬μλ μ΄νλΌμ΄μ λ₯Ό μ¬νμ©ν¨μΌλ‘μ¨ μΆκ°μ μΈ νλ‘λ₯Ό μ΅μννλ€. μ μνλ μ‘μμ κΈ°λ ν¬λ‘μ€ ν ν¬κ° 보μ κ°λ₯νκΈ° λλ¬Έμ, μ±λ κ°κ²©μ ν¬κ² μ€μ¬ κ³ μ§μ ν΅μ μ ꡬννμλ€. λν μ§μ λλ₯Ό λ μ¦κ°μν€κΈ° μν΄ μΈλ‘λ‘ μΈμ ν μ±λ μ¬μ΄μ μ°¨ν μΈ΅μ μ κ±°ν μ μΈ΅ μ±λ ꡬ쑰λ₯Ό μ μνλ€. 6κ°μ μ‘μμ κΈ°λ₯Ό ν¬ν¨ν νλ‘ν νμ
μΉ©μ 65 nm 곡μ μΌλ‘ μ μλμλ€. HBMκ³Ό νλ‘μΈμ μ¬μ΄μ silicon interposer channel μ λͺ¨μ¬νκΈ° μν 6 mm μ μ±λμ΄ μΉ© μμ ꡬνλμλ€. μ μνλ ν¬λ‘μ€ ν ν¬ λ³΄μ λ°©μμ 0.5 um κ°κ²©μ 6κ°μ μΈμ ν μ±λμ λμμ λ°μ΄ν°λ₯Ό μ μ‘νμ¬ κ²μ¦λμμΌλ©°, ν¬λ‘μ€ ν ν¬λ‘ μΈν μ§ν°λ₯Ό μ΅λ 78 % κ°μμμΌ°λ€. μ μνλ μ‘μμ κΈ°λ 8 Gb/s/um μ μ²λ¦¬λμ κ°μ§λ©° 6 κ°μ μ‘μμ κΈ°κ° μ΄ 36.6 mWμ μ λ ₯μ μλͺ¨νμλ€.CHAPTER 1 INTRODUCTION 1
1.1 MOTIVATION 1
1.2 THESIS ORGANIZATION 4
CHAPTER 2 BACKGROUND ON HIGH-BANDWIDTH MEMORY 6
2.1 OVERVIEW 6
2.2 TRANSCEIVER ARCHITECTURE 10
2.3 READ/WRITE OPERATION 15
2.3.1 READ OPERATION 15
2.3.2 WRITE OPERATION 19
CHAPTER 3 BACKGROUNDS ON COUPLED WIRES 21
3.1 GENERALIZED MODEL 21
3.2 EFFECT OF CROSSTALK 26
CHAPTER 4 DQ RECEIVER WITH BAUD-RATE SELF-TRACKING LOOP 29
4.1 OVERVIEW 29
4.2 FEATURES OF DQ RECEIVER FOR HBM 33
4.3 PROPOSED PULSE-TO-CHARGE PHASE DETECTOR 35
4.3.1 OPERATION OF PULSE-TO-CHARGE PHASE DETECTOR 35
4.3.2 OFFSET CALIBRATION 37
4.3.3 OPERATION SEQUENCE 39
4.4 CIRCUIT IMPLEMENTATION 42
4.5 MEASUREMENT RESULT 46
CHAPTER 5 HIGH-DENSITY TRANSCEIVER FOR HBM WITH 3D-STAGGERED CHANNEL AND CROSSTALK CANCELLATION SCHEME 57
5.1 OVERVIEW 57
5.2 PROPOSED 3D-STAGGERED CHANNEL 61
5.2.1 IMPLEMENTATION OF 3D-STAGGERED CHANNEL 61
5.2.2 CHANNEL CHARACTERISTICS AND MODELING 66
5.3 PROPOSED FEED-FORWARD-EQUALIZER-COMBINED CROSSTALK CANCELLATION SCHEME 72
5.4 CIRCUIT IMPLEMENTATION 77
5.4.1 OVERALL ARCHITECTURE 77
5.4.2 TRANSMITTER WITH FFE-COMBINED XTC 79
5.4.3 RECEIVER 81
5.5 MEASUREMENT RESULT 82
CHAPTER 6 CONCLUSION 93
BIBLIOGRAPHY 95
μ΄ λ‘ 102Docto
Photonic integration enabling new multiplexing concepts in optical board-to-board and rack-to-rack interconnects
New broadband applications are causing the datacenters to proliferate, raising the bar for higher interconnection speeds. So far, optical board-to-board and rack-to-rack interconnects relied primarily on low-cost commodity optical components assembled in a single package. Although this concept proved successful in the first generations of optical-interconnect modules, scalability is a daunting issue as signaling rates extend beyond 25 Gb/s. In this paper we present our work towards the development of two technology platforms for migration beyond Infiniband enhanced data rate (EDR), introducing new concepts in board-to-board and rack-to-rack interconnects.
The first platform is developed in the framework of MIRAGE European project and relies on proven VCSEL technology, exploiting the inherent cost, yield, reliability and power consumption advantages of VCSELs. Wavelength multiplexing, PAM-4 modulation and multi-core fiber (MCF) multiplexing are introduced by combining VCSELs with integrated Si and glass photonics as well as BiCMOS electronics. An in-plane MCF-to-SOI interface is demonstrated, allowing coupling from the MCF cores to 340x400 nm Si waveguides. Development of a low-power VCSEL driver with integrated feed-forward equalizer is reported, allowing PAM-4 modulation of a bandwidth-limited VCSEL beyond 25 Gbaud.
The second platform, developed within the frames of the European project PHOXTROT, considers the use of modulation formats of increased complexity in the context of optical interconnects. Powered by the evolution of DSP technology and towards an integration path between inter and intra datacenter traffic, this platform investigates optical interconnection system concepts capable to support 16QAM 40GBd data traffic, exploiting the advancements of silicon and polymer technologies
Recommended from our members
Development of Silicon Photonic Multi Chip Module Transceivers
The exponential growth of data generationβdriven in part by the proliferation of applications such as high definition streaming, artificial intelligence, and the internet of thingsβpresents an impending bottleneck for electrical interconnects to fulfill data center bandwidth demands. Links now require bandwidths in excess of multiple Tbps while operating on the order of picojoules per bit, in addition to constraints on areal bandwidth densities and pin I/O bandwidth densities. Optical communications built on a silicon photonic platform offers a potential solution to develop power efficient, high bandwidth, low attenuation, small footprint links, all while building off the mature CMOS ecosystem. The development of silicon photonic foundries supporting multi project wafer runs with associated process design kit components supports a path towards widespread commercial production by increasing production volume while reducing fabrication and development costs. While silicon photonics can always be improved in terms of performance and yield, one of the central challenges is the integration of the silicon photonic integrated circuits with the driving electronic integrated circuits and data generating compute nodes such as CPUs, FPGAs, and ASICs. The co-packaging of the photonics with the electronics is crucial for adoption of silicon photonics in datacenters, as improper integration negates all the potential benefits of silicon photonics.
The work in this dissertation is centered around the development of silicon photonic multi chip module transceivers to aid in the deployment of silicon photonics within data centers. Section one focuses on silicon photonic integration and highlights multiple integrated transceiver prototypes. The central prototype features a photonic integrated circuit with bus waveguides with WDM microdisk modulators for the transmitter and WDM demuxes with drop ports to photodiodes for the receiver. The 2.5D integrated prototype utilizes a thinned silicon interposer and TIA electronic integrated circuits. The architecture, integration, characterization, performance, and scalability of the prototype are discussed. The development of this first prototype identified key design considerations necessary for designing multi chip module silicon photonic prototypes, which will be addressed in this section. Finally, other multi chip module silicon photonic prototypes will be overviewed. These include a 2.5D integrated transceiver with a different electronic integrated circuit TIA, a 3D integrated receiver, an active interposer network on chip, and a 2.5D integrated transceiver with custom electronic integrated circuits. Section two focuses on research that supports the development of silicon photonic transceivers. The thermal crosstalk from neighboring microdisk modulators as a function of modulator pitch is investigated. As modulators are placed at denser pitches to accommodate areal bandwidth density requirements in transceivers, this thermal crosstalk will become significant. In this section, designs and results from several iterations of custom microring modulators are reported. Custom microring modulators allow for scaling up the number of channels in microring transceivers by offering the ability to fabricate variable resonances and provide a platform for further innovation in bandwidth, free spectral range, and energy efficiency. The designs and results of higher order modulation format modulators, both microring based and Mach Zehnder based, are discussed. High order modulators offer a path towards scaling transceiver total throughput without having to increase the channel counts or component bandwidth. Together, the work in these two sections supports the development of silicon photonic transceivers to aid in the adoption of silicon photonics into data generating systems
Foundry-Enabled Scalable All-to-All Optical Interconnects Using Silicon Nitride Arrayed Waveguide Router Interposers and Silicon Photonic Transceivers
This paper summarizes our latest results of integrated all-to-all optical interconnect systems using compact, low-loss silicon nitride (SiN) arrayed waveguide grating router (AWGR) through AIM photonics' multiple-project-wafer services. In particular, we have designed, taped out, and initially characterized a chip-scale silicon photonic low-latency interconnect optical network switch (Si-LIONS) system with an 8 Γ 8 200 GHz spacing cyclic SiN AWGR, 64 microdisk modulators, and 64 on-chip germanium photodector (PD). The 8 Γ 8 SiN AWGR in design has a measured insertion loss of 1.8 dB and a crosstalk of -13 dB, with a footprint of 1.3 mm Γ 0.9 mm. We measured an error-free performance of the microdisk modulator at 10 Gb/s upon 1Vpp voltage swing. We demonstrated wavelength routing with error-free data transmission using the on-chip modulator, SiN AWGR, and an external PD. We have designed and taped out the optical interposer version of the all-to-all system using SiN waveguides and low-loss chip-to-interposer couplers. Finally, we illustrate our preliminary designs and results of 16 Γ 16 and 32 Γ 32 SiN AWGRs, and discuss the possibility of scaling beyond 1024 Γ 1024 all-to-all interconnections with reduced number of wavelengths (e.g., 64) using the Thin-CLOS architecture
γ·γͺγ³γ³γγ©γγγ―γΉγη¨γγι«ιγ»ι«ζεΊ¦ε εδΏ‘ε¨γ«ι’γγη η©Ά
Tohoku Universityε±±η°γεδ»θͺ²
Recommended from our members
Heterogeneous Integration on Silicon-Interconnect Fabric using fine-pitch interconnects (β€10 οΏ½m)
Today, the ever-growing data-bandwidth demand is pushing the boundaries of the traditional printed circuit board (PCB) based integration schemes. Moreover, with the apparent saturation of semiconductor scaling, commonly called Moore's law, system scaling warrants a paradigm shift in packaging technologies, assembly techniques, and integration methodologies. In this work, a superior alternative to PCBs called the Silicon-Interconnect Fabric (Si-IF) is investigated. The Si-IF is a silicon-based, package-less, fine-pitch, highly scalable, heterogeneous integration platform for wafer-scale systems. In this technology, unpackaged dielets are assembled on the Si-IF at small inter-dielet spacings (β€100 οΏ½m) using fine-pitch (β€10 οΏ½m) die-to-substrate interconnects. A novel assembly process using a solder-less direct metal-metal (gold-gold and copper-copper) thermal compression bonding was developed. Using this process, sub-10 οΏ½m pitch interconnects with a low specific contact resistance of β€0.7 β¦-οΏ½m2 were successfully demonstrated. Because of the tightly packed Si-IF assembly, the communication links between the neighboring dies are short (β€500 οΏ½m) with low loss (β€2 dB), comparable to on-chip connections. Consequently, simple buffers can transfer data between dies using a Simple Universal Parallel intERface for chips (SuperCHIPS) at low latency (<30 ps), low energy per bit (β€0.03 pJ/b), and high data-rates (up to 10 Gbps/link), corresponding to an aggregate bandwidth up to 8 Tbps/mm. The benefits of the SuperCHIPS protocol were experimentally demonstrated to provide 5-90X higher data-bandwidth, 8-30X lower latency, and 5-40X lower energy per bit compared to existing integration schemes. This dissertation addresses the assembly technology and communication protocols of the Si-IF technology
Architecting a One-to-many Traffic-Aware and Secure Millimeter-Wave Wireless Network-in-Package Interconnect for Multichip Systems
With the aggressive scaling of device geometries, the yield of complex Multi Core Single Chip(MCSC) systems with many cores will decrease due to the higher probability of manufacturing defects especially, in dies with a large area. Disintegration of large System-on-Chips(SoCs) into smaller chips called chiplets has shown to improve the yield and cost of complex systems. Therefore, platform-based computing modules such as embedded systems and micro-servers have already adopted Multi Core Multi Chip (MCMC) architectures overMCSC architectures. Due to the scaling of memory intensive parallel applications in such systems, data is more likely to be shared among various cores residing in different chips resulting in a significant increase in chip-to-chip traffic, especially one-to-many traffic. This one-to-many traffic is originated mainly to maintain cache-coherence between many cores residing in multiple chips. Besides, one-to-many traffics are also exploited by many parallel programming models, system-level synchronization mechanisms, and control signals. How-ever, state-of-the-art Network-on-Chip (NoC)-based wired interconnection architectures do not provide enough support as they handle such one-to-many traffic as multiple unicast trafficusing a multi-hop MCMC communication fabric. As a result, even a small portion of such one-to-many traffic can significantly reduce system performance as traditional NoC-basedinterconnect cannot mask the high latency and energy consumption caused by chip-to-chipwired I/Os. Moreover, with the increase in memory intensive applications and scaling of MCMC systems, traditional NoC-based wired interconnects fail to provide a scalable inter-connection solution required to support the increased cache-coherence and synchronization generated one-to-many traffic in future MCMC-based High-Performance Computing (HPC) nodes. Therefore, these computation and memory intensive MCMC systems need an energy-efficient, low latency, and scalable one-to-many (broadcast/multicast) traffic-aware interconnection infrastructure to ensure high-performance.
Research in recent years has shown that Wireless Network-in-Package (WiNiP) architectures with CMOS compatible Millimeter-Wave (mm-wave) transceivers can provide a scalable, low latency, and energy-efficient interconnect solution for on and off-chip communication. In this dissertation, a one-to-many traffic-aware WiNiP interconnection architecture with a starvation-free hybrid Medium Access Control (MAC), an asymmetric topology, and a novel flow control has been proposed. The different components of the proposed architecture are individually one-to-many traffic-aware and as a system, they collaborate with each other to provide required support for one-to-many traffic communication in a MCMC environment. It has been shown that such interconnection architecture can reduce energy consumption and average packet latency by 46.96% and 47.08% respectively for MCMC systems.
Despite providing performance enhancements, wireless channel, being an unguided medium, is vulnerable to various security attacks such as jamming induced Denial-of-Service (DoS), eavesdropping, and spoofing. Further, to minimize the time-to-market and design costs, modern SoCs often use Third Party IPs (3PIPs) from untrusted organizations. An adversary either at the foundry or at the 3PIP design house can introduce a malicious circuitry, to jeopardize an SoC. Such malicious circuitry is known as a Hardware Trojan (HT). An HTplanted in the WiNiP from a vulnerable design or manufacturing process can compromise a Wireless Interface (WI) to enable illegitimate transmission through the infected WI resulting in a potential DoS attack for other WIs in the MCMC system. Moreover, HTs can be used for various other malicious purposes, including battery exhaustion, functionality subversion, and information leakage. This information when leaked to a malicious external attackercan reveals important information regarding the application suites running on the system, thereby compromising the user profile. To address persistent jamming-based DoS attack in WiNiP, in this dissertation, a secure WiNiP interconnection architecture for MCMC systems has been proposed that re-uses the one-to-many traffic-aware MAC and existing Design for Testability (DFT) hardware along with Machine Learning (ML) approach. Furthermore, a novel Simulated Annealing (SA)-based routing obfuscation mechanism was also proposed toprotect against an HT-assisted novel traffic analysis attack. Simulation results show that,the ML classifiers can achieve an accuracy of 99.87% for DoS attack detection while SA-basedrouting obfuscation could reduce application detection accuracy to only 15% for HT-assistedtraffic analysis attack and hence, secure the WiNiP fabric from age-old and emerging attacks
PIXAPP Photonics Packaging Pilot Line development of a silicon photonic optical transceiver with pluggable fiber connectivity
This paper demonstrates how the PIXAPP Photonics Packaging Pilot Line uses its extensive packaging capabilities across its European partner network to design and assemble a highly integrated silicon photonic-based optical transceiver. The processes used are based on PIXAPP's open access packaging design rules or Assembly Design Kit (ADK). The transceiver was designed to have the Tx and Rx elements integrated on to a single silicon photonic chip, together with flipchip control electronics, hybrid laser and micro-optics. The transceiver used the on-chip micro-optics to enable a pluggable fiber connection, avoiding the need to bond optical fibers directly to the photonic chip. Finally, the packaged transceiver module was tested, showing 56 Gb/s loop-back modulation and de-modulation, validating both the transmitter and receiver performance
A Scalable & Energy Efficient Graphene-Based Interconnection Framework for Intra and Inter-Chip Wireless Communication in Terahertz Band
Network-on-Chips (NoCs) have emerged as a communication infrastructure for the multi-core System-on-Chips (SoCs). Despite its advantages, due to the multi-hop communication over the metal interconnects, traditional Mesh based NoC architectures are not scalable in terms of performance and energy consumption. Folded architectures such as Torus and Folded Torus were proposed to improve the performance of NoCs while retaining the regular tile-based structure for ease of manufacturing. Ultra-low-latency and low-power express channels between communicating cores have also been proposed to improve the performance of conventional NoCs. However, the performance gain of these approaches is limited due to metal/dielectric based interconnection.
Many emerging interconnect technologies such as 3D integration, photonic, Radio Frequency (RF), and wireless interconnects have been envisioned to alleviate the issues of a metal/dielectric interconnect system. However, photonic and RF interconnects need the additional physically overlaid optical waveguides or micro-strip transmission lines to enable data transmission across the NoC. Several on-chip antennas have shown to improve energy efficiency and bandwidth of on-chip data communications. However, the date rates of the mm-wave wireless channels are limited by the state-of-the-art power-efficient transceiver design. Recent research has brought to light novel graphene based antennas operating at THz frequencies. Due to the higher operating frequencies compared to mm-wave transceivers, the data rate that can be supported by these antennas are significantly higher. Higher operating frequencies imply that graphene based antennas are just hundred micrometers in size compared to dimensions in the range of a millimeter of mm-wave antennas. Such reduced dimensions are suitable for integration of several such transceivers in a single NoC for relatively low overheads.
In this work, to exploit the benefits of a regular NoC structure in conjunction with emerging Graphene-based wireless interconnect. We propose a toroidal folding based NoC architecture. The novelty of this folding based approach is that we are using low power, high bandwidth, single hop direct point to point wireless links instead of multihop communication that happens through metallic wires. We also propose a novel phased based communication protocol through which multiple wireless links can be made active at a time without having any interference among the transceiver. This offers huge gain in terms of performance as compared to token based mechanism where only a single wireless link can be made active at a time. We also propose to extend Graphene-based wireless links to enable energy-efficient, phase-based chip-to-chip communication to create a seamless, wireless interconnection fabric for multichip systems as well. Through cycle-accurate system-level simulations, we demonstrate that such designs with torus like folding based on THz links instead of global wires along with the proposed phase based multichip systems. We provide estimates that they are able to provide significant gains (about 3 to 4 times better in terms of achievable bandwidth, packet latency and average packet energy when compared to wired system) in performance and energy efficiency in data transfer in a NoC as well as multichip system. Thus, realization of these kind of interconnection framework that could support high data rate links in Tera-bits-per-second that will alleviate the capacity limitations of current interconnection framework
Laser written glass interposer for fiber coupling to silicon photonic integrated circuits
Recent advancements in photonic-electronic integration push towards denser multichannel fiber to silicon photonic chip coupling solutions. However, current packaging schemes based on suitably polished fiber arrays do not provide sufficient scalability. Alternatively, lithographically-patterned fused silica glass interposers have been proposed, allowing for the integration of fanout waveguides between a dense array of on-chip silicon waveguides and a cleaved fiber ribbon. In this paper, we propose the use of femtosecond laser inscription for the fabrication of the fused silica glass interposer, allowing for a monolithic integration of waveguides and V-grooves for fiber alignment. The waveguides obtained by Femtosecond Laser Direct Writing (FLDW) have a propagation loss of 0.88 dB/cm at 1550 nm. The mode-field diameter is 12.8 +/- 0.4 mu m, allowing for a coupling loss of 1.24 +/- 0.32 dB when coupling to a standard single mode optical fiber, passively aligned to the fused silica waveguide by insertion in a V-groove created by Femtosecond Laser Irradiation followed by Chemical Etching (FLICE). The average surface roughness of the etched waveguide facet is 160 +/- 5 nm. Scattering loss when coupling to fiber is reduced by use of an index-matching adhesive for fiber fixation. A polished out-of-plane coupling mirror at an angle of 41.5. injects the light into standard grating couplers, providing a quasi-planar fiber-to-chip package. The excess loss of the proposed solution is limited to 2 dB per interface, including mirror, waveguide and fiber coupling losses
- β¦