Optical I/O core based on silicon photonics technology and optical/electrical assembly was developed as a fingertip-size optical module with high bandwidth density, low power consumption, and high temperature operation. The advantages of the optical I/O core, including hybrid integration of quantum dot laser diode and optical pin, allow us to achieve 300-m transmission at 25 Gbps per channel when optical I/O core is mounted around field-programmable gate array without clock data recovery. key words: silicon photonics, bandwidth density, optical/electrical assembly, FPGA
Introduction
Big data analysis and artificial intelligence (AI) have grown increasingly important in the modern information and communication technology (ICT) society. Large quantities of data have been collected by data centers (DCs) and are analyzed not only in these centers but also in high-performance computers (HPCs). Due to the increase of such data, computational complexity has also been increasing. The progress of AI with increasing information content thus requires large computing power. However, as the processing speed of CPUs/GPUs has almost doubled over the last two years [1], current processing speeds are not enough to cope with the increase of data. In order to compensate for the processing speed, field-programmable gate arrays (FPGAs) have come into use in DCs and HPCs. For examples, FPGA is used as fixed function hardware acceleration in high throughput data processing, as software acceleration such as offloading portions of a software application running on the CPU to FPGAs, and as bridges and switches to connect differprint and high bandwidth (high bandwidth density), low power consumption, and high temperature operation is required for mounting around FPGA. In this work, we propose a fingertip-size optical module, which we call "optical I/O core", to satisfy these requirements. We applied optical I/O cores to FPGA boards and demonstrated 300-m transmission at 25 Gbps per channel.
Design and Structure of Optical I/O Core
To cope with the increase in I/O capacity due to large scale integration (LSI), high bandwidth density and low power consumption in a high temperature environment are required for optical transceivers. Our developed optical transceiver, which we call "optical I/O core", is shown in Fig. 1 [3] . The optical I/O core has a small footprint of 5 × 5 mm 2 and a maximum capacity of 300 Gbps (25 Gbps × 12 channels). As shown in the cross-sectional view in Fig. 1 , the optical I/O core consists of technologies based on both silicon photonics integration and optical/electrical assembly.
Our silicon photonics integration technology is based on 300-mm silicon on insulator (SOI) wafer process. ArFimmersion lithography enables us to create low-loss and high-uniformity optical waveguide. We achieved low waveguide loss of 1.28 dB/cm for the O-band [4] and highlyuniform wavelength distribution of 2 nm (3σ) for a resonant peak of coupled resonator optical waveguides (CROWs) on the entire 300 mm wafer [5] . We also developed a germanium selective epitaxy process for photodetectors (PDs).
Based on these processes, we have developed a silicon photonics integrated circuit that includes the waveguide, MOS-capacitor-type Si optical modulator, Ge photodetector (Ge-PD), and grating coupler (GC) as shown in the cross-sectional view in Fig. 1 . The structure of our developed MOS-capacitor-type Si modulator is a vertical MOS-junction Si optical modulator structure with the Mach-Zehnder interferometer (MZI). This modulator has high modulation efficiency of 0.16 V · cm, which is about ten-fold more than has a conventional Si optical modulator with only a lateral pn junction [6] . Furthermore, it showed more than 25-Gbps operation with relatively low impedance of CMOS driver and 4-segmented electrode structure [6] . We also designed a high-speed and high-efficiency Ge-PD of a surface-illuminating type with 1800-nm thick Ge layer. By optimizing the anti-reflection coating stack structure, high responsivity of 0.8-0.9 A/W was uniformly obtained within the wafer [7] . A bandwidth of about 15 GHz was obtained at DC bias of 3 V. In the case of the Ge-PD with a thick Gelayer, photo-carrier transit time mainly limits the frequency bandwidth.
After completing a silicon photonics integrated circuit, we assembled a Fabry-Perot laser diode (FP-LD) and optical pins into an optical I/O core. Figure 2 shows a crosssectional view of the part of the silicon photonics integrated circuit on which the FP-LD is assembled. As shown in Fig. 2 , the LD is passively aligned to LD mounting stage on the silicon photonics integrated circuit according to the alignment marks on the LD and the silicon photonics integrated circuit with our developed LD bonding machine. The LD is bonded with the AuSn solder bump on the circuit. Our LD bonding machine creates horizontal misalignment of ±0.5 μm between the LD and the silicon photonics integrated circuit by using infrared camera [8] . The vertical positioning is determined by the Si pedestals fabrication process. Therefore, the vertical misalignment is better than ±0.1 μm.
The optical coupling tolerance between LD and Si waveguide with tip tapered SSC along the horizontal and vertical directions was measured as shown in Fig. 3 . The minimum optical coupling loss between the LD and the Si waveguide was 2.4 dB. The alignment tolerances for 1-dB excess loss were ±0.7 μm for horizontal and ±0.3 μm for the vertical direction, respectively. Therefore, by using the LD bonding machine, less than 3 dB coupling loss can be obtained. Furthermore, there is almost no temperature dependence of the coupling loss due to the flip-chip bonding at up to 100
• C as shown in [9] . For these LDs on silicon photonics integrated circuit, high temperature operation and high optical feedback tolerance are required. We applied quantum dot FP-LDs to address both requirements. Figure 4 shows the temperature dependence of light output power of the quantum dot FP-LD versus quantum well FP-LD at the operation current of 200 mA. The quantum well layer was composed of a strain compensated quantum well structure consisting of 5 InGaAsP compressive strained (+1%) wells with 6-nm thick and 6 InGaAsP tensile strained (−0.1%) barriers sandwiched by 1.0-μm composition InGaAsP SCH layers. On the other hand, the quantum dot layer was composed of 8 layers of InAs/GaAs quantum dot sandwiched by 1400 nm Al 0.4 Ga 0.6 As cladding layers. The quantum dot FP-LD maintains higher output power at high temperature compared to the quantum well FP-LD. Figure 5 shows the system for measuring optical feedback tolerance at near-end reflection. The system consists of a LD and a MZI with one output arm connected to the PD and the other connected to the air facet. By shifting the phase of one arm in MZI, the reflected light intensity from the air facet to the LD can be changed. We measured and compared the optical feedback tolerance for quantum dot FP-LD and quantum well FP-LD. Figure 6 shows the estimated worst signal-to-noise ratio (SNR) versus the feedback condition (C feedback ), the equation of which is indicated in Fig. 5 . The C feedback of optical I/O core is −7.5 dB. Compared to quantum well FP-LD, the estimated worst SNR for quantum dot FP-LD is better than 40 dB in a wide region of C feedback , which is enough for a 25-Gbps error-free operation [10] . These results show that quantum dot FP-LD is suitable for silicon photonics integrated circuits.
Next, we will briefly introduce the optical pin. Optical pin is a three dimensional polymer waveguide used in place of an optical lens for connecting the GC or the Ge-PD with multi-mode fiber (MMF). Optical pin is made of resin using 2-step lithography: one step is the core and the other step is cladding. Thanks to this lithography process, it is easy to expand the full wafer process and reduce the assembly cost. The output wavelength from the optical I/O core has high temperature dependence of 0.6 nm/
• C because of the FP-LD without temperature control. The radiation angle from GC has changed at 0.083 degree/nm as a result of high temperature dependence of wavelength in FP-LD. However, as the optical pin has a high refractive index with numerical aperture of more than 0.4 between the core and cladding, the varied radiated light from the GC can be enclosed within the core [11] . Therefore, the calculated misalignment tolerance with 1-dB excess loss between the optical pin with φ 35-μm cladding and GI50 MMF is larger than 10 μm in a temperature range from −45
• C to +85
• C. As for the electrical assembly, we assembled a driver and a trans-impedance amplifier (TIA) integrated circuit (IC), and through glass via (TGV) for the electrical I/O. For miniaturization, the driver and the TIA IC are assembled on a silicon photonics integrated circuit by flip-chip bonding, longitudinally. The step between the GC or Ge-PD on the silicon photonics chip and the top surface of the IC chip makes it difficult to assemble MMF on the GC or the Ge-PD. Therefore, by installing the optical pin, we can easily connect MMF to the optical I/O core. Furthermore, as this step also makes it difficult to connect electrical contacts, we install TGV for a flat layer over the IC chip to connect them easily.
Characteristics of Optical I/O Core
In this section, we present the characteristics of the optical I/O core. Two characteristics are especially important for mounting optical I/O cores around FPGAs: high temperature operation and low power consumption.
The optical I/O core was mounted on the evaluation board and MMF was implemented in it. We measured eye patterns and bit error rates (BERs) in the constant temperature bath. Figure 7 shows temperature dependence of eye patterns and BERs at 25 Gbps, with pseudo random bit sequence (PRBS) 31. Clear eye patterns were achieved from 25
• C up to 85 • C. Furthermore, BERs with less than 10
were demonstrated at 20 • C and 85
• C [12] . This operation in a wide temperature range is achieved by means of the quantum dot LD and optical pin. Figure 8 shows the bathtub curve of BER at 25 Gbps, PRBS31. Thanks to clear eye patterns, we obtained a 0.34 Fig. 7 Temperature dependence of eye patterns and BER at 25 Gbps, PRBS31. unit interval (UI) at BER of 10 −12 without clock data recovery (CDR). Although the power consumption of both the driver and the TIA is as low as 5 mW/Gbps at 25-Gbps operation, thanks to the 28-nm CMOS process, this exclusion of CDR further contributes to reducing the power consumption of the optical I/O core. Figure 9 shows the transmission characteristics of the optical I/O core. Two types of MMF were used. One is corning clear curve LX MMF [13], which is specialized for the O-bands and the other is conventional OM3 fiber, which is used in 850-nm wavelength. By using corning clear curve, transmission up to 300 m was achieved at 25 Gbps. The transmission length of 300 m is sufficient for applications in DC systems. Furthermore, by using conventional OM3 fiber, 30-m transmission without equalizer and 60-m transmission with equalizer were also achieved at 25 Gbps. The transmission length of 30-60 m is sufficient for applications in HPC systems.
FPGA Applications
As FPGA performance increase, its application area widens to include not only communications but also DCs and HPCs. This is largely due to FPGA's advantages in terms of flexibility, high throughput, low latency, and low power consumption compared to CPU and GPU. At the server systems in DCs and HPCs, FPGAs are used to function as the accelerators of CPUs and GPUs, which are connected through the FPGAs. Therefore, bandwidth guarantee of the entire system is important for improving the effective performance in server systems. For this purpose, we propose an FPGA board around which optical I/O cores are mounted, as shown in Fig. 10 . A maximum of 24 Tx and Rx channels of optical I/O cores, with a capacity of 1.2 Tbps, were mounted around FPGA. These FPGA boards were connected by 300-m MMF, and 25 Gbps signals per channel were transmitted. Then, BERs and eye patterns were measured in FPGA. As shown in Fig. 11 , BERs less than 10 −12 with good eye patterns were obtained. This shows that there is almost no degradation in BERs at simultaneous many channels operation in optical I/O core.
The mounting area of 12 conventional quad small form-factor pluggables (QSFPs) (capacity = 1.2 Tbps) is about 180 cm 2 . In contrast, the mounting area of eight optical I/O cores, also with a capacity of 1.2 Tbps, is about 11 cm 2 . In other words, the mounting area of optical I/O cores is less than 1/10 of the conventional QSFP area. This means that over-10-fold bandwidth in a server system is guaranteed by using optical I/O cores compared to the conventional optical modules. Furthermore, as optical I/O cores can be mounted close to FPGA, thanks to their small size and high temperature operation, the high speed electrical line between FPGA and the optical I/O cores becomes very short: less than 20 mm. This enables high speed transmis-sion of 25 Gbps without CDR and contributes to low power consumption.
Conclusions
Optical I/O core based on silicon photonics technology and optical/electrical assembly was developed as a fingertip-size optical module with high bandwidth density, low power consumption, and high temperature operation. The advantages of the optical I/O core, including hybrid integration of quantum dot LD and optical pin, allow to achieve 300-m transmission at 25 Gbps per channel when an optical I/O core is mounted around FPGA without CDR. This optical I/O core will contribute to high bandwidth density and low power consumption interconnection among FPGAs in data center systems and HPC systems. 
Jun Ushida
received the M.E. and Ph.D. degrees in physics from Osaka University, Osaka, Japan, in 1996 and 1999, respectively. In 1999, he joined NEC Corporation, Tsukuba, Japan, where he has been involved in research on nanophotonics including photonic crystals and silicon photonics. In 2012, he joined Photonics Electronics Technology Research Association (PETRA). He is a member of the Japan Society of Applied Physics (JSAP) and the Physical Society of Japan (JPS). 
Masataka

