We demonstrate the codesign and cointegration of an ultracompact silicon photonic receiver and a low-power-consumption (155 mW/channel) two-channel linear transimpedance amplifier array. Operation below the forward error coding (FEC) threshold both for quadrature phase-shift keying (QPSK) and 16-quadrature amplitude modulation (QAM) at 28 Gbaud is demonstrated.
Introduction
The growth of internet traffic of over 60% per year requires a constant evolution in transceiver technology in order to cope with the increasing demand. This has resulted in the deployment of 100 Gbit/s Ethernet coherent transceivers in long-haul networks, while a large effort is already dedicated to the evolution towards 400 Gbit/s. The major advantages of coherent communication include its spectral efficiency as well as the electronic compensation of linear and non-linear impairments of the transmission link [1] . In the near-future coherent transceivers are expected to become key components in metropolitan area networks and on the longer term they most likely will also penetrate the access domain [2] - [4] . The size, power consumption and cost of these transceivers need to be significantly reduced for shorter reach networks. Photonic integration is considered as the only viable route to realize such compact, low-cost, and low-power transceivers. Several photonic integration platforms are being considered for the implementation of these devices including the use of planar lightwave circuits [5] , InP-based photonic integrated circuits (PICs) [6] , and silicon photonics [7] . For applications in access and metro, silicon photonic coherent transceivers have a great potential. The high refractive index contrast available on this platform allows for ultra-compact devices. The high refractive index contrast also allows for the straightforward on-chip integration of polarization diversity, either by using 2-D grating structures [8] or polarization-insensitive spot size converters, combined with a polarization rotator [9] . The compact size of silicon PICs, together with the economies of scale of silicon processing, enable at the same time low-cost chips. No hermetic packaging of these devices is required, thereby further reducing the cost of the overall packaged device. While the footprint of integrated coherent transceivers is typically determined by large phase modulators [10] , for shorter reach links the IQ modulator can be implemented using electro-absorption modulators (EAMs) [11] , [12] . Such modulators are much more compact (50 to 100 m device length) compared to the classical traveling wave phase modulators. Recently, first generation Ge EAMs integrated on the silicon photonic platform have been reported, operating at 56 Gb/s [13] . The power consumption of the analog front end is determined by the modulator drivers and trans-impedance amplifiers (TIAs). The EAM transmitter implementations in [11] , [12] do not require digital-to-analog converters or a 50 termination, substantially reducing the overall power consumption, thereby making the TIA power consumption a substantial part of the overall power dissipation. Analog-to-digital converters and digital signal processing contribute substantially to the overall power consumption, and are therefore in a first phase envisioned to be implemented outside the transceiver package. With the advancement of complementary metal-oxide semiconductor (CMOS) technology, the cost and power consumption of digital computations constantly decreases over time, which will eventually enable the realization of transceivers with the digital functions inside the transceiver package. This paper reports on the realization of an important sub-block of the silicon coherent transceiver as indicated in Fig. 1 , a compact, low-power consumption 28 Gbaud coherent receiver. This is realized through the co-design and co-integration of the EIC and PIC. The PIC is realized using imec's iSIPP25G platform. The device is based on a multi-mode interference 90°optical hybrid that has a very small footprint (13.7 m by 155 m) and does not require any direct current (DC) control for adjusting the relative phases between the output ports. This reduces power consumption and decreases chip size. The 90°hybrid is connected to a pair of balanced high speed Ge photodetectors (Ge PDs). By implementing the photocurrent subtraction on the chip the number of bond pads required is reduced, leading to a further decrease in the PIC size. Also, because no DC decoupling capacitors are implemented on the photonic chip, the PIC size can be further reduced and a low-cost photonic integration technology with a single metallization layer can be used. While, in this proof-of-principle demonstration, a single polarization receiver is described, a polarization-multiplexed coherent receiver with optimally placed grating coupler structures would only occupy about 0.5 mm by 0.5 mm. The 2-channel linear single-ended input TIA array is designed in 0.13 m SiGe BiCMOS technology. Besides for linearity to enable 16-quadrature amplitude modulation (16-QAM), the electronic circuit is optimized for low power consumption. As will be discussed later, a single TIA operating at 28 Gbaud consumes only 155 mW, a substantial improvement over previous demonstrations of integrated coherent receivers (ICR) [7] , Fig. 1 . Envisaged silicon photonic coherent receiver consisting of an electronic (EIC) and photonic integrated circuit. In this paper, the ultra-compact coherent receiver for a single polarization with integrated low-power consumption linear TIA array is demonstrated (dashed line). [14] , [15] . Twenty eight Gbaud quadrature phase shift keying (QPSK) and 16-QAM reception is demonstrated using the silicon coherent receiver integrated with the 2-channel TIA array. In both cases the receiver can operate below the forward error coding (FEC) limit (3:8 Â 10 À3 at 7% overhead). For QPSK, less than 12 dB/0.1 nm optical signal-to-noise ratio (OSNR) is required to realize this: 2.5 dB above the theoretical limit. This demonstrates the potential of silicon photonics for coherent communication, when co-designed and co-integrated with EICs. While recently the monolithic integration of a coherent receiver and TIA has been proposed and demonstrated [15] , a hybrid integration is preferred over a monolithic approach in this work as it allows independent optimization of the used technology for the photonics and electronics, allowing for commercial silicon foundry services to be used. This paper is organized as follows: In Section 2, the design and characterization of the silicon photonic integrated circuit is described. In Section 3, the design of the low-power linear TIA array is discussed, while in Section 4, the co-integration and the system experiments are described.
Silicon Photonic Integrated Circuit Design and Characterization
The silicon photonic integrated coherent receiver is realized in imec's iSIPP25G platform. The layout of the circuit is shown in Fig. 2(a) . The circuit occupies an area of 0.3 mm by 0.7 mm. It consists of single polarization fiber-to-chip grating couplers for coupling the advanced modulation format signal and local oscillator to the chip. By properly positioning the grating couplers, the size of the coherent receiver can be further reduced to be less than 0.45 mm by 0.25 mm. The fiber-to-chip grating coupler efficiency is −6.5 dB at a wavelength of 1550 nm. The −1 dB bandwidth is 20 nm. Higher efficiency single polarization grating couplers [16] as well as twodimensional grating couplers for polarization diversity [8] can also be realized on this platform. Implementing polarization diversity using a focusing two-dimensional grating coupler would increase the device footprint only to about 0.5 mm by 0.5 mm. The 90°hybrid is realized using a 2 Â 4 multi-mode interference coupler. Compared to a recently reported silicon photonics single polarization coherent receiver [17] , the use of a multi-mode interference coupler leads to a smaller footprint and does not require additional thermal tuners to control the relative phases at the output of the 90°hybrid. As discussed above, this in turn reduces the number of bond pads, which further reduces the chip-size and cost. The layout of the 2 by 4 multimode interference coupler is shown in Fig. 2(b) . The back-end dielectric stack is removed for clarity. The device consists of deeply etched entrance and exit waveguides defined in a 220 nm thick waveguide layer, while the MMI itself is shallow etched (70 nm) in order to reduce phase errors and power im- Fig. 2(c) .
Assuming an allowable phase difference deviation of +/−5°, operation over the C-band is achieved. The simulated common mode rejection ratio (CMMR ¼ 20 Ã log10ððT ij À T ik Þ=ðT ij þ T ik ÞÞ, with T ij the power transmission from input in i to output ch j , is better than −25 dB over the C-band (data not shown).
As there are no tuning elements for the 90 degree hybrid, the fabrication tolerance of the MMI was assessed. Fig. 3 shows the phase relations as a function of MMI length, width, etch depth and Si waveguide layer thickness, at a wavelength of 1550 nm (the wavelength used in the experimental work). For typical tolerances in fabrication (+/−10 nm in waveguide widths and lengths, +/−10 nm on etch depth and +/−5 nm on silicon waveguide layer thickness, phase errors below 5°are obtained. Simulations (not shown here) also indicate that the common mode rejection ratio (CMRR) remains better than −20 dB in this fabrication window. A similar tolerance to fabrication variations is obtained at wavelengths at the edge of the C-band.
High-speed germanium photodetectors (Ge PDs) are implemented at the output of the 90°hy-brid. The characteristics of an individual photodiode integrated on the same chip as the coherent receiver are shown in Fig. 4 . The individual PDs have a bandwidth above 50 GHz at −1 V bias due to the low junction capacitance, a dark current of less than 15 nA at −1 V bias and an on-chip responsivity of 0.5 A/W. Good uniformity over 200 mm wafers is obtained [18] , important for chip yield and hence cost reduction. On-chip subtraction of the photocurrent was implemented in order to reduce the number of bondpads required and, thereby again, the chip-size and cost. While this approach doubles the capacitance of the optical receiver and thereby reduces the bandwidth [19] , the high bandwidth of the individual PD still allows for 28 Gbaud operation, as will be demonstrated below. Also, the on-chip current subtraction prevents a substantial DC photocurrent from the photodiodes to be injected in the TIA, simplifying its design.
SiGe BiCMOS Linear Trans-Impedance Amplifier Array Design
The TIA array is fabricated in a 0.13 m SiGe BiCMOS technology and consists of 2 identical channels. In order to interface with the balanced detectors on the photonic integrated circuit, the TIA channels have been arranged in a mirrored configuration, sharing a common ground, supply voltage (2.5 V for the analog parts and 1.2 V for the digital parts), and tunable bias voltage for the balanced photodiodes. The EIC was co-designed with the PIC in the first place by taking the equivalent circuits of the balanced photodiodes and interface parasitics into account and by providing an on-chip biasing and decoupling for the balanced photodiodes. Additionally, by mirroring both TIAs in the layout, the distance between the two inputs is only 500 m, reducing the minimally needed PIC size. The TIAs were optimized for both 2-and 4-level input signals (i.e., QPSK and 16-QAM) in terms of linearity [20] . A balancing error integrator loop is implemented to remove the input DC offset [21] . The total chip area is 3000 m Â 900 m of which 1100 m Â 900 m is occupied by each TIA. The serial peripheral interface controller is shared and a single bias block provides a 100 A reference current to each TIA. A microscope image of the TIA array with annotated functional blocks and sizing is provided in Fig. 5 . Fig. 6 shows a simplified block diagram of the equivalent electrical representation of the photonic coherent receiver and the TIA-array. The electrical signal path consists of an input stage, a main amplifier and an output stage. The input stage converts the current coming from the balanced photodiodes to a voltage signal through the feedback resistor R F , which is implemented as an array of eight parallel nMOS transistors operated in their linear region. As such, the gain of this stage can be controlled digitally by turning on (decreasing R F ) or off (increasing R F ) the transistors from this array. Contrary to other coherent receivers the PD biasing is generated by the 2-channel TIA chip through a settable reverse bias control as indicated in Fig. 6(b) . The TIA input stage provides a fixed voltage of approximately 0.9 V to the bottom photodiode. The reverse bias control is then set to match the voltage for the top photodiode by setting its value to 1.8 V, giving both photodiodes a reverse bias of 0.9 V. Decoupling C DC for the bias voltage is realized on-chip and close to the photodiodes removing the need for discrete external decoupling capacitors. As discussed above this minimizes the PIC footprint and allows using less complex PIC technologies with a single metallization layer. Furthermore, because there is no longer a need for a positive and a negative supply voltage as in traditional balanced photodiode biasing schemes [17] , this topology requires only one electrical connection from the package to the EIC to realize the biasing. As the bias control is shared by both TIAs, the bias voltage is the same for both pairs of balanced photodiodes.
As the trans-impedance gain is typically inversely proportional to the bandwidth, a trade-off needs to be made. Fig. 7(a) shows the simulated 3 dB bandwidth of the TIA as a function of the trans-impedance. Fig. 7(b) shows the simulated trans-impedance gain of the complete TIA as a function of frequency for a trans-impedance of 133 ; as is used in the experiment. For this trans-impedance value the TIA has a simulated 3 dB-bandwidth of approximately 26 GHz. This simulation assumes a total input capacitance of the PIC (100 fF), including the PDs (∼20 fF per photodiode, i.e., ∼40 fF in balanced configuration) and pad and wirebond capacitance (∼60 fF). The equivalent circuit is shown in the inset of Fig. 7(b) .
Electronic/Photonic Co-Integration and System Experiments
The developed silicon photonic coherent receiver and co-designed TIA array were co-integrated on a four-layer printed circuit board (PCB). Fig. 8(a) shows the PCB used for testing purposes. The PCB was not minimized in size as to enable easy testing and assembly. Both dies were placed in a cavity in the center of the PCB to minimize the required wire bond length between the TIA and the traces on the PCB. Care was taken during the assembly to place the 2-channel TIA-die and the silicon PIC as close as possible together in order to minimize the lengths of the interconnection wire bonds. The 2 Â 2 differential outputs of the two-channel TIA were routed symmetrically to four high-speed connectors at the edge of the board. Due to limitations of the measurement setup in the lab, all measurements were done single ended by terminating the corresponding output of the differential signal with a DC-block and a 50 termination. Fig. 8(b) shows a close-up of the wire bonded electronic and photonic die on the PCB. For practical reasons the photonic die was diced larger than the actual coherent receiver size, as indicated in Fig. 8 .
The measurement setup to characterize the coherent receiver at 28 Gbaud is shown in Fig. 9 . At the transmitter side the light of a C-band external cavity laser operating at 1550.92 nm (linewidth 100 kHz) is split, to be used both as Rx local oscillator (LO) and Tx source for ease of characterization. The signal part is guided through a LiNbO 3 Mach-Zehnder IQ-modulator (IQ-MZM), where it is modulated with a 2 15 À 1 long pseudo random bit sequence (PRBS) signal at 28 Gbaud and amplified by an EDFA. The IQ-MZM is driven by two digital-to-analog converters (DACs) for generating the in-phase and quadrature parts of the symbols. Both QPSK (2 bits/symbol) and 16-QAM (4 bits/symbol) modulation formats are studied. For OSNR measurements amplified spontaneous emission (ASE) noise is added to the modulated signal in a noise loading stage. A variable optical attenuator provides the desired signal power to the receiver. The LO is amplified by a second EDFA before being connected to the coherent receiver. Polarization controllers allow efficient coupling of TE polarized light into the silicon photonic receiver through the fiber-to-chip grating couplers. The output of the TIA is read out by a 50 GHz 160 GS/s real-time oscilloscope. In the digital domain, the captured data is parallel processed offline in a distributed digital signal processing cloud. First, the digitized signals are down-sampled to 56 GS/s (for twofold oversampling), before optical frontend impairments are compensated. Then the data is processed by a minimum mean squared error (MMSE) time domain equalizer (TDE). The weight coefficients of the TDE are heuristically updated, with a variable step size [22] , using the least mean squares (LMS) algorithm for convergence, and decision directed LMS for transmission. Note that TDE is employed instead of frequency domain equalizers due to their lower adaptation gain, which is too slow for stable convergence while still maintaining enough symbols for accurate BER analysis. For the weight matrix, every capture contains 560 k symbols, of which the least means squares (LMS) update algorithm for convergence switches after 40 k symbols to decision directed least means squares (DD-LMS). Both algorithms have variable step-sizes [23] . Subsequently, the small frequency offset between the transmitter and local oscillator lasers is removed by applying carrier phase estimation based on digital phase locked loops [22] . Next, the symbols are demapped. The system BER is averaged over 1 and 2 million bits, for QPSK and 16QAM, respectively.
No temperature control of the photonic integrated circuit is used during the measurements. For the 28 Gbaud QPSK measurement 12 dBm fiber-coupled LO power (∼5 dBm on-chip) was used. The signal power was −2.5 dBm (∼−9.5 dBm on-chip). The transimpedance of the TIA was tuned to achieve optimal bit error rate (BER) performance for the given data rate (28 Gbaud) by trading off a lower gain (a transimpedance of 133 ) for a higher bandwidth. The reverse bias of 1.8 V for the balanced photodiodes (0.9 V per diode) was set through the TIA. The BER as a function of OSNR for 28 Gbaud QPSK is shown in Fig. 10(a) , together with two representative constellation diagrams. The transmission is below the FEC-limit (i.e., 3:8 Â 10 À3 at 7% overhead) for an OSNR of 12 dB/0.1 nm. The OSNR penalty with respect to the theoretical minimum is less than 2.5 dB. For 28 Gbaud 16-QAM, our measurements are being limited by the performance of the DACs on the transmitter side. Nevertheless, below FEC threshold operation was realized, as shown in Fig. 10(b) and (c) together with a representative constellation diagram. In Fig. 10 (b) the LO power (in fiber) is swept for a constant signal power (4.5 dBm in fiber), while the signal power is swept for a constant LO-power (14.7 dBm in fiber) in Fig. 10(c) . In both curves the error rate increases again after a certain input power. This is attributed to a combination of two nonlinear effects: a reduction of the responsivity and bandwidth of the Ge photodetectors at higher input power on one hand and degeneration of the trans-impedance amplifier due to the high input current, related to saturation of the output stage of the TIA, on the other hand. Nonlinear effects in the silicon waveguides (two photon absorption, self phase, and cross phase modulation) are too weak to explain this BER degradation.
In the experiment at 28 Gbaud (both for QPSK and 16-QAM operation) the receiver consumes 310 mW, yielding a low overall power consumption of 155 mW per TIA, a factor of three lower than in [7] , [14] and a factor of 1.6 compared to [15] . In Table 1 , we compare our work to the state-of-the-art integrated silicon coherent receivers, illustrating the low power consumption and small PIC footprint of the receiver demonstrated in this work. Amongst the single-polarization ICRs, we realize a PIC size reduction of a factor 4. Assuming a polarization-division multiplexed (PDM) version of the presented PIC would be roughly twice as large (i.e., ∼0.5 mm 2 ), this design would still have a substantially smaller footprint than the PDM-ICRs.
Conclusion
An ultra-compact silicon photonic coherent receiver (0.3 mm by 0.7 mm) integrated with a low power consumption 0.13 m SiGe BiCMOS TIA (155 mW/channel) is demonstrated in this paper. Operation below FEC threshold for both QPSK and 16-QAM at 28 Gbaud is obtained. For QPSK the OSNR penalty w.r.t. the theoretical limit was less than 2.5 dB. This demonstration paves the way for the realization of low-power, low-cost and ultra-compact silicon photonic coherent transceivers.
Acknowledgment
The silicon photonic integrated circuit development was supported by the UGent Special Research Fund (BOF) GOA-electronic/photonic integration platform project. The TIA development was supported by the EU-funded FP7 ICT projects Mirage, Phoxtrot, and Discus. The TABLE 1 Comparison of state-of-the-art integrated silicon coherent receivers and the receiver demonstrated in this work
