Abstract: A broadband transimpedance amplifier (TIA) is designed and analyzed based on Regulated Cascode (RGC) configuration with L-matching network and cascode amplifier. A novel single-to-differential amplifier is also designed to simplify the following limiting amplifier (LA) design and increase the immunity of common-mode noise. The TIA is implemented in UMC 0.18 µm standard CMOS process. The measured transimpedance gain is 57 dBΩ with a −3 dB bandwidth of 8.1 GHz. The average equivalent input noise current spectral density is 29 pA/√Hz. The chip consumes 68 mW DC power under 1.8 V and occupies the area of 0.9 mm 2 .
Introduction
Recently, a single-chip microprocessor that communicates directly using light in 45-nm CMOS silicon-on-insulator (SOI) process was reported. This demonstration represents an era of chip-scale electronic-photonic systems grow up [1] . Optoelectronic integrated circuit (OEIC) is proposed to overcome the problems of power consumption, speed limitation, and cross-talk in the conventional short distance electronic interconnects. In addition, the CMOS technologies have recently become very attractive due to their low cost, high integration level characteristics and the continuous improvement of CMOS process. So the high-performance OEIC could be fully realized in CMOS process with a small chip area and cost. The transimpedance amplifier (TIA) is the critical block in the OEIC, because its speed and sensitivity have a crucial impact on the performance of the entire optical receiver system. Moreover, in order to eliminate common-mode noises from power supply and substrate, the TIA usually provide differential signal for limiting amplifier (LA). Unfortunately, considering the cost, usually only one photodetector (PD) is designed to provide a single-ended input signal [2, 3, 4] . Therefore, there are two prominent challenges in high-performance TIA design that stem from the potentially large PD junction capacitance, which deteriorates both the system bandwidth and noise performance, and achieving differential configuration.
The regulated cascode (RGC) configuration is very popular among the input stage of TIA due to its high gain, wide bandwidth and comparatively low power consumption and noise [5] . Up to now, there are various bandwidth enhancement techniques used to design Gigabit optical receiver based on the RGC configuration, such as the auxiliary amplifier improvement technique [6, 7, 8] , the wideband matching network [9, 10, 11, 12, 13, 14] and the pole-zero cancellation [15, 16] .
The typical TIA usually utilize the two kinds of pseudo-differential structure by employing a passive low pass filter (LPF) [9, 17] or exploiting a replica input [18, 19] , but both cases cannot obtain fully differential signal and have nonignorable defects. The passive low pass filter is not good for low frequency and long "1" data signal transmission and the replica input would greatly increase the chip area and power consumption. Ref [20] proposes another kind of active balun that would reduce overall bandwidth and has some significant shortcomings that the DC level, gain and bandwidth of one differential output are inconsistent with the other one.
Though introducing L-matching network, ³-matching network, cascode structure and capacitive degeneration, this paper describes a high-speed fully differential TIA in CMOS process with excellent bandwidth, noise and gain performance, which consists of a modified regulated cascode (M-RGC) input stage and a novel single-to-differential amplifier (S-D AMP). The proposed circuit topology is detailed in Section 2, along with the analysis on the optimization of bandwidth and noise performance. The measurement results of the proposed TIA are presented in Section 3. A conclusion will be made in Section 4.
The proposed broadband differential TIA

A. Bandwidth analysis and enhancement
The conventional RGC configuration, in which an auxiliary amplifier is used to create very low impedance at an input that neutralizes the effect of the total capacitance C T including PD junction and pad parasitic capacitance [5] , is shown in Fig. 1(a) . The auxiliary branch of conventional RGC is a common source (CS) amplifier, which produces a large parasitic capacitance C i,cs (C gs2 þ ð1 þ g m2 R B ÞC gd2 ) at the input because of Miller effect. On the one hand, C i,cs and C T would form a very large capacitance at node X, which will greatly deteriorate both bandwidth and noise performance. On the other hand, the designed circuit will operate at Gigabit, so the C i,cs would produce a low-impedance path to ground at high frequencies, resulting in a large loss of current signal and then gain degradation. Furthermore, the gain of auxiliary amplifier should be high to create very low impedance at input. Therefore, it is necessary to improve RGC by introducing a inductance L 1 , L 2 and cascode configuration, as shown in Fig. 1(b) . The most prominent features of cascode amplifier are "shielding" Miller effect and boost gain. The total parasitic capacitance C i,CAS at cascode amplifier input is
According to Eq. (1), if the sizes of the M 2 and M 3 are roughly same, times of Miller effect C gd2 is about 2, rather than ð1 þ g m2 R B Þ. Furthermore, when an inductance L 2 is inserted between the X node and the gate of M 2 , the relationship between the drain current of M 2 and V x is
The effective transconductance G m2 of M 2 increases with increasing frequency, which compensates for the gain reduction in cascode amplifier comes from the C i,CAS at high frequencies [10] . The output resistance of cascode auxiliary amplifier is expressed as
Which is bigger than that of CS amplifier ðr o2 k R B Þ. The gain of auxiliary amplifier can be expressed as
The gain of auxiliary amplifier is improved and then the input impedance of RGC is further reduced. Compared with the ways by increasing the W 2 value of M 2 and the value of R B to enhance the gain, the proposed auxiliary amplifier do not introduce bigger parasitic capacitance and avoid possible peaking to meet the flat frequency response due to the zero generated by the local feedback of the RGC stage. Moreover, R s should be relatively large to minimize signal loss [5] .
The wideband matching network technology is very effective both in noise reduction and bandwidth enhancement without degrading other parameters of an existing amplifier [11, 12] . However, there is a trade-off between performance and cost due to introduce the passive components. Finally, only one inductance L 1 is introduced at the RGC input to form a passive matching network with circuit parasitic capacitance.
In order to achieve maximum bandwidth enhancement and maximum gain flatness (Butterworth-type response), the M-RGC can be approximated to a passive low-pass filter [12] . The small-signal model of M-RGC, as shown in Fig. 2(a) , is not purely passive due to its interaction with active devices of the RGC input stage. So we should convert the matching network to an equivalent passive one.
Looking from left to right at node X in Fig. 2(a) , the impedance at node X is expressed by
We neglect C gs1 in this analysis because it is small enough. Therefore, the circuit could be simplified as shown in Fig. 2(b) , where C 1 is the combined parasitic capacitance at node X and Z T ðsÞ is expressed by
Where R O is the total output impedance, and ! 1 is the −3 dB cut-off frequency of the M-RGC. The RGC has two important poles, located at input and output node respectively [5] . The input pole of M-RGC is pushed to high frequency beyond the interested bandwidth (! in % 20 GHz) due to the very small input resistance and L-matching network. Therefore, the lowest pole (! out % 14 GHz) is located at output node and is expressed by
Where C O is the output node capacitance. However, the bandwidth of M-RGC is lower than that expressed by Eq. (7) due to the other pole, zero and parasitic capacitances contributed to bandwidth limitation. This equivalent matching network is third-order, which consists of C T , L 1 and C 1 . Considering the dominant pole ! 1 of the M-RGC, the whole amplifier can be approximated to a fourth-order low-pass filter with a frequency independent gain Z T ð0Þ. From the small-signal circuit model in Fig. 2(b) , the transfer function of the whole circuit can be derived as:
To design this transfer function with Butterworth response, the most convenient way is to map the coefficients of the denominator of (8) to the fourth-order Butterworth coefficients as shown in Eq. (9).
The design parameters can be computed from these equations to implement a Butterworth-type response M-RGC [12] . The simulation results (As shown in Fig. 3 ) demonstrate that the bandwidth is increased from 5.2 GHz to 11.8 GHz with a small peak, increased 2.26 times.
Intuitively, there is only one difference between them that is the position of L 2 [8, 12] , shown in Fig. 1 . But they have essential difference of principle, and our performance is better than their works. The L 2 of ref [8] is introduced between M 2 and M 3 as a ³-matching networking, which makes the C gd2 and C gs3 be isolated from each other and absorbed into passive networks. However, the C gd2 and C gs3 have little impact on the bandwidth in this circuit, so the bandwidth enhancement is not obvious. The input impedance of M-RGC [12] exhibits inductive due to introduced inductance L 2 , so the photo current flows through input parasitic capacitances ðC T ; C i,CS Þ rather than RGC input impedance at high frequencies and hence leads to decreasing RGC bandwidth [10] .
B. Noise analysis and reduction It's well known that, as the input stage, the noise performance of M-RGC determine the sensitivity of the entire optical receiver system. In order to improve the accuracy of noise analysis, the standard noise analysis in terms of E n -I n model is employed to evaluate the noise [12] . The output noise contribution of M 3 could be negligible when using current source to simulate the noise effects of M 3 . So, the noise generated by the M 3 is ignored in the following analysis. According to noise model, the input-referred noise current of M-RGC is expressed by
Where,
where k is Boltzmann constant, T is the absolute temperature,
According to the Eq. (10), R D and R s contribute most of the low-frequency noise, so R s should be relatively large to minimize its noise current contribution. The dominant noise at high frequencies comes from the M 1 and M 2 . The growth rate of the input referred noise at high frequencies depends on C T . Using the Lmatching network could effectively decreases the noise contribution from M 1 and M 2 at the desired bandwidth. In order to meet minimum input referred noise current spectral density, R B and W 1 values of M 1 should be small [5] . As shown in Fig. 4(b) , the simulation results demonstrate that simulation and analysis are basically consistent with each other; the proposed M-RGC has better noise performance than their works [8, 12] due to the better use of inductance L 2 , achieving a minimum at the desired bandwidth.
C. The novel S-D AMP
The active balun, as shown in Fig. 5(a) , is proposed in Ref [20] . A common-gate (CG) amplifier and a common-source (CS) amplifier are judiciously merged to achieve fully differential signal at both drain nodes. However, it would deteriorate the overall bandwidth and has some shortcomings that the DC level, gain and bandwidth of one differential output are inconsistent with the other one. Therefore, a novel S-D AMP is designed based on the active balun, which is used not only to acquire fully differential signal but also to improve gain, the equivalent circuit as shown in Fig. 5 
(b).
It's well known that the output node capacitance C O of the M-RGC would be increased when S-D AMP connected to M-RGC. Unfortunately, the dominant pole of M-RGC is located at output node. As a result, the bandwidth will be reduced. Hence, using inductor L 3 and L 4 to form two ³-matching networks with the parasitic capacitances at node 1, 2 and 3, the parasitic capacitances are isolated from each other and absorbed into passive networks [11] . Thus, the pole is pushed at higher frequency to effectively restrain the decline of the bandwidth, and then the dominant pole transfer to the node A and B.
What's more, two one-stage CS amplifiers with capacitive degeneration are introduced at output node of CS stage (R 2 and M 6 ) and CG stage (R 1 and M 5 ) respectively to make one output signal consistent with the other one as far as possible. The capacitive degeneration [12] technique is employed to contribute two zeros at ðR 4 C 1 Þ À1 and ðR 6 C 2 Þ À1 , and two poles at ð1 þ g m7 R 4 Þ=R 4 C 1 and ð1 þ g m8 R 6 Þ=R 6 C 2 . The two zeros could be used to compensate the dominant pole of the circuit at node A and B respectively to make the bandwidth of differential outputs consistent as far as possible and reduce the deterioration of the entire circuit bandwidth due to the S-D AMP. According to the small signal model and the Kirchhoff's law, the transfer function of the S-D AMP is obtained. 
D. The Proposed Differential TIA The proposed TIA circuit, as shown in Fig. 6 , consists of a M-RGC, S-D AMP and Buffer. The output buffer stage is designed for driving the 50 Ω loads required for measurement. In addition, to alleviate the bandwidth degradation due to the Buffer, the active inductance peaking technique is implemented. As shown in Fig. 7 , the simulation results demonstrate that the bandwidth and gain of two differential outputs are basically identical with each other and effectively enhanced with respect to the balun of Ref [20] , the gain reach 64 dBΩ with a −3 dB bandwidth of 11.4 GHz and equivalent input noise current spectral density is 15 pA/ p Hz. After the introduction of S-D AMP, the gain is improved 9 dB without significant deterioration of bandwidth and the overall noise.
Measurement results and analysis
The simulation and realization of the proposed TIA circuit are carried out using Cadence Spectre RF with UMC 0.18 µm 1P6M CMOS technology, the total die area is 1000 Â 900 µm 2 . The TIA chip consumes 68 mW DC power from 1.8 V supply and is mounted on a printed-circuit board (PCB) for measurement, as shown in Fig. 8(a) . The characteristic impedance of the transmission line on the PCB is 50 Ω for matching the input impedance of test equipment. The inductances L 1 À L 4 are implemented using a planar spiral inductance to ensure a monolithic implementation. An on-chip MIM capacitor is used to simulate the effect of the PD capacitance. The total input parasitic capacitance C T is about 0.35 pF. In order to reduce the parasitic influence on the performance of the TIA, the interconnection lines and bonding wires, which have parasitic inductance and would deteriorate S-parameters especially in the high frequency range, are designed as short as possible.
Three-port S-parameters of the CMOS TIA were measured using the network analyzer (E5071C) of Keysight Technologies as shown in Fig. 8(b) . The S21 and S31 are basically consistent with good flatness. S11, S22 and S33 are basically lower than −10 dB within the entire bandwidth. The effective trans-impedance gain Zt without the PD is calculated from the measured S-parameters by the following transform equation:
Where, Z 0 equal to 50 Ω. As shown in Fig. 8(b) , the measured gain is 57 dBΩ with the −3 dB bandwidth of 8.1 GHz while the input capacitance is 0.35 pF. Fig. 9(a) shows the measured group delay of the fabricated TIA which is calculated from the measured phase response, the group delay is about 30 AE 20 ps within 7 GHz, there is a large fluctuation between 7 GHz to 8 GHz, what might be caused by the nonideality of the inductances and inaccurate parasitic inductances in the metal lines and bonding wires.
The measured output noise voltage is 0.95 mV, which is measured by the Keysight DSA-X 93204A oscilloscope as shown in Fig. 9(b) . After subtracting the background noise (0.208 mV) contributed by the oscilloscope, the input referred noise is about 29 pA/ p Hz calculated by Eq. (15): These differences between the measured results and the simulated results is possibly due to an inaccurate model, additional parasitic elements that were not considered in the simulation, process variation, the EM radiation loss and PCB.
The measured eye diagrams are measured by Keysight DSA-X 93204A oscilloscope and J-BERT M8041A pattern generator as shown in Fig. 10 . The bandwidth is usually between 0.6 and 1.2 B (where B stands for the bit rate) for a typical TIA [12] . The input signal voltage is 50 mVpp and a 2 31 -1 pseudo random bit sequence (PRBS) is applied to the input up to 10 Gb/s and 12.5 Gb/s. The sufficient bandwidth and flatness make eye diagrams has good opening degree at 10 Gb/s. The eye diagrams of 12.5 Gb/s appears double eyelids because of waveform distortion, what might be caused by the multiple poles and zeroes, large highfrequency noise, not enough bandwidth and large high frequency group delay fluctuations. Table I is performance summary with the other works in the similar process and shows that the TIA has competitive performance with a large input capacitance. However, the factor cannot be ignored that they are all used on-wafer test set-up, our high frequency PCB test solution is closer to the actual application environment. In addition, considering the cost, the work is fabricated using UMC 0.18 µm CMOS technology. 
Conclusions
In this paper, the proposed TIA is designed and fabricated using UMC 0.18 µm CMOS technology with competitive performance. The RGC configuration is analyzed and optimized, the introductions of cascode auxiliary amplifier and Lmatching network not only make the bandwidth of RGC improved significantly along with maximum gain flatness, but also optimize the high-frequency noise performance of RGC. In addition, the novel S-D AMP not only achieve fully differential signal, but also greatly enhance the gain of the circuit, in the absence of deterioration the noise and bandwidth performance of the circuit, which could increase the immunity of common-mode noise and relax the following LA design. The work could provide a guidance for the further study of monolithic optical receiver.
