Abstract-Highly integrated electronic driver and receiver ICs with low-power consumption are essential for the development of cost-effective multichannel fiber-optic transceivers with small form factor. This paper presents the latest results of a two-channel 28 Gb/s driver array for optical duobinary modulation and a fourchannel 25 Gb/s TIA array suited for both NRZ and optical duobinary detection. This paper demonstrated that 28 Gb/s duobinary signals can be efficiently generated on chip with a delay-and-add digital filter and that the driver power consumption can be significantly reduced by optimizing the drive impedance well above 50 Ω, without degrading the signal quality. To the best of our knowledge, this is the fastest modulator driver with on-chip duobinary encoding and precoding, consuming only 652 mW per channel at a differential output swing of 6 Vpp. The 4 × 25 Gb/s TIA shows a good sensitivity of −10.3 dBm average optical input power at 25 Gb/s for PRBS 2 31 -1 and low power consumption of 77 mW per channel. Both ICs were developed in a 130 nm SiGe BiCMOS process.
J. Verbrugghe, R. Vaernewyck, B. Moeneclaey, X. Yin, G. Torfs, X.-Z. Qiu and J. Bauwelink are with INTEC/IMEC-iMinds, Ghent University, 9000 Gent, Belgium (e-mail: jochen.verbrugghe@intec.UGent.be; renato. vaernewyck@intec.UGent.be; bart.moeneclaey@intec.UGent.be; xin.yin@ intec.UGent.be; guy.torfs@intec.UGent.be; xingzhi@intec.UGent.be; johan. bauwelinck@intec.UGent.be).
G. Maxwell is with Tyndall National Institute, Cork, U.K. (e-mail: graeme.maxwell@tyndall.ie).
R. Cronin is with CIP Technologies, Ipswich, IP5 3RE, U.K. (e-mail: richard.cronin@ciphotonics.com).
C. Lai and P. Townsend are with the Photonic Systems Group, Tyndall National Institute, University College Cork, Cork, U.K. (e-mail: caroline.lai@ tyndall.ie; paul.townsend@tyndall.ie).
Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/JLT.2014.2319112
opto-electronic subsystems. For cost-sensitive applications, coherent transmission and digital signal processing are not considered as a practical solution. Instead, low-complexity modulation, multiplexing and signal processing are prefered as far as technology scaling economically allows. Applying multiple channels in parallel (e.g. fibers or wavelengths) is an effective approach to increase the system capacity, which can be further enhanced through the application of higher line rates or considering higher modulation schemes, which are still relatively easy to implement in electronics without ADCs and DSP (e.g. duobinary or 4-PAM). Integrating multiple transmitter and receiver blocks together, however, poses a number of challenges. As more circuits are combined on a single chip, the power consumption and power distribution becomes much more critical as the increased power consumption could lead to thermal problems or undesired voltage drops in the power supply nets on chip. As such, reducing the power consumption of all electronic circuits is of major importance, especially when integrating high-swing modulator driver circuits together. We demonstrated this strategy for the first time in a 10 × 11Gb/s EAM driver array consuming only 220 mW per channel [1] . As photonic components are often small compared to electronic ICs, especially in case of photodiodes, size constraints on the array integrated electronic channels lead to a number of challenges e.g. impacting power supply decoupling, low-side cut-off frequencies or crosstalk. This reasearch focused on the design of low-power driver and TIA circuits integrated into arrays. Delivering per lane/wavelength line rates of 25-28 Gb/s enables low-cost, 4 × 25 Gb/s approaches to provide 100 Gb/s connectivity in short-reach, inter-datacentre point-to-point links and metropolitan area networks. As an alternative for the conventional non-return-to-zero (NRZ) on-off keying format, optical duobinary (ODB) modulation is a good choice for metro-scale applications, since it offers greater chromatic dispersion tolerance, whilst retaining the advantage of simple low-cost direct detection receivers [2] . The reported driver array IC incorporates duobinary pre-and encoding, while delivering a large differential output swing at very low power consumption [3] . To the best of our knowledge such EAM driver arrays are not currently available on the market, whereas the TIA array offers similar performance with respect to the state-of-the-art, but with small footprint. Section II presents the transmitter design, based on the duobinary driver array IC, and its experimental results, followed by a description of the receiver design in Section III. BER measurements are presented for both NRZ and duobinary modulation in back-to-back. Finally, Section IV. gives the conclusions.
II. TRANSMITTER

A. EAM-Based Optical Duobinary Generation
Optical duobinary is a modulation format that is gaining interest in today's optical transmission. Thanks to its narrow optical spectrum, it is less sensitive to chromatic dispersion in long haul single mode links, but it is in fact suitable wherever the available bandwidth is limited, as the required bandwidth is about half that of NRZ. The electrical DB signal has three levels, denoted as −1, 0 and 1, which are translated into two optical intensities. The electrical 0 is transformed into a low optical intensity, whereas a high optical intensity is generated from both an electrical +1 and −1, but with a 180
• optical phase difference [4] . In this way a conventional direct detection receiver is still viable in the optical link to detect the data signal. Current duobinary transmitters mostly employ Mach-Zehnder modulators (MZMs). On the other hand an electroabsorption modulator (EAM) pair in an MZ or, in case of reflective EAMs, in a Michelson configuration can also be utilized, having the advantage of, firstly, a small form factor, which makes them easier to integrate into an array [5] and secondly, a low modulation voltage, which reduces the power consumption.
The MZ (or Michelson) configuration operates as follows: the positive and negative (three-level) data outputs of the driver are fed to two separate EAMs, of which one is followed by a pi-phase shifter (π), as shown in Fig. 1 . The electrical +1 and −1 levels guarantee an output with a large optical intensity as in this case one of both EAMs is transparant and thus turned on. The 180
• phase difference is ensured by the pi-phase shifter. In case of an electrical 0, both EAMs transmit a light signal with an equal power, which add destructively due to the pi-phase shifter, giving a low optical intensity at the output.
The three-level duobinary signal is created with a delay-andadd filter, which gives the possibility to adjust the bit rate as desired. The delay is implemented by D-flip-flops using a clock frequency equal to the bit rate. Generally the encoder is followed by a low pass filter at half the bit rate. Here this functionality is achieved by the limited bandwidth of the driver stage. Due to the encoding, the received signal does not represent the original binary signal. This can be solved by a decoder at the receiver or a precoder at the transmitter. The decoder solution was not chosen for two reasons. Firstly, it can give rise to bit error propagation and secondly, an ambiguity arises due to the initial condition of the duobinary encoder. Therefore a precoder is used, implemented as a frequency divider. In [6] it is proven that the use of a precoder cancels the ambiguity caused by the initial condition. Fig. 2 depicts the top level block diagram of the driver IC. First the NRZ data signal is converted by the aforementioned duobinary precoder and encoder. A predriver block amplifies the three-level DB signal and drives the large capacitive input of the actual driver. The predriver is directly followed by the driver, which has a configurable modulation current I MOD and two configurable bias currents I BIAS for both positive and negative outputs. In Fig. 2 a pair of EAMs with 50 Ω input impedances is connected to the output through transmission lines.
To reduce the power consumption, different supply voltages are used to operate the different circuits with minimum headroom. The driver stage can be supplied from 4.8 V up to 6.6 V (Vcc2 in Fig. 2 ), whereas a standard low supply voltage of 2.5 V (Vcc1 in Fig. 2 ) is fed to all other building blocks. Fig. 3 shows the predriver and driver circuit. The predriver amplifies the duobinary signal to a level of typically 500 mVpp differentially and drives the input of the driver stage. Emitter degeneration resistor R 1 is used to linearize the predriver, because of the three-level duobinary signal. The linearity is however programmable to adjust the position of the crosspoints, giving it the possibility to be completely linear. The driver uses a cascode configuration to reduce both the driver output capacitance and the capacitive loading of the predriver output. The driver stage also makes use of emitter degeneration, because of the duobinary signal, it isn't fully linear to keep the power consumption low. The back termination resistors R B T were chosen to be 
B. Duobinary Eye Diagrams
The transmitter has been fabricated in a 130 nm SiGe BiC-MOS technology. Fig. 4 shows the die micrograph, with the data path running from bottom to top. The total chip area is 2200 μm×1200 μm, determined by the number of I/O pads and the 400 μm pitch between the EAM outputs. This gives sufficient room for on-chip decoupling capacitance, which is over 1.2 nF for each supply.
The electrical duobinary eye diagram is shown in Fig. 5(a) , measured at a data rate of 28 Gb/s by multiplexing four PRBS 2 31 -1 pseudo random bit sequences (PRBS). With a supply of 6.6 V, a swing of 6 V pp was achieved, while both outputs were biased by the driver at a voltage of 1.5 V below Vcc2. The power consumption of the duobinary coding block was measured to be 127 mW/ch, while the driver consumption was only 525 mW/ch of which 90 mW was consumed externally in the 50 Ω resistors. Per channel this gives a power consumption of only 652 mW.
A smaller differential swing of 3 V pp is shown in Fig. 5(b) . Due to the smaller swing, the corresponding modulation current is lower and the supply voltage can be reduced to 4.8 V, resulting in a reduction of the driver power consumption to 198 mW/ch Fig. 6 . Optical duobinary eye diagrams at 25.3 Gb/s captured with a conventional PIN-PD receiver (440 μW/div, 6.5 ps/div), (a) is generated by the presented driver chip and a MZM (b) is generated by a 7 GHz low-pass filter and EAMs in a Michelson configuration. excluding the 127 mW for the duobinary coding. Thanks to the flexibility of the delay-and-add filter implementation, the transmission speed can go as low as 21 Gb/s, as is shown in Fig. 5(c) .
At the time of writing the integrated duobinary transmitter with a Michelson EAM configuration wasn't operational yet. Results of the duobinary driver connected to a dual drive Fujitsu FTM7937EZ MZM are shown in Fig. 6 (a) at 25.3 Gb/s. The creation of ODB with an MZM is similar to that with EAMs in MZ configuration, with the pi-phase shift generated by a correct biasing. In comparison, the eye diagram depicted in Fig. 6(b) is achieved by the EAMs to be used in the assembly. It is generated by configuring the reflective EAMs in a Michelson structure and employing a 7 GHz low pass Bessel filter to achieve the duobinary signal. Both eyes are clearly open.
When comparing the power consumption to other papers it is important to consider both bit rate and output swing. To make the comparison clearer, a figure of merit is defined as the energy per bit divided by the output swing (lower is better). A comparison of the state-of-the-art with low energy consumption is given in Table I .
III. RECEIVER
A. Topology
The major circuits of the receiver assembly are represented in the block diagram, shown in Fig. 7 . Light from the fiber array is coupled to the 4-channel photo diode array. For each photo diode, both anode and cathode are bonded to the die in order to mimimize loop inductance and susceptance to interference. The channel pitch equals the photo diode array pitch of 250 μm. The photo diode responsivity (including fiber coupling loss) is 0.41 A/W while its capacitance and series resistance is 115 fF and 10 Ω. respectively. Each channel consists of a transimpedance input stage, a single-ended to differential converter (S2D) stage, three low gain high bandwidth programmable limiting stages and a 50 Ω output stage. A control loop removes the dc-offset between the differential output signals by adjusting the dc-voltage at the inverting terminal of the S2D stage, thus providing balanced differential output signals. The total smallsignal differential gain is typically 69 dBΩ. The MON-block measures the average photo current and adaptively biases the transimpedance stage, increasing its current sinking capability for larger input photo currents. It also provides a photo current monitor output and supplies a filtered voltage of around 3.25 V to the photo diode cathode. The anode voltage is 900 mV. The MON-block can be disabled. The die core runs of a 2 V supply and draws 38.5 mA per channel. Fig. 8 shows a circuit diagram of the transimpedance input stage. Its core is a conventional self-biased shunt-shunt feedback amplifier [11] , providing a low-impedance input node. Common-emitter amplifier Q 0 , R C and emitter follower Q 2 make up the forward amplifier, while feedback resistor R F and C I , represent the feedback path. C I is the total capacitance at the input node and includes photo diode, bondpad and TIA input capacitance. Cascode Q 1 protects Q 0 from excessive collectoremitter voltage and reduces its Miller capacitance contribution to C I . In addition, it provides a convenient low-impedance input for the current source I b1 , which provides extra bias current to Q 0 . In order to reduce Q 0 's base resistor noise, it is necessarily big, leading to increased base-emitter junction capacitance. This calls for higher bias current in order to reduce the transition time through the base and improve the high-frequency response. Current sink I b2 both biases Q 2 and sinks most of the photo current. Power consumption is reduced at low input power by adjusting I b2 as the monitored dc photo current changes. Conventionally, the output is taken at the emitter of the follower Q 2 , which is biased here around 800 mV. However, this would not leave any headroom for the tail current bias source in the S2D stage. Hence the output V o is located at the collector of Q 1 . Care must be taken to keep the capacitance at this node low, as the impedance level is somewhat higher than at the emitter of Q 2 . The small-signal input-output transfer function can be approximated as
in which T equals the loop gain with low-frequency gain and poles
Input pole ω 1 is dominant. In order to leave sufficient phase margin and limit time-domain overshoot, care must be taken to place output pole ω 2 at least on the gain-bandwidth (GBW) product T 0 · ω 1 . The natural frequency of the resulting twopole closed-loop transfer function is the geometric mean of ω 1 and ω 2 . In this design, some peaking is allowed to increase the bandwidth, and ω 2 is placed two times higher than GBW.
The input stage provides 50 dBΩ of gain and has simulated post-layout bandwidth of 22 GHz.
The topology of the S2D-stage and main amplifier stages is depicted in Fig. 9 . It constitutes a variation of a Cherry-Hooper stage [12] , in which a transimpedance amplifier Q 2 , R f , R C is used as the load of a transconductor Q 1 . Usually, simple differential pairs with emitter follower buffers are used in high speed amplifiers. However, the available supply voltage is too low to allow for an extra base-emitter drop while keeping the tail current sources of the subsequent stages in the active region. Moreover, the area requirements of on-chip capacitors precludes the use of ac coupling. As is the case in an emitter follower, the negative feedback in the transimpedance amplifier load reduces the output impedance, effectively decoupling the next stage from the former. In addition, it also presents a low input impedance to the tranconductance stage. Bias sources I e source a part of Q 1 's tail current I b1 , limiting the current through R C . This avoids saturating Q 2 and keeps the dc level of the output nodes high enough to allow for dc coupling. In the three main amplifier stages, fixed resistive and capacitive emitter degeneration (R E and C E ) is employed to create peaking in the input-output transfer function, hence increasing bandwidth. Furthermore, NMOS transistor M 1 acts as a variable resistance that changes the low frequency degeneration. It is used to program the gain of the stages. Each main amplifier stage has a simulated post-layout bandwidth of over 35 GHz with programmable gain between approximately 2 and 4, resulting in a total main amplifier gain range of 18 to 36 dB. As indicated in Fig. 9 , the degeneration is not used in the S2D stage. The output signal of the TIA stage is applied to V ip , while V im is driven to same average voltage by the error amplifier of the balancing loop. The data signal is single-ended and travels through both an inverting and noninverting path to V om and V op , respectively. It can be shown that in the case of an ideal bias tail current source I b1 , C X zero and a perfect symmetrical circuit, both outputs are equal in magnitude but opposite in sign [13] . However, as the capacitive reactance to ground at node X decreases at high frequencies, part of the signal current is diverted to ground resulting in lower gain and excess phase shift in the V op path. This effect is more pronounced when emitter degeneration is included. In order to make the transfer functions from input to both outputs more equal at low and high frequencies, care has been be taken to make the output capacitance of I b1 small and degeneration is avoided. Although this results in an earlier onset of clipping, this poses no problem in this application. The S2D stage has a fixed single-ended to differential gain of 3 dB and a simulated post-layout bandwidth of over 35 GHz. The output driver is a Fig. 10 . Micrograph of the fabricated die. Fig. 11 . Photo of the assembled fibers, photo diode array and receiver die.
conventionial cascoded differential pair with 50 Ω load resistors. Total simulated input referred noise is 2.5 μA rms.
B. Experimental Results
The receiver has been fabricated in a 130 nm SiGe BiCMOS technology. Fig. 10 shows the die micrograph. Each channel core occupies 250 μm × 800 μm (including bond pads), with an additional 210 μm × 310 μm for each balancing error amplifier, located at the edges of the chip. The total die size measures 2400 μm × 800 μm. The channel pitch is 250 μm, equal to the photo diode array pitch. Channels 1 and 4 are in the outer lanes, while channels 2 and 3 occupy the middle ones. Furthermore, channels 3 and 4 have extra test circuitry at their outputs. It follows that, even though all channel cores are identical, the topological and electrical differences will impact the respective channel performances. In order to mitigate performance degradation due to crosstalk, both V D D and V S S are electrically isolated for each channel. Other measures include careful use of deep trenches and guard rings.
The assembly with fiber array, photo diode array and receiver die is depicted in Fig. 11 . The die is placed in a cavity in order to reduce bond wire inductance, in particular of the ground bond wires. Transmission lines fan out radially from the chip to SMP connectors. An aluminium structure holds the fibers in place.
Measured differential output eye diagrams are shown in Fig. 12 for all channels with the MON-block disabled. Input data is PRBS 2 31 -1 NRZ at 28 Gb/s, with extinction ratio and rms jitter of 14 dB and 950 fs, respectively. Average input photo current is 100 μA. Equipment used is an Agilent 86117A sampling amplifier with an Agilent 86107A precision timebase module to reduce oscilloscope jitter. All eyes are clearly open. The differential output amplitude is 400 mV peak-to-peak, while rms jitter amounts to 1.8 ps for channels 2-4. Channel 1, however, is somewhat noisier. The reason is a lower amount of power supply decoupling capacitance on the die for channel 1, as some of its area has been sacrificed for on-chip test structures. The eye diagrams suggest a high signal-to-noise ratio for channel 4, followed by channel 2 with channel 3 marginally worse. The measured small-signal −3 dB bandwidth (using a Agilent lightwave analyzer) is around 14 GHz. This is lower than expected. A possible cause is long input bond wires and we expect higher bandwidth with an improved assembly. It should be noted that the eye diagrams and all further measurements represent the combined performance of the photo diodes and receiver including coupling loss and loss due to on-board transmission lines and connectors. Unfortunately, after the eye diagram measurements, the fiber and photo diode of channel 4 misaligned, reducing the optical coupling to virtually nothing. As a fix would have required dismantling of the assembly, further measurements on channel 4 were not pursued.
The bit error rate (BER) has been measured using an SHF 12100B pattern generator and SHF 11100B error analyzer using an off-the-shelf transmitter at 25 Gb/s, in line with the available transmitter and receiver bandwidth. The extinction ratio and rms jitter of the optical input signal was 14 dB and 950 fs, respectively. Fig. 13 shows the BER curves of channel 1-3 for a NRZ PRBS 2 7 -1 input signal at 25 Gb/s. Channel 2 and 3 show a sensitivity of −11 dBm and −10.9 dBm at a BER of 10 −12 , respectively. In line with expectations, channel 1 infers an extra penalty of 1 dB due to lower power supply decoupling. Fig. 14 shows the BER for a NRZ PRBS 2 31 -1 data pattern at 25 Gb/s data rate. Compared to PRBS 2 7 -1, channel 2 and 3 infer a power penalty of 0.7 dB while the penalty for channel 1 is 2.4 dB. Also measured, but not shown, is the BER for a PRBS 2 15 -1 pattern. The results are similar to the results of PRBS 2 31 -1. The penalty for longer pattern lengths is caused by a low-frequency high-pass pole in the data path, introduced by the balancing control loop. This acts similar to ac-coupling, hence causes droop [11] . The low cut-off pole frequency is Table II for channel 2 attacked by channel 3 and vice versa, for a 25 Gb/s PRBS 2 31 -1 data pattern. The BER performance degradation is shown for an aggressor input power 5 dB and 8 dB higher than the sensitivity of the victim channel. In spite of the small channel pitch of 250 μm, only a penalty of 0.5 dB is observed for the +5 dB attacker. The major contributor to the crosstalk is inductive coupling between the bond wires of the adjacent channels, as the die substrate is high-ohmic and various on-chip isolation measures have been taken (separated supply rails and isolating trenches). This has been confirmed by 3-D field simulations in CST Studio. The extra penalty for the +8 dB attacker is a mere 0.1 dB. This can be explained by recognizing that, even though there is a twofold difference in input powers, in both cases the back-end stages of the attacking channel are limiting. Hence, the extra degradation must be caused by the front-end stages, in which the signals are smaller to begin with. In addition to NRZ, an optical duobinary back-to-back link has been measured, with the transmitter described in Section II. The photo diode acts as an intensity detector. The extinction ratio and peak-to-peak jitter were 9 dB and 13 ps, respectively. Fig. 15 depicts the performance of channel 2 and 3 for 25 Gb/s PRBS 2 7 -1 and PRBS 2 31 -1. For PRBS 2 7 -1, sensitivity is −10 dBm and −9.5 dBm for channel 2 and 3, respectively, indicating a power penalty of 1 dB to 1.4 dB as compared to NRZ signaling. An extra penalty of 2 dB is measured for PRBS 2 31 -1. The degradation, particularly pronunciated for the longer data patterns, is caused by the V-shaped eye opening typical for duobinary encoding and non-perfect timing in the transmitter, in combination with limited total system bandwidth.
A comparison with the state-of-the art is given in Table III . For comparable performance with previous receivers, this work obtains tighter integration using a standard 250 μm channel pitch.
IV. CONCLUSION
With ever increasing bandwidth demands, compact, lowpower and low-cost multi-channel transmitter and receiver modules are an absolute necessity to scale up performance. This work focused on the development of a two channel duobinary transmitter and a four channel PIN-TIA receiver array, suited for both NRZ and optical duobinary reception. This research showed that 28 Gb/s duobinary signals can be efficiently generated on chip with a delay-and-add digital filter and that the driver power consumption can be significantly reduced by optimizing the drive impedance well above 50 Ω, without degrading the signal quality. To the best of our knowledge this is the fastest modulator driver with on-chip duobinary encoding and precoding, consuming only 652 mW per channel at a differential output swing of 6 V pp . The 4 × 25 Gb/s TIA shows a good sensitivity of −10.3 dBm at 25 Gb/s for PRBS 2 31 -1 NRZ at low power consumption, 77 mW per channel, while providing a transimpedance gain of 69 dB. The channel pitch is 250 μm. Power penalty due to crosstalk is 0.6 dB for an adjacent aggressor +8 dB higher than sensitivity. BER measurements using the developed transmitter and receiver in a back-to-back link showed a sensitivity of −7.5 to −8 dBm for PRBS 2 31 -1 optical duobinary. The ICs were developed in a 130 nm SiGe BiCMOS process and the respective subsystems are being evaluated in a metro system test bed at the time of writing.
