An integrated 2x28 Gb/s dual-channel duobinary driver IC is presented. Each channel has integrated coding blocks, transforming a non-return-to-zero input signal into a 3-level electrical duobinary signal to achieve an optical duobinary modulation. To the best of our knowledge this is the fastest modulator driver including on-chip duobinary encoding and precoding. Moreover, it only consumes 652 mW per channel at a differential output swing of 6 V pp ..
Introduction
Data centers are becoming one of the major consumers of electricity in the industrialized world, with about 1.3% of the world's electricity attributed to them [1] . Even though the growth rate slowed down from doubling between 2000 and 2005, to an increase of about 50% in the following 5 years, the continuously growing trend doesn't seem to stop. The increase in rack density in less developed markets will endow the rising tendency [1] , [2] . Less than 50% of the electricity is actually used for computing and communication, while the rest is overhead caused by cooling systems, power distributions, etc [3] . As a consequence, lowering the power dissipation used for communication will trigger a cascading effect on the total power consumption reduction of the data centers and will reduce the cost of expensive cooling and power supplies.
As data centers often suffer from fiber scarcity or don't own their fiber infrastructure, dense wavelength division multiplexing (DWDM) technologies are essential to deliver reach and capacity extension for inter-office (point-topoint) interconnects in data center environments. The optical duobinary (ODB) modulation format has a narrow optical spectrum, which significantly improves chromatic dispersion tolerance, facilitating operation on a 50 GHz and even on a 25 GHz grid.
In this paper, a low power array of two 28 Gb/s duobinary modulator drivers is presented for short reach interdatacenter point-to-point links [4] . Two such driver arrays can be combined to realize a throughput of 112 Gb/s. Each channel of the driver array incorporates a duobinary encoder, transforming a non-return-to-zero (NRZ) input into a 3-level electrical duobinary signal. The driver array was designed in a 0.13 µm SiGe BiCMOS technology, to drive a pair of electroabsorption modulators (EAMs) each in parallel with a 50 Ω termination resistor. While most papers report Mach Zehnder modulator (MZM) usage for ODB [5] , [6] , in this research it was opted to use EAMs. The individual drivers can be operated from 20 to 28 Gb/s with a configurable differential output voltage swing between 3 and 6 V pp . The bias voltage of the modulators can be configured over a range of 650 mV for every output swing. At a maximum voltage swing of 6 V pp and a reverse EAM bias of 1.5 V the total power consumption is just over 1.3 W, which corresponds to 652 mW per channel using a supply voltage of 6.6 V. As the duobinary encoding dissipates 127 mW per channel, this brings the consumption of the actual driver at 525mW and the energy below 20 pJ per bit, which is very low in comparison with other work (see Section 5) . To the best of our knowledge this is the fastest modulator driver including on-chip duobinary encoding and precoding reported so far. In [7] the duobinary coding is done on-chip as well, but at 5 Gb/s and with lower output voltage swing.
The remainder of this work is organized as follows. The duobinary encoding is discussed in Section 2. The circuit techniques involved in the design of the driver are discribed in Section 3. Experimental results are presented in Section 4. Finally, a discussion on the power consumption is held in Section 5 followed by the conclusions.
Duobinary Encoding

Optical Duobinary Signal Characteristics
ODB is a modulation format that is gaining interest in today's optical transmission [8] . Due to its narrow optical spectrum, it is less sensible to chromatic dispersion in long haul single mode links, but it is in fact suitable wherever the available bandwidth is limited, as the required bandwidth is about half that of NRZ. The electrical duobinary (DB) signal has three levels, denoted as -1, 0 and +1, which are translated into two optical intensities. The electrical 0 is transformed into a low intensity, whereas a high optical intensity is generated from both an electrical +1 and -1, but with a 180
• optical phase difference. In this way a conven- The MZ (or Michelson) configuration operates as follows: the positive and negative (three-level) data outputs of the driver are fed to two separate EAMs, of which one is followed by a pi-phase shifter (π), as shown in Fig. 1 . The electrical +1 and -1 levels guarantee an output with a large optical intensity as in this case one of both EAMs is transparant and thus turned on. The 180
• phase difference is ensured by the pi-phase shifter. In case of an electrical 0, both EAMs transmit a light signal with an equal power, which add destructively due to the pi-phase shifter, giving a low optical intensity at the output.
Electrical Coding
The three-level DB signal can be created in two ways: with an analog Bessel low pass filter (LPF) or with a delay-andadd filter [10] . In this design a delay-and-add filter was chosen, because this approach allows one to adjust the bit rate as desired, whereas the LPF approach only gives good results at one particular data rate defined by its 3dB cutoff frequency. The delay is implemented by two D-latch using a clock frequency equal to the bit rate, as depicted in a simplified representation in Fig. 2 . The designed DB encoder uses buffer structures B1 and B2 following both D-latches to improve the signal quality. The adder is also succeeded by a buffer, B3, which is linear in order not to deform the three-level DB signal. Both inputs of the adder need to be synchronized with the clock (CLK) to ensure an exact delay of one bit period. Because the outputs of the latches are synchronized, the encoder is preceded by a D-latch (not shown in Fig. 2 ) and a buffer, B0. Not including B0 would cause an additional delay between the inputs of the adder, which should be one bit period. To shape the signal the encoder is generally followed by an LPF at half the bit rate. Here this functionality is achieved by the limited bandwidth of Due to the encoding, the received signal does not represent the original binary signal. This can be solved by a decoder at the receiver or a precoder at the transmitter. The precoder solution was chosen over the decoder solution since the decoder has two drawbacks. Firstly, it can cause catastrophic bit error propagation and secondly, an ambiguity arises due to the initial condition of the DB encoder. As illustrated in Fig. 3 , a frequency divider can perform the precoding while using the same clock as the D-latches. In [7] it is proven that the ambiguity due to the initial condition is cancelled by using a precoder. Even though the precoder in [7] is implemented differently, the functionality is the same. Fig. 4 depicts the top level block diagram of the driver IC. The data input and clock input are both differentially matched to 100 Ω. The NRZ data signal is converted by the aforementioned DB precoder and encoder. A predriver block amplifies the input signal and drives the large capacitive input of the actual driver. The predriver is directly followed by the driver, which has a configurable modulation current and two configurable bias currents for both positive and negative outputs.
Circuit Design
Top Level Block Diagram
The modulation and both bias current of each channel is separately programmable with 4-bit resolution using a serial peripheral interface (SPI). The bias current setting controls the reverse EAM DC bias voltage with an accuracy of less than 30 mV, to optimize the EAM settings according to the transmitting wavelength. For testing purposes both channels can be powered down independently by nullifying all tail currents.
To reduce the power consumption, several techniques are implemented in the driver array. First of all, different supply voltages are used to operate the different circuits with minimum headroom. A standard low supply voltage of 2. Fig. 4 ) is fed to the precoder and encoder, the predriver and all other building blocks outside the data path, whereas the driver stage can be supplied from 4.8 V up to 6.6 V (Vcc2 in Fig. 4 ). The use of the 2.5 V in the predriver also has the advantage that there is no need for additional level shifters at the output of the predriver to correctly set the output level.
Predriver
The predriver buffers the DB signal towards the last stage, the actual driver. Because the driver stage switches large currents, the transistors of this stage are large. Therefore, this last gain stage will have a high input capacitance, in the order of 1 pF. The predriver amplifies the DB signal to a level of typically 500 mV pp differentially and drives the input of the driver stage. The circuit is shown in Fig. 5 and consists of a differential pair, Q p , followed by a pair of emitter followers, Q f . Emitter degeneration, R 1 , is used to linearize the predriver. This is necessary because of the 3-level duobinary signal, to ensure the crosspoints of the upper and lower eyes lie in the middle of the adjacent levels. The linearity is however programmable to adjust the position of the crosspoints.
The emitter followers serve as level shifters and impedance transformers. Typically a cascade of emitter followers is used to ensure a low drive impedance [11] . This has the disadvantage that both the supply voltage and the current consumption are high. For that reason only one follower is used in this work, which gives a satisfactory low drive impedance, below 10 Ω. The current sources of the followers are used in a cascode configuration. The gates of these cascode transistores (M 1 ) are cross coupled for peaking. The principle of this operation lies in the lower drive strength required by a bipolar device when turning on compared to turning off [12] . This can be understood by looking at the base currents of the driver transistors. When a bipolar device (Q 1 in Fig. 6 ) is turned off, the current surging out of the base will subtract from the emitter current of the preceding emitter follower, Q f . To compensate for this current reduction, the tail current of the follower Q f should increase at every falling edge. This is done by the cross coupling, since the gate of the mosfet that sinks the tail current of a falling edge receives an extra forward bias from the rising edge at the complementary predriver output.
Driver
The driver cascode configuration, as shown in Fig. 6 , has several advantages. It reduces the capacitive loading of the predriver output, as the cascode input doesn't suffer a Miller effect. It also reduces the driver output capacitance. The emitter degeneration, R E , is again used to improve the linearity.
The back termination resistors, R BT , were chosen to be 250 Ω. In comparison to the typical 50 Ω back termination resistors, giving a power reduction of 35 % for the driver. This is possible when the driver output is DC-coupled output towards the EAM. With an AC-coupled output, a higher supply voltage would be required to satisfy the headroom requirement. The average current through the coupling ca- pacitor, and thus through the 50 Ω load in parallel with the EAM, would be zero. As a consequence, R BT would be sourcing a current that is sinked into the load when the output is high. With a single-ended output swing of V sw , this current is equal to V sw 2·50Ω . When the output is low, R BT would be sourcing a current that is
greater to achieve the desired swing. This means the headroom of the differential pair is decreased by R BT · V sw 2·50Ω . By using an AC-coupled output, increasing R BT would decrease the modulation current needed to maintain a certain output swing, but it would increase the supply voltage dramatically to satisfy the headroom requirement. While using a DC-coupled output, the increase of R BT will also decrease the modulation current, but it will have no effect on the supply voltage. 
Measurements
Fig . 7 shows the die micrograph. The SPI-register is shown on the lower left side, with the data path running from bottom to top. The total chip area is 2.2 x 1.2 mm 2 , determined by the number of I/O pads and the 400 µm pitch between the EAM outputs. This gives sufficient room for on-chip decoupling capacitance, which is over 1.2 nF for each supply.
The measurement setup is shown in Fig. 8 . At the input a 4:1 multiplexer was used to generate a 28 Gb/s NRZ signal. The multiplexer used a clock signal at half the bit rate (14 GHz), while the duobinary coding blocks need a clock of the full bit rate (28 GHz). For this reason a clock doubler was used, followed by a ratrace coupler to transform the single-ended clock into a differential signal. At the output a bias-tee was placed to emulate the same DC operating conditions as when the driver was connected to an EAM (with the cathode to Vcc2) shunted with a 50 Ω termination resistor. This was necessary because the load is a 50 Ω oscilloscope (DC-coupled to ground), while the EAM load would be connected to Vcc2. 20-dB attenuators were added to avoid overloading the high-speed oscilloscope. In this measurement the output was measured differentially, but for simplicity a single ended output structure is shown in Fig.  8 .
The duobinary eye diagram is shown in Fig. 9(a) , measured at a data rate of 28 Gb/s by multiplexing four 2 31 -1 pseudo random bit sequences (PRBS). With a supply of 6.6 V, a swing of 6 V pp was reached, leading to a gain of 20 dB since the driver input is 600 mV pp . Both outputs were biased by the driver at a voltage of 1.5 V below Vcc2. The power consumption of the duobinary coding block was measured to be 127 mW/ch, while the driver consumption was only 525 mW/ch of which 90 mW was consumed externally in the 50 Ω resistors. Per channel this gives a power con- A smaller differential swing of 3 V pp , is shown in Fig.  9(b) . Due to the smaller swing, the corresponding modulation current is lower and the supply voltage can be reduced to 4.8 V, resulting in a reduction of the driver power consumption to 198 mW/ch excluding the 127 mW for the duobinary coding.
Due to the implementation of the delay-and-add filter the transmission speed can go as low as 21 Gb/s, as is shown in Fig. 9(c) . For lower speeds the delay of the precoder becomes too small with respect to the bit period. This causes the clock and data to be too far out of sync to be compensated by the single D-latch, that preceeds the duobinary encoder. Adding an extra D-latch to the cascade would resolve this problem, but would increase the power consumption.
As mentioned in section 3.2 the linearity of the predriver can be controlled. For an output voltage of 4 V pp this can make the crosspoint shift with 228 mV, which is more than 11%, Fig. 10 . This feature can optimize the receiver sensitivity for optical duobinary according to the distance that is traversed. For back-to-back and short reach transmission the crosspoint needs to be in the middle of the adjacent levels for optimal eye quality, while for long reach transmission the eye quality will improve when the crosspoint is shifted to the outer levels. Note that the adjustment doesn't work for output swings larger than 5 V pp , as the predriver output needs to be as large as possible to switch all of the current in the driver stage. For this the predriver needs a minimal degeneration.
Discussion
When comparing the power consumption to other papers it is important to keep in mind both bit rate and output swing. The consumption per bit rate, which is equal to the energy per bit, is a good measure for a comparison. At typical operation (6 V pp output swing) the energy is 23.28 pJ/b, including the duobinary coding blocks consumption, and 18.75 pJ/b without. A plot of the energy per bit in function of the differential output swing for different modulator driver circuits is shown in Fig. 11 . A comparison of the state of the art with low energy consumption is given in Table 1 .
To make the comparison clearer, a figure of merit [14] [15]
[16]
[17] [18] This work measured points Driver with ODB coding Driver w/o ODB coding Fig. 11 Energy per bit and output swing comparison (FoM) is defined as the energy per bit divided by the output swing. The resulting FoM for the referred papers and this work with and without DB coding can be seen in Table  1 . Also in Fig. 11 there are lines drawn with equal FoM. In Fig. 11 , the black line represents the measured consumption of the complete modulator driver, including the duobinary precoder and encoder. The lower dotted black line only considers the driver (without DB coding), as this gives a better comparison with other papers, where NRZ is utilized and therefore no coding blocks are present. As the most desirable region is at the bottom right (high swing and low energy per bit), it can be seen that this work gives the best performance, both with and without the coding blocks. The performance of [18] is the closest to this work, however, the used distributed amplifier with a single ended output occupies a total area of 6.7 mm 2 , which is almost 4 times larger than the proposed array and thus far too large to be incorporated into an array.
It should be noted that only modulator drivers with an energy per bit lower than 40 pJ/b are listed and that some papers reported the power consumption excluding the dissipation in the external load. Furthermore, not all designs did have a differential output, but their single ended swing was doubled in the figure to make this comparison possible. With [18] this is not done, as this is a distributed amplifier making the output differential is only possible by doubling the circuit, and thus the consumption. Note that except for this work no drivers at speeds lower than 40 Gb/s were included in this plot, as the energy per bit becomes too high at lower data rates. Moreover only [16] and this work make use of SiGe BiCMOS, while the others utilise more expensive GaAs and InP technologies.
Conclusion
Results of a low power SiGe modulator driver array for ODB transmission with a differential output voltage swing of 6 V pp and a throughput of 2x28 Gb/s were presented. The precoding and encoding of the NRZ input signal into duobinary is performed on-chip, while the duobinary coding consumes 127 mW/ch and the driver needs only 525 mW/ch, which gives a total of 652 mW/ch. To the best of our knowledge, the proposed IC is the fastest modulator driver including onchip duobinary coding reported so far. Moreover, array inte- Table 1 Comparison state-of-the-art in low energy consumption gration and state-of-the-art power consumption are demonstrated.
as DISCUS, Phoxtrot, MIRAGE and GreenTouch consortium. His current research interests include high-speed opto-electronic circuits and subsystems, with emphasis on burst-mode receiver and CDR/EDC for optical access networks, and low-power mixed-signal integrated circuit design for telecommunication applications. He is author or co-author of more than 50 national and international publications, both in journals and in proceedings of conferences. Xing-Zhi Qiu received the Ph.D. degree in applied sciences, electronics from Ghent University, Ghent, Belgium in 1993. She joined the department of information technology (IN-TEC) of Ghent University in 1986. She gained 27 years R&D experience within INTEC design laboratory in the field of high speed O/E/O front-ends and physical layer hardware design for broadband optical networks in general and burst-mode receiver/transmitter technologies for passive optical networks in particular. She has been strongly involving in many EU-funded projects, and managing high speed opto-electronic analog/digital chip/sub-system designs within IMEC-INTEC/Ghent University. She is author or co-author of more than 150 publications and 6 patents on ASIC and telecom system designs.
Jochen
Efstratios Kehayas
is the co-founder and R&D director of Constelex Technology Enablers Ltd. He coordinates the companys R&D directions, performs technology transfer and intellectual property generation and is responsible for creating the companys product development roadmaps. Dr Kehayas current research activities are focused on the application of photonics for designing and developing amplification and transmission systems applicable to high capacity terrestrial and space communication networks. Dr Kehayas holds a B.Eng. degree from Southampton University, a M.Sc. degree from Imperial College London and a Ph.D. degree from National Technical University of Athens. He has authored and co-authored more than 70 scientific journal and conference publications in IEEE and OSA, including invited talks in major conferences. He is the technical manager of European co-funded research projects as well as the coordinator of product development projects funded by the European Space Agency under ARTES 5.2 and ECI frameworks, on new generation space-qualified photonic modules.
Johan Bauwelinck
was born in SintNiklaas, Belgium, in 1977. He received the engineering degree in applied electronics and a Ph.D. degree in applied sciences, electronics from Ghent University, Ghent, Belgium, in 2000 and 2005, respectively. He has been a research assistant in the INTEC design Laboratory, Ghent University, since 2000 and he is currently a full-time tenure track professor. His research focuses on high-speed, high-frequency (opto-) electronic circuits and systems and he is a member of the ECOC technical program committee.
