## 84 Gbit/s SiGe BiCMOS duobinary serial data link including Serialiser/Deserialiser (SERDES) and 5-tap FFE T. De Keulenaer<sup>™</sup>, G. Torfs, Y. Ban, R. Pierco, R. Vaernewyck, A. Vyncke, Z. Li, J.H. Sinsky, B. Kozicki, X. Yin and J. Bauwelinck The increasing demand for bandwidth fuels the development towards high data rate electrical serial links. These links generally suffer from considerable frequency-dependent loss, introducing the need for equalisation at 10 Gbit/s and higher. Modulation schemes with improved spectral efficiency, with respect to non-retrun to zero (NRZ), combined with feed-forward equalisation (FFE), allow increasing the chip-to-chip data rate with the drawback of a more complex, e.g. multi-level, receiver (Rx). The use of duobinary modulation (DB) is presented to realise a high-speed serial link. The increase in complexity of a DB Rx is limited, whereas the required channel bandwidth compared with NRZ is reduced. Furthermore, the need for equalisation when compared with PAM4 is reduced as the required roll-off that is needed to create a duobinary modulated signal from an NRZ stream can incorporate the frequency-dependent loss of the link. Introduction: In this Letter, a transmitter (Tx) and receiver (Rx) chipset targeting a serial data rate beyond 80 Gbit/s is presented. A block diagram of the complete transceiver chain is shown in Fig. 1. The Tx consists of a 4:1 multiplexer (MUX) connected to a 5-tap feed-forward equalisation (FFE) with an approximate delay of 12.5 ps between the taps. Together with the channel, the FFE creates an equivalent channel that transforms the NRZ from the MUX into duobinary at the Rx input (Fig. 1). The Rx chip includes a duobinary front-end connected to a 1:4 demultiplexer (DEMUX). A data rate of 84 Gbit/s is achieved across a differential link, which includes a parallel pair of 10 cm coaxial cables and the 5 cm grounded coplanar waveguide traces on the chip test boards. We believe this to be the fastest reported electrical duobinary transmission experiment to date [1]. The Tx and Rx chips are fabricated in 130 nm STMicroelectronics SiGe bipolar CMOS (BiCMOS) 9 MW technology and are mounted directly onto the printed circuit board (PCB) substrate using thermosonic flip-chip bonding. Serial peripheral interface (SPI) is used to control both chips and to adjust the bias currents, the comparison levels of the Rx, the tap weights of the FFE and the clock delays in the MUX and DEMUX. Fig. 1 Architecture of presented transceiver chain (middle), time-domain transition from NRZ to duobinary (top) and spectral shaping of channel to receive duobinary (bottom) Transmitter: The 4:1 MUX has a tree architecture with the final 2:1 selector stage feeding both the FFE driver and the MUX test output driver. To overcome problems related to limited set-up and hold time of the input re-timer and to facilitate data alignment, the MUX input data can be delayed on-chip by 12.5 ps and the clock phase of the half-rate MUXs can be selected on-chip. The MUX provides a clean NRZ waveform beyond 80 Gbit/s as shown in Fig. 6 by the eye diagram measured at the MUX output of the Tx test board. The 5-tap FFE topology is shown in Fig. 2. The delays are implemented using meandered on-chip transmission lines with a length of $750\,\mu m$ in between the gain cells. The measured delay between the cells, including the line loading, is 12.4 ps (±0.1 ps). The gain cells (bottom left Fig. 2) are based on a Gilbert cell architecture of which the gain is tuned linearly by changing the tail current difference between both differential pairs using an 8 bit monotonic digital—analogue converter (DAC). By keeping the summed current of both differential pairs constant, the current flowing through the transmission line termination resistors is constant, thus keeping the bias voltage of the FFE output buffer constant. Fig. 2 Overview of 5-tap FFE (top) and circuit of gain cell (bottom) Fig. 3 Block diagram of Rx chip and illustration of how unbalanced NRZ streams are first demultiplexed before they are decoded by XOR gate Fig. 4 Block diagram of five-stage LSLA (top) and schematic of level-shifting stages (bottom left) The Gilbert cell layout is relatively small compared with the meandered transmission lines. By splitting the gain cell, placing the emitter followers (EFs) as an input buffer at the input transmission line and placing the actual Gilbert cell at the output transmission line, long leads and excessive loading of the transmission lines are avoided. Receiver: The Rx block diagram is given in Fig. 3. A transimpedance amplifier (TIA), with inductive peaking to increase the bandwidth, is used as a matched linear input buffer which allows achieving both noise and impedance matching simultaneously [2]. The output of the TIA is connected to two parallel level-shifting limiting amplifiers (LSLAs), which convert the duobinary modulation (DB) stream into two NRZ-like streams, recovering the upper and lower eye of the DB signal. A block diagram of the LSLA is shown in Fig. 4. The level comparison is carried out in two stages having 8 and 4 bits of digital control. This allows re-enforcing the comparison level through the signal path. This is needed because the comparison of one of the duobinary eyes leads to mark densities that deviate from 50% (theoretical densities of 25 and 75% with balanced input data). The comparison level is set by tuning part of the EF current through the resistors $R_E$ . Shunt capacitors $C_E$ are added to short $R_E$ in the data path (Fig. 4). An asynchronous XOR introduces considerable jitter, hence the DEMUX first clocks the two LSLA outputs to four data streams with half the bit rate. At this lower bit rate, the XOR operation is performed to decode the DB stream into two balanced NRZ streams (Fig. 3), followed by an additional demultiplexing operation resulting in four quarter rate NRZ data streams. Measurements and conclusion: The chips are mounted directly onto a Rogers RO4003C substrate using thermosonic flip-chip bonding. The traces, together with 1.85 mm screw-on angle mount connectors, were designed to have a smooth frequency response of up to 67 GHz and above [3]. The Tx and Rx test boards are interconnected with two phasematched 10 cm coaxial cables and 1.85 mm DC blocks. The total combined channel loss at the Nyquist frequency (42 GHz) is 14 dB as can be seen from Fig. 5. Four identical PRBS7 streams with 0, 63, 95 and 31 bit delays are combined, resulting in a full-rate PRBS at the output of the MUX, which prevents the need for precoding since a precoded PRBS yields the same PRBS, simply shifted in time. The full-rate DB eye at the output of the 10 cm coax cable is shown in Fig. 6 together with the NRZ eye at the output of the MUX. Fig. 5 Total channel loss (black), ideal duobinary shape (grey) and duobinary shaped channel (dark grey) Fig. 6 84 Gbit/s output eye diagram of MUX (left) and 84 Gbit/s duobinary eye at input of Rx Fig. 7 Overview of 84 Gbit/s test set-up The bit error rate (BER) measured on a DEMUX output channel showed successful data transmission of 84 Gbit/s across a serial link with a BER of $5.3 \times 10^{-12}$ . The Rx chip with DEMUX and the Tx chip including MUX consume 1.25 W and 750 mW, respectively, from a 2.5 V supply and occupy $1.55\times4.59~\text{mm}^2$ and $1.93\times$ 2.58 mm<sup>2</sup>, respectively. This results in an overall power consumption of 24 mW/Gbit/s for the 84 Gbit/s duobinary link, showing that the presented Rx and Tx operate at very high data rate with reasonable power consumption. Chips with similar complexity, but working at lower data rates, are presented in [4, 5]. Raghavan et al. [4] showed an efficiency of 43 mW/Gbit/s for a 44 Gbit/s link including CDR and SFI-5.2 interface, whereas Ming-Shuan [5] showed a low power consumption of 11 mW/ Gbit/s at 40 Gbit/s and including 5-tap FFE and 3-tap finite impulse response (FIR); however, without MUX and at much lower rate. To conclude, our work shows that DB is a viable alternative for NRZ and PAM4 and that generating and decoding duobinary can be done with reasonable circuit complexity and power consumption at bit rates of above 80 Gbit/s. Acknowledgment: This work was supported by The Agency for Innovation by Science and Technology in Flanders (IWT). © The Institution of Engineering and Technology 2015 13 November 2014 doi: 10.1049/el.2014.3817 One or more of the Figures in this Letter are available in colour online. T. De Keulenaer, G. Torfs, Y. Ban, R. Pierco, R. Vaernewyck, A. Vyncke, Z. Li, X. Yin and J. Bauwelinck (*Department of INTEC, Ghent University – IMEC – iMinds, Ghent, Belgium*) J.H. Sinsky and B. Kozicki (Bell Labs, Alcatel-Lucent, Holmdel, USA) ## References - Sinsky, J.H., et al.: '39.4 Gb/s duobinary transmission over 24.4 m of coaxial cable using a custom indium phosphide duobinary-to-binary converter integrated circuit', *IEEE Trans. Microw. Theory Tech.*, 2008, 56, (12), pp. 3162–3169, doi: 10.1109/TMTT.2008.2007430 - 2 Shahramian, S., et al.: 'Design methodology for a 40 GS/s track and hold amplifier in 0.18 μm SiGe BiCMOS technology', IEEE J. Solid-State Circuits, 2006, 41, (10), pp. 2233–2240, doi: 10.1109/JSSC.2006.878111 - 3 De Keulenaer, T., et al.: 'Measurements of millimeter wave test structures for high speed chip testing'. IEEE Workshop on Signal and Power Integrity, Ghent, Belgium, May 2014, pp. 1–4, doi: 10.1109/SaPIW.2014.6844529 - 4 Raghavan, B., et al.: 'A sub-2 W 39.8–44.6 Gb/s transmitter and receiver chipset with SFI-5.2 interface in 40 nm CMOS', *IEEE J. Solid-State Circuits*, 2013, 48, (12), pp. 3219–3228, doi: 10.1109/JSSC.2013.2279054 - 5 Ming-Shuan, C.: 'A 40 Gb/s TX and RX chip set in 65 nm CMOS', ISSCC Dig. Tech. Pap., San Francisco, CA, USA, February 2011, pp. 146–148, doi: 10.1109/ISSCC.2011.5746257