CMOS Transmitter using Pulse-Width Modulation Pre-Emphasis achieving 33dB Loss Compensation at 5-Gb/s by Schrader, J.H.R. et al.
388 4-900784-01-X 2005 Symposium on VLSI Circuits Digest of Technical Papers
24-4
CMOS Transmitter using Pulse-Width Modulation Pre-Emphasis 
achieving 33dB Loss Compensation at 5-Gb/s 
J.H.R. Schrader, E.A.M. Klumperink, J.L. Visschers1, B. Nauta  
IC Design Group, MESA+ Research Institute, University of Twente, Enschede, The Netherlands  
1NIKHEF, Amsterdam, The Netherlands 
E-mail: j.h.r.schrader@utwente.nl
Abstract
A digital transmitter pre-emphasis technique is presented 
that is based on pulse-width modulation, instead of finite 
impulse response (FIR) filtering. The technique fits well to 
future high-speed low-voltage CMOS processes. A 0.13µm 
CMOS transmitter achieves more than 5Gb/s (2-PAM) over 
25m of standard RG-58U low-end coaxial copper cable. The 
test chip compensates for up to 33dB of channel loss at the 
fundamental signaling frequency (2.5GHz), which is the 
highest figure compared to literature. 
Keywords: pulse-width, modulation, pre-emphasis, transmitter, 
equalization, copper, cable and CMOS. 
Introduction
High-speed data-communication over lossy copper 
channels suffers from Inter-Symbol Interference (ISI). In fig. 
1a, the magnitude transfer function |S21| of 25m RG-58U 
low-cost, low-end, standard coaxial cable is shown. It can be 
seen that the channel exhibits 31dB of loss at the 
fundamental frequency (2.5GHz) for a 5Gb/s 2-PAM signal. 
The pulse response of this cable to a 200ps polar 
Non-Return to Zero (NRZ) pulse (fig. 1b) shows a very long 
tail that will interfere with neighboring symbols. To 
compensate, transmitter pre-emphasis and/or receiver 
equalization is necessary [1,2,3,4]. The latter, receiver 
equalization, typically involves several analog blocks with 
speed, accuracy and linearity requirements. On the other 
hand, transmitter pre-emphasis allows the use of a simple 
receiver that only needs to sample binary values [4]. 
Pre-emphasis methods found in literature are commonly 
based on symbol-spaced finite impulse response (FIR) 
filtering [1,2,3,4]. In order to flatten the channel response, 
the transmitted low-frequency amplitude is attenuated to 
match the loss figure at the fundamental frequency.  
1G 2G 3G-40
-30
-20
-10
0
Frequency [Hz]
|S
21
| [
dB
]
0 2ns 4ns
0
0.05
0.1
0.15
Time
Ca
bl
e 
O
ut
pu
t V
ol
ta
ge
Fig. 1a (left): |S21| of 25m RG-58U low-cost, low-end standard 
coaxial cable. Fig 1b (right): Cable response to a 200ps pulse. 
The performance for 2-PAM pre-emphasis schemes can be 
evaluated by comparing the loss figure (in dB) at the 
fundamental frequency (e.g. 2.5GHz for a 5Gb/s signaling 
rate), at which error free transmission is still possible. 
Different work can be compared on a loss figure basis, as 
long as the shape of the S21 is comparable. This is 
approximately the case for all copper cables (either coaxial 
or twisted pair), assuming that the ratio between skin effect 
and dielectric loss is not extremely different. Such a 
comparison is better than a comparison on a cable length 
basis because the pre-emphasis designer faces the same 
challenge for a long low-loss cable as for a short high-loss 
(low-end) cable. In recent work, a combination of 
pre-emphasis and post-equalization has led to 27dB 
(18dB+9dB) loss compensation at a signaling rate of 5Gb/s 
[1]. None of the pre-emphasis FIR filters that use 2 taps 
[1,3,4] achieve more than 18dB. Using a more complex 
5-tap symbol-spaced FIR filter, a loss compensation of 30dB 
has been achieved at 3.125Gb/s [2]. In this work, as a 
proof-of-principle, it will be shown that a very simple 
pulse-width modulation (PWM) scheme is a good alternative 
for providing pre-emphasis. A similar technique was 
recently proposed by our group for on-chip communication 
over 1cm long RC-limited interconnects [5]. Our paper 
demonstrates the suitability of the PWM technique for 
transmission over ~25m low-end copper cables. This PWM 
pre-emphasis method for copper cables can achieve a loss 
compensation of 33dB at 5Gb/s, which is the highest figure 
compared to those found in literature. 
Pulse-Width Modulation Pre-Emphasis 
A 2-PAM transmitter without pre-emphasis would send only 
a plain polar NRZ pulse for every bit. However, the cable 
smears out the pulses (as shown in fig. 1b), leading to ISI. 
To compensate, a 2-tap symbol-spaced FIR filter sends a 
delayed (lower amplitude) inverse polarity pulse after each 
main NRZ pulse (during the next bit time). Loosely speaking, 
this inverse polarity pulse compensates for the energy in the 
long tail of the cable response, leading to a much narrower 
cable output pulse than for the transmitter without 
pre-emphasis. These 2-tap symbol-spaced FIR filters are 
easy to implement and generally used for pre-emphasis 
[1,3,4]. We observed that similar functionality might be 
obtained using a fixed amplitude, but variable (time-)width 
inverse pulse. In fig. 2, this is illustrated. In fig. 2a, the 
PWM transmitter pulse shapes are illustrated and in fig. 2b 
the simulated cable responses (25m RG-58U) are shown. 
3892005 Symposium on VLSI Circuits Digest of Technical Papers
0 1ns 2ns
-0.05
0
0.05
0.1
Time
Ca
bl
e 
O
ut
pu
t V
ol
ta
ge
Fig 2a (left): Total PWM input pulse (solid) consists of 1st and 2nd
pulse (dashed/dotted). Fig. 2b (right): Simulated cable responses. 
0 500ps 1ns 1.5ns
0
0.05
0.1
0.15
Time
C
ab
le
 O
ut
pu
t V
ol
ta
ge 50%
57%
75%
100%
Fig 3a (left): PWM cable input pulse shapes for varying 
duty-cycles (200ps symbol duration). Fig 3b (right): Simulated 
cable responses (25m RG-58U) to PWM pulse shapes. 
Simulations have been made with an accurate cable model 
that includes skin effect and dielectric loss [7,8]. For a clear 
understanding the (linear) cable responses to both the first 
pulse and the second inverse pulse have been calculated 
separately (fig. 2: dashed and dotted lines), and added up to 
see the combined effect (solid line). The transmit pulse can 
be shaped by adjusting its duty-cycle from 50% to 100%, as 
is shown in fig. 3 (3a: cable input, 3b: simulated output for 
25m RG-58U). A value of 100% corresponds to 
transmission of a normal (polar NRZ) data signal (no 
pre-emphasis, resulting in ISI), and 50% to transmission of a 
Manchester coded data signal. Note that, for the correct 
duty-cycle setting, the cable output pulse becomes much 
narrower than the response to a plain polar NRZ pulse. 
Second, note that, as can be seen from fig. 3, the optimum 
duty-cycle is in-between 50% and 100%. 
The PWM method does not tune the pulse amplitude (as 
for FIR pre-emphasis), but instead exploits timing resolution. 
This is beneficial in future high-speed low-voltage CMOS 
generations, and it allows (class-D) full switching to the 
supply voltages. Second, the implementation is very simple 
and can be low-area, low-power and digital. In comparison 
to 2-tap symbol-spaced FIR filters the PWM scheme has a 
higher switching frequency (due to the switching inside the 
symbol period) and spectral analysis shows that it achieves 
more high-frequency boost than 2-tap symbol-spaced filters, 
resulting in higher loss compensation. 
Circuit Building Blocks 
As shown in fig. 4 (operation principle), the pre-emphasis 
circuit XORs the data with a pulse-width modulated (PWM) 
clock in order to provide pre-emphasized data. The PWM 
clock is generated using an OR gate and a delay circuit.  
PWMClk
Data in
Pre-emp.
data out
ClkB
ClkA
Time
B
in
ar
y 
Si
gn
al
 V
al
ue
Fig. 4a (top): operation principle. Fig 4b (bottom): signals. 
Fig. 5 Chip diagram: PRBS generator, clock buffers, pre-emphasis 
circuit and line driver. (All signals are differential). 
In fig. 5 the chip diagram is shown. Because a short 
differential delay is easier to make than a single short delay, 
the relative delay for clock B is created by delaying clk1
with delay1 and delaying clk2 with delay2. The differential 
delay is thus equal to (delay1-delay2). The XOR is 
implemented using a multiplexer (fig. 6a) that selects either 
D1 (non-inverted data) or D2 (inverted data). For optimum 
timing margin, D2 is delayed half a symbol time using a 
negative edge clocked flip-flop. The duty-cycle of the PWM 
pulse shape can be tuned between 50%-100%, when the 
relative phase-shift between clocks is 0º-180º. 
A. Delay Circuit 
The time-shifted clock is generated using a variable delay 
circuit (fig. 6b) [6]. This circuit has a delay from in to out, 
which is mainly determined by the RC time at the output. By 
adding a negative resistance (positive feedback circuit in 
parallel to the output), the effective R can be changed and 
hence the RC-delay. The value of the negative resistance is 
controlled by the differential delay control voltage 
VdelP-VdelN, which divides the total bias current between the 
input differential pair and negative resistance pair. For VdelP
» VdelN, the delay is minimized. As the total bias current 
through the output resistors is fixed, the output swing 
remains constant. The required tuning range of the 
delay-circuit depends on the desired symbol length and on 
the necessary duty-cycle range for pre-emphasis. The 
(continuous) tuning range can be enlarged by cascading 
390 4-900784-01-X 2005 Symposium on VLSI Circuits Digest of Technical Papers
multiple delay stages. For very large delay ranges this 
becomes unpractical and it is more effective to combine 
(continuous tunable) delay with (discrete fixed) delay steps.  
In our prototype design, we aimed for flexibility to evaluate 
the new PWM-concept in various ways. Therefore we allow 
for providing external clocks, e.g. to accommodate for very 
low bit rates for long poor cables. During normal operation, 
both inputs clk1 and clk2 (fig. 5) can just be connected to the 
same clock. 
Fig. 6a (left): CML multiplexer. Fig. 6b (right): CML delay tuning.
The whole test chip has been designed in CML to provide 
maximum supply noise rejection and minimum supply noise 
injection and keep timing noise as low as possible. Also, this 
guarantees equal up- and down- slew rates. 
B. Line Driver 
The line driver (fig. 7) consists of three stages. Each stage 
has three times the W/L dimensions and one third the 
resistance value of its predecessor. The final stage has 50�
on-chip output resistance and a tail current of 24mA. 
Nominal single-ended output swing is 600mVp-p
(Differential 1.2Vp-p).
Fig. 7 Three-stage differential line driver. 
Fig. 8 Chip microphotograph. 
Measurement Results
In fig. 8, a chip microphotograph is shown. Measurements 
have been made using an Agilent Digital Communication 
Analyzer (86100A) and an Anritsu pattern generator and 
BER tester (MP1632C with 231-1 PRBS). Eye diagrams have 
been generated at all speeds. An on-chip pattern generator 
was provided (27-1 PRBS) because the available external 
BER tester and pattern generator had a maximum speed of 
only 3.2Gb/s. 
Due to restrictions in measurement equipment and time 
limitations, no measurements could be made yet with twisted 
pair cable, however the transmitter has a differential output. 
So far only single-ended measurements have been made with 
coaxial RG-58U cable, using only one of the two transmitter 
outputs and terminating the other at 50�. The cable was 
connected to the test chip using a 50� differential probe 
(with 4 pins: ground-signal-signal-ground). The RG-58U 
cable is very low-cost, low-end, and standard. All chip I/Os 
have on-chip 50� termination and are ESD protected. 
A. Effect of Adjustments in PWM Duty-Cycle 
In fig. 7, the transmitter output eyes are shown for different 
duty-cycles. The left- and right edges in the eye diagrams 
correspond to the symbol edges. (Compare to fig. 2a). 
In fig. 8 the responses of a 10m RG-58U cable to the 
pre-emphasized data stream with different pre-emphasis 
duty-cycles are shown. It can be seen that there is an 
optimum duty-cycle (middle figure). Under-emphasis is 
shown in the left figure and over-emphasis in the right figure. 
Note that the time scale in fig. 7 and fig. 8 is the same. 
Fig. 7 Measured transmitter eyes at 5Gb/s with three different 
duty-cycle settings; left: no pre-emphasis (100%), middle: weak 
pre-emphasis (66%), right: strong pre-emphasis (55%). 
Horizontal axis = 20ps/div, vertical axis=100mV/div. 
Fig. 8 Measured eyes of cable response for transmitter settings 
shown in previous figure and 5Gb/s over 10m RG-58U cable.  
Horizontal axis = 20ps/div, vertical axis=20mV/div. 
B. Eye Diagrams for 25m RG-58U 
In fig. 9a (4Gb/s) and fig. 9b (5Gb/s), measured eye 
diagrams of the cable output for 25m RG-58U are shown. 
These two speeds are shown to illustrate the difference in 
eye shape. The cable loss at 2.5GHz is 31dB, and the total 
channel loss is approximately 33dB including additional 
parasitic losses in the path from chip to coaxial cable (probes, 
short wire, bias tee and connectors). Using a high-speed 
1mm
Vbias
VdelP
in in
VdelN
out
out
Vbias
PWMclk
D1
D1
D2
D2
PWMclk
Pre-
Emp
Data
3912005 Symposium on VLSI Circuits Digest of Technical Papers
limiting amplifier and the BER tester, the BER has been 
tested up to 3.2Gb/s (due to limited speed of BER tester) and 
is <10-12. From the clearly open eye diagrams it can be 
concluded that the pre-emphasis compensates enough 
channel loss to enable error free transmission at 5Gb/s. 
Fig. 9a (left): Measured output eye of 25m RG-58U at 4Gb/s. 
Horizontal axis = 20ps/div, vertical axis = 10mV/div. Fig. 9b 
(right): Measured output eye of 25m RG-58U at 5Gb/s. Horizontal 
axis = 20ps/div, vertical axis = 7.5mV/div.
At a channel loss of 33dB, the small cable output 
amplitude puts a high demand on receiver sensitivity and it 
might be necessary to use differential signaling. Using the 
fully differential transmitter capabilities would boost the 
(differential) swing at the cable output with 6dB while also 
rejecting common mode noise. 
TABLE I 
PRE-EMPHASIS COMPARISON WITH OTHER WORK 
Ref. R Loss Ft. size Type 
[1] TX only 5Gb/s 18dB 0.13m 2-tap FIR 
[2] TX only 3.125Gb/s 30dB 0.11m 5-tap FIR 
[3] 8Gb/s ~10dB 0.3m 2-tap FIR 
[4] 4Gb/s ~10dB 0.25m 2-tap FIR 
this work 5Gb/s 33dB 0.13µm PWM 
TABLE II 
ELECTRICAL CHARACTERISTICS OF TRANSMITTER 
Baudrate (2-PAM) 5GBd 
U-I 200ps
TX amp. (Vp-p) nom. 1.2V (dif), 
600mV (single-ended) 
Channel loss @ 2.5GHz 33dB 
Vsup 1.2V
Power (pre-emphasis) 12mW 
Power (line driver) 42mW 
Power (clock buffering) 39mW 
Power (on-chip PRBS) 17mW 
In table I, a comparison with other published work is 
given. It is shown that this work achieves the highest loss 
compensation (33dB) at a bit rate of 5Gb/s. From [1,2] only 
the transmitter pre-emphasis has been taken into account 
(not the receiver equalizer). In table II, the electrical 
characteristics are given. Power is hard to compare because 
most publications only give total figures. In our current 
proof-of-principle design, the clock-buffering takes quite a 
lot of the power budget, which can be improved if internal 
clocks are available in the IC, like in practical applications. 
Because of the simplicity of the pre-emphasis method, area 
and power can be very small. 
Conclusions
A new digital pre-emphasis technique based on pulse-width 
modulation (PWM) is introduced. The PWM method does 
not tune the pulse amplitude (as for FIR pre-emphasis), but 
instead exploits timing resolution. This fits well to future 
low-voltage high-speed CMOS processes. Using only 
single-ended measurements due to limitations in 
measurement equipment, successful transmission of a 
2-PAM 5Gb/s data signal over 25m of low-cost, low-end, 
standard RG-58U coaxial cable is demonstrated. This 
corresponds to a loss compensation of 33dB at the 
fundamental frequency of 2.5GHz, which is the highest 
figure compared literature. Main building blocks of the CML 
pre-emphasis circuit are a tunable delay, an OR gate and a 
multiplexer. The pre-emphasis technique is simple and can 
be implemented using only low power and area. 
Acknowledgements
The authors would like to thank Stichting FOM (funding), 
CERN and Giovanni Cervelli (organizing chip fabrication), 
Daniel Schinkel, Paulo Moreira and Hans Verkooijen 
(helpful discussions), Henk de Vries, Gerard Wienk and 
Joop Rövekamp (practical assistance), and Lanzarote for the 
beautiful views. 
References
[1] Y. Kudoh, M. Fukaishi and M. Mizuno, “A 0.13-m CMOS 
5-Gb/s 10-m 28AWG cable transceiver with no-feedback-loop 
continuous-time post-equalizer”, IEEE J. Solid-State Circuits,
vol. 38, pp. 741-746, May 2003. 
[2] W. Gai, Y. Hidaka, Y. Koyanagi, J.H. Jiang, H. Osone and T. 
Horie, “A 4-channel 3.125 Gb/s/ch CMOS transceiver with 
30dB equalization”, Symposium on VLSI Circuits, Digest of 
Technical Papers, pp. 138-141, 2004. 
[3] R. Farjad-Rad, C.K. Yang, M. Horowitz, and T. Lee, “A 
0.3-m CMOS 8Gb/s 4-PAM serial link transceiver”, IEEE J. 
Solid-State Circuits, vol. 35, pp. 757-764, May 2000. 
[4] M. Lee, W. Dally and P. Chiang, “Low-power area-efficient 
high-speed I/O circuit techniques”, IEEE J. Solid-State 
Circuits, vol. 35, pp. 1591-1599, Nov. 2000. 
[5] D. Schinkel, E. Mensink, E. Klumperink. E. van Tuijl, B. 
Nauta, “A 3Gb/s/ch transceiver for RC-limited on-chip 
interconnects”, IEEE International Solid State Circuits 
Conference 2005, in press. 
[6] B. Razavi, “Design of analog CMOS integrated circuits”, 
McGraw-Hill, 2001. 
[7] F.E. Gardiol, “Lossy transmission lines”, Artech House, 1987. 
[8] P. Grivet and P. W. Hawkes, “The physics of transmission 
lines at high and very high frequencies”, Academic Press, 
1970.
