Turkish Journal of Electrical Engineering and Computer Sciences
Volume 25

Number 3

Article 35

1-1-2017

Reconstruction of a single square pulse originally having 40 ps
width coming from a lossy and noisy channel in a point to point
interconnect
ALAK MAJUMDER
BIDYUT BHATTACHARYYA

Follow this and additional works at: https://journals.tubitak.gov.tr/elektrik
Part of the Computer Engineering Commons, Computer Sciences Commons, and the Electrical and
Computer Engineering Commons

Recommended Citation
MAJUMDER, ALAK and BHATTACHARYYA, BIDYUT (2017) "Reconstruction of a single square pulse
originally having 40 ps width coming from a lossy and noisy channel in a point to point interconnect,"
Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 25: No. 3, Article 35.
https://doi.org/10.3906/elk-1507-209
Available at: https://journals.tubitak.gov.tr/elektrik/vol25/iss3/35

This Article is brought to you for free and open access by TÜBİTAK Academic Journals. It has been accepted for
inclusion in Turkish Journal of Electrical Engineering and Computer Sciences by an authorized editor of TÜBİTAK
Academic Journals. For more information, please contact academic.publications@tubitak.gov.tr.

Turkish Journal of Electrical Engineering & Computer Sciences
http://journals.tubitak.gov.tr/elektrik/

Turk J Elec Eng & Comp Sci
(2017) 25: 2055 – 2065
c TÜBİTAK
⃝
doi:10.3906/elk-1507-209

Research Article

Reconstruction of a single square pulse originally having 40 ps width coming from
a lossy and noisy channel in a point to point interconnect
Alak MAJUMDER1,∗, Bidyut BHATTACHARYYA2
1
Department of ECE, NIT Arunachal Pradesh, India
2
Department of EE, NIT Agartala, Tripura, India
Received: 24.07.2015

•

Accepted/Published Online: 18.08.2016

•

Final Version: 29.05.2017

Abstract: The fundamental problem in high speed communication is that it suﬀers a lot of signal integrity issues due
to dispersion (caused by dielectric variation with angular frequency), reflection (S 11 ) , and insertion losses (S 12 ) of the
channel made of copper. When a pulse width τ with magnitude V 0 is driven through a lossy channel, we observe a
reduction in magnitude (due to S 12 and S 11 ) and an increase in pulse width (due to dispersion). It causes diﬀerent
values of skin eﬀect and dielectric loss leading to diﬀerent eﬀective resistance at each segment as the pulse moves through
the channel. This impedance mismatch generates reflection noise, which makes the identification of the received signal
diﬃcult at the receiver. Modeling of such a complex situation and reconstruction of a high speed signal driven through
a lossy channel remain an open problem for the research community. This work unveils a method of designing a system
that can renovate a square wave pulse of 40 ps or less (corresponding to a data rate of 25 Gbit/s or more) after sending the
same over a lossy channel from transmitter to receiver. The received noisy signal (Signal-A) is sent through a RC circuit
to obtain a diﬀerent delayed signal (Signal-B). Both the signals are then applied to the two terminals of a comparator.
The diﬀerence, ∆ (t), between Signal-A and Signal-B is measured and it is witnessed that the voltage diﬀerence ( ϕ)
of two consecutive maximum peaks of ∆ (t) actually provides us with a better way to determine the design criteria of
threshold voltage, V T , of the comparator for the reconstruction of the square pulse. It helps to eliminate the needless
oscillations at the output of the comparator. The design of a threshold voltage depends fully on the channel properties.
Key words: Driver, receiver, printed circuit board, inter symbol interference, delay line, reconstruction, decision
feedback equalizer, feed forward equalizer

1. Introduction
With the advancement of IC technology, the data rate between two communicating central processing units
(CPUs) inside a server has not increased drastically, because the performance of high speed digital circuits
is aﬀected by the characteristics of channels (made of copper interconnects), including printed circuit boards
(PCBs), plastic packages, sockets, and edge connectors [1]. The bandwidth limited channels caused by channel
distortion and insertion loss are the curbing factor in the overall performance of electronic systems with high data
sampling rates. As the switching frequency and the interconnect density in packages and PCBs are increasing
day by day, the wiring cross sections of interconnects get reduced, which makes them lossy as compared to
previous generation designs [2]. In real life, the modeling of lumped resistance, capacitance, and inductance
per unit distance in a simulation tool oﬀers an artificial simulation error that eﬀectively makes the intersymbol
∗ Correspondence:

majumder.alak@gmail.com

2055

MAJUMDER and BHATTACHARYYA/Turk J Elec Eng & Comp Sci

interference (ISI) worse. ISI is generally observed when the second signal is steered before the first signal decays
completely at the receiver. This fallacy, being an additive term, leads to a substantial ISI. It allows some extra
circuit components to be added to a system to eliminate that error, which is purely due to circuit simulation.
Eventually, the cost of the product increases to compensate for the simulation error, which does not even exist
in reality. Hence, eﬃcient and accurate modeling and simulation of on-chip channel are the need of an hour for
modern high speed digital design [3]. The modeling methodology and the eﬀect of ISI are described in [4]. The
basic block diagram of a communication link is shown in Figure 1.

Figure 1. Basic communication link between two chips.

Two chips communicate via a channel, which is divided into N diﬀerent segments such that Tr ≥ 2Tpd (Tr
= Propagation delay and Tpd = Rise time of driven pulse) for every segment of that channel. This assumption
of the conventional lumped circuit is no more valid when it comes to the analysis of high speed VLSI circuits.
The work done in [5] has shown that the traditional lumped circuit is not an ideal one, when it is driven by
a pulse of less than 100 ps width over a certain length of interconnect. As the propagation time of an ideal
TEM mode over 1 mm length is typically about 8 ps on silicon substrate or in PCBs (almost similar dielectric
constant), a lumped model is improper for a high speed IC [6]. Recent developments in video applications along
with the expansion of the volume of data traﬃc have raised the insistence of high data rates in computer servers.
This demand requires data transmissions from one CPU to another over the backplane of the server using point
to point interconnects as established by Madrid et al. [7] from Intel. In 2007, Hollis et al. [8] designed a new
point to point interconnect using MUX and DEMUX trees (separated by distributed wire) in the driver and
receiver chip, respectively. They obtained a data rate of around 763 Mbit/s with 1-mm wire length, dropping
slowly to 639 Mbit/s for a 10-mm wire. In mid 2008, Intel developed the advanced version of the point to point
interconnect, which is called the quick path interconnect (QPI) [9].The QPI is used to connect a CPU to another
CPU or an I/O hub and is suitable for diﬀerent system configurations. It operates at a clock rate of 2.4 GHz,
2.93 GHz, 3.2 GHz, 4.0 GHz, and 4.8 GHz. The 4.8 GHz frequency provides approximately about 10 Gbit/s
data rate between two CPUs. Although it has increased the data rate, the problem of ISI still exists as the
signal received by the receiver does not decay in less than one half of the clock period. Thus, the detection of
the digital output from the distorted signal received at the receiver has become impossible at higher frequency.
Even today, CPUs inside the server barely work beyond 7–8 Gbit/s speed. This bottleneck is caused by the
properties of the transmission line (channel) that connects the receiver to the transmitter. This can be solved
by designing a system with special sockets and edge connectors where the impedance is matched throughout
the whole interconnect. Another problem arises due to the variation in lengths from one CPU to another or
from one chip to another because of their geometrical locations. Thus, wires connecting two chips on a PCB
can have various lengths. For example, let us say the length of the copper wire (channel) connecting Chip-1 to
Chip-2 for Address Bus-1 is X. However, if the same two chips are to be connected for Address Bus-2, it may
have a length of Y (diﬀerent than X). Thus, if there is a total of 200 I/Os, all the interconnect lengths from
Chip-1 to Chip-2 will be diﬀerent and this causes an additional problem for high speed interconnects. This
results in all diﬀerent kinds of signal integrity problems in the channel and thus the design methodologies may
2056

MAJUMDER and BHATTACHARYYA/Turk J Elec Eng & Comp Sci

lead to a complex solution, because the diﬀerence in length forces one to lay out the board first and then to
design the output drivers of the transmitter and the receiver’s port diﬀerently for every I/O in a given CPU.
It also depends on the channel characteristics of each I/O layout in the PCB. What we mean is that all I/O
drivers on each address and data will have diﬀerent designs. When a square pulse of 40 ps width (corresponding
to 25 Gbit/s data rate) is sent from the driving end through the channel, the receiver experiences a voltage
that is very diﬀerent than the original signal due to various reasons such as impedance mismatch at various
points in the channel, cross talk, and all forms of losses due to radiation, skin eﬀect, and dielectric loss. This
has made server design a challenging task when the data rate is more than 10 Gbit/s (< 100 ps pulse width).
Agilent Technologies [10] has developed equipment on the basis of decision feedback equalization (DFE) and
feed-forward equalization (FFE) to understand and to reconstruct the original signals by generating a better
eye diagram. However, that process also has the best eye for data rate less than 10 Gbit/s. In 2010, Seo [11]
proposed a new low-swing signaling technique for high bandwidth and low energy consumption. Pre-emphasized
bipolar signals were generated at the driver through series capacitors, whereas the receiver chip reconstructs
NRZ data from fast RZ pulses at single data rate (SDR) signaling. Double data rate (DDR) signaling, which
was found at DRAM PCB (made by Micron Technology Inc.), was also employed to further improve the data
rate of the on-chip bus system. However, they achieved 2.5 Gbit/s and 4.9 Gbit/s data rate for SDR and DDR,
respectively. Bhattacharyya et al. [12] showed in 2004 that it is indeed possible to achieve a 25–100 Gbit/s
data rate in a point to point signaling scheme for a 24-inch channel made out of a PCB, two packages, two
sockets, and two backplane connectors. It was observed that an mV solution (given 1 mV is the expected signal
magnitude at the receiver for a pulse of 1000 mV having 40 ps width driven by the transmitter) exists for a 25
Gbit/s signaling scheme while using the interconnect technology that existed during that time at Intel. This 1
mV is obtained after subtracting all noises including white noise and ISI. Here, we have analyzed a method by
which we can regenerate a signal of 40 ps width or less after passing through a lossy channel in any point to
point interconnect.

2. Design methodology and the channel performance
In Figure 2, we have shown interconnect (a lossy and mismatched channel) between two chips. The left side of
the dotted line (unmatched channel) is Chip-1 and the right side is Chip-2. Chip-1 is the driver/transmitter
and Chip-2 is the receiver. A square wave signal of pulse width τ and amplitude of about 1 V is driven by
the transmitter. The noisy signal (Signal-A) received by the receiver right after the bond pad (see Figure 3)
at time t = t1 is sent to feed the noninverting terminal of the comparator inside Chip-2. However, a RC
delay ( τ ) circuit is used to generate a delayed version of the received signal, called Signal-B at time t = t2 ≈
(t1 + τ ), and is fed to the inverting terminal of the same comparator. The termination resistance, the driver’s
output impedance, and the input ESD capacitance at the bond pad or at C4 bumps are R T , R D , and C I ,
respectively. Diﬀerent kinds of connections such as bond pads, package traces, pins of the package, sockets,
and PCBs have diﬀerent characteristic impedance, which leads to impedance mismatch at various locations
along with the channel shown in Figure 1. This makes the interconnection noisy. Moreover, multiple reflections
at various interfaces lead to oscillations in those interconnections. The state-of-the-art is always to make the
characteristic impedance (Z 0 ) of the channel matched with the termination resistances (R T ) , connected at
either end of the channel as shown in Figure 3. This channel (which we have used) oﬀers various mismatches
due to the discretization of the transmission line. The transmission line has 30 RLC segments with R = 100
2057

MAJUMDER and BHATTACHARYYA/Turk J Elec Eng & Comp Sci

m Ω, L = 250 pH, and C = 100 fF. The values of R T , R D , C I are kept at 50 Ω , 50 Ω , and 2 pF, respectively.
As the real-life package pins have about 1–2 nH inductances, we have inserted inductance of 1 nH somewhere
after the 15th segment of the transmission line to create an inductive reflection due to impedance mismatch.
As the pins are mostly inductive, they exhibit higher impedance. The rise time (Tr ) and fall time (Tf ) of the
driving pulse are assumed to be 10 ps, whereas the ON time is assumed to be TON = 30 ps, which eventually
gives us a pulse width of 40 ps at 50% rise and fall time of the signal. Thus, 40 ps is the half width of driven
pulse. Each segment of the transmission line is assumed to have a time delay of (LC) 1/2 , which is 5 ps for
our design. As this time delay is smaller than the rise time of the signal, each segment of the channel can
be considered as a lumped circuit. The values of R τ and C τ are chosen such that the product of these two
elements is also close to 40 ps and that is one of the key ideas of our reconstruction model. In our model, we
have used R τ = 4 k Ω and C τ = 10 fF. Such resistors and capacitors can be built using the poly-Si and NMOS
gate on p-substrate silicon, respectively.

Figure 2. Block diagram of proposed design methodology.

Figure 3. Schematic diagram to reconstruct driven signal by designing a comparator.

3. Results and discussion
In Figure 4, we have shown the driving pulse at the bond pad near Chip-1. We have also shown Signal-A and
Signal-B at the noninverting and inverting terminals of the comparator, right after the bond pad at Chip-2.
The right side scale of Figure 4 is for the pulse driven by the transmitter from Chip-1 and the left side scale
is for the voltages received at the two inputs of the comparator at Chip-2. The signal V I (t), right before the
2058

MAJUMDER and BHATTACHARYYA/Turk J Elec Eng & Comp Sci

driver’s impedance (R D ), is a square wave pulse with full voltage swing of 1 V and half-width of 40 ps (See
Figure 3). The voltage measured at the bond pad at Chip-1 is around 0.32 V. This is due to the channel
resistance (which is about 3 Ω), two termination resistances R T , and driver’s resistance R D . Figure 4 shows
that Signal-A has arrived at the receiver after some time, which is the time delay of the channel. It is the
time a signal takes to reach Chip-2 from Chip-1. It is also seen that the peak amplitude of Signal-B is shifted
from the peak amplitude of Signal-A due to R τ and C τ . The time diﬀerence between these two highest peak
voltages is approximately R τ C τ ≈ τ , which is 40 ps. If the comparator can sense a few mV diﬀerences between
the noninverting terminal (Signal-A) and inverting terminal (Signal-B), then the output of the comparator will
switch back and forth from 1 V to 0 V depending on whether Signal-A is greater or less than Signal-B. It is
seen from Figure 4 that there are many occasions when Signal-A is greater than Signal-B and therefore one will
have multiple square wave signals of diﬀerent time widths (τ ) at the output of the comparator inside Chip-2.
Under that condition, the output waveform of the comparator at Chip-2 will not be same as the pulse driven,
V I (t), at Chip-1. In order to fix this problem, a value of the threshold voltage (V T ) is to be determined that
will generate only one square wave pulse, the same as V I (t). In order to determine that V T , we have plotted
the diﬀerence in voltages between Signal-A and Signal-B in Figure 5. We denote this voltage by ∆ (t), such that
∆ (t) = (voltage of Signal-A – voltage of Signal-B). Figure 5 shows that at around 100 ps we have the highest
peak, which is approximately 60 mV (V max 1 ) and at around 500 ps we have the 2nd highest peak, which is
about 20 mV (V max 2 ). If the value of V T is chosen somewhere between these two maximum voltage points,
then we will be able to see a single square wave pulse at the comparator output. The average of these two
maximum values is around 40 mV, which is the right threshold value for our case, as this gives the maximum
voltage margin. This threshold voltage will also prevent the back and forth switching of the voltage caused
by signal oscillations. It is interesting to note that this average value (V T ) intersects the ∆ (t) vs. time (t)
curve at two points and the time diﬀerence between these two points is 40 ps. This is also shown in Figure
5. For the solution to exist, any line drawn parallel to the time axis in Figure 5 needs to intersect the curve
∆ (t) at two points such that the time diﬀerence between those two points is 40 ps, which is the desired pulse
width. If the second voltage spike is greater than 40 mV, then we cannot reconstruct the single square wave
pulse at the output of the comparator. The diﬀerence in voltages (ϕ) between the first maximum peak and the
second maximum peak will be the actual signal strength we have to work with and it depends on the channel
characteristics. In this case, the diﬀerence ( ϕ) is also 40 mV. In order to prevent multiple switching, one can
also use the conventional comparator hysteresis [13] loop. The feedback path present in the loop will control
the generation of the RC delayed signal (Signal-B) based on the comparator output. This may require a clock
circuit to match the 40 ps timing and makes the design a bit complex. Figure 6 shows that the comparator
output is also a square wave pulse that looks exactly same as the pulse driven by the transmitter, V I (t). It is
also seen that the reconstruction occurred at that time slot when V T is exactly equal to (V max 1 + Vmax 2 ) /2.
The reconstructed pulse intersects the ∆(t) curve at two points, which is also close to 40 ps and that is the
pulse width we wanted. Figure 7 shows the final result of our methodology in a more detailed way. The dark
solid line is the square pulse we have reconstructed at the output of the comparator and the dark dotted line is
the pulse that we have sent from the driver. Each tick mark in the time axis is about 2.5 ps. We can see some
skews of the rise and fall time, which may produce the high frequency jitter eﬀect. We have not investigated the
minimization of this jitter in this paper. In summary, the filter we have designed is unique and makes the final
output look almost identical to the driven pulse. In Figure 8, we have presented the summary of our process
flow to reconstruct the square pulse having a pulse width close to 40 ps.

2059

MAJUMDER and BHATTACHARYYA/Turk J Elec Eng & Comp Sci

Figure 4.
Signal-B.

Plot of driven square pulse, Signal-A, and

Figure 5. Plot of ∆ (t), the diﬀerence in voltage between
Signal-A and delayed Signal-B.

Figure 6. Reconstruction of square wave by the new
methodology.

Figure 7. Final reconstruction of the square wave pulse
inside the Chip-2. The edge jitter between the two signals
is in the range of 2.5 ps.

It may be worth mentioning that if we send square wave pulses of period T, then after a certain time
Signal-A will creep up to some average voltage V AV G and Signal-B will also creep up to almost the same
average voltage due to the filtering eﬀect of the channel. Under these circumstances, ∆ (t) will fluctuate around
0 V after some time and the value of ∆(t) will never be more than 40 mV. Thus, keeping V T equal to 40
mV, as done for a single square wave pulse, will still generate one square wave pulse at the receiver with the
pulse width close to 40 ps . Therefore, even though the signal at the receiver (Signal-A) looks bad, the process
proposed here can reconstruct the square wave signal that was originally sent from the drivers. The proposed
method requires generation of Signal-B with some time delay τ = R τ Cτ , and then feeding both received and
delayed signal into the comparator, which in turn is designed for a given V T , to produce a square wave pulse
inside Chip-2.

4. Eﬀect of variation in V T for various channel lengths and pulse width τ
As discussed in the previous section, we have defined V T by the following equation for the value of α = 1
(where, 0 < α < 2):
VT =
2060

α(Vmax 1 − Vmax 2 )
+ Vmax 2 ,
2

(1)

MAJUMDER and BHATTACHARYYA/Turk J Elec Eng & Comp Sci

where V max 1 and V max 2 are the highest voltage peak and the second highest voltage peak of the ∆ (t) curve as
shown in Figure 5. In the detailed description of the previous section, we have presumed α to be close to 1, and
that generates the value of V T as the average of V max 1 and V max 2 . The range of α is 0 < α < 2. In reality,
α can be taken very small to get V T close to the value of V max 2 . If the value of α equals 2, then the value of
V T will be close to V max 1 . Thus, the minimum V T can be taken close to V max 2 and the maximum V T can
be close to V max 1 . We have taken total 30 RLC segments of a channel with R = 100 m Ω , L = 250 pH, and C
=100 fF, which is mentioned earlier. There is also a lumped inductor of 1 nH somewhere between the 15th and
16th segments of this channel (30 RLC elements) for the generation of reflection noise, which in reality causes
reflection noise. This inductance of 1 nH can be originated from the socket, package pin, edge connectors, etc.
Keeping the number of channel segments fixed at 30, we have varied the half-width of the input pulse from
15 ps to 60 ps to see how it aﬀects the value of threshold (V T ) of the receiver chip, for reconstruction of the
pulse. While varying the driven pulse width, we have also changed the value of the delay R τ C τ inside Chip-2,
so that it remains the same as the driven pulse width. In Figure 9, we have plotted V max 1 , V max 2 , and V T
of the comparator as a function of pulse width (τ ) using α = 1. As per the principle, V T can have a range of
V max 1 < V T < V max 2 . When α equals 1, the V T is in the middle of V max 1 and V max 2 . It is interesting
to see that V T goes on increasing as the value of τ is increased from 15 ps to 60 ps. It means that the signal
strength is increasing, as both the voltage margins (V max 1 – V T ) and (V T – V max 2 ) are increasing. Thus,
when τ increases, the design margin gets better and better. In each attempt, the comparator inside the Chip-2
generates the same pulse width τ , which was originally sent by the transmitter from Chip-1. In Table 1, we
have summarized all the data for 30 channel segments and it is seen that when the pulse width is 60 ps the
voltage margin is about 35 mV, but when the pulse width is reduced to as small as 15 ps the voltage margin
is only about 5 mV. It is seen that V T of the receiver is diﬀerent in each case. However, changing V T of the
receiver chip may not be a good option. Therefore, if we keep the V T of the receiver chip fixed at 40 mV, then
with the increase in τ the design methodology will work, provided the second maximum of curve ∆ (t) is always
less than 40 mV. However, since the value of τ is increasing the data rate is getting reduced, even though the
voltage margins are getting better. This V T gives us suﬃcient margins (V max 1 – V T = 2.2 mV and V T –
V max 2 = 25.8 mV) to make the design work for the pulse width τ close to 30 ps, which corresponds to a data
rate of about 33.33 Gbit/s as shown in Table 1.
It is seen that for τ = 15 ps, which translates to a data rate of about 66.67 Gbit/s, the receiver V T has
to be 10 mV. As it gives only 5 mV margin for the logic to swing from logic 0 to logic 1 and vice versa, the
white random noise has to be much less than 5 mV. The common mode noise, if generated, may be eliminated
since we are splitting the original signal into Signal-A and Signal-B for the generation of the ∆ (t).
The above discussion is based on the variation in driving pulse width (τ ), where the time delay R τ C τ
at the receiver is kept equal to τ , keeping the number of RLC segments of the channel fixed at 30. However,
in reality, it is not possible to always vary R τ C τ at the receiver along with the driven pulse width. Hence,
we also have studied the performance of the proposed system when the driven pulse width is varied, where the
channel segments and time delay R τ C τ at the receiver are kept fixed. It is important to mention that when
the driven pulse width is 30 ps and the value of the product R τ C τ is kept at 40 ps the values of V max 1 and
V max 2 are 48.5 mV and 15.7 mV. For α = 1, the value of V T will be 32.1 mV. This new V T will generate
an output of pulse width the same as 40 ps in Chip-2, even though the pulse width generated by the driver is
30 ps. However, if we keep V T equal to 40 mV, then the pulse width of the output of the comparator inside
Chip-2 will be 26 ps only and not 30 ps. Thus, if we choose the value of V T to be the mid-point of V max 1 and
2061

MAJUMDER and BHATTACHARYYA/Turk J Elec Eng & Comp Sci

Figure 8. Proposed process flow to reconstruct square pulse sent by the transmitter.

Figure 9. Plot of threshold voltage V T as a function of pulse width for α = 1, which is the midpoint of V max 1 and
V max 2 .
Table 1. Threshold voltage (V T ) variation with respect to pulse width variation (number of RLC segments of the
channel is fixed at 30).

Pulse width
(ps)
15
20
25
30
35
40
45
50
55
60

2062

Rτ
(kΩ)
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0

Cτ
(fF)
10
10
10
10
10
10
10
10
10
10

R τ Cτ
(ps)
15
20
25
30
35
40
45
50
55
60

Vmax 1
(mV)
15.6
24.2
33.2
42.2
51.1
60.0
68.1
76.1
83.6
90.8

Vmax 2
(mV)
5.7
8.7
11.6
14.2
16.7
20.0
19.9
20.9
21.0
20.5

Φ = Vmax 1
−Vmax 2 (mV)
10.0
15.5
21.6
28.0
34.3
40.0
48.3
55.2
62.6
70.3

VT = Vmax 2
+Φ/2 (mV)
10.7
16.5
22.4
28.2
33.9
40.0
44.0
48.5
52.3
55.6

MAJUMDER and BHATTACHARYYA/Turk J Elec Eng & Comp Sci

V max 2 the pulse width inside Chip-2 will only depend on the Rτ Cτ and not on the pulse width of the driver.
In Table 2, we have summarized the diﬀerent widths of received pulse under various conditions.
Table 2. The received pulse width will be independent of the pulse width sent by the transmitter if the value of V T is
dynamically changed.

Pulse width
(ps) driven by
Chip-1
30
40
30
30

Rτ
(kΩ)
4.0
4.0
3.0
3.0

Cτ
(fF)
10
10
10
10

R τ Cτ
(ps)
40
40
30
30

Vmax 1
(mV)
48.5
60
48.5
48.5

Vmax 2
(mV)
15.7
20
15.7
15.7

Rτ
(kΩ)
32.1
40
32.1
40

Pulse width (ps) at
comparator output
inside Chip-2
40
40
30
26

The methodology is also checked to determine the eﬀect of number of RLC segments keeping the driven
pulse width and R τ C τ fixed at 40 ps. A higher number of RLC elements translates to an increase in channel
length. In this study, we have varied the channel length from 30 to 70 segments, which means that the channel
length is increased more than double. This study is important because when someone connects two chips (which
are in packages) on the PCB, the channel length may vary between two diﬀerent I/O ports for these two chips.
In Figure 10, we have shown the plot of V T as a function of channel length for a given pulse width τ , which
is 40 ps. It is interesting to see that, even though we have doubled the channel length, the threshold voltage
has not changed significantly. It has only changed from 40 mV to 34.5 mV as shown in Table 3. Thus for the
regeneration of the 40 ps pulse width, if the length of the channel connecting two chips gets increased by a
factor of two, the voltage margin will be reduced by about 5.5 mV if V T is kept at 40 mV. The margin for logic
1 will change from 20 mV to 14.4 mV (= 54.4 mV – 40 mV) and the voltage margin for logic 0 will change

Figure 10. Plot of threshold voltage as a function of number of segments in channel.

Table 3. Number of RLC segments vs. V T for the fixed pulse width, which is 40 ps in our case.

No. of RLC
elements
30
40
50
60
70

Rτ
(kΩ)
4.0
4.0
4.0
4.0
4.0

Cτ
(fF)
10
10
10
10
10

Rτ Cτ
(ps)
40
40
40
40
40

Vmax 1
(mV)
60.0
58.4
56.8
55.5
54.5

Vmax 2
(mV)
20.0
18.4
16.9
15.5
14.5

Φ = Vmax 1
– Vmax 2 (mV)
40.0
40.0
40.0
40.1
40.0

VT = Vmax 2
+ Φ/2 (mV)
40.0
38.4
36.9
36.5
34.5

2063

MAJUMDER and BHATTACHARYYA/Turk J Elec Eng & Comp Sci

to 25.5 mV (= 40 mV – 14.5 mV). These margins are calculated from Table 3. Therefore, length variation of
the channel has very little eﬀect on the value of V T and the signal reconstruction is not aﬀected drastically
by a higher number of channel segments. Table 3 summarizes the data of threshold voltage, V T , for various
segments of the channel.
5. Conclusion
In this work, we have shown a unique method to send a high frequency square wave pulse having pulse width
τ and to reconstruct the same pulse inside the receiver by generating a delayed signal using an R τ C τ delay
circuit, which is exactly same as the driven pulse width (τ ) . It is also shown that there is a considerable margin
in this reconstruction process given that both pulse width and channel length vary. The pulse width has been
varied from 15 ps to 60 ps, whereas the number of channel segments has been varied by a factor of two, from 30
to 70 segments. It is shown that keeping V T fixed at 40 mV, if the driven pulse width changes from 40 ps to 30
ps, the present receiver is capable of reproducing a pulse of 30 ps width, corresponding to a data rate of 33.33
Gbit/s, at the comparator output with voltage margin very small and close to 2 mV, given the white noise is
less than that. However, in reality, it may be harder to achieve. If one changes V T dynamically from 40 mV to
28 mV, when pulse width changes from 40 ps to 30 ps, then the voltage margin for the pulse width 30 ps will
be a bit higher and it is about 14 mV for both logic 1 and logic 0, as shown in Table 1. It is also shown that
keeping R τ C τ equal to 40 ps and changing the threshold voltage (V T ) dynamically from 40 mV to 32 mV, if
a pulse width of 30 ps is driven from the transmitter, then our methodology produces a pulse width of 40 ps
inside Chip-2.
Acknowledgment
We are thankful to the Department of Electronics and Information Technology (DEITY), Government of India,
for providing the financial grant under special manpower development program (SMDP) for VLSI to carry out
the work.
References
[1] Hall SH, Heck HL. Advanced Signal Integrity for High Speed Digital Designs. Hoboken, NJ, USA: John Wiley and
Sons, 2009.
[2] Johnson HW, Graham M. High Speed Digital Design: a Handbook of Black Magic. Upper Saddle River, NJ, USA:
Prentice Hall, 1993.
[3] Deutsch A, Coteus PW, Kopcsay GV, Smith HH, Surovic CW, Krauter BL, Edelstein DC, Restle PJ. On-chip
wiring design challenges for gigahertz operation. Proc IEEE 2001; 89, 4: 529-555.
[4] Tuuna S, Isoaho J, Tenhunen H. Analytical model for crosstalk and inter-symbol interference in point-to-point
buses. IEEE T Comput Aid D 2006; 25, 7: 1400-1410.
[5] Hasegawa H, Seki S. On-chip pulse transmission in very high speed LSI/VLSI. In: IEEE microwave and milli-meterwave monolithic circuits Symposium Digest; 1984; San Francisco: IEEE. pp. 29-33.
[6] Diao J. High speed on-chip interconnect modelling and reliability assessment. PhD, Rensselaer Polytechnic Institute,
Troy, NY, USA, 2006.
[7] Madrid A, Jacobson S, Bhattacharyya BK. Circuit design for point-to-point chip for high speed testing. In: US
Patent; No – 5532983: 2 July 1996.
[8] Hollis SJ. Pulse based on chip interconnect. PhD, University of Cambridge, Cambridge, UK, 2007.

2064

MAJUMDER and BHATTACHARYYA/Turk J Elec Eng & Comp Sci

R
[9] Intel. An introduction to the Intel⃝quick
path interconnect. Document number 320412-001US: 2009.

[10] Agilent Technology. Manual N5461A infiniium serial data equalization user’s guide: 2009.
[11] Seo J. High speed and low energy on chip communication circuits. PhD, University of Michigan, Ann Arbor, MI,
USA, 2010.
[12] Bhattacharyya BK, Rustein M. Can we ever send 25-100 Gb/s signals over 24” line length of printed circuit board
and still have mVolt signal at the receiver. In: IEEE 13th Topical Meeting on Electrical Performance of Electronic
Packaging; 25–27 October 2004; Portland, OR, USA: IEEE. pp. 15-18.
[13] Kay A, Claycomb T. Comparator with hysteresis reference design. Texas Instruments: 2014.

2065

