Testable Design of Repeaterless Low Swing On-Chip Interconnect by Kadayinti, Naveen & Sharma, Dinesh K.
ar
X
iv
:1
51
1.
06
72
6v
1 
 [c
s.A
R]
  2
0 N
ov
 20
15
Testable Design of Repeaterless Low Swing
On-Chip Interconnect
K. Naveen* and Dinesh K. Sharma
Department of Electrical Engineering, Indian Institute Institute of Technology Bombay,
Powai, Mumbai - 400076, India
*Email: naveen@ee.iitb.ac.in
Abstract—Repeaterless low swing interconnects use mixed
signal circuits to achieve high performance at low power. When
these interconnects are used in large scale and high volume
digital systems their testability becomes very important. This
paper discusses the testability of low swing repeaterless on-chip
interconnects with equalization and clock synchronization. A
capacitively coupled transmitter with a weak driver is used as
the transmitter. The receiver samples the low swing input data
at the center of the data eye and converts it to rail to rail levels
and also synchronizes the data to the receiver’s clock domain.
The system is a mixed signal circuit and the digital components
are all scan testable. For the analog section, just a DC test has a
fault coverage of 50% of the structural faults. Simple techniques
allow integration of the analog components into the digital scan
chain increasing the coverage to 74%. Finally, a BIST with low
overhead enhances the coverage to 95% of the structural faults.
The design and simulations have been done in UMC 130 nm
CMOS technology.
Index Terms—Scan test, DFT, BIST, Repeaterless interconnect,
Mesochronous synchronizers.
I. INTRODUCTION
Low swing repeaterless interconnects have been researched
extensively for improving the performance of long intercon-
nects, while keeping the power consumption to acceptable
levels [1]–[4]. These techniques use low swing on the line
with equalization to enhance the bandwidth. Among the pro-
posed architectures for these circuits, the capacitively coupled
transmitter is one of the most promising architectures due
to its simplicity and robustness [5]–[7]. Such interconnect
links have unknown latencies which can be multiple cycles,
thus requiring appropriate clock synchronizing circuits at the
receivers [4]. The interconnect design reported in [4] uses
current mode signaling. At the receiver, a digitally controlled
delay line is used to generate a set of clock phases and a
foreground calibration routine selects the phase closest to the
center of the data eye. Though the system has the advantage
of using digital circuits for clock synchronization, it has
limitation of phase quantization error and it cannot track en-
vironmental changes without breaking normal operation. The
authors of [8] cite these problems and propose a background
phase synchronizer circuit that uses digital coarse correction
with analog fine correction. Thus, repeaterless interconnects
need to use sophisticated mixed signal circuits for achieving
high performance and robustness. While these repeaterless
interconnect solutions have shown quite some promise, unless
these circuits are testable they will not be appealing for large
scale deployment. The digital components of these circuits are
typically simple and can be tested using fairly standardized test
methods. However, testing the analog components along with
the digital systems in large designs is a challenging problem.
This paper discusses the testability of repeaterless low swing
interconnects that use mixed signal circuits for achieving best
performance. The transmitter uses the capacitively coupled
feed-forward equalizer reported in [7]. A receiver that em-
ploys coarse digital correction and fine analog correction for
accurate adaptive synchronization is used [8]. While this paper
discusses testing of low swing interconnect using the above
circuits as transmitters and receivers, the solutions presented
can be used for other low swing interconnect systems as
well. The receiver synchronizer circuit is similar to a phase
or delay locked loop. BIST of PLL’s is generally performed
by adding delays to the inputs of the phase detector and
capturing the divider’s outputs [9]. Such techniques are not
attractive for interconnect test as it will need adding delays
in the clock or data path. Fault based testing that does not
interfere with the critical path is preferred for such circuits
[10]. The interconnect test in this paper uses standard scan test
for the digital components. Since the digital circuits are simple,
a 100% coverage is possible. For the analog sections, just a DC
test of the full link can detect 50.4% of the structural faults.
Simple techniques are used to integrate the analog components
into the digital scan chain which enhances the fault coverage
to 74.3%. The fault coverage can be increased to 94.8% by
using a BIST with a lock detector. The fault sets covered
by the scan test and BIST are intersecting but not subsets
of each other, which means to achieve 94.8% coverage both
tests are required. The circuits do not alter the critical path of
the design.
A. Notations and Fault models
The additional circuitry added only for the purpose of
testing are shaded grey in all the figures. These circuits will
be turned off in normal operation. The structural fault model
[10] is used for the analog circuits.
B. Paper organization
The paper is organized as follows. Section II discusses the
architecture of the capacitively coupled transmitter and the
clock synchronizer with their test circuits. Section III describes
. . .
φRx
φTx
φd
Alexander
Phase
Charge Charge
Pump Pump
(weak) (strong)
Logic
UP DOWN
UP UPst
DNstDN
Counter
Divider
Switch Matrix
DLL
VCDL
Window
Comparator
Retimed Data
U
P
/
D
N
E
n
a
bl
e
Vc
Q0−9
Fine tuning loop Coarse tuning loop
Ca
pa
ci
tiv
e
Scan chain A
Scan chain B
Scan clock
data
Lock
Detector
Detector
Sen
Sen
(FSM)
Ten
Fig. 1: Block diagram clock synchronizer system, divided into fine tuning and coarse tuning loops.
VCDL: voltage controlled delay line, V c: control voltage, Sen: Scan enable signal, φTx: Transmitter clock phase, φRx:
Receiver clock phase, φd: sampling clock phase, Ten: Test mode enable.
BIST of the link. Section IV discusses simulation results and
Section V concludes the paper.
II. CAPACITIVELY COUPLED TRANSMITTER AND CLOCK
SYNCHRONIZING RECEIVER
Fig. 1 shows the block diagram of the repeaterless inter-
connect. The transmitter is the capacitively coupled transmitter
from [7]. The receiver in Fig. 1 is a clock recovery circuit that
is used to generate a sampling clock that samples the data at
the center of the data eye [8]. The circuit has two control
loops for coarse and fine phase correction. The coarse phase
correction loop performs correction in discrete steps quantized
to the DLL phases. The fine correction loop performs continu-
ous correction using a voltage controlled delay line. The circuit
uses a phase detector to sense the phase difference between
the received data and the sampling clock. The error signal
from the phase detector is integrated and the integrated output
(Vc) controls a VCDL which delays the sampling clock. The
negative feedback loop thus formed, pushes the sampling clock
to the center of the eye. The VCDL is designed to have a range
greater than one phase step of the DLL over a range of control
voltage corresponding to the window comparator thresholds.
If the fine control loop fails to lock while the control voltage
Vc is within the window thresholds, a coarse phase correction
request is issued by the window comparator and the control
voltage is reset to lie within the window by the strong charge
pump. This process repeats till lock is achieved. In steady state,
the DLL phase chosen is within the VCDL range from the
center of the data eye and the control voltage tunes the VCDL
to sample the data at the center of the data eye. Fig. 2 shows
the waveform of the control voltage and the chosen phase of
the DLL with time, as the circuit locks to the correct phase
from startup. The simulated circuit was designed in UMC 130
nm CMOS technology with a supply voltage of 1.2 V and a
data rate of 2.5 Gbps.
Once lock is achieved, the phase difference between the
sampling clock and the receiver clock can be found from the
coarse tuning control word to an accuracy within the VCDL
phase tuning range. If the sampling clock is less than half
cycle from the receiver’s clock, the data is delayed by half a
clock cycle to ensure reliable crossover to the receiver clock
domain.
Since the receiver and the transmitter operate in different
clock domains, the circuit is tested using two separate scan
chains which are the data path scan chain (Scan chain A)
and the clock control path scan chain (Scan chain B). The
data path scan chain begins at the transmitter, goes through
the interconnect and the phase detector at the receiver. The
output of this scan chain is the retimed data output of the
phase detector. The clock control path scan chain begins at
the window comparator, goes through the strong and the weak
charge pumps, the control FSM, UP DOWN counter and
finally the lock detector block. When test is enabled, the coarse
correction loop’s clock input is driven from an external scan
clock as shown in Fig. 1. The divider in this circuit can be
shared across multiple such receivers in the chip and tested
separately.
A. The data path scan chain
The low swing transmitter (Fig. 3) is the first component of
the data path scan chain. It uses series capacitors to boost the
high frequency content of the data, thus compensating for the
0 0.5 1 1.5
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
 
 
VL
Vc
VH
Time (µs)
Vo
lta
ge
(V
)
φ0 φ1 φ2 φ3
Fig. 2: Evolution of the fine correction control voltage Vc and
coarse correction DLL phase of the synchronizer from startup
to lock condition.
low pass characteristics of the interconnect. The two capacitors
effectively form a two bit feed-forward equalizer. A weak
driver in shunt with the capacitors drives the interconnect with
a current source which enables arbitrarily low data activity
factors. The values of the capacitors are optimized using the
worst case design method described in [7]. A single ended
version is shown for brevity, but actual implementation used
a differential interconnect. Flip-flops are added to probe the
driver side of the series capacitors (the shaded flip-flops in
Fig. 3) and thus enable the scan chain to cover all the nodes
up to the series capacitors. The additional latch in the data
path is added to optionally introduce a half cycle delay at the
transmitter, which is required for testing the phase detector at
the receiver. This latch is transparent during normal operation
and it can be absorbed into the buffer that drives the line.
Since the interconnect has high latency, it is not in the critical
path. Hence the delay added by the latch does not degrade the
maximum operating frequency of the system.
Fig. 4 shows the circuit diagram of the receiver termination
with the test circuit. It uses comparators with programmed
offset for the DC test. This circuit is a single stage opamp
followed by an inverter (Fig. 5).
The input transistors of the comparator in Fig. 5 are deliber-
ately mismatched to have an offset of 15 mV. The interconnect
is designed for a logic swing of 60 mV and when the circuit
has no faults the comparator gets an input of 30 mV. The
input transistor sizes are 0.5µ/0.5µ and 0.8µ/0.5µ, which is
sufficient to overcome any mismatch due to the manufacturing
process. The window comparator compares the bias generated
at the receiver with another voltage divider bias generator
in the clock recovery circuit. The window comparator is
constructed using two comparators with a programmed offset
of +15 mV and -15 mV. Any fault in the weak driver or the
QD
QD
Q
D
Q
D
Si Si
Si
QD
Latch
TE
Ck
Cs
−gm
Csα
RL
Vcm
data
Scan in
Scan outLine
Fig. 3: The capacitive feed-forward equalizer with the weak
driver. All the flip flops are clocked by the same transmitter
clock, which is not shown.
QD
Sin
Clock
−+
− +
−
+
Window
Comparator
Comparator
(with 15mV offset)
(Fig. 5)
(Fig. 6)R−x
R+x
C
C
C
Vmid (From Fig. 8)
Scan in
Scan out
Fig. 4: The termination of the interconnect at the receiver.
series capacitors at the transmitter or the termination resistor
at the receiver, results in a mismatch in the two arms of
the differential interconnect. All such faults are detected by
the comparators. Some faults, like drain open in one of the
transistors of the transmission gate resistor, result in a dynamic
mismatch. This is not detectable at DC. Hence the window
comparator is designed to operate at the scan frequency (which
is assumed to be 100 MHz) and these faults are detected with
a simple toggling data pattern during scan. Common centroid
layout techniques can be used to reduce the inherent offset
in these comparators. Since these comparators are either used
at DC or at scan frequencies, the additional parasitics of the
common centroid layout are not a problem.
The last component in the data path scan chain is the
Alexander phase detector at the receiver. The circuit diagram
of the phase detector is shown in Fig. 7. When the link is
operated at the scan frequency, the phase detector always
asserts the UP signal. To test the other signal path, the half
Qin+in−
Vbn
0.8µ
0.5µ
Fig. 5: Comparator with offset used in the receiver termination.
All un-labelled transistors have W/L=0.5µ/0.5µ
−
+
−
+
Q
in+
in+
in−
in−
Vbn
Clock
0.8µ
0.5µ
Fig. 6: Window comparator used in the receiver termination
circuit. All un-labelled transistors have W/L=0.5µ/0.5µ
cycle delay at the transmitter side is enabled, which makes the
phase detector assert the DN signal. Thus, with two passes
both these paths can be tested. The last flip-flop (which is
driven by either φRx or φRx) in the data path is used to insert
either a 1 clock or a half clock cycle delay. This is required for
transferring the data to the receiver clock domain. Half cycle
delay is chosen when the sampling clock is less than half a
clock cycle away from the receiver clock and this is done by
driving this last flip-flop with φRx. For test purposes this can
be controlled via the clock control path scan chain. When φRx
is chosen, it results in an increase in the length of Scan chain
A register by 1 flip-flop.
B. Clock control path scan chain
The first circuit block in this scan chain comprises of the
window comparator, the charge pumps and the control FSM.
Fig. 8 shows the circuit diagram of this part of the system. The
charge pump is an analog circuit and cannot be included in
the scan chain as is. To work around this problem the charge
pump is converted to a combinational circuit when scan test is
enabled. This is done by connecting the bias voltages for the
current sources in the charge pump to GND for the PMOS
source and to VDD for the NMOS sink. This essentially
QD QD
Scan OutQD
D Q D Q
φi = φRx or φRx
Scan chain A
UP
DN
Clock
inp
Fig. 7: The Alexander phase detector with scan.
converts the analog charge pump into a combinational circuit
with two inputs UP and DN and one output. Also when scan
is enabled, the window comparator’s input is connected to the
middle of the thresholds thus forcing the output to be “00”.
Two flip-flops are used to capture the comparator’s outputs,
which are read through the clock control path scan chain.
Scan chain A is used to make the phase detector assert
either UP or DN signal. This results in the control voltage Vc
being driven to a logic ‘1’ or “0’ respectively. When the scan
is disabled and the circuit is clocked before re-enabling scan
mode, the control FSM resets the control voltage to within the
window. Most of the faults in the charge pump result in the
control voltage not being reset to within the window or in not
being driven to the desired logic level when scan is enabled.
The comparators outputs can detect most faults in the charge
pump circuit.
To test the ring counter, it is first preloaded with a 1 hot
value. Scan chain A is used to drive the control voltage in the
charge pump to logic ‘1’ or ‘0’. The window comparators
outputs enable the ring counter in this condition and the
UP/DN signal is driven to ‘1’ or ‘0’ depending on the control
voltage. This completes the pre-load step and by de-asserting
scan enable (Sen) and clocking the circuit the ring counter can
be made to count in the desired direction before enabling scan
again and reading the results.
The last circuit block in the clock control path scan chain is
the switch matrix. Defects in this block may lead to inability of
selecting a desired phase for locking or inability to deselect a
particular phase. This is tested by pre-loading the ring counter
with all zeroes pattern. This causes none of the phases to be
picked, resulting in Scan chain A not getting clocked. Simple
continuity test of Scan Chain A can detect a permanently
selected phase. Further pre-loading different one hot values
into the ring counter and testing the continuity of Scan Chain
A all the paths in the switch matrix can be tested.
III. BIST
The Lock Detector in Fig. 1 is a simple saturating UP
counter. It logs the number of times a coarse correction request
has been issued. From any initial condition, the number of
coarse corrections needed can be no more than half the number
−+
−
+
Si Si
−
+
UPst
UPst
UPst
DNst
DNst DNst
UP
UP
DN
DN
Sen
Sen Sen
Sen
Vbp
Vbn
C
C
DD
D
D
QQ
Q
Q
RST
RST
CP
BIST
Enable
Scan chain BScan in Scan out
UP/DN
Vc
Vp
VH
VH
VL
VL
V
m
id
Vmid
Fig. 8: Control logic for generating UP/DN and Enable signals for the ring counter and UPst & DNst signals for the strong
charge pump. VH , VL are the upper and lower thresholds of window comparator respectively. All flip-flops are clocked with
the divided clock of the coarse control loop.
of DLL phases. The design used a 10 phase DLL and hence a 3
bit saturating UP counter is sufficient for the lock detector. For
BIST the interconnect is run with random data at speed. The
receiver is expected to lock within 2 µs, which corresponds
to 5000 cycles at 2.5 Gbps. Some of the faults in the charge
pump, which are not detectable in the scan test, can be detected
using this test. During scan test the charge pump’s current
sources were used as switches. This however masks a drain
source short fault in the current source transistors. The BIST
with the lock detector can detect such faults.
The scan test of the charge pump tested only the main
charge pump path and the charge balancing path (that drives
the node Vp in Fig. 8) is not tested. Any faults in this second
path or faults in the amplifier in the charge pump, result in
the node Vp drifting towards VDD or GND. This pushes one
of the current sources to linear region and as a result causes
increased jitter in the recovered clock, which can degrade the
interconnect performance. The CP-BIST block in Fig. 8 is a
window comparator that is designed for a window of 150 mV.
Fig. 9 shows the circuit diagram of the window comparator
that is built with two comparators with a programmed offset of
150 mV. Once lock is achieved, the comparator output being
high is an indicator of a fault in the charge pump that was not
detected in the scan test.
The DLL in the receiver is not tested completely by this
BIST. This DLL can be treated as a stand-alone unit and using
the techniques reported in [11], [12] a complete test of the
DLL can be integrated with the interconnect test.
IV. SIMULATION RESULTS
The interconnect system was designed in UMC 130 nm
CMOS technology with a supply voltage of 1.2 V. The digital
components are tested using the scan test. Since the circuits
are logically simple in nature, the stuck at fault coverage is
100%. The digital coarse correction is operated at a divided
clock frequency which is in the range of scan test frequencies.
Hence the delay faults in this path are also tested with 100%
coverage.
−
+
−
+
in+
in+
in−
in−
Vbn
1µ
0.2µ
1µ
0.2µ
0.2µ
1µ
0.2µ
1µ
C
C
Q
Q
Fig. 9: The window comparator used in the charge pump for
BIST. All un-labelled transistors have W/L=0.5µ/0.5µ
TABLE I: Coverage of different types of faults
Defect Coverage
Gate open 87.8%
Drain open 93.9%
Source open 93.9%
Gate drain short 93.9%
Gate source short 100%
Drain source short 100%
Capacitor short 100%
Total 94.8%
For the analog components, two DC tests with the intercon-
nect input at logic 1 and logic 0 respectively can detect 50.4%
of the structural faults in the circuit. Scan test of the analog
circuits in the receiver by converting the charge pump to a
combinational circuit enhances the coverage to 74.3%. Most
of the faults missed by the DC and scan test are detected in
the BIST which improves the fault coverage to 94.8%. Table I
tabulates the types of faults and their coverage statistics. The
total additional circuits required are shown in Table II.
TABLE II: Circuit and control input overhead
Entity Number
Flip-flop 7
Comparators (DC) 4
Comparators (100 MHz) 2
D-Latch 1
2×1 Multiplexer 2
3 bit saturating UP counter 1
Control signals 2
Logic gates 6
V. CONCLUSIONS
This paper describes the test of repeaterless low swing
interconnects which use mixed signal circuits. The low swing
interconnect considered uses capacitive feed-forward equal-
ization at the transmitter and a clock synchronizing receiver
that uses digital coarse correction and analog fine correction.
The digital circuits are scan testable easily. Simple techniques
are used to test the analog components along with the digital
circuits for the scan test. A fault coverage of 50% is achieved
with a DC test, which is enhanced to 74% with scan test
and to 94% with a BIST. This enables the use of low swing
interconnect in large scale high volume digital systems.
ACKNOWLEDGEMENT
The authors would like to thank Prof. Maryam Shojaei
Bhaghini and Prof. Virendra Singh, both from IIT Bombay,
for useful discussions. The authors would also like to thank
Tata Consultancy Services (TCS) and the SMDP programme
of the Government of India for student scholarships and for
providing funds for EDA tools respectively.
REFERENCES
[1] Eisse Mensink, Daniel Schinkel, Eric Kiumperink, Ed van Tuiji, and
Brain Nauta, “0.28pJ/b 2Gb/s/ch transceiver in 90nm CMOS for 10mm
on-chip interconnects,” in IEEE Int. Solid State Circuits Conf. (ISSCC)
Dig. of Tech. Papers, Feb. 2007, pp. 414–416.
[2] R. Ho, T. Ono, F. Liu, R. Hopkins, A. Chow, J. Schauer, and R. Drost,
“High-speed and low-energy capacitively-driven on-chip wires,” in IEEE
Int. Solid State Circuits Conf. (ISSCC) Dig. of Tech. Papers, 2007, pp.
412–414.
[3] Byungsub Kim and V. Stojanovic and, “An energy-efficient equalized
transceiver for RC-dominant channels,” IEEE J. Solid-State Circuits,
vol. 45, no. 6, pp. 1186–1197, June 2010.
[4] Seung-Hun Lee, Seon-Kyoo Lee, Byungsub Kim, Hong-June Park, and
Jae-Yoon Sim, “Current-mode transceiver for silicon interposer channel,”
IEEE J. Solid-State Circuits, vol. 49, no. 9, pp. 2044–2053, Sept 2014.
[5] R. Ho, T. Ono, R.D. Hopkins, A. Chow, J. Schauer, F.Y. Liu, and
R. Drost, “High speed and low energy capacitively driven on-chip
wires,” IEEE J. Solid-State Circuits, vol. 43, no. 1, pp. 52–60, Jan.
2008.
[6] E. Mensink, D. Schinkel, E.A.M. Klumperink, E. van Tuijl, and
B. Nauta, “Power Efficient Gigabit Communication Over Capacitively
Driven RC-limited On-Chip Interconnects,” IEEE J. Solid-State Circuits,
vol. 45, no. 2, pp. 447–457, Feb. 2010.
[7] K. Naveen, M. Dave, M.S. Baghini, and D.K. Sharma, “A feed-forward
equalizer for capacitively coupled on-chip interconnect,” in Proc. 26th
IEEE Conf. VLSI Design, 2013, pp. 215–220.
[8] N. Kadayinti, M. Shojaei Baghini, and D. K. Sharma, “A Clock
Synchronizer for Repeaterless Low Swing On-Chip Links,” ArXiv e-
prints, Oct. 2015, http://arxiv.org/abs/1510.04241.
[9] Chun-Lung Hsu, Yiting Lai, and Shu-Wei Wang, “Built-in self-test for
phase-locked loops,” IEEE Trans. Instrum. Meas., vol. 54, no. 3, pp.
996–1002, June 2005.
[10] S. Kim and M. Soma, “An all-digital built-in self-test for high-speed
phase-locked loops,” IEEE Trans. Circuits Syst. II, vol. 48, no. 2, pp.
141–150, Feb. 2001.
[11] Cheng Jia and L. Milor, “A bist circuit for dll fault detection,” IEEE
Trans. VLSI Syst., vol. 16, no. 12, pp. 1687–1695, Dec. 2008.
[12] S. Sunter and A. Roy, “Purely digital bist for any pll or dll,” in Proc.
12th IEEE Eur. Test Symp. (ETS), May 2007, pp. 185–192.
