Reduced pin-count testing, 3D SICs, time division multiplexing, test access mechanism, simultaneous bidirectional signaling by Soomro, Iftikhar A. & Samie, Mohammad
Received April 28, 2021, accepted May 7, 2021, date of publication May 17, 2021, date of current version May 28, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3081359
Reduced Pin-Count Test Strategy for 3D Stacked
ICs Using Simultaneous Bi-Directional Signaling
Based Time Division Multiplexing
IFTIKHAR A. SOOMRO , MOHAMMAD SAMIE , AND IAN K. JENNIONS
School of Aerospace, Transport and Manufacturing, Cranfield University, Cranfield MK43 0AL, U.K.
Corresponding author: Iftikhar A. Soomro (i.soomro@cranfield.ac.uk)
ABSTRACT 3D Stacked Integrated Circuits (SICs) offer a promising way to cope with the technology
scaling; however, the test access requirements are highly complicated due to increased transistor density and
a limited number of test channels. Moreover, although the vertical interconnects in 3D SIC are capable of
high-speed data transfer, the overall test speed is restricted by scan-chains that are not optimized for timing.
Reduced Pin-Count Testing (RPCT) has been effectively used under these scenarios. In particular, Time
Division Multiplexing (TDM) allows full utilization of interconnect bandwidth while providing low scan
frequencies supported by the scan chains. However, these methods rely on Uni-Directional Signaling (UDS),
in which a chip terminal (pin or a TSV) can either be used to transmit or receive data at a given time. This
requires that at least two chip terminals are available at every die interface (Tester-Die or Die-Die) to form a
single test channel. In this paper, we propose Simultaneous Bi-Directional Signaling (SBS), which allows a
chip terminal to be used simultaneously to send and receive data, thus forming a test channel using one pin
instead of two. We demonstrate how SBS can be used in conjunction with TDM to achieve reduced pin count
testing while using only half the number of pins compared to conventional TDM based methods, consuming
only 22.6% additional power. Alternatively, the advantage could be manifested as a test time reduction by
utilizing all available test channels, allowing more parallelism and test time reduction down to half compared
to UDS-based TDM. Experiments using 45nm technology suggest that the proposed method can operate at
up to 1.2 GHz test clock for a stack of 3-dies, whereas for higher frequencies, a binary-weighted transmitter
is proposed capable of up to 2.46 GHz test clock.
INDEX TERMS Reduced pin-count testing, 3D SICs, time division multiplexing, test access mechanism,
simultaneous bidirectional signaling.
I. INTRODUCTION
The transistor density in 2D integrated circuits has been expo-
nentially increasing following Moore’s law over the past sev-
eral decades; however, shrinking technology further is now
proving difficult due to thermal and power constraints [1].
A promising way forward is to stack the individual dies
vertically in the third dimension creating a single package
known as 3D Stacked Integrated Circuit (SIC). 3D SICs
overcome the problems of increasing interconnect path delays
and offer higher performance with a much smaller footprint.
The associate editor coordinating the review of this manuscript and
approving it for publication was Yong Chen .
This concept has been applied successfully to manufacture
processors, memories, and FPGAs [2]–[4].
One of the key enablers allowing 3D stacking of dies is the
vertical interconnects between the dies, known as Through
Silicon Vias (TSVs). However, TSVs bring about additional
challenges for testing [5]. First, TSVs occupy a significant
chip area, and therefore, there is a limit on the TSVs that
could be included in the design, and even more so for test-
ing purposes. The limited number of TSVs reduces the test
channel width available for transportation of test vectors to
and from the tester. Moreover, the increased number of dies
means higher transistor count and an increased number of
test patterns now need to be applied, and unlike 2D ICs, not
just once but at several instances during stacking of the dies,
75892
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 9, 2021
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
such as pre- mid- and post-bond testing. It is evident that
as the test vector volume and the number of test instances
increase, the test access bottleneck caused by TSVs becomes
significant.
Nevertheless, TSVs allow very high bandwidth owing to
smaller channel resistance [6]. However, the same cannot be
utilized in testing as the scan chains restrict the maximum
shift frequency. The Flip-Flops, in the design, are converted
to scan-enabled flops after the functional front-end design to
enable scan-based tests. Consequently, the Flip-Flops, now
concatenated as shift registers forming a scan-chain, are not
optimized for timing. Because of these timing constraints
and the thermal and power constraints associated with higher
switching activity during testing, the maximum shift fre-
quency of the scan-chain is restricted to a few tens of MHz
and results in under-utilization of channel bandwidth and
increased test times. One solution is to send in the test data at
a high frequency and incorporate a mechanism to distribute
the data among multiple scan chains at a lower frequency,
such as by using Serializer-Deserializer (SerDes) or Time
Division Multiplexing (TDM). By utilizing the full channel
bandwidth, fewer pins or TSVs would be required and is
termed as Reduced Pin Count Testing (RPCT) [7]. RPCT
also results in a reduced number of test equipment channels
needed for testing a device, and the spare channels can be
utilized to test multiple devices in parallel, also known as
Multi-Site Testing [8], [9].
This paper proposes a TDM based RPCT technique
coupled with simultaneous Bi-Directional Signaling (SBS).
In contrast with the conventional TDM based technique,
which uses a communication channel in a uni-directional
fashion by either sending or receiving data at a particular time,
SBS allows simultaneous transmission and reception of the
data in a channel. The advantage of such a technique could
be a reduction in the number of test channels required for a
given Test Access Mechanism (TAM) or a decrease in test
time for a given number of test channels.
The rest of the paper is organized as follows. In section II,
the motivation for this work and the prior work in TAM
design methods using RPCT techniques and SBS is pre-
sented. In section III, we introduce the principle of operation
of TDM and SBS. SBS-TDM based test strategy for 3D SIC
is presented in section IV. Evaluation results of the proposed
SBS-based method versus UDS-based TDM test method
are presented in section V, followed by the conclusion in
section VI.
II. MOTIVATION AND PRIOR WORK
Testing is of vital importance as it ensures defect-free and
reliable devices. However, with the ever-advancing transistor
density, the test times and hence the cost of chips increase
significantly [10], opening the focus of significant research in
such areas. The most commonly used Design for Test (DfT)
strategy is scan-based testing, which involves shifting-in the
test stimuli vectors serially, applying the test stimuli, and
scanning out of the response vectors. The test application
points are thememory elements or the flip-flops in the design,
which are modified such that these are observable and con-
trollable in test mode and are termed as scan-flops. The
scan-flops are then concatenated into serial shift-registers,
known as scan chains, such that the test data could be sequen-
tially scanned in and out using the core’s primary inputs
and outputs. The stimuli could be propagated through the
intermediate combinational logic, and the response could be
read out for comparison with the expected response. With
millions of flip-flops expected in modern core-based design,
a single serial scan mechanism such as an IEEE 1149.1 stan-
dard compliant JTAG port [11] may not be suitable in terms
of test times. Parallel test ports, such as Wrapper Parallel
Ports (WPP) of the IEEE 1500 Standard [12], allow the use
of multiple serial scan channels by temporarily using the
I/Os for test purposes. The test standard relying on similar
serial/parallel test access mechanisms is also introduced for
3D-SICs [13].
Despite using the parallel test ports during testing,
the exponentially increasing transistor density and the lim-
ited number of chip terminals do not allow all the scan-
chains to be accessed at once. Therefore, the test process
is segmented into sessions, and during each session, only a
limited number of cores are accessed and tested. A number
of test patterns are scanned in one bit at a time, making
the entire process significantly time-consuming and costly.
In general, the test time increases with the test data volume
(and hence the chip complexity) and decreases with the avail-
able channel width for parallel test ports and the scan shift
frequency. Test compression methods have been frequently
employed to reduce the test data volume [14]. However,
this method requires additional on-chip resources such as
decompressors and compactors. Also, beyond a certain point,
test compression reduces the test coverage. Increasing the
available channel width for parallel test ports significantly
reduces the test time [15], [16], but the bottleneck, in this
case, is the limited number of chip terminals and TSVs
in the design. Increasing the scan shift frequency offers a
proportional reduction in test times; however, the most crit-
ical limiting factor, in this case, are the scan-chains that
are not optimized for timing and operate on low frequency.
This results in the loss of usable tester and the chip ter-
minal/TSV bandwidth which are capable of much higher
speeds.
RPCT based methods have two significant advantages.
First, it allows optimal utilization of the tester and I/Os
bandwidths, and secondly uses fewer chip terminals than
Full Pin Count Testing. Additionally, in 3D-SICs, the wafers
are thinned to expose the TSVs hidden in the substrate [5],
[17], and the thinned die may not be able to withstand the
forces exerted by the tester probes; and therefore, only lim-
ited test channels may be available. Hence, RPCT naturally
lends itself to testing 3D-SICs. Techniques such as SerDes
and TDM, which allow increasing the scan frequency while
addressing the scan chains’ low frequency, have been fre-
quently employed to achieve RPCT.
VOLUME 9, 2021 75893
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
A TDM based access mechanism for serial Reconfig-
urable Scan Networks (RSNs), such as those based on IEEE
1687 standard (aka iJTAG) [18], was proposed in [19]. Unlike
traditional scan design, an RSN allows dynamic scan path
reduction as and when required. However, using a single
serial access interface limits the practicability of such an
approach for high volume test vector transportation. The
future SICs are expected to contain millions of flip-flops
necessitating high bandwidth scan-testing employing Parallel
Test Ports. The authors in [20] and [21] highlighted the
notion of using virtual TAMs and Serializer and De-serializer
(SerDes). In [22], the authors discuss a combination of test
vector compression and RPCT to reduce test times. The
authors in [23] proposed using Multi-Valued Logic (MVL)
for tester-to-chip communication. The use of MVL increases
the data rate for a given clock frequency; however, they neces-
sitate analog to digital converter with calibration schemes
which may complicate the implementation and add signifi-
cant chip area.
Much of the above work has been focused on 2D chips, and
a majority of these methods are applicable to 3D designs as
well. Nonetheless, the implementation is not straightforward,
and 3D technology-specific concerns must be considered.
The Test Access Mechanism (TAM) design, which refers to
the insertion of required logic between the chip’s primary
I/Os and the individual cores to enable test vector transport,
is known to be an NP-Hard combinatorial problem. The
design choice of a TAM such that the test requirements of
all the dies in 3D SIC are met using limited resources signif-
icantly affects the test time. Several researchers focus on test
time reduction by optimal TAMdesign for 3DSICs [24], [25].
The authors in [26], [27] proposed a TDM based RPCT
test strategy in 3D-SICs. In [27], the authors showed sig-
nificant improvement in test times using TDM compared to
conventional TAM design methodology. The authors defined
‘global channels’ as communication channels traversing ver-
tically through multiple TSVs/dies, as opposed to point-to-
point communication channels (the same definition of ‘global
channels’ will be used in this article). The global channels
were operated at a higher frequency, and the data were mul-
tiplexed to dies, and in turn to the cores at lower frequencies.
The focus of the previous research work in testing and test
time reduction has been on conventional UDS signaling. This
simplifies the design as the standard I/O cells can be utilized;
however, Simultaneous Bidirectional Signaling (SBS) signif-
icantly improves the throughput of communication channels,
significantly impacting test time. The author in [28] briefly
discussed the dynamics of using SBS in chip testing and its
future potential. The research in SBS has beenmostly focused
on the normal mode (as opposed to test mode), point to point
communication links, focusing on throughput and power effi-
ciency. Following the initial concept of SBS by the authors
in [29], the researchers have proposed various methods to
enable SBS. Broadly, the SBS design methodology can be
classified into differential [30], [31] or single-ended designs
[29], [32]–[34], current mode [30], [31] or voltage mode
transceivers [33], [29], [34], or based on channel character-
istics such as on-chip [30], [35] or off-chip communication
[29], [32]. 3D SICs offer a promising prospect for SBS due
to the low channel impedance of the TSVs. The authors in
[36], [37] proposed single-ended SBS design methodologies
for use in 3D SICs and reported significant improvement in
chip area, throughput, and power consumption compared to
2 x UDS-based TSVs.
While both TDM/ SerDes and SBS methods improve the
efficiency of a communication channel, the key difference is
that the former is aimed at maximizing the use of available
channel bandwidth, whereas the latter allows using a single
pin to form the communication channel. The previous works
have focused on using these methods separately in the chip
testing scenario, with SBS used to decrease the test time
[38], and TDM/ SerDes used to minimize test channels (pins/
TSVs) [27]. Nevertheless, a combination of both methods
presents new prospects in the 3D SIC test, allowing mini-
mization of test resources as well as test time reduction. This,
however, presents several challenges. Firstly, the previous
research in SBS has been focused on conventional point-to-
point communication, which simplifies the implementation.
In this scenario, communication is required between a high-
frequency source at the near-end (Scan-in vectors from the
tester) and several far-end transmitters operating at much
slower speeds (Scan-out vectors from the scan chains), for
which the design is not straightforward. Secondly, unlike
[38], where SBS was used in the Full Pin Count Test (FPCT)
scenario with the low-frequency operation, the design con-
siderations in this case are complicated by high-performance
requirement.
This paper explores the feasibility of integrating SBS with
TDM based RPCT method, with a particular focus on its
application in testing 3D SICs. We present a potential SBS
transceiver design capable of high-frequency operation and
evaluate the design tradeoffs. The design challenges and
possible solution of integrating SBS with 2D-TDM based
Test Access Mechanism, which requires a global channel
traversing through multiple TSVs, are studied. Moreover,
the strategy to extend TDM-SBS based test methodology
to pre- mid- and post-bond test instances of 3D SICs is
presented.
III. BACKGROUND
This section describes the operation of TDM based test
methodology, followed by an overview of Simultaneous Bidi-
rectional Signaling. The working concept is illustrated using
example cases that mainly focus on scan-based test architec-
tures. The same examples will be subsequently used as the
test cases for evaluation.
A. TIME DIVISION MULTIPLEXING
Consider an example 3D-SIC with three dies, as shown in
Fig. 1. It is assumed that every die is composed of 2 cores
with a total of 3 scan chains and that all dies are identical
(details are only shown for the first die for clarity). In general,
75894 VOLUME 9, 2021
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
FIGURE 1. An example implementation of TDM for scan-test application
based on [27].
the number of scan chains is far greater than the available
number of test channels and depending on the test schedule,
only a small subset of cores are selected at a time. Therefore,
this example represents the case for a particular test session
in the overall test schedule. A single test channel is shown,
which originates at the scan-in chip terminal and terminates at
the bottom die’s (1st die) scan-out chip terminal. It is assumed
that the tester communicates with the 1st die through these
chip terminals.
The TDM design for this example is based on the method
proposed in [27] in which the authors proposed using separate
2D-TDM for the vertical (inter-die) and horizontal (intra-die)
communication. At the input, the incoming data is available
to all the dies, and in turn, the scan chains, using the global
TSV channel. The data is demultiplexed from the Scan-in
pin to the cores by controlling the scan chains’ clock signal,
such that data is scanned-in only to the scan chain which
receives the positive clock edge. At the scan-chain output,
the data is multiplexed on the scan-out pin using two tri-
state buffer stages. The first stage tri-state buffers (one for
every scan-chain) are controlled such that only one buffer
is active at a time, essentially serving as 3 to 1 multiplexer.
In the second stage, tri-state buffer (one for every die) in-
turn multiplexes the data onto the scan-out pin, one die at a
time.
To appropriately demultiplex the data at the input and mul-
tiplex at the output, the above arrangement requires a control
circuit to select the appropriate die, core, and hence the scan-
chain at every clock cycle. This is achieved using a clock
divider circuit that can be constructed using shift registers as
a Ring Counter (RC). Fig. 2 (a) shows one implementation of
the clock divider circuit. The incoming Global Clock signal
(Gclk) is first divided to generate Die clock (Dclk). As there
are three dies in this example, a 3-bit RC is used to generate
three 120◦ out of phase die clocks (Dclk1,2,3) running at 1/3rd
FIGURE 2. Control circuit for TDM (a) Generation of core and die clocks
from global test clock using ring counters, the bottom RC generates die
clocks, the top RC generates core clocks (b) timing diagram of the die and
core clocks.
of the Gclk frequency. To ensure this, the RC is initialized
with a one-hot bit sequence (with only one bit set high -
100-bit sequence used in this case). The die clock serves two
purposes; First, it allowsmultiplexing the data to the Scan-out
pin by activating the second stage tri-state buffer of only one
die at a time (Fig. 1). Secondly, it is used to derive the Core
Clock (Cclk) signals for the individual cores/scan chains,
as shown in Fig 2(a). The Cclks serve two other purposes,
first, they allow demultiplexing the data from the scan-in
terminal to the scan chains, and secondly, they activate the
first stage tri-state buffer of the cores to be multiplexed at the
output.
The choice of the number of flip-flops for Cclk generation
would determine the lowest achievable frequency (fmin) and
the number of scan chains serviceable by the Cclk. For k
flip-flops, a minimum frequency of Dclk/k can be achieved,
and at most, k scan-chain can be multiplexed. Multiples
of fmin can also be produced by OR-gating alternate Cclk
outputs, as shown in Fig. 2(a). In this way, different frequency
clocks can be provided to scan-chains depending on the scan-
frequency supported by the core. For example, in Fig 1,
the Cclk13, which is twice the minimum core clock frequency
(Dclk/k), is used to serve the scan-chain chain in the core a
of the dies. Fig 2 (b) shows the timing diagram of the derived
clock signals. The clock division using this arrangement pro-
duces a duty cycle with on-time equal to the clock period of
the Gclk, ensuring the second stage tristate buffers are ‘on’
for the entire Gclk cycle.
B. SIMULTANEOUS BI-DIRECTIONAL SIGNALING
Conventional TAM design using TDM requires separate out-
put and an input port to form a test channel, as shown
in Fig. 3(a). The proposed TAM design is based on Simul-
taneous Bi-Directional Signaling (SBS) in which a channel
could be formed using one pin only, as illustrated in Fig. 3(b).
SBS is different from traditional bi-directional pins in that the
latter can only be configured either as an input or an output at
a given time (half-duplex) and is therefore considered to be
UDS for test purposes.
The working principle of SBS is elaborated using an exam-
ple die consisting of a single scan chain of two-bit length,
VOLUME 9, 2021 75895
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
FIGURE 3. An illustration of simultaneous Bi-directional signaling (a) Conventional Uni-directional signaling using two wires (b) Simultaneous
Bi-directional signaling using one wire (c) Block diagram of SBS working principle–test channel formation for a two-bit scan chain.
as illustrated in Fig. 3(c). Using a conventional UDS scheme,
two pins would be required, one to connect the input of
the scan-chain to the Scan-In (SI) signal and another pin to
connect the output of the scan-chain Scan-Out (SO) (Fig. 3a).
However, using SBS, a single pin could transmit and receive
SO and SI simultaneously. To achieve the same, the signal at
the chip terminal is ternary encoded instead of binary. The
scan-chain SO’s output is fed to a transmitter Tx1, which
could be designed as a buffer. The Tx1 (with an output
impedance of R1) drives the chip terminal from one end,
whereas a similar transmitter Tx2 (with output impedanceR2)
is assumed to be driving the same chip terminal with the SI
signal from another die (In Fig. 3(c) the R1 and R2 depict
transmitter internal impedance but are shown as external
resistors for clarity). Depending on the state of the SI and
SO, the voltage Vx at the chip terminal node will either be
pulled low (0 V) or high (Vdd) in the case when both ends
are being driven low or high, respectively; however, Vx will
take on an intermediate value (Vxm) when both transmitters
are in the opposite state (10 or 01). The value of Vxm will
depend on the impedances R1 and R2, and assuming both to
be the same, Vxm equates to 1/2 Vdd.
The ternary encoded signal Vx at the chip terminal can now
be used by each die to determine whether the incoming signal
(SI) is the same or opposite of the signal (SO) being trans-
mitted. The Ternary Decoder (TD) block shown in Fig. 3(c)
receives the Vx signal and the SO signal (taken just before
Tx1). To determine the Vx signal state, two reference voltages
are required, a high reference voltage Vrefh that is midway
between Vdd and Vxm, and a low reference voltage Vrefl
halfway between Vxm and the ground. An analog multiplexer
is employed, selecting Vrefh if SO is high and Vrefl when SO
is low. A voltage comparator circuit is used to compare the
Vx signal with the Vref (reference voltage from the Analog
Multiplexer). Table 1 lists all possible SI and SO values and
the state of the transceiver in each case. When SO is 1, Vx can
only take on the value of 1 (Vdd) or Vxm (1/2 ∗Vdd). A high
value at Vx implies that SI must also be high, whereas a 1/2
FIGURE 4. Neural network presentation of the ternary decoder.
∗Vdd voltage level indicates SImust be zero. The comparator
determines the same by comparing Vx with Vref from the
Analog Multiplexer, which in this case (SO = 1) would be
Vrefh. Similarly, When SO is low, the comparator receives
the lower reference Vrefl and compares it with Vx, which
could either be low (meaning SI = 0) or 1/2 ∗Vdd (meaning
SI = 1). In all cases, the Decoded Scan-In (DSI) Signal
produced by the comparator (and hence the TD) is the same
as the original SI, as shown in Table 1.
TABLE 1. SBS transceiver states.
A mathematical presentation of the ternary decoder can
be constructed using a neural network shown in Fig. 4. The






VreflU (Vxm − SO)+ VrefhU (SO− Vxm)
])
(1)
VGND and Vdd in Fig. 5 are constant values, while
Vx varies depending on the SO and SI voltages. The position
of Vx is determined by the superposition of SO and SI applied
to the resistive network comprised fromR1 andR2, calculated
75896 VOLUME 9, 2021
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
FIGURE 5. Hyperplanes created by the neurons in Fig. 4.








The neurons in the network of Fig. 4 generate two hyper-
planes, a fixed position hyperplane of Vxm, and a dynamic
position hyperplane of Vref. This initially compares the input
value Vx with the fixed hyperplane Vxm, then based on the
result, it triggers Vrefl if Vx < Vxm, otherwise Vrefh if
Vx > Vxm. Fig. 5 demonstrates a case when Vrefl is triggered
as the valid hyperplane for the second neuron because of
Vx < Vxm. Finally, in accordance with Table 1, the network
generates logic value ‘1’ as Vx > Vrefl.
IV. METHODOLOGY
This section demonstrates the feasibility of using SBS-based
TDM using the test case of UDS-based TDM design pre-
sented in section III and modify it to include an SBS
transceiver, as shown in Fig. 6. The design of an SBS
transceiver is dependent on the characteristics of the chan-
nel; therefore, the design considerations are different for the
communication channel between the tester and the first Die,
and for inter-die communication (using TSVs). As the TSV
channel is much less resistive in nature [6], the design for
an SBS transceiver is relatively simple and can be achieved
using the core transistors. Therefore, to avoid complexity,
we assume that tester-to-die communication is done using the
existing UDS method, and we only propose SBS for inter-die
communication through TSVs.
A. TRANSCEIVER DESIGN
SBS transceivers can be implemented using several design
methods. Differential mode transceivers [39] are used for
high-frequency applications; however, the design is often
complicated, and the requirement of two wires limits its
use in TAM design where single-ended one wire systems
are preferable. Single-ended SBS transceivers for use in the
TDM scheme require three main components, the transmitter,
receiver, and the control circuit, including TDM switching
circuitry.
The transmitter design mostly depends on two factors,
the required performance and the power consumption. As the
SBS transmitter involves ternary coding, the design deviates
from the static CMOS logic to generate the intermediate
voltage levels, resulting in high static currents. The transmit-
ter was designed as an inverter, followed by a Transmission
Gate (TG) acting as an analog switch, as shown in Fig. 7 (a).
The TG allows turning off the transmitter when not in use;
for instance, to limit static current or during functional mode
FIGURE 6. SBS-TDM based test access mechanism for the example case
of Fig. 1.
operation. The TG also allows turning on and off the transmit-
ters only during the specified intervals to allow time-division
multiplexing. The transistor widths were carefully chosen to
find the right balance between performance (maximum sup-
ported frequency) and acceptable power consumption. This
trade-off is discussed further in section V. The transmitter
design in Fig. 7(a) is equivalent to a tri-stated inverter whose
intermediate nodes between the Pull-up PMOS transistors
and Pull-Down NMOS transistors are connected. While this
functionality could also be achieved using a tri-state inverter,
the transmission-gate based design has an advantage that
during the on-state of the transmitter, the effective resistance
of the transmission gate is a parallel combination of both the
PMOS andNMOS; and therefore, has a lower resistance com-
pared to either NMOS or PMOS. This arrangement reduces
delay and improves performance.
The transmitter design must also account for the desired
Vxm voltage level when the transmitters at either end are in
the opposite states. Ignoring the TSV resistance and assuming
the pass gate as an ideal switch, the two transmitters’ equiv-
alent electrical model when sending opposite signals (10,01)
is shown in Fig. 7(b). If RP and RN denote the resistance of
the PMOS and NMOS transistors of the inverter, respectively,
the middle voltage level Vxm is given by:
Vxm = Vdd (RP/(RP + RN )) (3)












L (VGS − Vt)
(5)
VOLUME 9, 2021 75897
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
FIGURE 7. SBS transceiver implementation (a) Transmitter (b) Equivalent
electrical model of the transmitter when sending opposite signals (10,01)
(c) Sense-Amplifier based Ternary Decoder (Receiver).
For Vxm = 1/2 Vdd, the factors (VGS − Vt) can be
assumed to be similar for both PMOS and NMOS, also taking





Therefore, to achieve Vxm= 1/2Vdd, the transistor widths
ratio between PMOS and NMOS should be designed so that
the on-state current of the low mobility PMOS is similar to
higher mobility NMOS.Moreover, the TG transistor sizes are
also chosen to be the same for the inverter, ensuring that the
TG has a constant on-resistance with the same strength as the
inverter.
The Ternary Decoder was designed using a sense-
amplifier-based voltage comparator [40], [41] and a pass-
gate analog multiplexer, as shown in Fig. 7 (c). The sense
amplifier is widely used as a voltage comparator due to its
robust operation and power efficiency and is commonly used
in high-performance Flash ADCs [42], [43]. The pass-gate
analog multiplexer selects the appropriate reference voltage
depending on the outgoing signal SI, i.e. the high reference
voltage Vrefh when SI = 0 and lower reference voltage Vrefl
when SI = 1. The reference voltage is fed to one sensing
input of the sense-amplifier (M9), whereas the other sensing
input (M8) receives the ternary coded Vx signal. M8 and
M9 act like variable resistors with values proportional to
FIGURE 8. (a) Bypass method for pre-bond testing. (b) Using SBS
transceivers as buffers to access higher dies.
the respective gate voltage (Vx and Vref). The transistors
M1 throughM4 form two cross-coupled inverters. During the
positive clock cycle, M5, M6, M2, and M4 turn on, charging
the cross-coupled nodes of both inverters to Vdd. During
the entire positive clock cycle, the cross-coupled transistors
remain in themeta-stable state. During the negative clock half
cycle, M7 turns on, providing a discharge path to the cross-
coupled inverters; however, the inverters tend to discharge
at different rates depending on the on-resistance, and hence
the currents through M8 and M9, performing the comparator
action through regenerative feedback. The inverting output of
the SA amplifier is chosen as the TD’s output to reconstruct
the original signal, which was inverted by the transmitter
designed as an inverter.
B. PRE- AND MID-BOND TESTING
The implementation in Fig. 6 represents the case of post-bond
testing in which the dies have been assumed to be already
bonded. However, the dies may require testing before bond-
ing, also known as pre-bond testing. As we have considered
that the tester communicates using UDS, the pre-bond testing
can be undertaken using UDS methods by bypassing the SBS
transceivers. This can be achieved by multiplexing the output
of the tester side and die side TDs, with the SO and SI signals,
respectively, as shown in Fig. 8(a). The multiplexers may
be configured to either select SBS or UDS using a single
bit register accessible through JTAG. To minimize power
consumption by the SBS transceiver when UDS is selected,
the TGs can be turned off, and the TD clock could be disabled
by using clock-gating; alternatively, the complete transceiver
circuits could be disabled by power-gating. Fig. 8(a) mainly
depicts the case for the 1st die; similarly, the SBS transceivers
may be bypassed for the other dies.
The mid-bond test instances are a subset of the post bond
test problem. However, depending upon the number of dies
75898 VOLUME 9, 2021
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
FIGURE 9. Pseudo-code for generation of 3D TDM based TAM output for
a 1-bit global channel assuming identical dies in the stack.
stacked, the clock divider circuit may be multiplexed/ con-
figured accordingly as the case would be in UDS based
TDM. However, the addition or removal of the dies may
affect the global channel’s electrical characteristics, affecting
the transmitters’ performance and power consumption. The
transmitter’s performance may be adjusted by using a binary-
weighted variable drive strength transmitter [29]. Depending
on the requirement, the drive strength may be adjusted by
enabling the desired inverters; for instance, a 3-stage trans-
mitter would allow the adjustment of drive strength from 1x
to 7x. Similar to the selection of UDS in pre-bond testing,
the transmitter strength may be configured using JTAG.
C. TEST SETUP
To test the proposed transceiver, the example 3D stacked die
and core structure shown in Fig. 1 was modified to include
SBS transceivers as shown in Fig. 6. The TSV was modelled
as a lumped RC circuit [6] assuming TSV structure with a
length of 20 µm, a diameter of 5 µm, Tox (oxide thickness)
of 200nm, and 2 × 1015/cm3 doping concentration for the
substrate. The resultant RC model has a TSV capacitance
of 30fF and resistance of 100 m.
To validate the test structure’s output, the typical Capture,
Shift, and Update cycle of the scan-chain was ignored, and
continuous shifting was performed such that the output is the
same as the input. However, it may be noted that unless all
the scan-chains are of the same length and operate at the same
frequency, the multiplexed output from the scan chains will
be an interleaved form of the input. For instance, in the test
structure used in this example, core a is being serviced using
Cclk1,3 which is twice the fmin; the scan sequence through
FIGURE 10. Waveforms of various signals using SBS transceiver design of
Fig. 7 in the test setup of Fig. 6 simulated at 1.2 GHz Gclk frequency.
this core will appear earlier at the output compared to other
scan-chains. To validate the output, the correct/ expected
multiplexed output, SO(expected), for the TDM multiplexer
can be modeled using the proposed pseudo-code as shown
in Fig. 9.
The test setup in Fig. 6 has been limited to 3 dies; but,
3D SICs with any arbitrary number of dies may be tested
using SBS based TDM. However, the signal integrity, when
traversing multiple dies in a global TSV, must be ensured.
In a UDS scheme, buffers can be inserted midway between
two consecutive TSVs; in an SBS scheme, digital buffers
cannot be used because of the ternary encoding. The authors
in [33] discuss the use of accelerators, mid-way latches,
and opposite-polarity transition encoding to improve SBS
performance in highly lossy global links, and the authors in
[44] propose using clamping circuits to address the same. The
proposed design does not include any intermediate buffers;
there will be an upper limit on the number of dies supported
by the given transceiver design for a TDM global channel.
The dies higher up the stack may be accessed by using SBS
transceivers as buffers. For instance, for the test case in Fig.6,
the dies beyond the 3rd die may be accessed using SBS
buffers as shown in Fig 8(b). The use of this method has two
implications, 1) every SBS buffer instance requires insertion
of flip-flops resulting in an additional 1-bit delay in the scan
shift cycle, and 2) The overall test schedule would require
different sessions to test the buffered segments.
V. RESULTS AND DISCUSSION
Simulations were performed with Cadence Virtuoso using
45nm technology and the standard cells from 45nm Nangate
VOLUME 9, 2021 75899
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
FIGURE 11. Transceiver transient response with varying frequency.
Library [45], using 1v Vdd. The proposed transceiver was
designed to achieve an operating frequency of 1.2GHz at
the global TSV channel. The transmitter’s NMOS transistors
were designed with 360nm width, and the PMOS transistors
were designed as 1.6× NMOS width giving the middle volt-
age level of approximately 0.5Vdd. The Sense-Amplifier was
designed using 360nm NMOS width and 1.6×NMOS width
for the PMOS transistors. The lower and upper reference volt-
ages were chosen to be 300mV and 700mV, respectively, for a
supply voltage of 1V. From here on, we define the scan-in side
transceiver and the scan-out side transceivers as near-end and
far-end transceivers, respectively. The transient simulation
results for the near-end transceiver at 1.2GHz frequency are
shown in Fig. 10. The near-end transmitter sends the scan-in
(SI) signal, which is demultiplexed to 9 scan chains in 3 dies.
Although there are 3 far-end transmitters, only one is active at
a time, essentially behaving as a single transmitter, and hence
the signal scan-out (SO) as seen by the near end transceiver is
shown as SO(expected) and was computed using the pseudo-
code in Fig. 9. The various intermediate signals in the TD
are also shown, and the output of TD is denoted as Decoded
Scan-Out (DSO). TheDSO signal appears as a unipolar return
to zero coded waveform due to the sense-amplifier’s nature,
which can be directly fed to the scan chains. The signal SO
in Fig. 10 shows the first scan-flop’s reconstructed output and
is similar to the SO(Expected) signal delayed by 1 cycle.
Fig. 11 compares the transceiver behaviour with vary-
ing frequency. The output of the transmitter (Vx) and the
receiver (DSO) are shown for 0.6, 1.2, 1.8, and 2.2 GHz
clocks. It may be noted that although there are 3 output
states of the Vx signal (0, Vxm, and Vdd), there are six
possible transitions, depending upon the previous state, i.e.
rail-trail (0-1,1-0), rail-mid (0-Vxm, 1-Vxm), and mid-rail
(Vxm-1, Vxm-0). Fig. 11 shows the rail-rail and rail-mid
transition, whereas the mid-rail transitions are omitted for
clarity as the response is similar to rai-rail. For a given
Vx transition, the dashed red markers show the reference
voltage (low or high), which is dependent on SI signal being
transmitted. Moreover, for ease of comparison, every interval
on the horizontal axis depicts one clock cycle with waveforms
for different frequencies accordingly scaled.
From Fig. 11, it is evident that the increasing frequency
results in additional gate delay and transient time relative to
the clock duration. This results in reduced timing margins
and a lesser voltage difference between Vx and Vref at the
TD input. While the former affects both the transmitter and
receiver and is somewhat mitigated by accounting for timing
pessimism, the effect of the latter is rather significant as lesser
voltage margins decrease the robustness of the TD, especially
in the presence of process variations and cross-coupling.
It is interesting to note that although the rail-rail transi-
tions involve a larger voltage swing than the rail-mid swing,
they are relatively faster (∼60% compared to the rail-mid
transitions). This is because the rail-rail transitions are only
possible when both near- and far-end transmitters are being
driven in the same direction, which effectively doubles the
drive strength. For the same reason, the mid-rail transitions
also exhibit similar behaviour, which may even be slightly
better due to reduced voltage swing. On the other hand, there
is effectively a single transmitter driving Vx for the case of
rail-mid transitions, resulting in relatively slower transitions.
Consequently, with increasing frequency, the transceiver per-
formance is likely to degrade for the rail-mid transitions first;
therefore, the transceiver may be designed considering the
rail-mid response as the worst case.
The transceiver operation was verified across all process
corners at 1.2GHz. A maximum variation of 40 mV was
seen in the mid voltage level Vxm across various design
corners. It may be noted that the Vxm seen by the near and
far end transceivers may slightly differ further due to para-
sitic resistances. The minimum offset voltage (the difference
between Vref and Vx) required to correctly resolve the middle
voltage level Vxm from Vrefl or Vrefh (low or high reference
voltages) levels was observed to be approximately 25mv. This
gives a sufficient margin to account for voltage drops due to
parasitics and additional statistical offset due to variability in
the sense amplifier
A. POWER CONSUMPTION
The power consumed by the test circuit designed using UDS
and SBS based TDM is defined as the sum of average power
consumed by the transmitters and receivers when a pseudo-
random binary sequence is used as the input. At 1.2 GHz
switching frequency, the total power consumption of the
complete channel, including 1 x near-end transceiver and 3 x
far-end transceivers of the global channel, was 164.5 µW
for the SBS based design, which is 22.5% higher as com-
pared to the UDS based scheme (134.2 µW), designed using
4x Buffers as transceivers. The power consumption trend
of both designs with increasing frequency of operation is
shown in Fig. 12. The power consumption of both methods
increases with frequency. However, as the UDS-TDM can be
designed using static CMOS, the static power component is
75900 VOLUME 9, 2021
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
FIGURE 12. Power consumption of a single channel for UDS (Fig. 1) and
SBS (Fig. 6) transceiver based TDM schemes.
minimal, and the overall power consumption is dominated by
the dynamic power, which increases considerably with the
frequency at the rate of 10.9µW/100MHz. For the SBS based
design, the dynamic power consumption is relatively lesser at
3.9µW/100MHz, which can be attributed to the reduced volt-
age swing due to 3-level encoding and the relatively smaller
transistor sizes. However, the major contributor to the overall
power in SBS transceiver is the static power. As the static
power consumption remains independent of the frequency,
the power consumption of SBS at lower frequencies was
observed to be higher than UDS. Nevertheless, due to lower
dynamic power consumption, SBS consumes lesser power at
higher frequencies as compared to UDS.
The static power consumption can be reduced by limit-
ing the static current when both transmitters transmit the
opposite signal, i.e., 10 or 01. To limit static currents and
conserve power, designs such as capacitive coupling-based
transmitters [37] or MOSR coupled inverters [38] have been
proposed. The capacitor-based design significantly improves
static power consumption by blocking the steady-state cur-
rent; however, the capacitor also blocks the noise discharge
path making it prone to coupling noise, which may be
significant in TDM based TAMs where global channels
are required. The MOSR based transmitter limits the static
power, but it also reduces the voltage swing and increases
transient times/delays, and therefore only finds its use in
low-frequency applications. The authors in [41] propose a
Sense-Amplifier Completion Detector (SACD) circuit that
can turn off one of the transmitters after the TD has compared
the inputs. The SACD was incorporated in the test circuit,
and the SI side TG was turned off after completion of the
sampling by the sense amplifier during every cycle. Fig. 12
compares the power consumption of an SACD-based design
which reduces the static power by almost 18%; however,
the additional circuit of the SACD adds to the dynamic power,
which increases from 3.9 to 6.4 µW/100MHz.
Table 2 presents the transmitter’s average power consump-
tion and the maximum frequency when a binary-weighted
transmitter is used, as suggested in section IVB. The width
for the x1 transmitter was chosen to be 180nm, which is
TABLE 2. Binary-weighted transmitter performance.
twice the minimum technology width. The PMOS transis-
tors were sized as 1.6 x NMOS. Similarly, the second and
third stage transmitters were sized x2 and x4. The maximum
frequency was estimated by measuring the 10-90% rise and
fall times between rail to rail and rail to mid-level (Vxm).
The reported maximum frequency (fmax) is 40% (0.8 ×
0.5) of the frequency suggested by the transient response to
account for the time required for sampling and sensing at the
receiver, i.e. (fmax = 0.8 × (0.5/max[rise-time, fall-time]).
Clearly, decreasing the transmitter strength significantly
reduces power consumption; however, the weaker transistors
also limit the maximum achievable frequency. Therefore,
there exists a trade-off between power consumption and max-
imum frequency, and an optimal transistor strength should be
configured.
B. SIGNAL INTEGRITY UNDER CROSS-COUPLING
As the TSVs are a relatively large structure traversing through
the entire substrate, cross-coupling between TSVs becomes a
significant concern. The authors in [46] presented a lumped
RC model of the TSV cross-coupling, which was used
to study the SBS transceiver’s performance under cross-
coupling noise. The RC values were calculated assuming
silicon resistivity of 6.89 .cm and a TSV pitch of 10µm.
As the test circuit traverses 2 TSVs in the global channel,
assuming every TSV is surrounded by 8 neighbouring aggres-
sors (for a 3 × 3 cluster), and each neighbour driven by a
PRBS sequence, the eye diagram of the Vx signal at the input
of the near end (SI side) TD is shown in Fig 13. For the given
transmitter design for 1.2GHz frequency, the transceiver was
verified working satisfactorily under cross-coupling in all
process corners. Fig. 13 shows that most of the coupling
noise diminishes at the negative clock edge (for 50% duty
cycle); however, the sampling time may be delayed further
by increasing the clock’s duty cycle for more robust receiver
performance. Moreover, the coupling noise may be reduced
using TSV guard rings or using power/ ground TSVs among
the neighbouring TSVs [47].
For a given sized transmitter, the eye height and width will
decrease with increasing frequency. For correct transceiver
operation, the transistor sizing of the transmitter, or in the case
of the binary-weighted transmitter, the configuration of the
transmitter is to be selected for the appropriate eye-opening
at the desired frequency and power. The SA must be sized
to account for the affordable offset voltage margin for the
chosen reference voltages. The receiver sampling time may
VOLUME 9, 2021 75901
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
TABLE 3. Comparison with relevant works.
FIGURE 13. Eye diagram for the Vx signal at near end receiver side at
1.2GHz under TSV Cross coupling (all process corners).
be adjusted using the clock duty cycle to provide sufficient
timing margins.
C. COMPARISON WITH RELEVANT PRIOR WORK
Table 3 compares this work with the other relevant work
regarding power consumption improvement over UDS and
maximum frequency. The results are compared with relevant
previous works in TDM based 3D TAM design [27] and
SBS designs for use in 3D SICs [36], [37]. The authors
in [27] reported 600µW average power consumption for a
global channel of 3 dies, which is much higher than the
SBS transceiver proposed in this paper (165µW). However,
the authors used toggle input, 6x strength transmitters, and
intermediate buffers resulting in higher power consumption.
The power consumption of SBS and UDS is affected by
various factors such as transistor technology, channel char-
acteristics, and the bit patterns used. Therefore, we compare
the percentage improvement in power consumption of the rel-
evant SBS design over UDS designed in the same technology
reported by the authors for a fair comparison. The SBS design
in [37] consumes an estimated 37% more power compared
to UDS based scheme, which is slightly more than this work
(22.5%). Park et al. in [36] reported an SBS transceiver design
that was 33%more power-efficient than UDS transceivers for
2 x TSVs. However, the reported comparison was made at
the maximum supported frequency, where it is expected to be
more power-efficient, as suggested by the trend in Fig. 12.
Another notable factor causing the difference in the power
consumption and the maximum supported frequency is that
the designs proposed in [36], [37] were focused on a single
TSV channel for point to point communication involving
2 x transceivers. On the contrary, a TDM based channel
involves multiple dies/ TSVs in the channel involving multi-
ple transceivers (1 near-end and 3 far-end transceivers used in
our experiments). These factors increase power consumption
and result in lesser frequency in this work than point-to-point
communication channels.
Unlike full pin count test methods, which involve design-
ing SBS transceiver at every chip terminal, RPCT based
methods use only a subset of chip terminals, and therefore
the area overheads are not a significant concern. In gen-
eral, an SBS-based transceiver occupies an area similar
to or slightly higher than the UDS counterpart designed for
a similar frequency range [36].
VI. CONCLUSION
A novel SBS-TDM based reduced pin count test strategy was
proposed for testing TSV based 3D SICs. Design consider-
ations for the transmitter, receiver, and the control circuitry
required for SBS based TDM, and associated trade-offs were
presented. The transmitter was designed as a suitably sized
inverter followed by a Transmission Gate, and the receiver
designed as a Sense Amplifier. The power consumption of
the SBS based transceiver is dominated by the static power
consumption of the transmitter, which can be minimized
using appropriate transistor sizing and control. The limita-
tions of the SBS-based test strategy in terms of the channel’s
electrical characteristics and possible solutions for the incor-
poration of SBS into pre-, mid-, and post-bond test instances
were discussed. SBS transceiver can be bypassed for pre-
bond testing allowing normal UDS communication, whereas
adjustable strength SBS transmitters along with SBS buffers
75902 VOLUME 9, 2021
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
are proposed for mid-bond test insertions. Simulation results
using an example test case, in terms of power consump-
tion and performance were presented. The proposed method
consumed 22.5% more power while utilizing only half the
number of TSVs than the UDS based design. The transceiver
performance was verified across all process corners under
cross-coupling from neighboring TSVs.
REFERENCES
[1] R. Sharma and K. Choi, Design of 3D Integrated Circuits and Systems.
Boca Raton, FL, USA: CRC Press, 2014, pp. 157–174.
[2] G. H. Loh, Y. Xie, and B. Black, ‘‘Processor design in 3D die-stacking
technologies,’’ IEEE Micro, vol. 27, no. 3, pp. 31–48, May 2007.
[3] M. B. Healy, K. Athikulwongse, R. Goel, M. M. Hossain, D. H. Kim,
Y.-J. Lee, D. L. Lewis, T.-W. Lin, C. Liu, M. Jung, B. Ouellette, M. Pathak,
H. Sane, G. Shen, D. H. Woo, X. Zhao, G. H. Loh, H.-H.-S. Lee, and
S. K. Lim, ‘‘Design and analysis of 3D-MAPS: Amany-core 3D processor
with stacked memory,’’ in Proc. IEEE Custom Integr. Circuits Conf.,
Sep. 2010, pp. 1–4.
[4] C. Ababei, P. Maidee, and K. Bazargan, Exploring Potential Benefits of 3D
FPGA Integration. Berlin, Germany: Springer, 2004, pp. 874–880.
[5] H.-H.-S. Lee and K. Chakrabarty, ‘‘Test challenges for 3D integrated
circuits,’’ IEEE Des. Test. Comput., vol. 26, no. 5, pp. 26–35, Sep. 2009.
[6] G. Katti, M. Stucchi, K. De Meyer, and W. Dehaene, ‘‘Electrical modeling
and characterization of through silicon via for three-dimensional ICs,’’
IEEE Trans. Electron Devices, vol. 57, no. 1, pp. 256–262, Jan. 2010.
[7] H. Vranken, T. Waayers, H. Fleury, and D. Lelouvier, ‘‘Enhanced reduced
pin-count test for full-scan design,’’ J. Electron. Test. Theory Appl., vol. 18,
no. 2, pp. 129–143, 2002.
[8] A. C. Evans, ‘‘Applications of semiconductor test economics, and multisite
testing to lower cost of test,’’ in Proc. Int. Test Conf., 2003, pp. 113–123.
[9] A. H. Baba and K. S. Kim, ‘‘Framework for massively parallel testing
at wafer and package test,’’ in Proc. IEEE Int. Conf. Comput. Des. VLSI
Comput. Process., Oct. 2009, pp. 328–334.
[10] M. L. Bushnel and V. D. Agrawal, Essentials of Electronic Testing for
Digital, Memory and Mixed-Signal VLSI Circuits. Norwell, MA, USA:
Kluwer, 2002.
[11] IEEE Standard for Test Access Port and Boundary-Scan Architecture,
IEEE Standard 1149.1-2013 (Revision IEEE Std 1149.1-2001), 2013,
pp. 1–444.
[12] IEEE Standard Testability Method for Embedded Core-Based Integrated
Circuits, IEEE Standard 1500-2005, 2005, pp. 1–136.
[13] IEEE Standard for Test Access Architecture for Three-Dimensional Stacked
Integrated Circuits, IEEE Standard 1838-2019, 2020, pp. 1–73.
[14] N. A. Touba, ‘‘Survey of test vector compression techniques,’’ IEEE Des.
Test. Comput., vol. 23, no. 4, pp. 294–303, Apr. 2006.
[15] V. Iyengar, K. Chakrabarty, and E. J. Marinissen, ‘‘Test wrapper and test
access mechanism co-optimization for system-on-chip,’’ J. Electron. Test.
Theory Appl., vol. 18, pp. 213–230, Apr. 2002.
[16] X. Wu, Y. Chen, K. Chakrabarty, and Y. Xie, ‘‘Test-access mechanism
optimization for core-based three-dimensional SOCs,’’ in Proc. IEEE Int.
Conf. Comput. Design, Oct. 2008, pp. 212–218.
[17] E. J.Marinissen, ‘‘Challenges and emerging solutions in testing TSV-based
2 1 over 2D- and 3D-stacked ICs,’’ in Proc. Design, Automat. Test Eur.
Conf. Exhib. (DATE), Mar. 2012, pp. 1277–1282.
[18] IEEE Standard for Access and Control of Instrumentation Embedded
Within a Semiconductor Device—1687, IEEE Standard 1687-2014, 2014,
pp. 1–283.
[19] M. A. Ansari, J. Jung, D. Kim, and S. Park, ‘‘Time-multiplexed 1687-
network for test cost reduction,’’ IEEE Trans. Comput.-Aided Design
Integr. Circuits Syst., vol. 37, no. 8, pp. 1681–1691, Aug. 2018.
[20] A. Sehgal, V. Iyengar, and K. Chakrabarty, ‘‘SOC test planning using
virtual test access architectures,’’ IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 12, no. 12, pp. 1263–1276, Dec. 2004.
[21] A. Sanghani, B. Yang, K. Natarajan, and C. Liu, ‘‘Design and implementa-
tion of a time-division multiplexing scan architecture using serializer and
deserializer in GPU chips,’’ in Proc. 29th VLSI Test Symp., May 2011,
pp. 219–224.
[22] M. S. Kawoosa, R. K. Mittal, M. Jalasuthram, and R. A. Parekhji,
‘‘Towards single pin scan for extremely low pin count test,’’ in Proc. 31st
Int. Conf. VLSI Design 17th Int. Conf. Embedded Syst. (VLSID), Jan. 2018,
pp. 97–102.
[23] B. Li and V. D. Agrawal, ‘‘Applications of mixed-signal technology
in digital testing,’’ J. Electron. Test., vol. 32, no. 2, pp. 209–225,
Apr. 2016.
[24] B. Noia, K. Chakrabarty, S. K. Goel, E. J. Marinissen, and J. Verbree,
‘‘Test-architecture optimization and test scheduling for TSV-based 3-D
stacked ICs,’’ IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.,
vol. 30, no. 11, pp. 1705–1718, Nov. 2011.
[25] X. Wu, P. Falkenstern, K. Chakrabarty, and Y. Xie, ‘‘Scan-chain
design and optimization for three-dimensional integrated circuits,’’
ACM J. Emerg. Technol. Comput. Syst., vol. 5, no. 2, pp. 1–26,
Jul. 2009.
[26] M. A. Ansari, J. Jung, D. Kim, and S. Park, ‘‘Time-multiplexed test access
architecture for stacked integrated circuits,’’ IEICE Electron. Exp., vol. 13,
no. 14, 2016, Art. no. 20160314.
[27] P. Georgiou, F. Vartziotis, X. Kavousianos, and K. Chakrabarty, ‘‘Testing
3D-SoCs using 2-D time-division multiplexing,’’ IEEE Trans. Comput.-
Aided Design Integr. Circuits Syst., vol. 37, no. 12, pp. 3177–3185,
Dec. 2018.
[28] B. G. West, ‘‘Simultaneous bidirectional test data flow for a low-cost
wafer test strategy,’’ in Proc. Int. Test Conf. (ITC), vol. 1, Jan. 2003,
pp. 947–951.
[29] R. Mooney, C. Dike, and S. Borkar, ‘‘A 900 Mb/s bidirectional signaling
scheme,’’ IEEE J. Solid-State Circuits, vol. 30, no. 12, pp. 1538–1543,
Dec. 1995.
[30] H.-Y. Huang and R.-I. Pu, ‘‘Differential bidirectional transceiver for on-
chip long wires,’’ Microelectron. J., vol. 42, no. 11, pp. 1208–1215,
Nov. 2011.
[31] Y. Tomita, H. Tamura, M. Kibune, J. Ogawa, K. Gotoh, and T. Kuroda,
‘‘A 20-Gb/s simultaneous bidirectional transceiver using a resistor-
transconductor hybrid in 0.11-µ CMOS,’’ IEEE J. Solid-State Circuits,
vol. 42, no. 3, pp. 627–636, Mar. 2007.
[32] J.-Y. Sim, Y.-S. Sohn, S.-C. Heo, H.-J. Park, and S.-I. Cho, ‘‘A 1-Gb/s
bidirectional I/O buffer using the current-mode scheme,’’ IEEE J. Solid-
State Circuits, vol. 34, no. 4, pp. 529–535, Apr. 1999.
[33] C. J. Akl andM. A. Bayoumi, ‘‘Wiring-area efficient simultaneous bidirec-
tional point-to-point link for inter-block on-chip signaling,’’ in Proc. IEEE
Int. Freq. Control Symp. Expo., Jan. 2008, pp. 193–200.
[34] M.-K. Jeon and C. Yoo, ‘‘A single-ended simultaneous bidirectional
transceiver in 65-nm CMOS technology,’’ J. Semicond. Technol. Sci.,
vol. 16, no. 6, pp. 817–824, Dec. 2016.
[35] P. V. S. Rao and P. Mandal, ‘‘Current-mode full-duplex (CMFD) signaling
for high-speed chip-to-chip interconnect,’’Microelectron. J., vol. 42, no. 7,
pp. 957–965, Jul. 2011.
[36] S. Park, A. Wang, U. Ko, L.-S. Peh, and A. P. Chandrakasan, ‘‘Enabling
simultaneously bi-directional TSV signaling for energy and area efficient
3D-ICs,’’ in Proc. Design, Automat. Test Eur. Conf. Exhib. (DATE), 2016,
pp. 163–168.
[37] M. T. L. Aung, E. Lim, T. Yoshikawa, and T. T.-H. Kim, ‘‘Design of
simultaneous bi-directional transceivers utilizing capacitive coupling for
3DICs in face-to-face configuration,’’ IEEE J. Emerg. Sel. Topics Circuits
Syst., vol. 2, no. 2, pp. 257–265, Jun. 2012.
[38] I. A. Soomro, M. Samie, and I. K. Jennions, ‘‘Test time reduction of 3-D
stacked ICs using ternary coded simultaneous bidirectional signaling in
parallel test ports,’’ IEEE Trans. Comput.-Aided Design Integr. Circuits
Syst., vol. 39, no. 12, pp. 5225–5237, Dec. 2020.
[39] N. Wary and P. Mandal, ‘‘Current-mode simultaneous bidirectional
transceiver for on-chip global interconnects,’’ inProc. 6th Asia Symp.Qual.
Electron. Design (ASQED), Aug. 2015, pp. 19–24.
[40] T. Na, S.-H.Woo, J. Kim, H. Jeong, and S.-O. Jung, ‘‘Comparative study of
various latch-type sense amplifiers,’’ IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 22, no. 2, pp. 425–429, Feb. 2014.
[41] T. Kobayashi, K. Nogami, T. Shirotori, and Y. Fujimoto, ‘‘A current-
controlled latch sense amplifier and a static power-saving input buffer
for low-power architecture,’’ IEEE J. Solid-State Circuits, vol. 28, no. 4,
pp. 523–527, Apr. 1993.
[42] Y. Chen, P. I. Mak, J. Yang, R. Yue, and Y. Wang, ‘‘Comparator with built-
in reference voltage generation and split-ROM encoder for a high-speed
flash ADC,’’ in Proc. Int. Symp. Signals, Circuits Syst. (ISSCS), 2015,
pp. 2–5.
[43] J. Yang, Y. Chen, H. Qian, Y. Wang, and R. Yue, ‘‘A 3.65 mW 5 bit 2GS/s
flash ADC with built-in reference voltage in 65 nm CMOS process,’’ in
Proc. IEEE 11th Int. Conf. Solid-State Integr. Circuit Technol. (ICSICT),
Oct. 2012, pp. 5–7.
VOLUME 9, 2021 75903
I. A. Soomro et al.: Reduced Pin-Count Test Strategy for 3D SICs Using SBS
[44] M. T. L. Aung, E. Lim, T. Yoshikawa, and T. T.-H. Kim, ‘‘A 3-Gb/s/ch
simultaneous bidirectional capacitive coupling transceiver for 3DICs,’’
IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 61, no. 9, pp. 706–710,
Sep. 2014.
[45] Nangate 45 nm FreePDK Library. Accessed: Oct. 3, 2019. [Online].
Available: https://si2.org/open-cell-library/
[46] T. Song, C. Liu, Y. Peng, and S. K. Lim, ‘‘Full-chip signal integrity analysis
and optimization of 3-D ICs,’’ IEEE Trans. Very Large Scale Integr. (VLSI)
Syst., vol. 24, no. 5, pp. 1636–1648, May 2016.
[47] Y. Peng, T. Song, D. Petranovic, and S. K. Lim, ‘‘Silicon effect-aware
full-chip extraction and mitigation of TSV-to-TSV coupling,’’ IEEE
Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 33, no. 12,
pp. 1900–1913, Dec. 2014.
IFTIKHAR A. SOOMRO received the B.E.
degree in electronics engineering and the M.S.
degree in electrical engineering (control sys-
tems) from the National University of Sciences
and Technology (NUST), Karachi, Pakistan,
in 2007 and 2017, respectively. He is currently
pursuing the Ph.D. degree in manufacturing engi-
neering (electronics) with the Integrated Vehi-
cle Health Management (IVHM) Centre, School
of Aerospace, Transport, and Manufacturing
(SATM), Cranfield University, U.K.
His research interests include reliability, mixed-signal sensor design,
testability, and prognostics in electronic circuits and systems.
Mr. Soomro was a recipient of the President’s, Rector’s, and CNS Gold
medals for his B.E. degree, and the President’s Gold Medal for his M.S.
degree.
MOHAMMAD SAMIE received the B.Sc. degree
in electronics from the Islamic Azad University of
Saveh, Iran, in 1997, the M.Sc. degree in electron-
ics from Shiraz University, Shiraz, Iran, in 2002,
and the Ph.D. degree in advanced electronics from
the University of the West of England, Bristol,
U.K., in 2012.
He is currently working as a Lecturer with the
School of Aerospace, Transport and Manufactur-
ing (SATM), Cranfield University, U.K. He is
also leading Seretonix, the Secure and Reliable Electronic Systems Group,
Cranfield University, with a focus on resilience and security of electronics.
He has accumulated a wide and varied experience in field programmable gate
arrays (FPGAs) and ASIC design, simulation, verification, and implemen-
tation, Toumaz in Didcot, U.K. He was involved with two EPSRC-funded
projects NFF and SABRE, where he was responsible for creating most of
the detailed designs and implementations. He has published 36 international
journals, conference papers, and book chapters, with two awarded as best
articles, on bio-inspired electronics.
IAN K. JENNIONS received the degree in
mechanical engineering and the Ph.D. degree in
CFD from Imperial College London, London.
He has worked for Rolls-Royce (twice), Gen-
eral Electric, and Alstom in a number of tech-
nical roles, gaining experience in aerodynamics,
heat transfer, fluid systems, mechanical design,
combustion, services, and IVHM. In July 2008,
he moved to Cranfield University, as a Profes-
sor, and the Director of the IVHM Centre which
is funded by a number of industrial companies, including Boeing, BAE
Systems, Thales, Meggitt, MOD, DRS, Alstom Transport, and Novartis.
He has led the development and growth of the IVHM Centre, in research
and education, since its inception. His career spans some 40 years, working
mostly for a variety of gas turbine companies. He has coauthored the book
No Fault Found—The Search for the Root Cause. He is the Editor of five
SAE books on IVHM and the recent The World of Civil Aerospace. He is a
Fellow of IMechE, RAeS, ASME, and PHM, a Contributing Member of the
HM-1 IVHM Committee, the Director of the PHM Society, and a Chartered
Engineer. He is the Chair of the SAE IVHM Steering Group. He represents
the Editorial Board of the International Journal of Condition Monitoring.
75904 VOLUME 9, 2021
