Subthreshold Source-Coupled Logic Circuits for Ultra Low Power Applications by Tajalli, Armin et al.
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 7, JULY 2008 1699
Subthreshold Source-Coupled Logic Circuits for
Ultra-Low-Power Applications
Armin Tajalli, Student Member, IEEE, Elizabeth J. Brauer, Member, IEEE, Yusuf Leblebici, Senior Member, IEEE,
and Eric Vittoz, Life Fellow, IEEE
Abstract—This paper presents a novel approach for imple-
menting ultra-low-power digital components and systems using
source-coupled logic (SCL) circuit topology, operating in weak
inversion (subthreshold) regime. Minimum size pMOS transistors
with shorted drain-substrate contacts are used as gate-controlled,
very high resistivity load devices. Based on the proposed ap-
proach, the power consumption and the operation frequency of
logic circuits can be scaled down linearly by changing the tail bias
current of SCL gates over a very wide range spanning several
orders of magnitude, which is not achievable in subthreshold
CMOS circuits. Measurements in conventional 0.18 m CMOS
technology show that the tail bias current of each gate can be set
as low as 10 pA, with a supply voltage of 300 mV, resulting in a
power–delay product of less than 1 fJ. Fundamental circuits such
as ring oscillators and frequency dividers, as well as more complex
digital blocks such as parallel multipliers designed by using the
STSCL topology have been experimentally characterized.
Index Terms—CMOS integrated circuits, CMOS logic circuit,
current-mode logic (CML), pipelining, power–delay product,
source-coupled logic (SCL), subthreshold CMOS, subthreshold
SCL, ultra-low-power circuits, weak inversion.
I. INTRODUCTION
THE demand for implementing ultra-low-power digital sys-tems in many modern applications such as mobile systems
[1], [2], sensor networks [3], [4], and implanted biomedical sys-
tems [5], has increased the importance of designing logic circuits
in subthreshold regime [6]. In subthreshold MOSFET operation,
current density is very low and the ratio of the transconductance
to bias current of the device is maximum [7], [8].
Meanwhile, the exponential relationship between drain current
and gate voltage makes this mode of operation very suitable for
implementing widely adjustable circuits [7], [9]. Conventional
CMOS logic circuits utilizing subthreshold transistors can typ-
ically operate with a very low power consumption [10]–[13],
which is mainly due to the dynamic (switching) power consump-
tion and is quadrWRatically dependent to the supply voltage as
(where is the frequency of operation and
indicates the supply voltage). Hence, reducing the supply voltage
will result in reduction of power dissipation [1], [14] as well as
the output logic swing. Supply voltage reduction, on the other
hand, increases the delay in each gate which means the power
dissipation, logic swing,andspeedofoperationare tightly related
Manuscript received November 19, 2007; revised February 10, 2008.
A. Tajalli, Y. Leblebici, and E. Vittoz are with the Swiss Federal Institute
of Technology (EPFL), CH-1015 Lausanne, Switzerland (e-mail: armin.
tajalli@epfl.ch; yusuf.leblebici@epfl.ch; eric.vittoz@epfl.ch).
E. J. Brauer is with Electrical Engineering Department, Northern Arizona
University, Flagstaff, AZ 86911 USA (e-mail: elizabeth.brauer@nau.edu).
Digital Object Identifier 10.1109/JSSC.2008.922709
to each other. Meanwhile, the exponential relationship between
power dissipation and supply voltage in subthreshold regime
makes the accurate control of power consumption difficult. To
implement very low power digital systems, it is necessary to
minimize the energy dissipation at the system level in addition
to the gate level to achieve the desired performance [10].
Source-coupled logic (SCL) circuits are widely used in mixed-
mode integrated circuits where supply noise and substrate noise
injection are crucial [15]. Reduced output voltage swing in
SCL circuits compared to the CMOS logic gates has made this
topology very suitable for high frequency applications [16], [17].
This paper explores the potentials of subthreshold SCL circuits
as an alternative solution for implementing ultra-low-power
digital systems. In this approach, the power consumption and
maximum speed of operation can be adjusted linearly through
the tail bias current of each gate over a very wide range [18],
[19], thus, efficiently decoupling the decision of output voltage
swing from power dissipation and delay.
To enable operation at very low current levels and to achieve
thedesiredperformancespecifications, special circuit techniques
have to be applied, [18]–[21], for implementing very low power
SCL circuits. In [20], the intrinsically limited output impedance
of deep-submicron, short-channel pMOS devices has been used
to implement very high value load resistances for SCL topology.
Here, a more general approach with much less sensitivity to
process and technology variations will be introduced [19].
This paper presents novel techniques for implementing sub-
threshold SCL (STSCL) gates where the bias current of each
cell can be set as low as 10 pA. In Section II, after a brief review
of SCL circuits, the proposed technique for implementing sub-
threshold SCL gates will be introduced. Section III discusses the
power-delay performance of the proposed circuit configuration.
Experimental results and comparison with conventional CMOS
circuits are presented in Section IV, followed by conclusions in
Section V.
II. SUBTHRESHOLD SOURCE-COUPLED LOGIC CIRCUITS
A. Conventional SCL Topology
In an SCL gate, the logic operation takes place mainly in cur-
rent domain. Therefore, the speed of operation can be inher-
ently high. Shown in Fig. 1, a logic network composed of nMOS
source coupled differential pair switches steers the tail current
to one of the output branches based on the input logic
levels. The output load resistance converts the branch cur-
rent back to the voltage domain in order to drive the subsequent
SCL gates. The voltage swing at the output node
0018-9200/$25.00 © 2008 IEEE
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
1700 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 7, JULY 2008
Fig. 1. A conventional SCL-based inverter/buffer circuit. The switching part
can be composed of a network of nMOS source-coupled pairs to implement
more complex logic functions [15]. The load resistances can be implemented
using pMOS devices biased in triode region.
should be high enough to switch completely the input dif-
ferential pair of the next stage (i.e., ). Based
on this observation, the voltage swing should be larger than
( is the drain-source overdrive voltage of the
input nMOS devices when ) when the input nMOS de-
vices are in strong inversion [22], and larger than when
the devices are in weak inversion [7] ( is the thermal
voltage and is the subthreshold slope factor). Therefore, the
required voltage swing when the devices are in subthreshold
regime can be as low as which is about 150 mV at
room temperature (assuming ). This swing in the sub-
threshold regime depends on the subthreshold slope factor and
is independent of the threshold voltage of the nMOS switching
devices. Provided that the load resistance can be made suffi-
ciently high, this means that the switching operation of nMOS
devices has low dependence on the fabrication process varia-
tions. Therefore, as long as the tail bias current is higher than
junction leakage currents and output impedance of the devices
is much higher than the load resistance, the proposed topology
can operate properly as a logic circuit, even in aggressively
scaled deep-submicron technologies. Unlike CMOS logic cir-
cuits where the subthreshold channel leakage current is the dom-
inant leakage component, in STSCL topology the main leakage
currents are due to the p-n junctions of the MOS devices.
The speed of operation in an SCL gate is mainly limited by
the time constant at the output node which is
(1)
Based on this, the propagation delay is inversely proportional
to the tail bias current. Meanwhile, the circuit power–delay
product (PDP) is independent of [15], [16], [23].
B. Load Device Concept
To maintain the desired output voltage swing at very low
bias current levels, it is necessary to increase the load resistance
value in inverse proportion to the reducing tail bias current as
(2)
Fig. 2. (a) Conventional pMOS load device, (b) proposed load device, (c) I–V
characteristics of the conventional pMOS load (dotted) in comparison to the
proposed device (solid line), (d) measured I–V characteristics of the proposed
load device in comparison to the BSIM model (all data obtained using 0.18  m
CMOS technology).
In subthreshold operation, the tail bias current would be in
the range of few nA or even less. Therefore, to obtain a reason-
able output voltage swing, the load resistance should be in the
range of hundreds of . Meanwhile, this resistance should be
controlled very accurately based on the value. Hence, a well
controlled high resistivity load device with a very small area is
required. For this range of resistivity, conventional pMOS de-
vices biased in triode region can not be utilized since the re-
quired channel length of the transistor would be impractically
large [Fig. 2(a)]. Fig. 2(c) (dotted line) shows the I–V character-
istics of a pMOS device realized in 0.18 m technology for dif-
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
TAJALLI et al.: SUBTHRESHOLD SOURCE-COUPLED LOGIC CIRCUITS FOR ULTRA-LOW-POWER APPLICATIONS 1701
Fig. 3. Cross-section view of the proposed pMOS load device, showing the
parasitic components that contribute to operation in subthreshold regime.
ferent values, indicating that the configuration of Fig. 2(a)
results in a current source with almost infinite output impedance,
even for deep-submicron devices. Hence, the gain would not be
limited, neither would the amplitude. Fig. 2(b) shows the pro-
posed load device, where the drain of the pMOS device is con-
nected to its bulk. In this way, as illustrated in Fig. 2(c), the con-
figuration shown in Fig. 2(b) produces a finite and controllable
differential resistance, which, associated with the transconduc-
tance of the differential pair will provide a controlled, limited
gain and amplitude. Thus, it is possible to implement a very high
resistivity load device using a single minimum size pMOS de-
vice. The fact that each individual pMOS load device must be
confined in its own n-well also does not have a severe impact on
area as will be demonstrated later. The measured DC I–V char-
acteristics of the device are shown in Fig. 2(d). For
(bulk tied to the drain), the device operates as a very high re-
sistivity element as expected. This plot also shows that the mea-
surement results are very close to the resistance values predicted
by simulations.
The cross section view of the proposed pMOS load device
can be seen in Fig. 3. Connecting the drain to the bulk of the
pMOS load device ties the cathode of the n-well-to-substrate
reverse-biased diode to the output node. However, since the de-
vices are minimum size, the parasitic capacitance associated
with this diode is very small and can usually be neglected (in this
design using 0.18 m technology: 1 fF). The other impor-
tant parasitic element is the forward biased source-bulk diode.
Illustrated in Fig. 3, this diode can limit the possible voltage
swing at the drain of the device to 400–500 mV. However, as
the required voltage swing for subthreshold SCL gates is well
below this value, the source-bulk diode does not influence the
operation of the circuit.
Using the EKV model, the I–V characteristics of the sub-
threshold pMOS device can be expressed by [7], [8]
(3)
in which . In the proposed con-
figuration illustrated in Fig. 2(b), , hence
(4)
Fig. 4. A very high value floating resistor composed of two back to back pMOS
devices: (a) circuit schematic, and (b) measured I–V characteristics of the con-
trolled floating resistor.




in which and .
Thus, can be controlled through the source-gate voltage
of the device through . Because of exponential de-
pendence of the output resistance on , it can be adjusted in
a very wide range. To avoid process-related deviations, a replica
bias generator is required for , as explained in the next sec-
tion. The wide tuning range of means that the proposed
STSCL gate can be used in a very wide range of operating condi-
tions without the need for modifying the size of devices. Mean-
while, as long as the matching requirements are respected, the
frequency of operation would be linearly proportional to the bias
current.
Note that when becomes negative, the current direction
is reversed and the device switches to conventional configura-
tion in which the bulk is connected to source. In this case, the
drain current will increase rapidly. This property can help imple-
ment high valued floating resistors with a very wide adjusting
range by connecting two pMOS transistors in series as shown in
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
1702 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 7, JULY 2008
Fig. 5. Subthreshold SCL gate and the replica bias circuit used to control the output voltage swing.
Fig. 4. The measured I–V characteristics of this floating resistor
show moderate linearity in a wide voltage range, which can be
exploited in various analog circuit applications.
C. STSCL Gates
The proposed pMOS load device can be utilized to imple-
ment an SCL gate biased in subthreshold. Fig. 5 shows the basic
structure of the proposed STSCL gate. A simplified circuit dia-
gram of the replica bias circuit used to control the output voltage
swing is also shown. In this schematic, all devices operate in
subthreshold regime and the tail bias current can be reduced
until it becomes comparable in magnitude to the leakage cur-
rents that exist in the circuit.
Since the input differential pair transistors are operating in
subthreshold, it can be shown that the transconductance of the
input differential pair is
(7)
in which indicates the input differential voltage and
is the subthreshold slope of nMOS devices. Based on (7), for
the entire current will be switched to one of
the branches. Therefore, a voltage swing of more than
would be sufficient to make sure that the gain of STSCL circuit
is enough to be used as a logic gate. Combining (7) with (6) re-
sults in
(8)
Fig. 6(a) illustrates the DC transfer characteristics of an
STSCL gate as well as the stage gain. The simulated DC gain of
3.2 at the cross-over point is very close to the value estimated
by (8). The measured input–output transfer characteristics
of an STSCL buffer stage are shown in Fig. 6(b). Since all
the devices are operating in subthreshold regime, the transfer
characteristics of the circuit is independent of the bias current.
In this plot, the deviation from the ideal DC characteristics is
mainly due to the leakage currents in the test circuit coming
from electrostatic discharge (ESD) protection circuitry. To
measure the DC characteristics, output voltage swing has been
adjusted manually.
Meanwhile, based on (5) it can be shown that the equivalent
output resistance of the pMOS load for V is finite and
equal to
(9)
which means the load devices are capable of pulling up the
output node completely to .
Concerning the area overhead associated with the pMOS load
devices, actual mask layout examples using 0.18 m CMOS
technology design rules provide an accurate assessment. The
layout of a three-input XOR gate is shown in Fig. 7 where the
area required for the pMOS load devices is demonstrated to be
small compared to the remaining parts of the circuit.
D. Voltage Swing Control
A controlling circuit is necessary to keep the voltage swing at
the output of the SCL gates on the desired value. Fig. 5 shows
the simplified schematic of a replica bias (RB) circuit [15]. This
circuit should be well matched to the SCL gates to have very low
deviation in operating point. Meanwhile, amplifier should
provide enough gain with a very low offset to have the desired
accuracy. In this work, a folded-cascode amplifier has been used
to provide a large swing at the output node and to be able to test
the SCL gates in a very wide range of bias current values.
Any mismatch in the bias current or devices of the SCL gates
and RB circuit will result in variation of the desired output
voltage swing and it can be shown that the sensitivity
of this circuit to the mismatches is
(10)
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
TAJALLI et al.: SUBTHRESHOLD SOURCE-COUPLED LOGIC CIRCUITS FOR ULTRA-LOW-POWER APPLICATIONS 1703
Fig. 6. (a) Simulated DC transfer characteristics of an STSCL gate biased at
     nA and its DC gain. (b) Measured transfer characteristics of an STSCL
buffer stage for two different supply voltages (   0.6 V and 1.0 V) and
different bias currents (    1 nA, 10 nA, and 100 nA).
Fig. 7. Mask layout of the three-input XOR gate showing the area occupied
by the major components. Note that the pMOS load devices with their isolated
n-wells occupy a relatively small area compared to the nMOS logic network.
in which . Monte Carlo simulations show that
for minimum size devices, can be as high as 20–40 mV in
a typical 0.18 m process considered in this work. To compen-
sate the influence of device mismatch, should be selected
a little larger than the minimum value.
Meanwhile, it can be shown that the voltage gain from gate
to drain of transistor MPR in Fig. 5 is small
. Therefore, in spite of the expo-
nential relationship between and , the gain
of this stage is low and the RB circuit can be stabilized without
difficulty. Finally, please note that one single replica bias circuit
can be used for a large number of STSCL gates. Therefore, its
area overhead would be negligible in large scale applications.
III. PERFORMANCE ANALYSIS AND OBSERVATIONS
Power-Delay Product: The power dissipation of the STSCL
gate is where is the tail bias current, and the
typical delay of the gate is
(11)
Thus, the Power Delay product (PDP) is found as
(12)
Meanwhile, the power-to-frequency ratio can be calculated as
(13)
where the operating frequency is defined as
(14)
with being the activity rate factor (duty rate) and being
the maximum possible operating frequency: .
Thus, the ratio is
(15)
which provides a more practical measure for the power/fre-
quency tradeoff of any functional block.
Observation 1: The delay (or the maximum operating fre-
quency) in a STSCL gate depends on the tail bias current ,
but not on . Therefore, the delay of a logic block can be
controlled without influencing PDP, which is not possible in
conventional CMOS topologies. More importantly, the speed
and the operation (supply) voltage can be effectively decou-
pled in STSCL circuits. This point will be further elaborated
in Section IV-B.
Observation 2: To reduce the ratio, should be kept
as large as possible. This observation does not contradict with
similar results for conventional CMOS, where
(16)
as shown in [6]. However, the influence of on is
quite different in conventional CMOS, where an optimum
value to minimize can be found, especially for small
values, due to the significant leakage in CMOS.
Observation 3: Assuming that the system clock frequency
is dictated by the longest delay path between two consecutive
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
1704 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 7, JULY 2008
Fig. 8. Photomicrograph of the test circuits: (a) ring oscillator; (b) frequency
divider.
register stages, and assuming that the activity rate depends in-
versely on the maximum logic depth between two registers, it
is most beneficial to keep the logic depth as shallow as pos-
sible, and thus, increase . This calls for very short (one stage)
pipelining in STSCL systems, which is demonstrated with an
example in Section IV-D.
The output load capacitance is partially due to the de-
vice parasitic capacitances such as the capacitance of n-well to
p-substrate reverse biased diode and wiring capacitance
related to interconnections . Since n-well to p-substrate
capacitance for small size pMOS devices is less than 1 fF,
it can be ignored in comparison to the wiring capacitance
which can be much larger even for simple circuits.
Regarding (12), one can conclude that the achievable
power–delay product per unit capacitance would be .
This means that for a supply voltage of 400 mV and
mV, the minimum achievable PDP would
be 0.04-0.06 [fJ/fF/Gate]. Since the total parasitic capacitance
due to the STSCL gate itself (including ) is less than 1 fF,
the minimum PDP that can be expected for an unloaded gate is
[fJ/Gate]. Notice that PDP also depends on
temperature through and can be reduced by reducing the
temperature.
IV. TEST STRUCTURES AND MEASUREMENT RESULTS
A. Ring Oscillator and Divider Operation
To measure the delay versus power consumption for the pro-
posed STSCL gates, a test chip has been designed and fabricated
in conventional 0.18 m CMOS technology. The test structures
Fig. 9. Measured oscillation frequency versus power dissipation of the eight-
stage ring oscillator based on the proposed STSCL topology for     0.3 V,
0.4 V, and 1.0 V.
consist of eight-stage ring oscillator and frequency divider (di-
vide-by-8) circuits, both of which are implemented based on
a two-input multiplexer (MUX) STSCL gate. The micropho-
tographs of the test circuits are shown in Fig. 8. To control the
operation of the test circuits, the tail bias current of the SCL
gates can be adjusted externally. Internal current mirrors with
the ratio of 1/100 are used to simplify the measurement process.
The supply voltages of the test blocks are directly accessible
to measure the total power consumption of each block using
HP4156A Semiconductor Analyzer. An internal replica bias cir-
cuit has been applied to control the voltage swing at the output
of the gates, as described in Section II-D, ensuring a minimum
output swing of 100 mV. The die-to-die variation of the gate
bias voltage ( in Fig. 5) required to ensure a fixed voltage
swing of 150 mV at a given tail current was found to be less than
%, in conventional 0.18 m CMOS technology.
Fig. 9 illustrates the measured oscillation frequency of an
eight-stage ring oscillator with differential STSCL NAND
gates (which are constructed based on two-input MUX) in
comparison to the simulation results. The conventional CMOS
oscillator used for comparison is built with two-input standard
NAND gates in the same 0.18 m CMOS technology with
driving strength of 1. As depicted in this figure, the mea-
surement results of the STSCL oscillator are very close to the
simulation results, and consistent over a range of several orders
of magnitude. Meanwhile, PDP is very well predictable by (12).
This figure also shows the results for the CMOS ring oscillator,
operating in subthreshold regime with different supply voltage
values between 0.1 and 0.4 V.
The divide-by-8 circuit has been realized using the source-
coupled latch structure as shown in Fig. 10. Since all transis-
tors operate in weak inversion, the device dimensions can be
kept close to minimum size. The measured maximum operating
(input) frequency of the divider is plotted against power dissi-
pation in Fig. 11(a) at V and V, com-
paring the results with the performance of an optimized CMOS
frequency divider operating in subthreshold regime. While the
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
TAJALLI et al.: SUBTHRESHOLD SOURCE-COUPLED LOGIC CIRCUITS FOR ULTRA-LOW-POWER APPLICATIONS 1705
Fig. 10. (a) STSCL latch circuit schematic, and (b) the topology of the divide-
by-8 circuit used for measurement, consisting of three D-flip-flop (DFF) stages.
CMOS divider cannot sustain correct operation below 200 mV
supply voltage, the SCL divider with the bulk-drain connected
pMOS load continues its operation down to 10 pA/Gate of tail
current, and 3 kHz of input frequency. The resulting (measured)
PDP corresponds to less than 1 fJ/Gate.
To compare the performance of the STSCL gates at scaled
technology nodes, the maximum operating frequency of a
divide-by-8 circuit has been simulated using technology pa-
rameters for 90 nm, 130 nm, and 180 nm CMOS processes
[Fig. 11(b)]. Here, it is assumed that the DFF gates are loaded
with the same amount of interconnect capacitance, and all
leakage components are taken into account. It can be seen
that the STSCL frequency divider exhibits very similar perfor-
mance in different technology nodes. It is possible to reduce
the tail bias current of the circuit down to 10 pA in a controlled
manner both in 130 nm and 90 nm technologies, whereas the
subthreshold leakage current would be very difficult to limit in
conventional CMOS logic circuits.
Considering the results presented in Figs. 9 and 11, it can
be observed that the STSCL solution can successfully extend
Fig. 11. (a) Measured maximum frequency of operation versus power dissipa-
tion of the divide-by-8 frequency divider shown in Fig. 10 for        V,
and 1.0 V. (b) Simulated maximum operating frequency of STSCL divider in
different technologies (CMOS 90 nm, 130 nm, and 180 nm).
the range of operation by two orders of magnitude along the
power axis, and by about one order of magnitude along the
frequency axis, while allowing completely separate control of
voltage swing and power dissipation.
B. Carry–Save Multiplier Using SCL Gates
To illustrate the use of the proposed circuit topology for more
complex functions, a second test chip containing an (8 8) bit
parallel carry–save multiplier has been designed and fabricated
using 0.18 m CMOS technology (Fig. 12). Fig. 13 shows the
measured input-to-output delay of the STSCL-based multiplier,
operating at V, 0.4 V, and 1.0 V, in comparison to
the simulation results. It can be seen that the performance of
the STSCL multiplier is accurately predicted by the simulations.
The supply voltage can be reduced to 0.3 V while the circuit re-
mains operational over a very wide range of tail bias current.
The saturation behavior of the delay at higher bias currents is
mainly due to the limited swing of the replica bias circuit that is
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
1706 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 7, JULY 2008
Fig. 12. Photomicrograph of the measured STSCL-based 8  8 bit carry–save
multiplier.
used to produce the proper gate voltage for the pMOS load de-
vices. To illustrate the independent control of the delay and the
voltage supply, the PDP versus the delay of the STSCL multi-
plier circuit is plotted in Fig. 14 for different bias current levels,
and compared with the variation of PDP of an equivalent CMOS
multiplier circuit, also operating in subthreshold regime. In this
example, the power supply voltage and the output voltage swing
of the STSCL circuit is kept at 0.35 V and 0.15 V, respectively,
resulting in nearly constant PDP of less than 1 pJ over the en-
tire operating range. The PDP of the CMOS circuit, on the other
hand, varies significantly with , due to the quadratic depen-
dence of PDP on , and increasing dominance of leakage at
low values.
C. Compound SCL Gates to Improve Power–Delay
Performance
Using STSCL topology, the power consumption of a func-
tional block is directly proportional to the number of logic gates
to be biased with a tail current. Therefore, implementing more
complex logic functions in a single stage SCL gate can be ex-
pected to result in smaller number of gates and hence, reduced
power consumption. In this approach, since the time constant
at the common source nodes (i.e., , in which
indicates the parasitic capacitance in each common source
node) is much smaller than the time constant at the output node
( as shown in (1)), the speed degradation due to the stacking
will be negligible for ( indicates the number of
stacked stages in nMOS switching network) where
(17)
Fig. 15(a) shows a unit cell which is required to implement
the carry–save multiplier [24]. This unit block consists of a
two-input AND gate and a full-adder (FA), and it can be im-
plemented by two separate SCL gates, as shown in Fig. 15(b).
Alternatively, Fig. 15(c) shows an STSCL gate implemented
by merging two logic functions of AND and XOR on a single
Fig. 13. Measured total propagation delay of the proposed STSCL multiplier
versus tail bias current     for different supply voltages and in comparison to
the simulation results.
Fig. 14. Comparing the power–delay product versus delay for two 8  8
bit carry–save multiplier circuits built with conventional CMOS and STSCL
components.
branch and realizing the compound logic operation
. Using the merged STSCL
gate topology [Fig. 15(c)] results in a significant improvement
of the power–frequency performance of the 8 8 multiplier, as
illustrated in Fig. 16(a). The multiplier in this example is built
out of 56 adders and 64 AND gates (total number of gates is
120), of which 49 can be merged with the corresponding adder
as described above. This modification alone results in approxi-
mately 40% power reduction. In the general case of an
multiplier, the total number of gates is , and it is
possible to merge AND gates with adders, resulting
in almost 50% power reduction for higher values. In addi-
tion to the obvious reduction of tail currents, the merging of
AND gates with adders also reduces the layout area of the unit
cell, and hence, lowers the parasitic capacitance due to wiring.
Finally, the operating frequency is further increased by reducing
the overall logic depth, resulting in about 80% total improve-
ment of speed at iso-power. While the results are difficult to
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
TAJALLI et al.: SUBTHRESHOLD SOURCE-COUPLED LOGIC CIRCUITS FOR ULTRA-LOW-POWER APPLICATIONS 1707
Fig. 15. (a) Unit block needed to implement a carry–save multiplier consists of a two-input AND gate and a full-adder (FA) [24]. (b) Possible implementation of
the unit block based on STSCL logic (only the part generating    is shown) in which an AND gate is followed by an adder stage. The total current consumption
in this case would be     , while the total delay of this block would be approximately twice of a single STSCL gate. (c) Alternative implementation: Merging
the adder and AND functions in a single compound STSCL gate to improve the PDP. All switching nMOS transistors are minimum size devices.
generalize for random logic topologies, the merging of complex
logic gates clearly presents a valuable opportunity that can be
exploited for improving the power–frequency performance in
STSCL circuits.
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
1708 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 7, JULY 2008
Fig. 16. (a) Power–frequency improvement that can be achieved in the
8  8 carry–save multiplier circuit, by using compound gates as described
in Fig. 15(b) Comparison of maximum operation frequency versus power
consumption for two identical 8  8 carry–save multipliers, implemented with
merged STSCL gates (    0.35 V) and with CMOS gates (    0.2
V-0.5 V).
Fig. 16(b) compares the maximum frequency of operation for
an 8 8 bit STSCL carry–save multiplier with merged compo-
nents (operating at ) and that of a conventional
CMOS multiplier, operating at . It can be
seen that the power–frequency performance of the STSCL cir-
cuit is comparable to, and in many cases better than, the CMOS
equivalent, over a wide frequency range. The main drawback of
the merged-gate approach is a slight increase of the minimum
useable supply voltage, since compound gates with more levels
typically require a higher supply voltage. However, this is a rela-
tively minor limitation as long as the nMOS network transistors
are biased in subthreshold regime.
D. Shallow Pipelining to Improve Activity Rate
As already discussed in Section III, the power-to-frequency
ratio of STSCL circuits (i.e., the power efficiency to operate at
Fig. 17. (a) Section of the parallel multiplier where the signal flow is regulated
using two-phase micro-pipelining technique for improving the performance of
SCL gates. Note that every FA stage output is followed by a keeper/latch stage.
(b) Eye diagram of the output of the multiplier circuit. This plot shows the output
after SCL-to-CMOS level converter circuit. Input is a    pseudo-random bit
stream (PRBS). Here, the period of input data is     s,    10 nA
and    100 pA, i.e., the keeper stages dissipate only 1% of the power
dissipated by the FA stages.
a given frequency) can be significantly improved by increasing
the activity rate using shallow pipelining and by reducing logic
depth, as much as possible. One possibility is to implement
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
TAJALLI et al.: SUBTHRESHOLD SOURCE-COUPLED LOGIC CIRCUITS FOR ULTRA-LOW-POWER APPLICATIONS 1709
Fig. 18. Power–frequency improvement that can be achieved in the 8  8
carry–save multiplier circuit, by using shallow pipelining with keeper-latch
stages.
two-phase latch-based pipelining where the output of each gate
is latched during one clock phase, and passed on to the next
stage during the other clock phase—effectively reducing the
maximum logic depth to two consecutive gates. Instead of using
explicit latch stages, such two-phase pipelining can be achieved
by increasing (and reducing) the source (tail) current bias of al-
ternating stages, using the gate terminal of the tail current bias
transistor of each stage as the “clock” input. In this approach, il-
lustrated in Fig. 17 for the example of the carry–save multiplier
architecture, the current bias of odd stages is reduced to a low
(yet non-zero) level to retain (hold) their output while the cur-
rent bias of even stages is raised to the nominal operating value
to enable evaluation. Very simple cross-coupled “keeper” stages
connected to each gate output ensure that the output levels do not
degrade significantly during the “hold” phase. Fig. 17(a) shows
the circuit topology of an adder (sum generator) stage and the
output keeper stage, where the pulsed tail bias achieves a very
robust dynamic latching effect, augmented by the output keeper
with a tail bias current of 100 pA. In an 8 8 bit carry–save
multiplier circuit, taking into account the additional power over-
head of pipelining (which is 1% only), shallow pipelining using
keeper-latch stages will result in an overall improvement of the
by a factor of 5 (Fig. 18).
The pipelining technique described above can certainly be ap-
plied in combination with the gate-merging approach discussed
in Section IV-C, to improve the power–frequency performance
of subthreshold SCL circuits considerably.
V. CONCLUSION
A new approach for implementing ultra-low-power source-
coupled logic circuits biased in subthreshold regime has been
demonstrated. The new topology uses compact high resistance
pMOS load devices to provide the required voltage swing at the
output for proper logic operation. Measurement results show
that the tail bias current of each logic gate can be reduced to
less than 10 pA, while the power–delay product of the gate re-
mains less than 1 fJ, using 0.18 m CMOS technology. Robust
operation of ring oscillator and frequency divider circuits, as
well as more complex logic blocks (8 8 bit carry–save mul-
tiplier) has been demonstrated over a very wide range of fre-
quencies. Among other advantages, the proposed approach ef-
fectively decouples the circuit propagation delay from the oper-
ating voltage, resulting in near-constant PDP versus frequency.
The bias current of the STSCL gate can be scaled over several
decades using the same device dimensions, which makes this
circuit topology very suitable for ultra-low-power configurable
digital systems.
ACKNOWLEDGMENT
The authors would like to thank M. Stanisavljevic, B. Rey,
M. Mercaldi, and S. Badel for their valuable contributions in
block design and layout, and S. Hauser for preparing the test
setup.
REFERENCES
[1] M. Horowitz et al., “Low-power digital design,” in Proc. IEEE Int.
Symp. Low Power Electronics and Design (ISLPED), 1994, pp. 8–11.
[2] D. Suvakovic and C. A. T. Salama, “A low   CMOS implantation
of an LPLV digital filter core for portable audio applications,” IEEE
Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 47, no. 11,
pp. 1297–1300, Nov. 2000.
[3] G. Gielen, “Ultra-low-power sensor networks in nanometer CMOS,”
in Int. Symp. Signals, Circuits and Systems (ISSCS), Jul. 2007, vol. 1,
pp. 1–2.
[4] B. A. Warneke and K. S. J. Pister, “An ultra-low energy microcontroller
for smart dust wireless sensor networks,” in IEEE Int. Solid-State Cir-
cuits Conf. (ISSCC) Dig. Tech. Papers, 2004, pp. 316–317.
[5] L. S. Wong et al., “A very low-power CMOS mixed-signal IC for im-
plantable pacemaker applications,” IEEE J. Solid-State Circuits, vol.
39, no. 12, pp. 2446–2456, Dec. 2004.
[6] E. Vittoz, “Weak inversion for ultimate low-power logic,” in
Low-Power Electronics Design, C. Piguet, Ed. Boca Raton, FL:
CRC Press, 2005.
[7] C. Enz and E. Vittoz, Charge-Based MOS Transistor Modeling: The
EKV Model for Low-Power and RF IC Design. New York: Wiley,
2006.
[8] C. Enz, F. Krummenacher, and E. Vittoz, “An analytical MOS tran-
sistor model valid in all regions of operation and dedicated to low-
voltage and low-current applications,” Analog Integr. Circuits Signal
Process. J., vol. 8, pp. 83–114, Jun. 1995.
[9] C. Enz, M. Punzenberger, and D. Python, “Low-voltage log-domain
signal processing in CMOS and BiCMOS,” IEEE Trans. Circuits Syst.
II, Analog Digit. Signal Process., vol. 46, no. 3, pp. 279–289, Mar.
1999.
[10] B. H. Calhoun, A. Wang, and A. Chandrakasan, “Modeling and sizing
for minimum energy operation in subthreshold circuits,” IEEE J. Solid-
State Circuits, vol. 40, no. 9, pp. 1778–1786, Sep. 2005.
[11] B. H. Calhoun and A. Chandrakasan, “Ultra-dynamic voltage scaling
(UDVS) using subthreshold operation and local voltage dithering,”
IEEE J. Solid-State Circuits, vol. 41, no. 1, pp. 238–245, Jan. 2006.
[12] R. Amirtharajah and A. Chandrakasan, “A micropower programmable
DSP using approximate signal processing based on distributed arith-
metic,” IEEE J. Solid-State Circuits, vol. 39, no. 2, pp. 337–347, Feb.
2004.
[13] H. Soeleman, K. Roy, and B. C. Paul, “Robust subthreshold logic
for ultra-low-power operation,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 9, no. 1, pp. 90–99, Sep. 2001.
[14] A. Chandrakasan and R. Brodersen, “Minimizing power consumption
in digital CMOS circuits,” Proc. IEEE, vol. 83, no. 4, pp. 498–523,
Apr. 1995.
[15] J. M. Musicer and J. Rabaey, “MOS current mode logic for low power,
low noise CORDIC computation in mixed-signal environment,” in
Proc. IEEE Int. Symp. Low Power Electronics and Design (ISLPED),
2000, pp. 102–107.
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
1710 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 7, JULY 2008
[16] S. Badel and Y. Leblebici, “Breaking the power–delay tradeoff: Design
of low-power high-speed MOS current-mode logic circuits operating
with reduced supply voltage,” in Proc. IEEE Int. Symp. Circuits and
Systems (ISCAS), May 2007, pp. 1871–1874.
[17] M. Alioto and G. Palumbo, Model and Design of Bipolar and MOS
Current-Mode Logic (CML, ECL and SCL Digital Circuits). New
York: Springer, 2005.
[18] E. Brauer and Y. Leblebici, “Semiconductor based high-resistance
device and logic application,” European Patent Application No.
07104895.3-1235, Mar. 26, 2007.
[19] A. Tajalli, E. Vittoz, Y. Leblebici, and E. J. Brauer, “Ultra low
power subthreshold MOS current mode logic circuits using a novel
load device concept,” in Proc. European Solid-State Circuits Conf.
(ESSCIRC), Munich, Germany, Sep. 2007, pp. 281–284.
[20] F. Cannillo and C. Toumazou, “Nano-power subthreshold cur-
rent-mode logic in sub-100 nm technologies,” IEE Electron. Lett., vol.
41, no. 23, pp. 1268–1269, Nov. 2005.
[21] F. Cannillo, C. Toumazou, and T. S. Lande, “Bulk-drain connected load
for subthreshold MOS current-mode logic,” IEE Electron. Lett., vol. 43,
no. 12, pp. 662–664, Jun. 2007.
[22] P. R. Gray, P. J. Hurst, S. H. Lewis, and R. G. Meyer, Analysis and
Design of Analog Integrated Circuits, 4th ed. New York: Wiley, 2000.
[23] M. Alioto and G. Palumbo, “Power-aware design techniques for
nanometer MOS current-mode logic gates: A design framework,”
IEEE Circuits Syst. Mag., vol. 6, no. 4, pp. 40–59, 2006.
[24] J. M. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Cir-
cuits: A Design Perspective. New York: Prentice-Hall, 2003.
Armin Tajalli (S’04) received the B.S. and M.S.
degrees (Hons.) in electrical engineering from Sharif
University of Technology, Tehran, Iran, and Tehran
Polytechnic University in 1997 and 1999, respec-
tively, and the Ph.D. degree from Sharif University
of Technology in 2006 (Hons.).
From 1998 to 2004, he was with Emad Semicon as
a Senior Analog Design Engineer. In 2006, he joined
Microelectronic Systems Laboratory (LSM) in
Ecole Polytechnique Fédérale de Lausanne (EPFL),
Switzerland, working on ultra-low-power circuit
design techniques.
Dr. Tajalli received the award of the Best Design Engineer from Emad
Semicon, 2001, the Kharazmi Award on Research and Development, 2002, and
the Presidential Award of the best Iranian researchers, 2003.
Elizabeth J. Brauer (M’94) received the Ph.D. de-
gree in electrical engineering from the University of
Illinois at Urbana-Champaign in 1994.
She is presently an Associate Professor of Elec-
trical Engineering at Northern Arizona University,
Flagstaff. She has worked for Motorola and Fairchild
Semiconductor, and taught at the University of Ken-
tucky. Her technical interests are in computer-aided
design, verification, and testing of integrated circuits,
microelectronics and biomimetic circuits.
Yusuf Leblebici (M’90–SM’98) received the B.S.
and M.S. degrees in electrical engineering from
Istanbul Technical University, Istanbul, Turkey, in
1984 and 1986, respectively, and the Ph.D. degree
in electrical and computer engineering from the
University of Illinois at Urbana-Champaign in 1990.
From 1991 to 1993, he was Visiting Assistant
Professor of electrical and computer engineering
at the University of Illinois at Urbana-Champaign.
From 1993 to 1998, he was on the faculty of Istanbul
Technical University as Associate Professor of
electrical engineering. He was Associate Professor of electrical and computer
engineering at Worcester Polytechnic Institute (WPI) in Massachusetts between
1998 and 2001, where he established and directed the VLSI Design Laboratory,
and served as Project Director at the New England Center for Analog and
Mixed-Signal IC Design. From 2000 to 2001, he also took the responsibility
of developing the microelectronics degree program at Sabanci University,
as the Microelectronics Program Coordinator. Since 2002, he has been a
Chair Professor at the Swiss Federal Institute of Technology in Lausanne
(EPFL), Switzerland, and Director of Microelectronic Systems Laboratory. His
research interests include design of high-speed CMOS digital and mixed-signal
integrated circuits, computer-aided design of VLSI systems, intelligent sensor
interfaces, modeling and simulation of semiconductor devices, and VLSI
reliability issues. He is the coauthor of two textbooks, Hot-Carrier Reliability
of MOS VLSI Circuits (Kluwer Academic Publishers, 1993) and CMOS Digital
Integrated Circuits: Analysis and Design (McGraw Hill, 1996, 1998, and
2002), as well as more than 150 scientific articles published in international
journals and conferences.
Dr. Leblebici has been on the organizing and steering committees of several
international conferences in microelectronics. He served as an Associate Editor
of IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II between 1998 and 2000,
and as an Associate Editor of IEEE TRANSACTIONS ON VLSI between 2001 and
2003. He received the Young Scientist Award of the Turkish Scientific and Tech-
nological Research Council in 1995, and the Joseph Samuel Satin Distinguished
Fellow Award of the Worcester Polytechnic Institute in 1999.
Eric Vittoz (M’72–SM’87–F’89–LF’04) received
the electrical engineering degree from Polytechnical
School University of Lausanne, Switzerland, in 1961
and the Ph.D. degree from EPFL (Swiss Institute of
Technology Lausanne) in 1969.
He joined the Watchmakers Electronic Center
(CEH) in 1962 as a member of the team that de-
veloped the first quartz watch. He became head of
the Advanced Circuit Department at CEH in 1967
and was appointed Vice Director and head of the
Applications Division in 1971. In 1984, he took the
responsibility of the Circuits and Systems Research Division of the newly
founded CSEM (Swiss Center for Electronics and Microtechnology), where he
was appointed Executive Vice-President in 1991, head of Integrated Circuits
and Systems, then head of Advanced Microelectronics after 1997. Since 2004,
he is retired from CSEM after spending three years of partial retirement as
a Fellow researcher. Since 1975, he has been teaching analog circuit design,
and supervising undergraduate and graduate student projects at EPFL,where
he became Professor in 1982. He has authored or co-authored more than 140
papers and holds 26 patents in the fields of very low-power microelectronics,
compact transistor modeling, analog CMOS circuit design and biology-inspired
analog VLSI.
Dr. Vittoz has been involved in the formation of the IEEE Solid-State Circuits
Society and was a member of its AdCom from 1996 to 1999. He was a member
of the European Program Committee of ISSCC from 1977 to 1989, and during
more than 25 years, a member of the Steering Committee of ESSCIRC, the
European Solid-State Circuits Conference. A Life Fellow of the IEEE, he is the
recipient of the 2004 IEEE Solid-State Circuits Award.
Authorized licensed use limited to: EPF LAUSANNE. Downloaded on October 8, 2008 at 10:32 from IEEE Xplore.  Restrictions apply.
