Design and Analysis of 8 Bit Parallel Prefix Comparators Using Constant Delay Logic  by George, Amy Mariam & Chandran, G. Jyothish
 Procedia Technology  24 ( 2016 )  1178 – 1185 
Available online at www.sciencedirect.com
ScienceDirect
2212-0173 © 2016 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license 
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of the organizing committee of ICETEST – 2015
doi: 10.1016/j.protcy.2016.05.074 
International Conference on Emerging Trends in Engineering, Science and Technology (ICETEST 
- 2015) 
Design and Analysis of 8 Bit Parallel Prefix Comparators using 
Constant Delay Logic 
Amy Mariam Georgea*, Jyothish Chandran G.b 
a, b Dept. of Electronics and Communication, SAINTGITS College of Engineering, Kottayam, India, 686532 
Abstract 
Parallel Prefix Radix 2 8 bit Comparators using Constant Delay (CD) logic is presented in this paper. The constant delay logic 
pre discharges the output to zero logic and switches to a logic one through a critical path clocked PMOS transistor. CD logic 
operation is much faster than a dynamic logic circuit during its D-Q mode of operation. The comparator’s architecture consists of 
two stages; where the first stage uses a pass transistor pre encoding circuitry for achieving low power consumption and the 
second stage employs a high performance dynamic-CD-static logic manner combination comparators. Design and simulation 
were carried out in Mentor Graphics ELDO Simulator using 180nm technology. 
© 2016 The Authors. Published by Elsevier Ltd. 
Peer-review under responsibility of the organizing committee of ICETEST – 2015. 
Keywords: Constant Delay Logic; Dynamic Logic; Parallel Prefix Comparators; Small Swing Dynamic Comparator; 
1. Introduction 
Binary comparator is a combinational logic circuit with wide variety of applications in electronics. High speed 
adders were preferred for high performance comparison at the cost of area and power consumption. Priority 
encoding algorithms [1] and bitwise competition logics [2] which shows better delay performance was utilized in the 
previous comparator designs. Dynamic Logic was commonly used for achieving high performance operations 
though it is not appropriate for low power operations. Lately, tree based comparators analogous to the parallel prefix 
adder’s carry merge tree was proposed which is one of the fastest architecture since the delay for the comparison of 
two N bit numbers depends on logarithm of N only[3]. In this paper a new and improved comparator realized using 
constant delay is presented. This comparator maintains low power dissipation by implementing with pass transistor 
logic and also accomplishes high performance logic operations using a tree structure. The constant delay logic is 
used for timing crucial stages for reducing the overall delay without compromising the consumption of energy.  
* Amy Mariam George. Tel.:+0-9496-686821. 
E-mail address:amymariam@gmail.com 
 2016 The Authors. Published by Elsevi r Ltd. This is an open access article under the CC BY-NC-ND license 
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of the organizing committee of ICETEST – 2015
1179 Amy Mariam George and G. Jyothish Chandran /  Procedia Technology  24 ( 2016 )  1178 – 1185 
 A digitally tunable delay replica for controlling the clock signal was added to ensure robust operations [4]. 
The proposed comparator design utilises a small swing Dynamic Comparator which reduces the output voltage 
swing. An energy efficiency and delay analysis of the binary tree based comparators with and without clock gating 
has been presented in this paper. By adding a clock gating circuitry, the overall energy consumption can be reduced 
without compromising the performance.  
The paper is organized into 5 sections. Section II covers the characteristics of the constant delay logic. 
Section III describes the parallel prefix comparators and a proposed design. Section IV gives the simulation results 
and their comparison. Section V concludes the paper.  
2. Constant Delay Logic 
CMOS is a widely used type of semiconductor for constructing ICs. Commonly used circuit families 
include ratioed logic, dynamic logic and pass transistor logic circuits. Recently a more useful logic operation that 
has high performance capability known as FeedThrough Logic (FTL) was proposed. However FTL exhibits excess 
direct path current, reduced noise margin and excessive power dissipation [5]. Furthermore, multiple cascaded FTL 
stages are not feasible for performing complex logic functions. To mitigate the shortcomings of FTL, a high 
performance Constant Delay Logic has been proposed. The CD logic consists of a Timing Block and a Logic Block. 
Timing block (TB) creates an adjustable window period which helps to reduce the static power dissipation by 
adjusting the contention period. The Logic Block (LB) aids in unnecessary glitch elimination and makes CD logic 
cascading feasible. The operation of the CD Logic is such that during the rest phase, the clock is high, CD logic 
initially discharges the X and Y nodes to logic 0. When clock becomes low, the logic enters evaluation phase and 
any of the following operations can occur namely, contention mode, C Q delay mode, or the D Q delay mode [6]. 
i)Contention Mode 
 Contention mode takes place when the input is at logic one when the clock is low. As a result, X will rise 
to nonzero voltage level and the output experience a short-term glitch. The duration of this glitch depends on the 
delay between CLK and CLK_d which is the local window width. When CLK_d becomes high, and suppose X is 
low, then Y rise to logic one and M1 will turn OFF. As a result, contention period will be over and the temporary 
glitch gets eliminated.  
ii)C Q Delay Mode 
C Q delay mode happens when input makes a shift from one to zero logic level before the clock becomes 
zero. When the clock becomes 0, X rises to logic 1 and thus Y will remain at logic 0 for the entire evaluation.  
iii)D Q Delay Mode 
D Q delay mode employs the pre estimation characteristics of the CD logic for ensuring high performance 
logic functions. The clock falls from one to zero logic level before the input transit. X will initially rise to a nonzero 
voltage level. If the input becomes logic 0 when Y is low, X will rise to logic one.  
The Timing Block of the CD logic efficiently reduces the power dissipation during the contention mode. 
The adjustable local window technique allows the designers to alter the window width for different logic 
expressions for achieving minimal power dissipation and maintains the performance as well. CD logic gets rid of 
any false logic evaluation in the case of a cascaded FTL. 
In the parallel prefix comparator design, the CD logic can be modified where the transistor overhead for the 
Timing Block (TB) is reduced by 20%. Also the pre discharge transistor that is used in the Constant Delay logic is 
no longer required since the first stage dynamic comparator always charges to logic 1 during the precharge phase 
and accordingly pulls down the internal node of the CD logic to logic 0. The static inverted comparison circuit 
shown in Figure 5.8 acts as the Logic Block for the Constant Delay logic Comparator and reduces any unwanted 
glitch which is seen at the output while calculating the final stage comparison [4]. 
       
1180   Amy Mariam George and G. Jyothish Chandran /  Procedia Technology  24 ( 2016 )  1178 – 1185 
  
   Fig. 1.  Constant Delay Logic                           Fig. 2.  Modified Constant Delay Logic. 
3. 8 Bit Parallel Prefix Comparators 
The parallel prefix comparator is motivated by the fact that the generate (G) and propagate (P) signals can be 
defined for binary comparisons, similar to generate and propagate signals for binary additions. Hence, a parallel 
prefix comparator can be considered as a subset of the carry merge tree of a parallel prefix adder, where only the 
final carry out signal is necessary in determining the result.  
x Basic Design Principle 
A two 2-bit binary number (A1A0 and B1B0) comparison can be realized with 
ܣ஻௜௚ ൌ ሺͳܤͳതതതതሻ ൅ ሺܣͳْ ܤͳሻതതതതതതതതതതതതതതሺͲܤͲതതതതሻ                                         (1) 
ܤ஻௜௚ ൌ ሺܣͳതതതതതܤͳሻ ൅ ሺܣͳْ ܤͳሻതതതതതതതതതതതതതതሺܣͲതതതതܤͲሻ                                   (2) 
ܧܳ ൌ ሺܣͳْ ܤͳሻሺܣͲْ ܤͲሻ                                        (3) 
If B > A, then (ܤ஻௜௚ , EQ) is (1,0).  (ܤ஻௜௚ , EQ) is (0,0) if A > B and (0,1) if A = B. (1) shows that it is 
analogous to the carry generation in binary additions. Carry generation is given by, 
ܥ௢௨௧ ൌ ܣܤ ൅ ሺܣ ْ ܤሻܥ௜௡ ൌ ܩ ൅ ܲܥ௜௡                         (4) 
where A and B are the binary inputs, ܥ௜௡ and ܥ௢௨௧ are the carry input and output, and G and P are the 
generate and propagate signals, respectively. Comparing (2) and (4), we can term: 
 
 ൌ ܣͳതതതതܤͳ , ܧܳͳ ൌ ܣͳْ ܤͳ  and ܥ௜௡ ൌ ܣͲതതതതܤͲ for ܤ஻௜௚.  
The encoding equation is given as: 
ܩሺ௜ሻ ൌ ܣሺపሻതതതതതܤ௜  and ܧܳሺ௜ሻ ൌ ܣሺపሻ ْ ܤሺపሻതതതതതതതതതതതതത                        (5) 
1181 Amy Mariam George and G. Jyothish Chandran /  Procedia Technology  24 ( 2016 )  1178 – 1185 
         
Fig. 3. 8 bit Tree Diagram of Comparator       Fig. 4. Comparison Generation Circuit 
Fig. 3 illustrates the 8 bit diagram of the static parallel prefix comparator. The pre-encode circuit (Fig. 5) is 
optimized for better energy efficiency and is intended to minimize the total transistor count. Therefore, a pass 
transistor logic style is used which minimises the required number of transistors. The comparison generation circuit 
(Fig. 4) is optimised for better power delay product. So a static logic is used to aid low power and better 
performance functions. 
 
Fig. 5. Static Pre-encoding Circuitry 
3.1. Radix 2 Parallel Prefix CD Comparator with clock gating 
The parallel prefix comparator structure can be classified into two stages. The first stage comprises of eight pass 
transistor pre-encoding circuits in parallel (Fig. 6). The second stage consists of a particular 8 bit comparator. The 
second stage of parallel prefix comparator architecture together with the clock generation circuit is shown in Fig. 7. 
The first stage realizes a radix 2 output that merges with a footed dynamic comparator. CD logic comparator is 
utilized in the second stage due to its domino compatibility and it acts as a high performance interface between the 
dynamic and the static logic comparators. The clock tree is arranged such that the CD comparator always operates in 
the high performance D to Q delay mode. The static inverted comparison circuit acts as a Logic Block for the CD 
comparator and computes the final stage comparison result. 
 
1182   Amy Mariam George and G. Jyothish Chandran /  Procedia Technology  24 ( 2016 )  1178 – 1185 
 
Fig. 6. Stage 1 of Radix 2 Parallel Prefix CD Comparator with clock gating 
 
Fig. 7. Stage 2 of Radix 2 Parallel Prefix CD Comparator with clock gating 
The clock generation circuit includes a digital tunable delay replica [7] and a clock gating circuit which are 
controlled by two the EQ signals, EQ7 and EQ6. The digital delay replica ensures that the input to all the dynamic 
comparators arrive before CLK0 and CLK1. In the parallel prefix comparator, one dynamic comparator and one CD 
comparator will be triggered in every clock cycle by CLK1 and CLKCD1. In contrast, the other three dynamic and 
the other CD comparators enters the evaluation period only if both the EQ7 and EQ6 signals are at logic 1.The 
chance that the 3 dynamic comparators and one CD logic comparator enters the evaluation period is just 40%. The 
clock gating strategy successfully reduces the energy consumption. Compared to the static logic parallel prefix 
comparator, this design achieves almost 50% reduction in delay. 
 
3.2. Proposed Radix 2 Parallel Prefix CD Comparator with clock gating 
The power consumed by high performance circuits has increased so much that it imposed a limiting factor on the 
overall functionality and performance such as heat removal, chip thermal management processes. The low voltage 
swing circuit technique has become an attractive method for reducing the power in high performance circuits. In this 
small swing domino logic, the voltage swings at the internal nodes of logic circuits are modified. The dynamic 
comparator can be replaced by a new small swing logic that is capable of reducing the signal amplitude by adding 
PMOS and NMOS transistors. This circuit is capable of lowering the voltage intensity of logic 1 and increasing the 
voltage intensity of logic 0.  
1183 Amy Mariam George and G. Jyothish Chandran /  Procedia Technology  24 ( 2016 )  1178 – 1185 
 
Fig. 8. Small Swing Dynamic Comparator used in Parallel Prefix Comparator 
 The small swing dynamic comparator circuit shown in Fig. 8. consist of additional transistors which are 
responsible for reducing the swing level depending on the transistor sizes. The output node is charged to VDD 
during the pre charge mode when the input is 0. The substrate bias of the PMOS transistor MP2 by the output node 
bias tries to turn off MP2. And when MP2 turns off, current will no longer flow through it. As a result, the voltage at 
output remains lower than VDD. The circuit operates with the same mechanism in the evaluation mode also [8]. 
4. Simulation and Results 
The design entry was performed using Mentor Graphics Pyxis Schematic using TSMC 180nm process 
technology with a supply of 1.8v at a temperature of 270C. The design was simulated and functionalities were 
verified using Mentor Graphics ELDO Simulator. A clock signal of 20ns width and 40ns period was taken. After 
Simulation, the waveforms were viewed using EZWave. Physical Implementation was carried by schematic driven 
layout IC Station using 180nm technology. 
4.1. Radix 2 Parallel Prefix Static Comparator 
The output for A>B ie A=11111111 and B= 01111111 is shown in Fig. 9. The power dissipated by this circuit is 
less when compared to Parallel Prefix CD comparators with and without clock gating circuitries. 
 
Fig. 9. Simulation Waveform  
1184   Amy Mariam George and G. Jyothish Chandran /  Procedia Technology  24 ( 2016 )  1178 – 1185 
4.2. Radix 2 Parallel Prefix CD Comparator without clock gating 
The output for A>B ie A=11111111 and B= 01111111 is shown in Fig. 10. This circuit dissipates more power 
than Parallel Prefix CD Comparator with clock gating. 
 
 
Fig. 10. Simulation Waveform  
4.3. Radix 2 Parallel Prefix CD Comparator with clock gating 
The output for A>B ie A=11111111 and B= 01111111 is shown in Fig. 11. This circuit dissipates less power 





Fig. 11. Simulation Waveform  
4.4. Proposed Radix 2 Parallel Prefix CD Comparator with clock gating 
The output of Proposed Radix 2 Parallel Prefix CD Comparator with clock gating for A>B ie A=11111111 and 
B= 01111111 is shown in Fig. 12. This circuit dissipates less power compared to Parallel Prefix CD Comparator 
without clock gating. It shows better energy efficiency compared to all the other comparators. 
 
 
1185 Amy Mariam George and G. Jyothish Chandran /  Procedia Technology  24 ( 2016 )  1178 – 1185 
 
 
Fig. 12. Simulation Waveform 
The comparison table has been presented to compare the results of the parallel prefix comparators. The 
table shows the power, delay, PDP and EDP comparison when the inputs to the comparators are A=11111111 and 
B=01111111 i.e. when A>B. The analyses have been done for 100ns duration in each case using 180nm technology. 
The frequency of operation is 10MHz for all the circuits and the supply voltage is 1.8v.  
Table 1. Performance Comparison. 
Radix 2 Parallel Prefix Comparators  Delay(nS) Power(nW) PDP(aJ) EDP(aJ*nS) 
Static 2.05 90.05  184.60 378.43 
CD without clock gating 1.23  180.80  222.384 273.53 
CD with clock gating 1.11  127.135  141.11 156.64 
Proposed CD with clock gating 1.13  102.15  132.795 150.05 
5. Conclusion 
Radix 2 Parallel Prefix 8 bit Constant Delay Comparators with and without clock gating has been designed 
and simulated. The clock gating circuitry is added to ensure that only the required comparators enter the evaluation 
period thereby reducing the energy consumption effectively. Furthermore, the proposed Radix 2 Parallel Prefix CD 
comparator with clock gating shows better energy efficiency by making use of a small swing dynamic comparator 
for reducing the output voltage swing. The proposed comparator architecture has shown better PDP and EDP 
compared to all the other designs. The clock gated CD logic parallel prefix comparators are implemented 
exclusively for timing the critical path and achieves additional speed advantage with comparable energy 
consumption over the same design with static logic only.  
References 
[1]  C.-H. Huang and J.-S. Wang, “High-performance and power-efficient CMOS comparators,” IEEE J. Solid-State Circuits, vol. 38, no. 2, pp. 
254–262, Feb. 2003. 
[2]  J.-Y. Kim and H.-J. Yoo, “Bitwise competition logic for compact digital comparator,” in Proc. IEEE Asian Solid-State Circuits Conf, pp. 59–
62, 2007. 
[3]  P. Chuang, D. Li, and M. Sachdev, “A low-power high-performance single-cycle tree-based 64-bit binary comparator,” IEEE Trans. Circuits   
Syst. II, Exp. Briefs, vol. 59, no. 2, pp. 108–112, Feb. 2012.  
[4]  P.-J. Chuang, D. Li, M. Sachdev, and V. Gaudet,” A 167-ps 2.34-mW Single-Cycle 64-Bit Binary Tree Comparator With Constant-Delay 
Logic in 65-nm CMOS”, IEEE Transactions On Circuits And Systems—I : Regular Papers, Vol. 61, No. 1, January 2014 
[5]  V. Navarro-Botello, J. A. Montiel-Nelson, and S. Nooshabadi, “Analysis of high-performance fast feedthrough logic families in CMOS,” IEEE 
Trans. Circuits Syst. II, Exp. Briefs, vol. 54, no. 6, pp. 489–493, Jun. 2007. 
[6]  P. Chuang,D. Li, andM. Sachdev, “A constant delay logic style,” IEEE Trans. Very Large Scale Integr. Syst., vol. 21, no. 3, pp. 554–565,   
Mar. 2013. 
[7] M. Maymandi Nejad and M. Sachdev, “A monotonic digitally controlled delay element,” IEEE J. Solid-State Circuits, vol. 40, no. 11, pp. 2212–
2219, Nov. 2005 
[8]  S. Y.  Ahn and K. Cho, “Small Swing Domino Logic Based on Twist Connected Transistors”, Electronics Letters, Vol. 50, No. 15 pp. 1054–
1056, July 2014.   
