DUAL-RAIL GATE STRUCTURE FOR A COMPLEX DATA PATH by Nagasandeep, Viswanadham & Mahesh, Y
Viswanadham Nagasandeep* et al. 
  (IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH 
  Volume No.4, Issue No.6, October – November 2016, 4732-4734.  
2320 –5547 @ 2013-2016 http://www.ijitr.com All rights Reserved.  Page | 4732 
Dual-Rail Gate Structure For A Complex 
Data Path 
VISWANADHAM  NAGASANDEEP 
M.Tech Student, Dept of ECE 
SKR College of Engineering & Technology 
Nellore, Andhra Pradesh, India 
Y MAHESH
 
Assistant Professor, Dept of ECE 
SKR College of Engineering & Technology 
Nellore, Andhra Pradesh, India
Abstract: Dual-rail domino gates are restricted to create a reliable critical data path. According to this 
critical data path, the handshake circuits are greatly simplified, that provides the pipeline high 
throughput in addition to low power consumption. This paper presents a higher-throughput and 
ultralow-power asynchronous domino logic pipeline design method, targeting to latch-free and very fine-
grain or gate-level design. The information pathways are comprised of a combination of dual-rail and 
single-rail domino gates. The 4 phase bundled-data protocol design most carefully resembles the style of 
synchronous circuits. Furthermore, the stable critical data path enables the adoption of single-rail 
domino gates within the noncritical data pathways. An 8 × 8 array style multiplier can be used for 
evaluating the suggested pipeline method. This saves lots of power by reduction of the overhead of logic 
circuits. In contrast to a bundled-data asynchronous domino logic pipeline, the suggested pipeline saves 
energy within the best situation and also the worst situation when processing different data patterns. 
Keywords: Asynchronous Pipeline; Critical Data Path; Dual-Rail Domino Gate; Single-Rail Domino Gate;  
I. INTRODUCTION 
Combined with the ongoing CMOS technology 
scaling, VLSI systems become increasingly more 
complex. The physical design issues, for example 
global clock tree synthesis and top-level timing 
optimization, become serious problems. 
Asynchronous design is recognized as an 
encouraging solution for coping with these 
problems that report towards the global clock, since 
it uses local handshake rather of externally 
provided global clock. In asynchronous design, the 
option of handshake protocols affects the circuit 
implementation [1]. The 4-phase bundled-data 
protocol and also the four-phase dual-rail protocol 
are a couple of popular protocols which are utilized 
in most practical asynchronous circuits. The 4 
phase bundled-data protocol design most carefully 
resembles the style of synchronous circuits. 
Handshake circuits generate local clock pulses and 
employ delay matching to point valid signal. It 
normally results in the best circuits because of the 
extensive utilization of timing assumptions. 
However, the 4-phase dual-rail protocol design is 
implemented within an elaborate method in which 
the handshake signal is combined with dual-rail 
encoding of information. Handshake circuits 
understand the arrival of valid data by discovering 
the encoded handshake signal, which enables 
correct operation in the existence of arbitrary data 
path delays. This selection is extremely helpful for 
coping with data path delay variations in advanced 
VLSI systems, for example asynchronous field-
programmable gate arrays and system-on-nick. 
These overheads cause low circuit efficiency and 
restrict the applying part of the four-phase dual-rail 
protocol design. This paper presents a manuscript 
design approach to asynchronous domino logic 
pipeline, which concentrates on increasing the 
circuit efficiency and making asynchronous 
domino logic pipeline design better for an array of 
applications. The novel design method combines 
the advantages of the 4-phase dual-rail protocol and 
also the four-phase bundled-data protocol, which 
achieves a place-efficient and ultralow-power 
asynchronous domino logic pipeline. The latch less 
feature provides the advantages of reduced critical 
delays, smaller sized plastic area, minimizing 
power consumption. However, asynchronous 
domino logic pipeline includes a prevalent problem 
that dual-rail domino logic needs to be accustomed 
to compose the domino data path. Single-rail 
domino logic can't be used consumes lots of plastic 
area and power consumption. Such overhead 
almost cancels the area and power benefits supplied 
by the latch less feature. The recognition overhead 
keeps growing using the width of information 
pathways, which impedes its application in the 
style of a sizable function block having a 
considerable data path width. However, 
asynchronous domino logic pipeline in line with 
the four-phase bundled-data protocol avoids the 
recognition overhead by applying just one extra 
bundling signal, to complement the worst situation 
block delay, which works as a completion signal 
[2]. Within this paper, our suggested pipeline 
reduces both dual-rail encoding overhead in data 
pathways and also the recognition overhead in 
handshake control logic by designing with different 
built critical data path. A reliable critical data path 
is built using redesigned dual-rail domino gates. By 
discovering the stable critical data path, single-bit 
completion detector is sufficient to obtain the 
correct handshake signal whatever the data path 
width. Such design doesn't only help reduce the 
Viswanadham Nagasandeep* et al. 
  (IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH 
  Volume No.4, Issue No.6, October – November 2016, 4732-4734.  
2320 –5547 @ 2013-2016 http://www.ijitr.com All rights Reserved.  Page | 4733 
recognition overhead but additionally partly 
maintains the great qualities within the four-phase 
dual-rail protocol design. 
II. SYSTEM STUDY 
PS0 is really a well-known implementation type of 
asynchronous domino logic pipeline according to 
dual-rail protocol. It's an important foundation for 
many later suggested styles. Since our suggested 
pipeline can also be according to PS0, we shall 
start by reviewing PS0 pipeline style, after which 
simply presenting two other advanced styles: 1) a 
timing-robust style known as recharge half-buffer 
and a pair of) a higher-throughput style known as 
look ahead pipeline. Finally, we summary the delay 
assumptions of those pipelines and provide our 
delay assumption within the suggested design. PS0 
was created in line with the four-phase dual-rail 
protocol. The 4-phase dual-rail encoding encodes a 
request signal in to the data signal using two wires. 
This protocol is extremely robust since a sender 
along with a receiver can communicate reliably no 
matter delays within the combinational logic block 
and wires together [3]. The dual rail encoded data 
path is called the delay-insensitive data path. In 
PS0, each pipeline stage consists of the purpose 
block along with a completion detector. Each 
function block is implemented using dual-rail 
domino logic. Each completion detector generates a 
nearby handshake signal to manage the flow of 
information with the pipeline. A 2-input NOR gate 
can serve as the fir-bit completion detector to 
develop a bit done signal by monitoring the outputs 
of dual-rail domino gate. To construct a couple-bit 
completion detector, C-element is required to 
combine the part done signals. You will find three 
evaluations, two completion detections, and 
something recharge within the complete cycle for 
any pipeline stage. The recognition overhead is 
because the entire completion detectors that are 
utilized to cope with data path delay variations by 
discovering the whole data pathways. The twin-rail 
encoding overhead is because dual-rail domino 
logic which is used because of not only applying 
logic function but additionally storing data between 
pipeline stages. The additional dual-rail domino 
buffers overeat of plastic area and power. Inside a 
4-bit ripple carry adder, 18 dual-rail domino buffer 
gates are added, which just about counterbalance 
the advantage of removing explicit storage 
elements. PCHB is really a timing-robust pipeline 
style that utilizes quasi-delay-insensitive control 
circuits. Two completion detectors inside a PCHB 
stage: one around the input side (Di) and something 
around the output side (Do). Even though this 
design makes PCHB more timing robust, it leads to 
a two-time overhead in handshake control logic in 
contrast to PS0. Besides, PCHB has got the same 
dual rail encoding overhead as PS0. LP2/2 
increases the throughput of PS0 by optimizing the 
consecutive of handshake occasions. However, 
they don't solve the overhead problems in 
handshake control logic and performance block 
logic. The handshake speed is faster by using 
uneven completion detectors placed in front of 
function blocks. PCHB is an extremely robust 
pipeline that needs no delay assumptions or 
calculations by designer. Our suggested pipeline is 
dependent on PS0, but constitutes a different delay 
assumption from LP2/2. According to PS0, LP2/2 
makes two more aggressive delay assumptions: 
first, each pipeline stage evaluates no slower than 
its completion detects as well as the stage’s 
successor recharges. Second, each pipeline stage 
completion detects plus its predecessor recharges 
no slower compared to stage evaluates plus its 




The pipeline was created with different stable 
critical data path that's built utilizing a special dual 
rail logic. The critical data path transfers an 
information signal as well as an encoded handshake 
signal. Noncritical data pathways, made up of 
single-rail logic, only transfer data signal. A static 
NOR gate detects the twin-rail critical data path 
and generates a complete done signal for every 
pipeline stage. The outputs of NOR gates are 
attached to the recharge ports of the previous stages 
[4]. APCDP has got the same protocol as PS0. The 
main difference is the fact that a complete done 
signal is generated by discovering just the critical 
data path rather from the entire data pathways. 
Such design method has two merits. First, the 
conclusion detector is simplified one NOR gate, 
and also the recognition overhead isn't growing 
using the data path width. Second, the overhead of 
function block logic is reduced by making use of 
single-rail logic in noncritical data pathways. 
Consequently, APDCP includes a small overhead 
both in handshake control logic and performance 
block logic, which greatly increases the throughput 
and power consumption. This paper introduces a 
competent solution that utilizes SLGs to create the 
critical data path. The SLGs solve the gate-delay 
data-dependence problem by ensuring SLGs cannot 
start evaluation until all valid data arrive. 
Consequently, the suggested design is considerably 
area and power efficient. The dwelling of APCDP, 
the solid arrow represents a built critical data path, 
Viswanadham Nagasandeep* et al. 
  (IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH 
  Volume No.4, Issue No.6, October – November 2016, 4732-4734.  
2320 –5547 @ 2013-2016 http://www.ijitr.com All rights Reserved.  Page | 4734 
the dotted arrow represents the noncritical data 
pathways, and also the dashed arrow represents the 
creation of single-rail to dual-rail encoding ripper 
tools. In every pipeline stage, a static NOR gate can 
be used as 1-bit completion detector to develop a 
total done signal for the whole data pathways by 
discovering the built critical data path. Driving 
buffers deliver each total done signal towards the 
recharge/evaluation control port from the previous 
stage. The critical signal transition differs from one 
data road to others based on different input data 
patterns. Since SLGs have solved the gate-delay 
data-dependence problem, a reliable critical data 
path can be simply built. Used, making all gates 
evaluate simultaneously is tough, especially 
without the assistance of intermediate latches or 
registers. Therefore, we result in the SLG end up 
being the last gate to begin evaluation by linking 
each pipeline stage’s SLG together. The logic 
overhead within the noncritical data pathways 
could be reduced using single-rail domino gates 
rather of dual-rail domino gates. However, single-
rail domino gate and dual-rail domino gate use 
different encoding schemes. It's encoding 
compatibility problem whenever a single-rail 
domino gate connects to some dual-rail domino 
gate. Encoding ripper tools must be made to solve 
the issue. APCDP has pipeline failure within the 
situation that the pipeline stage doesn't finish 
evaluating before its previous stage start recharge. 
Such situation, the pipeline stage cannot properly 
finish evaluating since the recharge of their 
previous pipeline stage removes the valid data in 
the inputs. To avert this pipeline failure, APCDP 
must satisfy a belief that, inside a pipeline stage, no 
other bits over the entire data pathways is slower 
compared to detected bit by greater than the delay 
via a static NOR gate and also the drive buffer 
chain following it. The pipeline structure of 
APCDP is very robust because the hold time  
supplies the required time margins. Used, the 
sturdiness from the built critical path is impacted 
by delay variations [5]. Ought to be fact, it's a 
prevalent problem in VLSI circuit design, just like 
the sturdiness of the clock signal in synchronous 
design along with a match delay line in bundled-
data asynchronous design. 
IV. CONCLUSION 
The look method greatly cuts down on the 
overhead of handshake control logic in addition to 
function block logic, which not just boosts the 
pipeline throughput but additionally lessens the 
power consumption. This paper introduced a 
manuscript design approach to asynchronous 
domino logic pipeline. It's even comparable having 
a synchronous pipeline with consecutive clock 
gating. The pipeline is recognized with different 
built critical data path. The evaluation results 
reveal that the suggested design has better 
performance than the usual bundled-data 
asynchronous domino logic pipeline (LP2/2-SR). 
V. REFERENCES 
[1]  H. S. Low, D. Shang, F. Xia, and A. 
Yakovlev, “Variation tolerant AFPGA 
architecture,” in Proc. ASYNC, 2011, pp. 
77–86.  
[2]  M. Singh, J. A. Tierno, A. Rylyakov, S. 
Rylov, and S. M. Nowick, “An adaptively 
pipelined mixed synchronous-asynchronous 
digital FIR filter chip operating at 1.3 
gigahertz,” IEEE Trans. Very Large Scale 
Integr. (VLSI) Syst., vol. 18, no. 7, pp. 
1043–1056, Jul. 2010.  
[3]  A. M. Lines, “Pipelined asynchronous 
circuits,” Dept. Comput. Sci., California 
Inst. Technol., Pasadena, CA, USA, Tech. 
Rep., 1998.  
[4]  Z. Xia, S. Ishihara, M. Hariyama, and M. 
Kameyama, “Dual-rail/singlerail hybrid 
logic design for high-performance 
asynchronous circuit,” in Proc. IEEE 
ISCAS, May 2012, pp. 3017–3020.  
[5]  Z. Xia, S. Ishihara, M. Hariyama, and M. 
Kameyama, “Synchronising logic gates for 
wave-pipelining design,” IEE Electron. 
Lett., vol. 46, no. 16, pp. 1116–1117, Aug. 
2010.  
AUTHOR’s PROFILE 
Viswanadham  Nagasandeep 
completed his Btech in K.S.N 
Institute Of Technology 
(Kovur) in 2014. Now 
pursuing Mtech in Electronics 
& Communication 
Engineering in SKR College 
of Engineering & Technology, 
Manubolu 
Y Mahesh , received his 
M.Tech degree, currently He is 
working as an Assistant 
Professor in SKR College of 
Engineering & Technology, 
Manubolu 
 
