Missouri University of Science and Technology

Scholars' Mine
Electrical and Computer Engineering Faculty
Research & Creative Works

Electrical and Computer Engineering

01 Apr 2007

Teaching Asynchronous Digital Design in the Undergraduate
Computer Engineering Curriculum
Scott C. Smith
Missouri University of Science and Technology

Waleed K. Al-Assadi
Missouri University of Science and Technology, waleed@mst.edu

Follow this and additional works at: https://scholarsmine.mst.edu/ele_comeng_facwork
Part of the Electrical and Computer Engineering Commons

Recommended Citation
S. C. Smith and W. K. Al-Assadi, "Teaching Asynchronous Digital Design in the Undergraduate Computer
Engineering Curriculum," Proceedings of the IEEE Region 5 Technical Conference, 2007, Institute of
Electrical and Electronics Engineers (IEEE), Apr 2007.
The definitive version is available at https://doi.org/10.1109/TPSD.2007.4380336

This Article - Conference proceedings is brought to you for free and open access by Scholars' Mine. It has been
accepted for inclusion in Electrical and Computer Engineering Faculty Research & Creative Works by an authorized
administrator of Scholars' Mine. This work is protected by U. S. Copyright Law. Unauthorized use including
reproduction for redistribution requires the permission of the copyright holder. For more information, please
contact scholarsmine@mst.edu.

363

Teaching Asynchronous Digital Design in the
Undergraduate Computer Engineering Curriculum
Scott C. Smith', Senior Member, IEEE, and Waleed K. Al-Assadi2, Senior Member, IEEE
University of Missouri - Rolla, Department of Electrical and Computer Engineering
Emerson Electric Co. Hall, 1870 Miner Circle, Rolla, MO 65409

'Phone: (573) 341-4232, 'E-mail: smithsco 2
umredu, Phone: (573) 341-4836, Ea
Abstract-As demand continues for circuits with higher
performance, higher complexity, and decreased feature size,
asynchronous (clockless) paradigms will become more widely
used in the semiconductor industry, as evidenced by the

International Technology Roadmap for Semiconductors' (ITRS)

prediction of a likely shift from synchronous to asynchronous
design styles in order to increase circuit robustness, decrease
power, and alleviate many clock-related issues. ITRS predicts

that asynchronous circuits will account for 19% of chip area
within the next 5 years, and 30% of chip area within the next 10

mu

Computer Engineering courses; Section IV depicts the
original VHDL and VLSI course outlines and shows how
these courses have been augmented to include the
a hronous materials an Section VLpresentse outc e
of the first offerings of the VHDL and VLSI courses with the
asynchronous materials included, and provides conclusions

and directions for future work.

years. To meet this growing industry need, students in Computer

II. OVERVIEW OF ASYNCHRONOUS PARADIGMS

to make them more marketable and more prepared for the
challenges faced by the digital design community for years to
come.

Asynchronous circuits can be grouped into two main
categories: bounded-delay and delay-insensitive models.

Engineering should be introduced to asynchronous circuit design

I. INTRODUCTION
The development of synchronous circuits currently
dominates the semiconductor design industry. However, there
are major limiting factors to the synchronous, clocked
approach, including the increasing difficulty of clock
distribution, increasing clock rates, decreasing feature size,
increasing power consumption, timing closure effort, and
difficulty with design reuse. Asynchronous circuits require
less power, generate less noise, produce less electro-magnetic
interference (EMI), and allow for easier reuse of components,
compared to their synchronous counterparts, without
compromising performance.
In most Computer Engineering curriculums students are
only taught the synchronous, clocked paradigm, and never
even touch on asynchronous digital design. Those curriculums
that do mention asynchronous design do so only in passing;
the students are not taught how to design asynchronous
circuits. The widespread introduction of asynchronous digital
design in the classroom is largely constrained by the lack of
introductory educational materials. This paper presents one
approach for integrating asynchronous circuit design into the
undergraduate Computer Engineering curriculum, focusing on
inclusion in two courses, one on Hardware Design Languages
(HDLs), such as VHDL, and the other on VLSI.
The paper iS organized into 5 sections. Section II presents
an oervew aosnchonou loic;Secton II escrbesthe
. undergraduate
'
use in
asynchronous materials developed for

The authors gratefully acknowledge the support from the National Science

Foundation under CCLIgrant DUE-0536343.

1-4244-1280-3/07/$25.00 ©2007 IEEE

Bounded-delay models, such as micropipelines [1], assume
that delays in both gates and wires are bounded. Delays are
added based on worse-case scenarios to avoid hazard
conditions. This leads to extensive timing analysis of worsecase behavior to ensure correct circuit operation. On the other
hand, delay-insensitive circuits assume delays in both logic

elements and interconnects to be unbounded, although they
assume that wire forks within basic components, such as a full
adder, are isochronic [2, 3], meaning that the wire delays
within a component are much less than the logic element
delays within the component, which is a valid assumption
even in future nanometer technologies. Wires connecting
components do not have to adhere to the isochronic fork
assumption. This implies the ability to operate in the presence
of indefinite arrival times for the reception of inputs.
Completion detection of the output signals allows for
handshaking to control input wavefronts. Delay-insensitive
design styles therefore require very little, if any, timing
analysis to ensure correct operation (i.e., they are correct by
construction), and also yield average-case performance rather
than the worse-case performance of bounded-delay and
traditional synchronous paradigms.
A. Delay-Insensitive Paradigms
Most delay-insensitive (DI) methods combine C-elements
with Boolean gates for circuit construction. A C-element
behaves as follows: when all inputs assume the same value
then the output assumes this value, otherwise the output does
change. Seitz' s [4], DIMS [5], Anantharaman' s [6],
~~~~~not
Sig. [] n Dai's [8 ehd reape fD
prdgsta

nyueCeeet
oaheedly

insensitivity. On the other hand, both Phased Logic [9] and

2007 IEEE Region 5 Technical Conference, April 20-21, Fayetteville, AR

Authorized licensed use limited to: IEEE Xplore. Downloaded on January 20, 2009 at 15:50 from IEEE Xplore. Restrictions apply.

364

NULL Convention Logic (NCL) [10] target a library of
multiple gates with hysteresis state-holding functionality.
Phased Logic converts a traditional synchronous gate-level
circuit into a DI circuit by replacing each conventional
synchronous gate with its corresponding Phased Logic gate,
and then augmenting the new network with additional signals.
NCL circuits are realized using 27 fundamental gates
implementing the set of all functions of four or fewer
variables, each with hysteresis state-holding functionality.
Seitz's method, Anantharaman's approach, and DIMS
require the generation of all minterms to implement a
function, where a minterm is defined as the logical AND, or
product, containing all input signals in either complemented or
non-complemented form. While Singh's and David's methods
do not require full minterm generation, they rely solely on
C-elements for speed-independence. NCL also does not
require full minterm generation and furthermore includes
27 fundamental state-holding gates for circuit design, rather
for
a greater
thusp yielding
than
only C-elements,
optizatonly
thalementsotheruI
dingm [1]
Pheasedpotential
Logiialsfor

doptim require full minerm .genraIon anddoes n

' NCL is a delay-insensitive asynchronous paradigm, which

o

solely on C-elements for speed-independence; however,
Phased Logic circuitry is derived directly from its equivalent
it does
design, not
synchronous
no created
crae independently
thsitde
synchronous dein
ineedety thus
not have the same potential for optimization as does NCL.
Furthermore, the Phased Logic paradigm has been developed
mainly for easing the timing constraints of synchronous
designs, not for obtaining speed and power benefits [9]
whereas these are main concerns of other asynchronous
paradigms.
Self-timed circuits can also be designed at the transistor
level as demonstrated by Martin [12]. However, automation of
this method would be vastly different than that of the standard
synchronous approach, since it optimizes designs at the
transistor level instead of targeting a predefined set of gates,
as do the previously mentioned methods. Overall, NULL
Convention Logic offers the best opportunity for integrating
asynchronous digital design into the predominantly
synchronous semiconductor design industry for the following

reasons:
I)The framework for NCL systems consists of DI

combinational logic sandwiched between DI registers, as
shown in Fig. 1, which is very similar to synchronous
systems, such that the automated design of NCL circuits can
follow the same fundamental steps as synchronous circuit
design automation. This will enable the developed DI
design flow to be more easily incorporated into the chip
design industry, since the tools and design process will
_\

DI Register

D

\

DI Register

v Ko
< Ko Ki<1

Ki <1

already be familiar to designers, such that the learning curve
is relatively flat.
2)NCL systems are delay-insensitive, making the design
process much easier to automate than other non-DI
asynchronous paradigms, since minimal delay analysis is
necessary to ensure correct circuit operation.
3)NCL systems have power, noise, and EMI advantages
compared to synchronous circuits, performance and design
reuse advantages compared to synchronous and non-DI
asynchronous paradigms, area and performance advantages
compared to other DI paradigms, and have a number of
advantages for designing complex systems, like Systemson-Chip (SoCs), including substantially reduced crosstalk
between analog and digital circuits, ease of integrating
multi-rate circuits, and facilitation of component reuse and
technology migration.
B NULL Convention Logic (NCL)

\ /-

i

means that NCL circuits will operate correctly regardless of
when circuit inputs become available; therefore NCL circuits
are said to be correct-by-construction (i.e., no timing analysis
is necessary for correct operation). NCL circuits utilize dualrail or quad-rail logic to achieve delay-insensitivity. A dualD, cconsists of two wires, Do and D', which may
rrail ssignal, D
assume any value from the set {DATAO, DATA1, NULL}.
The DATAO state (D = 1, D' = 0) corresponds to a Boolean
logic 0, the DATAI state (Do = 0, D' = 1) corresponds to a
Boolean logic 1, and the NULL state (Do = 0, D' = 0)
corresponds to the empty set meaning that the value of D is
not yet available. The two rails are mutually exclusive, such
that both rails can never be asserted simultaneously; this state
is defined as an illegal state. A quad-rail signal, Q, consists of
four wires, QD,Q', , and Q3, which may assume any value
from the set {DATA, DATA 1, DATA2, DATA3, NULL}.
1,
state
The
0,
0,
0)
corresponds to two Boolean logic signals, X and Y, where
X = 0 and Y
The DATAI state (Q0 0, Q' 1, Q2 0,
Q 0=) corresponds to X
and Y 1. The DATA2 state
Q
o,
0
Q2
Q3
0) corresponds to X land
=
= 0,
Q3 1
The DATA3 state (Q0 0, Q' 0, Q2
Y
corresponds to X 1 and Y 1, and the NULL state
0,
0, Q3 = 0) corresponds to the empty set meaning
Q0h,
that the result is not yet available. The four rails of a quad-rail
NCL signal are mutually exclusive, such that no two rails can
ever be asserted simultaneously; these states are defined as
illegal states. Both dual-rail and quad-rail signals are space
optimal 1-hot delay-insensitive codes, requiring two wires per
bit.
DI Register l
Dl- \ \ DI Register

DATAo

(QB

=0.

=1,

(QY

co0.

QX

Ql

=0

0,,
(Qm

Qt

v Ko

Ki *

Qw

<

Ki
~~~~~~~~~~~Ko

Fig. 1. NCL system framework: input wavefronts are controlled by local handshaking signals and Completion Detection instead of by a global clock
signal. Feedback reauires at least three DT registers in the feedback loon to nrevent deadlock.

1-4244-1280-3/07/$25.00 ©2007 IEEE

2007 IEEE Region 5 Technical Conference, April 20-21, Fayetteville, AR

Authorized licensed use limited to: IEEE Xplore. Downloaded on January 20, 2009 at 15:50 from IEEE Xplore. Restrictions apply.

365

NCL circuits are comprised of 27 fundamental gates, as
shown in Table I, which constitute the set of all functions
consisting of four or fewer variables. Since each rail of an
NCL signal is considered a separate variable, a four variable
function is not the same as a function of four literals, which
would normally consist of eight variables. The primary type of
threshold gate, shown in Fig. 2, is the THmn gate, where
1 < m < n. THmn gates have n inputs. At least m of the n
inputs must be asserted before the output will become
asserted. In a THmn gate, each of the n inputs is connected to
the rounded portion of the gate; the output emanates from the
pointed end of the gate; and the gate's threshold value, m, is
written inside of the gate.
TABLE I
27 FUNDAMENTAL NCL GATES
NCL Gate
A+B
TH12

TH22

Function

A

AB

TH13
TH23
TH33
TH23w2

A+B+C
AB+AC+BC
ABC
A+BC

TH14

A+B+C+D

TH33w2

3
B
z
c
D
Fig. 3. TH34w2 threshold gate: Z = AB + AC + AD + BCD.

AB+AC

TH24

AB + AC + AD + BC + BD +CD

TH24w2

A + BC + BD + CD

TH34
TH44

TH44w2
TH34w3
TH44w3

ABC + ABD + ACD + BCD
ABCD

ABC + ABD +ACD
A+BCD
AB + AC + AD

TH24w22 A+B+CD

TH34w22 AB + AC + AD + BC + BD

TH44w22 AB + ACD + BCD
TH54w22 ABC + ABD
TH34w32 A + BC + BD

TH44w322 AB + AC+AD+BC
TH24comp AC +BC +AD +BD

tm

NCL systems contain at least two DI registers, one at both
the input and at the output. Two adjacent register stages
interact through their request and acknowledge signals, Ki and
K0, respectively, to prevent the current DATA wavefront from
overwriting the previous DATA wavefront, by ensuring that
the two DATA wavefronts are always separated by a NULL
wavefront. The acknowledge signals are combined in the
Completion Detection circuitry to produce the request
signal(s) to the previous register stage. NCL registration is
realized through cascaded arrangements of single-bit dual-rail
registers or single-signal quad-rail registers, depicted in
Figs. 4 and 5, respectively. These registers consist of TH22
gates that pass a DATA value at the input only when K, is
request for data (rfd) (i.e., logic 1) and likewise pass NULL
only when Ki is request for null (rfn) (i.e., logic 0). They also
contain a NOR gate to generate Ko, which is rfn when the
register output is DATA and rfd when the register output is

TH54w322 AB + AC + BCD
THxorO AB+CD
THandO AB + BC + AD

input 1
input 2

are designed with hysteresis state-holding capability, such that
after the output is asserted, all inputs must be deasserted
before the output will be deasserted. Hysteresis ensures a
complete transition of inputs back to NULL before asserting
the output associated with the next wavefront of input data.
Therefore, a THnn gate is equivalent to an n-input C-element
(i.e., when all inputs are asserted the output is asserted; the
output then remains asserted until all inputs are deasserted, at
which time the output becomes deasserted); and a THIn gate
is equivalent to an n-input OR gate. NCL threshold gates may
also include a reset input to initialize the output. Circuit
diagrams designate resettable gates by either a d or an n
appearing inside the gate, along with the gate's threshold.
d denotes the gate as being reset to logic 1; n, to logic 0. These
resettable gates are used in the design of DI registers.

output

input n

NULL. The registers shown below are reset to NULL, since
all TH22 gates are reset to logic 0. However, either register
could be instead reset to a DATA value by replacing exactly
one of the TH22n gates with a TH22d gate.
10

Fig. 2. THmn threshold gate.

Another type of threshold gate is referred to as a weighted
threshold gate, denoted as THmnWw1w2.. .wR. Weighted
threshold gates have an integer value, m . WR > 1, applied to
inputR. Here 1 < R < n; where n is the number of inputs; m is
the gate's threshold; and w1, W2, ... WR, each > 1, are the
integer weights of input], input2, ... inputR, respectively. For
example, consider the TH34W2 gate, whose n 4 inputs are
labeled A, B, C, and D, shown in Fig. 3. The weight of input /
A, W(A), is therefore 2. Since the gate's threshold, m, is 3, this K0
implies that in order for the output to be asserted, either inputs
B, C, and D must all be asserted, or input A must be asserted
along with any other input, B, C, or D. NCL threshold gates
1-4244-1280-3/07/$25.00 ©2007 IEEE

00

11

n

1o
K

-

X

,
Fig. 4. Single-bit dual-rail register.

2007 IEEE Region 5 Technical Conference, April 20-21, Fayetteville, AR

Authorized licensed use limited to: IEEE Xplore. Downloaded on January 20, 2009 at 15:50 from IEEE Xplore. Restrictions apply.

366

1°

III. ASYNCHRONOUS COURSE MATERIALS

2n
112n

n

________

.l

To effectively introduce asynchronous digital design into
the Computer Engineering curriculum, lecture notes, example
problems, group projects, and libraries of fundamental
asynchronous gates and components were developed. The
educational materials were developed as Modules, such that
portions of the materials could be easily integrated into a
variety of courses, as appropriate, to meet the needs of a
diverse set of courses with different learning objectives.

o'

12
o2

n

i3

A. Educational Modules

2n

Reset

The following is the list of the specific educational modules
that were developed:
l)Introduction to Asynchronous Logic: This includes a
discussion of both bounded-delay and delay-insensitive
asynchronous paradigms, highlighting the differences
between the two and comparing each to the synchronous,

Ki

K, < 1

clocked paradigm.

Fig. 5. Single--signal quad-rail register.

An N-bit register stage, comprised of N single-bit dual-rail
NCL registers, requires N completion signals, one for each bit.
The NCL completion component, shown in Fig. 6, uses these
N Ko lines to detect complete DATA and NULL sets at the
output of every register stage and request the next NULL and
DATA set, respectively. In full-word completion, the singlebit output of the completion component is connected to all Ki
lines of the previous register stage. Since the maximum input
threshold gate is the TH44 gate, the number of logic levels in
the completion component for an N-bit register is given by
Flog4 Ni. Likewise, the completion component for an N-bit
quad-rail registration stage requires Ninputs, and can be
realized in a similar fashion using TH44 gates.

Ko(N)
K

4(N-2)

Ko(N-3)

Ko(N-4)
Ko(N-_5)

Ko(N-6)

Ko(N-7)
0
*
h

*
*
*

Ko(8)

Ko(7)

4

_

*

4

*

-.

0
*.
-- !

0

Ko(5)
Ko(4)

Ko(1)Fig. 6. N-bit completion component.

1-4244-1280-3/07/$25.00 ©2007 IEEE

Ko

2)Introduction to NULL Convention Logic (NCL): This
includes a description of dual-rail and quad-rail signaling,
the 27 fundamental NCL gates, NCL registration,
combinational logic, and completion detection components,
and NCL DATA/NULL wavefront flow.
3)Transistor-level NCL Gate Design: This details the
process for designing both static and semi-static NCL gates.
4)Input-Completeness and Observability: This explains the
two criteria that must be followed when designing NCL
circuits to ensure delay-insensitivity.
5)Dual-Rail NCL Design: This details the process for
designing dual-rail NCL combinational circuits.
6)Quad-Rail NCL Design: This details the process for
designing quad-rail NCL combinational circuits.
7)NCL Throughput Optimization: This describes the NCL
throughput calculation, NCL pipelining, and the NULL
Cycle Reduction optimization.
8)Group Projects: This contains a number of comprehensive
group projects consisting of the implementation and testing
of
various types of NCL arithmetic circuits, at various levels
o
of abstraction.
All of these course modules can be downloaded from the
authors'
CCLI
website:
http://web.umredu/-smithsco/CCLI *,chtml. Module I is
similar to Sections 11 and II.A in this paper; and Module 2 is
similar to Section II.B in this paper. Modules 1 and 2 are
introductory and therefore do not contain any specific
example problems or exercises; they are also independent of
each other, such that a broad discussion of asynchronous logic

in general is not required before discussing NCL specifics.
Modules 3-7 all contain an explanation of the specific topic
along with a comprehensive example and exercise problems.
Ko(3) ! \-Modules 2 and 4 are prerequisites for all subsequent modules,
while Modules 3, 5, 6, and 7 are independent of each other.
The comprehensive group projects in Module 8 require
various other modules as prerequisites, depending on the
specific project requirements and objectives.
2007 IEEE Region 5 Technical Conference, April 20-21, Fayetteville, AR

Authorized licensed use limited to: IEEE Xplore. Downloaded on January 20, 2009 at 15:50 from IEEE Xplore. Restrictions apply.

367

B. Asynchronous Libraries
In order to assist students with designing and testing NCL
circuits at various levels of abstraction, static NCL VHDL,
transistor-level, and physical-level libraries were created. The
transistor-level and physical-level libraries of the fundamental
NCL gates were implemented with the Mentor Graphics CAD
tools using the 0.18ptm TSMC CMOS process. The VHDL
library consists of a package that defines the fundamental
NCL data types, a file containing the fundamental NCL gates,
with delays based on the simulated physical-level static NCL
gates, a file containing generic versions of standard NCL
registration and completion components, and a package
consisting of various functions to be used in testbenches. The
VHDL, transistor-level, and physical-level NCL libraries can
all be downloaded from the authors' CCLI website:

http:// eb.umr.edu/-smithsco/CCLIasync.html.

IV. COURSE INTEGRATION
The asynchronous modules and libraries were successfully
incorporated into two senior/graduate-level elective courses at
University of Missouri - Rolla (UMR), Digital System
and Introduction to VLSI, in Spring
Modeling with VHDL
VHDL
Modeling
and
to06 respectively.
VLsIecineSp.rThe
Introductionr
with0
2006,
Semester 2006 and Fall Semester
foh
iS shown schedule
on the leftoriginal schedule for the VHDL course original

hand~
~ ~ ~~~~~~.
sid Fi.7.,sprvdstestdnswt
of

hand
sideof Fig. 7. This provides the students with

approximately 13 weeks of topic lectures, leaving around
3 weeks for discussion of homework and project assignments
and their solutions, holidays, and the midterm exam. Note that
the final exam is given the week after the 16-week semester

concludes. This schedule has been vetted by the primary
author over the past five years and has been shown to work
well. It does require the students to do a sizable amount of
work; however, after successful completion of the course,
students are well versed in VHDL.
To integrate the asynchronous logic material into the
course, the last quarter of the original schedule was revised, as
shown on the right-hand side of Fig. 7. The floating-point
arithmetic and microprocessor architecture topics were
replaced with the asynchronous topics; HW#5 on generic
constants and generate statements was changed to instead
cover Text I/O; HW#6 on the design of an IEEE single
precision floating-point co-processor was switched to an
assignment on NCL; and Design Project #2 on implementing a
microcontroller in VHDL and Text I/O was replaced with the
design of a complex generic NCL arithmetic circuit. These
changes replace 3 weeks of topics with 2 2/3 weeks of
asynchronous logic topics, providing an extra 1/3 week for
additional explanation of the NCL assignments and solutions.
do not eliminate any key VHDL
these changes
Furthermore,
coremtras
ohfotn-on rtmtcadRS
f
mcronmtreri archi
aregcovee iari ous othe
com terE architecture are covered on various other
Computer Engineering courses, and were only discussed in
the VHDL course so that they could be used as sample circuits
VHDL. Furthermore,
since asynchronous
be designed
HDtocicut
mus beusing
deinda
tutrlmdl
n antb
circuits must be designed as structural models and cannot be
described as behavioral or dataflow models and synthesized
using current industry standard tools, the topic fits seamlessly
into the discussion of generate statements, which are

1) Introduction to Modeling with VHDL
2) Entity and Architecture Statements
3) Test Benches
4) Basic UNIX Commands and Mentor Graphics VHDL Compiler and Simulator
HW# 1: design simple behavioral, dataflow, and structural models and testbench
5) Packages, Functions, and Procedures
HW#2: write a package including functions and procedures
6) Mealy and Moore Machines
HW#3: Mealy and Moore machines, including design, VHDL behavioral and
dataflow implementation, state minimization, and state assignment
7) Algorithmic State Machines (ASMs)
8) Mentor Graphics VHDL Synthesis Tool
HW#4: ASM throughput capability (TPC) calculation, TPC optimization, and VHDL
dataflow implementation and synthesis
Design Project #1: design complex chip, such as Run-Length Encoder, Huffman Decoder, etc.
Midterm Exam
10) File I/O

9) Generic Constants and Generate Statements

_

to read the inputs from a text file and store the
HW#5 design a generic Multiply and Accumulat unit MAC)
outputs to a text file
10) File 1/0
I) Overview of Asynchronous Logic
11) Floating-Point Arithmetic
12) Overview of NCL
HW#6: design an IEEE single precision floating-point co-processor
13) Input-Completeness and Observability
12) Simple RISC Microcontroller Architecture
1
Design
Design Project #2: augment RISC architecture discussed in class
l
C
-:
to incvlude add tional i nstructions, such as various branches a NOP
15 CIPienigOtmzio
and a compare, and implement in VHDL, nclud;ng Text I/O in testbench W6N asglet
13) Overview of Verilog Modeling Language
\DeinPoct#dsgnN:gnrc ihrtc
Final Exam
cici,scasaMCitrtvdiie,gaet

1-4244-1280-3/07/$25.00 ©2007 IEEE

Fig. 7. VHDL course schedule and changes.
2007 IEEE Region 5 Technical Conference, April 20-21, Fayetteville, AR

Authorized licensed use limited to: IEEE Xplore. Downloaded on January 20, 2009 at 15:50 from IEEE Xplore. Restrictions apply.

368

1) Introduction to VLSI Systems
Lab# 1: VHDL coding, synthesis, and simulation
2) CMOS Transistor Theory
3) Fabrication, Layout, and Design Rules
Lab#2: gate-level and transistor-level schematics and simulation
4) Analysis of Static Inverter
Lab#3: layout of static inverter and RC extraction
5) Design and Optimization of Static CMOS Gates
6) Introduction to NC1L
7) Transistor-level design of NC1L gates
8) Critical Path Delay Analysis and Transistor Sizing
9) Dynamic CMOS Circuit Design
10) Design of Flip-Flops, Latches, and Sequential Circuits
Lab#4 layout of basic static Boolean gates and static and, semi-static NC:L gates (NCL gates replaced flip-flops)
11) Static Timing Analysis for Sequential Circuits
12) Low Power Design
Lab#5: schematic driven layout
13) Datapath Design for Synchronous Circuits (e.g., comparators, adders, multipliers, registers, etc.)
14) Datapath Design for NCLj Circuits (e.g., registration, completion, and DR and QRt combinational circuits)
Lab#6: synchronous datapath design and simulation
15) Semiconductor Memories
16) Clock Distribution, PLL, Clock Skew, and Jitter
17) Floorplanning, Placement, and Routing
18) Control Unit Design
19) VLSI Testing and Design for Test
Design Project- design, layout, and, simlulate various NCL arithmnetic circuits (e.g., qa-rail unsigned, 24+8X8 MAC, dual-rail 2s comnplemnent
8X8 B3ooth2 mnultiplier, and, dual-rail 2s comlplemrlent 8X8 BaugXh-Wooley mnultiplier)
20) Future Trends in VLSI Design
Fig. 8. VLSI course schedule and changes.

The schedule for the revised VLSI course is shown in
Fig. 8. This provides the students with approximately 14
weeks of topic lectures, leaving around 2 weeks for discussion
of laboratory assignments and their solutions, holidays, and
occasional quizzes. Note that the final exam is scheduled the
week after the 16-week semester concludes, and is utilized for
each group to present their semester project design. The class
requires a substantial amount of laboratory work; however,
after successful completion of the course, students are well
versed in VLSI design using the Mentor Graphics CAD tools.
The asynchronous logic topics have been incorporated into
the VLSI course by replacing previous miscellaneous lecture
topics, by replacing Lab#4's layout of a flip-flop with the
layout of a static and semi-static NCL gate, and by utilizing
NCL circuits for the semester's comprehensive design project.
The new semester design projects involve designing various
NCL arithmetic circuits, using one of the industry-standard
VLSI CAD tool suites, Mentor Graphics, throughout all steps
of the design flow (i.e., starting from the high level of
abstraction, behavioral modeling, down to the low level of
abstraction, physical layout), and proving the functional
equivalence with simulations throughout all levels of
abstraction.
V. CONCLUSIONS AND FUTURE WORK
A. Evaluation ofDevelopedMaterials
Modules 1, 2, 4, 5, and 7 and the VHDL library were
utilized in the VHDL class; and Modules 2-6 and the VHDL,
1-4244-1280-3/07/$25.00 ©2007 IEEE

transistor-level, and physical-level libraries were utilized in
the VLSI course. Both courses also incorporated an NCLbased final project, Module 8. According to the feedback
provided from UMR's end of semester student evaluation
form for both courses, the students found the asynchronous
logic topics very interesting, and would have liked to have
been able to spend more time on NCL. Many students also
stated that the libraries were easy to use and error-free.
Overall, the students performed quite well on the NCL-related
assignments. For the VHDL class, the average on the
asynchronous logic homework assignment was the second
highest of the six homeworks (i.e., 83% verses 86%, 76%,
7300, 64%, and 440/O); and the asynchronous project's average
was approximately the same as the first project (i.e., 85%
verses 87%). However, this included one group of two
students who decided not to complete the project because they
were graduating, already had jobs, and already had enough
points to pass the course, and therefore received a 310% on the
partial submission and an overall grade of D in the class.
Excluding this outlier boosts the asynchronous project's
average to 910%. For the VLSI course, all students successfully
completed the NCL-related laboratory assignment
(i.e., Lab#4); and all three NCL-based semester projects
worked correctly, all resulting in a conference publication
with the students as first author [13-15].
The 8 educational modules were also evaluated externally
by Dr. Jia Di from the University of Arkansas; and he rated
them as excellent. In fact, he is currently working on the
design of an NCL 8051 microcontroller for a NASA Phase II
SBIR, and has required his graduate students working on the

2007 IEEE Region 5 Technical Conference, April 20-21, Fayetteville, AR

Authorized licensed use limited to: IEEE Xplore. Downloaded on January 20, 2009 at 15:50 from IEEE Xplore. Restrictions apply.

369

project to download Modules 2-5, study them, and complete
the related exercise problems. Furthermore, he is utilizing the
authors' VHDL library for the NCL 8051 functional design,
although he is using Cadence for the transistor-level and
physical-level design. Dr. Di's main suggestion for
improvement was to implement the transistor-level and
physical-level libraries in Cadence as well, such that the
libraries are available for use with the three most prevalent
digital design tool suites (i.e., Mentor Graphics, Synopsys,
and Cadence), which are used in almost all U.S. universities.
Note that the VHDL library is platform independent and is
therefore already compatible with Synopsys.
B. Future Work

The authors are planning to expand upon this work through
the following:
l)Develop new educational modules focusing on additional
asynchronous circuit topics, such that asynchronous
circuit concepts can be incorporated into a larger variety of
Computer Engineering courses.
2)Develop semi-static VHDL, transistor-level, and
physical-level
libraries

of

fundamental

asynchronous

components, such that students can easily compare
asynchronous circuits designed using static vs. semi-static
in terms
terms ot
of speecl,
speed, area,
area, ancl
and energy
energy usage.
usage.
gates, m
3)Complete the development of NCL design and
optimization CAD tools, which work with the Mentor
Graphics design tool suite, such that students can design

REFERENCES
[1] I. E. Sutherland, "Micropipelines," Communications of the ACM,
Vol. 32/6, pp. 720-738, 1989.

[2]

Communication, UT Year of Programming Institute on Concurrent
Programming, Addison-Wesley, pp. 1-64, 1990.
[3] K. Van Berkel, "Beware the Isochronic Fork," Integration, the VLSI
Vol. 13/2, pp. 103-128, 1992.
[4] Journal,
C. L. Seitz, "System
Timing," in Introduction to VLSI Systems, AddisonWesley, pp. 218-262, 1980.
[5] J. Sparso, J. Staunstrup, M. Dantzer-Sorensen, "Design of Delay
Insensitive Circuits using Multi-Ring Structures," European Design
Automation Conference, pp. 15-20, 1992.

[6] T. S. Anantharaman, "A Delay Insensitive Regular Expression

Recognizer," IEEE VLSI Technical Bulletin, Sept. 1986.
[7] N. P. Singh, "A Design Methodology for Self-Timed Systems," Master's
[8]

and CAD tools are available for use with the three most

prevlen diita
suites (i.e., Mentor Graphics,
prevalent
digital deigntool
design tosue(i.Mnor
apc,
Synopsys, and Cadence), which are used in almost all U.S.
universities.

5DevelopanaA
such
students can
5)Develop an asynchronous FPGA,
such that
that students
can
implement and test their asynchronous circuit designs in
hardware.
6)Broadly disseminate the developed materials to faculty
members at other institutions, and integrate and
evaluate the materials through course offerings at
numerous institutions throughout the nation.
Overall, the developed materials provide an easy way to
integrate cutting-edge technology into standard educational
practices to provide a low-cost, innovative addition to the
Computer Engineering curriculum, in order to prepare
students for the challenges faced by the digital design
community for years to come.

1-4244-1280-3/07/$25.00 ©2007 IEEE

Thesis, MIT/LCS/TR-258, Laboratory for Computer Science, MIT,
I.1981.
David, R. Ginosar, and M. Yoeli, "An Efficient Implementation of

Boolean Functions as Self-Timed Circuits," IEEE Transactions on
Computers, Vol. 41/1, pp. 2-10, 1992.
[9] D. H. Linder and J. H. Harden, "Phased Logic: Supporting the

Synchronous Design Paradigm with Delay-Insensitive Circuitry," IEEE

Transactions on Computers, Vol. 45/9, pp. 1031-1044, 1996.

[10] K. M. Fant and S. A. Brandt, "NULL Convention Logic: A Complete
and Consistent Logic for Asynchronous Digital Circuit Synthesis,"
International Conference on Application Specific Systems,
Architectures, andProcessors, pp. 261-273, 1996.
[11] s. C. Smith, R. F. DeMara, J. S. Yuan, D. Ferguson, and D. Lamb,
of NULL
Convention
Self-Timed
Circuits," Integration,
"Optimization
pp. 135-165,
August 2004.
the VLSI Journal,
Vol. 37/3,
[12] A. J. Martin, "Compiling Communicating Processes into DelayInsensitive VLSI Circuits," Distributed Computing, Vol. 1/4,

and
large
NCLcircitsandanstudytheo[13]
and test large NCL circuits
and
can study the operation of
the asynchronous CAD tools in the context of their
synchronous counterparts.
[14]
4)Port the static and semi-static libraries to Cadence, and
the NCL CAD tools to Synopsys, such that the libraries
test

A. J. Martin, "Programming in VLSI: From Communicating Processes to
Delay-Insensitive Circuits," in Developments in Concurrency and

pp.
226-234, 1986.
M. V. Joshi, S. Gosavi, V. Jagadeesan, A. Basu, S. Jaiswal,
W. K. Al-Assadi, and S. C. Smith, "NCL Implementation of Dual-Rail 2S

Complement 8x8 Booth2 Multiplier using Static and Semi-Static
Primitives," IEEE Region 5 Technical Conference, April 2007.
S. R. Mallepalli, S. Kakarla, S. Burugapalli, S. Beerla, S. Kotla,
P. K. Sunkara,
W. K. Al-Assadi, and S. C. Smith, "Implementation of
Static and Semi-Static Versions of a Quad-Rail NCL 24+8x8 Multiply

and Accumulate Unit," IEEE Region 5 Technical Conference,

[15] April
R. S. 2007.
P. Nair, F. Kacani, R. Bonam, S. M. Gandla, S. K. Chitneni,
V. Kadiyala, W. K. Al-Assadi, and S. C. Smith, "Implementation of
Static and Semi-Static Versions of a Bit-Wise Pipelined Dual-Rail NCL
2s Complement Multiplier," IEEE Region 5 Technical Conference, April

2007.

2007 IEEE Region 5 Technical Conference, April 20-21, Fayetteville, AR

Authorized licensed use limited to: IEEE Xplore. Downloaded on January 20, 2009 at 15:50 from IEEE Xplore. Restrictions apply.

