Characterisation & optimisation of computational functional blocks for ATM switches GaAs MESFET technology by Chu, Eric
-z(,-,'7'.
Characterisation & Optimisation
Computational Functional Blocks for
ATM Switches in
GaAs MESFET Technology
Eric Chu B.Sc., B.E.(Hons)
A thesis submitted to the Faculty of Engineering for the
Degree of Master of Engineering Science
Centre for Gallium Arsenide VLSI Technology
Department of Electrical and Electronic Engineering





Chapter 1. Introduction ........ .........1
l.l B-ISDN and ATM
1.2 Benes Network and Bit-R¿te Conversion ...
1.3 Gallium Arsenide as a Semi-Conductor
2.1 Lo,cic Families in GaAs....
2.1.1 Direct Coupled FET Logic (DCFL)
2.2. 1 Optimising DCFL.......
2.2.2 Optimising SDCFL.....
2.2.3 Optimising SBFL
2.2.4 Optimising UBFL ....
2.3 Performance Comparison
2.4 Logic Design Methodology.......
3.1 The ATM Switch Block..
3.2 Cell Format.....




1.4 Scope of This Thesis 11





2.1.2 Soulce-tbllower Direct Coupled FET Logic (SDCFL)
2.1.3 Super Buffer FET Logic (SBFL).....
2.I.4UltraButfer FET Logic (UBFL)












Chapter 3. Systern Specifications........... """'38
I
3.3. 1 Perfotmance Requirement ...'..........
3.4 Realising the Switch Fabric.......
3.4.4 The 2 x 2 Switch.............
3.4.5 Input Multiplexers ..........
3.4.6 Output Demultiplexet's ...
45
46
463.4.1 The Buffer Chip..........
3.4.2The Router Chip......... ....................47




Chapter 4. Layout Style and Design Tools
4.1 Layout Style ........





Chapter 5. Primitives............. """59
5. I Logic Gates........
5. 1. 1 Inverters...........
5.1.2 2-lnput Nor Gaæs .......
5.1.3 3-Input Nor Gates





5.5 D Flip-Flops ........
5.6 Multiplexers.......
Chapter 6. Implementing the Control Logic...........
6.1 The Input Control














6.L2 The Block Requester
6.1.3 The Pulse Generation Module
6.1.4 The Memory Conttol Module
6.1.5 The 2-Bit Counter
6.2 The Buffer Manager..
6.2.1 The Queue Size Re-qister
6.2.2The Buffer Full and Empty Decoders
6.2.3 The Input Block Requester...........










6.2.5 The Count Pulse Generater. . 110
6.2.6 The 5-Bit Up Counters ..... .111
6.3 The Output Contlol .
6.3.1 The Pseudo Random Counter
6.3.2 The Address Latch.... r23
6.3.3 The 2-Bit Counter........... t24
6.3.4 The Read Enable Generater..
6.3.5 The Output Block Request Generater t26
6.3.6 The Output Convert Decoder '.....-.127
Chapter 7. Conclusion and Future Work ...................132
7. I ConcIusion................ 132
7.2 Future Work 135
Appendix A. VHDL Structural Description of the Input Control ..............136
Appendix B. VHDL Structural Description of the Buffer Manager 149
Appendix C. VHDL Structural Description of the Output Contrbl
lll
166
Appendix D. Published Paper ....178








Broadband Integrated Services Digitat Network (B-ISDN) is thought to be the next
step in the evolution of communication networks. Amongst all possible realisations,
Asynchronous Transfer Mode (ATM) has shown the most promise. Currently, many
ælephone companies in the world are already providing ISDN services to their customers.
CCITT has proposed two data rates for ISDN: 155Mb/s and the 622Mbls.In a national
Danish project, an experimental ATM switch fabric of size 1024 x 1024 openting at up to
622Mbts wilt be constructed using only three different types of chips in a commercial
Gallium Arsenide MESFET process. It uses a Benes type network architecture together
with a concept of bit-rate conversion to imprôve performance. This switch fabric is
designed to utilise only 2.r 2 switching elements (which have a marcimum input data rate of
4.8Gb/s). The main objective in this thesis is to design the control logic of a buffer chip and
highlight some important issues on GaAs VLSI design in contrast ûo conventional Silicon.
The hrst sæp in logic design is to select and optimise the appropriate logic families.
A "mixsd" logic approach ba.sed on the DCFL family is presenûed which is designed for
high speed, low power and high temperature operations.
The control logic of a buffer chip is described in this thesis and the entire chip is
currently being fabricaæd in the Thomson Composants Microondes (TCS) 0.8pm Self-
Aligned Gaæ (SAGA) E/D GaAs process.'An isochronous clocking stratcgy is used due to
the large geometry of the chip. The input section runs at a ma:rimum frequency of 600MHz
while other sections run at half that frequency. The control block is realised as three
modules: an input control, a buffer manager, and an output control module. The entire chip
contains over 70,000 transistors and has a total area of 26.Lmm2. Simulations at mærimum














This work contains no material which has been accepted for the award of any other
degree or diploma. in any university or other tertiary institution and, to the best of my
knowledge and belief, contains no material previously published or written by another
person, except where due reference has been made in the æxt.
I give consent to this copy of my thesis made, when deposiæd in the University




The author is indebted to his supervisor Assoc. Prof. Kamran Eshraghian for all
his invaluable and untiring guidance and support given during the entire resea¡ch period as
well as the exchange opportunity to work in Denma¡k. Furthermore, the helpful discussions
given by Dr. Ken Sarkies and Mr. Michaet Liebelt is very much appreciaæd in the area
of telecommunications and digital logic design. The author would also like to thank his
colleagues Mr. Andrew Beaumont-Smith and Mr. Warren Marwood for sharing their
expertise in design tools as well a.s aU the useful advice and discussions for the long hours
in the laboratory where the deadline for chip submission 1ry¿¡s near, Mr. Jonathan Main for
his help in understanding many fundamental concepts in B-ISDN and ATM swirching,
Mrs. Song Cui and Mr. Alireza Moini for their collaboration during the first three months
of the research period in completing a survey as well as the optimisation and comparison of
various logic families in GaAs. Mr. Tarmo Rohtla is to be acknowledged for his
assist¿nce and support in computing facilities. Also, the assistance provided by many other
staff and postgraduates in this university is very much appreciaæd.
During his stay in Denmark, the author'would like to express his most sincere
thanks to Danish researcher Mr. Jens Jakobsen and the entire VLSI design team in Jydsk
Telefon, Ärhus for their support as well as providing a very nice and friendly working
environment despiæ the author's poor Danish. The author would also like to acknowledge
Mr. Preben Hvas and Mr. Poul GØdsvang for their advice, guidance and much useful
information regarding the specifications of this project and Mr. Finn Barrett for design
tools and computing facilities support.
Regarding the preparation of this thesis, the author is most thankful to Assoc. Prof.
Kamran Eshraghian, Dr. Ken Sarkies, Mr. Michael Liebelt, Dr. Neil Burgess, Mr.
vllt
Jens Jakobsen, Mr. Andrew Blanksby, Mr. Andrew Beaumont-Smith, Mrs. Song Cui,
Mr. Shannon Morton and Mr. Michael Mcgeever for proof-reading this thesis and many
constructive feedback and suggestions. Otherwise, the submission daæ would definitely be
postponed.
Last but not least, the author would like to acknowledge his ex-girlfriend Miss
Catherine Law for her ever-lasting spiritual support. It was this support that drove him to



























B ro adband-Integrated Service s Di gital Network
Computer Aided Design
International Consultative Committee on Telephone and Telegraph
Direct Coupled FET Logic
Depletion-type Field Effect Transistor
Enhancement-type Field Effect Transistor





Integrated Services Digital Network
Kilo-bit per second
Mega-bit per second
Metal Semi-conductor Field Effect Transistor
Metal Oxide Semi-conductor Field Effect Transistor
Namowband-Integrated Services Digital Network
Super Buffer FET Logic
Source-follower Direct Coupled FET Logic
Ultra Buffer FET Logic
Very high speed integrated circuit Hardware Descriptive Language
Very Large Scale Integration
x
Chapter 1. Introduction "tt:
VLSI æchnology and communications are the two most rapidly growing areas in
Electrical Engineering. The combination of both enables the provision of an entirely new
spectrum of services to become available to everyday customers. Today, different services
use different and independent networks (eg, the telephone network for voice services, the
InærNet for data communications and CableTV for TV/Video services) which are expensive
to maintain, incompatible and lack capability for sharing traffic between them. This leads to
the demand for a single integrated network which is flexible and efficient enough to replace
all current networks. This is what Broadband ISDN aims to realise.
For a network to support data rate in the order of multi-gigabit per second, the
technology will have a signifîcant impact on the ha¡dwa¡e realisation. Currently, the semi-
conductor industry has been dominaæd by Silicon æchnology. However, as the speed of
operation keeps increasing, the limit of Silicon technology will soon be reached. Gallium
Arsenide (GaAs), on the other hand, has a superior power-delay product compared to Silicon
technology at high operating speed. A brief comparison between Silicon and GaAs as a semi-
conductor will also be discussed in this chapær.
1.1 B-ISDN and ATM
ISDN is the mainstream in the next generation of communication networks. It is
designed to integrate all current networks into a single one. Currently, the detailed
specification of ISDN is still under investigation [I.121]. However, two draft specifications
expected IDEPRYCKER] are lisæd below:
I
. The cell loss probability for high priority cells should be comparable to that of optical fibre,
that is, in the order of 10-12 and,
. Cell re-ordering is not allowed to reduce cost
ISDN comes in two flavours: Narrowband ISDN (N-ISDN) and Broadband ISDN
(B-ISDN) [STALLINGS], [DEPRYCKER]. N-ISDN can be thought as the transition
between the current telephone network and B-ISDN. In N-ISDN, voice and data pass over
separate networks but appear to the customer as a single network with a cornmon access
interface. It uses the same circuit switching technique as conventional ælephone network to
transmit voice and data at a single 64kb/s rate. For this reason, N-ISDN is also known as the
64kb/s ISDN. N-ISDN can also be thought of as several separate low speed data networks
switched by software and hence, has no possible expansion to video or other high data rate
services apart from very primitive ones. B-ISDN, on the other hand, uses an entirely
different switching technique called Asynchronous Transfer Mode (ATM) to increase its
efhciency and flexibility so that other high bandwidth services can be supported. Currently,
two standard transmission rates are specified [I.121]: the 155Mb/s and the 622Mbls B-
ISDN. Higher tansmission rates can be expected in the near future. A table of comparison
between conventional circuit switched and ATM networks is shown in Table 1.1
IDEPRYCKER], [PARTRIDGE].
The main advantages of B-ISDN are discussed below:
. Flexible and Future Safe
An integrated network which supports different types of services will need to adapt
itself to different raffic behaviour (eg. bursty trafhc for data services, smooth üaffic
for voice services and intermediate traffic for video services).
2
. Efficient in Resource Use
Since all available resources can be shared between all different services, an optimal
statistical resource sharing can be achieved.
. [,ess Expensive
Since only one network is required to be designed, manufactured and maintaine4 the
overall designing, manufacturing, operating and maintenance costs will be lower.
Table 1.1: Comparison between circuit switching and ATM networks
3
. La¡ger overheads. Inflexible tbr calls of difterent
bandwidth to basic rate
. Inefficient for va¡iable bit rate calls
Dlsaclvantages
. Flexible and future safe
. Efficient use of bandwidth for both
constant and va¡iable bit rate
services
.Iæss expensive
. Good for fixed bandwidth constant
bit-rate calls (eg. voice services)
Advantages




. Packetization and depacketization
delay
. Switching delay
. Propagation delay on Eansmission
linl$
. Processing delay in the switches
Delay
Contributions
. May transmit in any available cell
slot




. All current services as well as
future services which are not yet
fully defined





. High to very high speed (155MUs
or 622Mbls)
. [.ow speed (in the order of kb/s)Transmission
Rate
. Time Division Multiplexing (TDM)
. Time is divided into cell slots
. TimeDivision Multiplexing (IDM)
. Time is divided into frames





ATM can be thought to be a very general protocol which is capable of providing a
wide range of different services. Some characteristics of ATM networks arc IDEPRYCKER]:
. Fully Integrated for Low Cost
Since only one network is required for all kinds of service, the operation and
maintenance costs can be reduced.
. Supports All Known Services
Due to the flexibility of ATM, all known services as well as the ones which have not
yet been fully defined can be supported.
. Connection Oriented
This refers to the fact that no cells can be sent down from the user-terminal to the
network until a logicaVvirtual connection is established. This allows the network to
reserye and check if there is statistically enough resources available to support the
conne¡tion. If there are insufficient resources available, the connection is refused.
. High Speed
The high speed operation in ATM networks is achieved by hardware switching as
well as using small and fixed length packets which have the propenies of:
. Reduced header functionality
The ATM header has a very timited function to enable fast processing in the
network. Its main function lies in the identif,rcation of the virtual connection
by an identiher which is selected at call set-up and guarantees a proper routing
of each cell in the network. In addition, it allows an easy multiplexing of
different virtual connections over a single link.
4
Due to the limited functionality in the header, the implementation of the header
processing in the ATM nodes is simple and can be done at very high speeds
(eg. 155Mb/s and 622Mbls). This results in very low processing and queuing
delays.
. No error protection or flow control on a link-by-link basis
In contrast to packet switching, no re-transmission of lost cells takes place in
a link of the connection. This is due to the fact that very high quality links arc
used in a network so a low bit error rate is achieved. Hence, no error contol
is required
Flow control will also not be supported in ATM networks. Proper resource
allocation and queue dimensioning in the network ensure a tolerable number
of queue overflows which cause packet loss. Values of packet [oss probability
of l0-8 down to 10-12 are intended.
. Relatively small information field length
In order to reduce the internal buffers in the switching nodes, and to limit the
queuing delays in those buffers, the information field length is kept relatively
small. Indeed, small buffers guarantee a small delay and delay jitter as
required by real time services.
In this thesis, all designs are required to support both l55Mb/s and 622Mbls
tansmission rates for compatibility reasons.
5
1.2 Benes Network and Bit-Rate Conversion
The Benes network was developed by IBENESI in the 1960's and is reported in
tTOBAGfj to be the smallest interconnection network for which all permutation Patterns ar€
realisable. It is an a¡chitecture originally developed for circuit swirched networks for building
a large non-blocking switch by smaller switching elements. A Benes network can be
constructed by two Banyan networks TGOKE&LI arranged in a mirror image configuration
which is symmetrical about the centre stage. To realise an N¡ N swirch, it requires:
M x M switching elements.
Stage I Stage 2 Stage 3 Stage 4 Stage 5
Figu¡e 1.1: An I ¡ I swirch constructedby 2 ¡ 2 swirching elements as a Benes network
An 8 x 8 switch can be constructed by 2 x2 switching elements as shown in Figure
1.1. In circuit switching, the Benes network is known as a "Reaffangeably Non-blocking"
switch which means that the connections in the circuit swirch can always be rearranged in a
way so that a desired connection can be made. However, there is no equivalent for ATM. In




















2x2 2x22x2 2x2 2x2
2x2 2x22x2 2x2 2x2
scattered over the cenu€ stage to break up long term congestion in the routing stage due to
lots of cells converging onto one link. Cell loss can still occur but it is shared evenly by
different connections and is considerably less than that for a Banyan network. In Figure 1.2,
different paths to route an incoming cell from input I to output 8 are shown.
Stage 1 Stage2 Stage 3 Stage 4 Stage 5
Figure 1.2: Different paths to route an incoming cell from input 1 to ouþut 8
In this project, a "call level randomising" routing strategy is used for the Benes
network in which each call is given a random path through the first half of the network and
routed to the desired output in the second half. Cell re-sequencing is not necessary since all
cells belong to the same call take the same path through the network. However, call set-up
blocking still occrus but can be minimised by employing the bit rate conversion technique.
The concept of bit rate conversion is to increase the data rate by a factor of k at the
inputs before entering the switching elements. It is then reduced back to its original speed
before leaving the output. This effectively reduces the switch size by a factor of k and hence
the number of switching elements and the delay. It can be shown that the power dissipation























L.3 Gallium Arsenide as a Semi-Conductor
Gallium Arsenide was first discovered in 1926. However, its poæntial as a high
speed semi-conductor was not realised until the 60's [ESHRAGHIAI.IJ. The advantages of
GaAs over Si as a semi-conductor a¡e disiussed'below: IBUSHEHRII, [ESHRAGHIANII,
IFIRSTENBERG], [HSU], ISIMONS], [SZE]
(Ð At normal doping levels the saturated ca¡rier drift velocity for GaAs and Silicon are
1.4 x I07 cmls and I x 107cm/s respectively. However, ttre drift velocity of GaAs reaches a
peak at an electric held strength of around 0.35V/pm. As the electric field strength increases
further, the drift velocity reduces and approaches the saturated velocity. On the other hand,
the drift velocity of Silicon increases with increasing electric field strength while the saturated
velocity is obtained at an electric field strength of about four times that of GaAs. As a result,
lp to 70Vo reduction in power dissipation can be achieved in GaAs over the fastest Silicon
technology such as Emitter Coupled Logic (ECL).
(ü) The electron mobility of GaAs is about six to seven times higher than that of Silicon.
Consequently, GaAs MESFETs with typical gaæ lengths of 0.5¡rm to l¡rm can achieve
transit times as short as l0ps to 15ps. This corresponds to a current gain-bandwidth product
in the range of l5GHz to 25GHz, which is a facûor of three ûo five times higher than Silicon.
(iü) Due to the absence of a gate oxide layer to trap charges, GaAs devices are more
radiation resistant than Silicon ones. This has a significant impact on space applications
wherc radiation is a major concern.
(iv) A large bandgap offers GaAs semi-insulating properties. A high resistivity in the
range of 107-l09Qcm at room temperature is.another advantage for high performance
8
devices. It does not only minimise the parasitic capacitances but also reduce the leakage
curent between devices on the same substrate.
(v) A wider operating temperature for GaAs devices is possible due to the larger
bandgap. Typical operating temperatur€ varies over the range from -200'C to +200'C.
(vi) Schottky barriers can be realised on GaAs with a large variety of metal such as
Aluminium, Platinum and Titanium. This leads to high quality Schottky junctions with
excellent ideality (less than l) and low reverse currenß (< 1A/cm2).
(vü) The direct bandgap of GaAs allows efficient radiative recombination of elecüons and
holes. This means the forward biased p-n junctions can be used a.s light emitters. Thus, an
efficient integration of electrical and optical function is possible.
A table summarising the electrical properties between GaAs and Silicon is shown in
Table 1.4 IGLOANEC].
Table 1.4: Comparison of electrical propenies between GaAs and Si
9
0.4-0.ó0.7-0.EV)Schottky barrier height r
3x1054x 105Breakdown field (V/cm)
1ù310-üMinority ca¡rier life time (s)
105t0eM a,rimum resistivi ty (C¿cm)




1x1022x IO7Maximum electon drift velocity (cm/s)
E005Un(cm2A/s)Electron mobility
SiGaAs
Despiæ all the desirable properties that GaAs may possess, its disadvantages lie
mainly in the device physics ILONG&BUTNER], tl.[tJNEZl which limit ttp yield and result
in a high fabrication cosl The disadvantages of GaAs are discussed as follows:
(Ð GaAs wafers have a large density of dislocations in the crystal lattice stn¡cture. This
coupled with the inadequacy and brittleness of the maærial together with the extra diffrculty in
controlling the doping and threshold voltage over the wafer results in a lower yield than
Silicon. As a result, GaAs ICs must be smaller in area and have a smaller transistor count.
Furthermore, the fabrication cost for GaAs ICs is approximately two orders of magnitude
higher than Silicon of the same technology.
(ü) The lack of a gaæ oxide (which isolates the gaæ metal and the underlying conductive
channel) in GaAs makes the Schottky junction easily forwa¡d-biased. As a result, a large gate
current can flow through and limits the gate-to-source voltage to the range of 0.7-{.8V
depending on the type of gate metal used. Consequently, it is more diffrcult for GaAs devices
to match the operation conditions of existing Silicon ICs.
(üi) The problems associated with the reduction of drain current in a MESFET or
heterojunction field effect transistor (HFET) by the presence of other nearby neighbouring
FETs are known as the backgating or sidegating effect. This effect is mainly caused by the
capacitive coupling of the channel of a MESFET to the floating substrate. As a result, the
substrate can modulate the drain current of the MESFET as a backgate,.and adjacent devices
as sidegates. The remedy is to place transistors further away from each other. Unfortunately,
this conflicts with the mentality to put devices in close proximity for good matching as well
as achieving a high packing density in a VLSI environmenL
From the above discussions, it seems unlikely for GaAs to replace Silicon as a semi-
conductor. However, GaAs can be used as a fast front-end interface (eg. interfacing with
10
optical fîbres) that processes high speed serial data to produce lower rate parallel subsÍeams
which are further processed by Silicon subsystems.
1.4 Scope of This Thesis
This thesis is intended to address the design details regarding the control logic within
a buffer chip, which is part of a chip set for the realisation of a high bandwidth packet switch
fabric in GaAs MESFET technology. An outline of the following chapters in the thesis is
presented below:
Chapter 2 describes the choice of logic families and their optimisation. A "mixed"
logic design approach based on DCFL is used in the realisation of the control logic. This
work has been published in the 11th NorChip seminar held in Trondheim, Norway on
November 1993. The actual paper is anached in Appendix D.
Chapter 3 describes the overall system specifications for the enti¡e 1024 x 1024
switch fabric. Issues such as the internal cell format, the systems architecture and the
required performance are addressed.
The logic layout styte and the design tools used throughout the project are described
in Chapter 4. The "ring notation" layout style is employed to minimise the coupling between
high speed signals and to achieve a high packing density. A design flow diagram is presenæd
as a guideline for designing and verifying high complexity VLSI circuits.
Chapter 5 describes the basic primitives used in the construction of the buffer chip.
This library is the same as the one used in the VHDL descriptions to achieve a one-to-one
correspondence between the structural descriptions and the transistor netlists generated from
the layout.
11
Chapær 6 describes the author's design of the control logic. It is realised as three
main modules: an input control, a buffer manager and an output control. The modules operate
at different speeds and communicate asynchronously. All circuit and timing diagrams,
layouts and simulation results are included for completeness. This chapær constitutes the
main context of this thesis. Other modules in the buffer chip were designed by Mr. Jens
Jakobsen at Jydsk Telefon, Denma¡k.
Finally, a conclusion about this research is presented in Chapter 7. A note about the
future work on this project is also given.
t2
Chapter 2. Choice of Technology & OptimisatÍon
This chapter describes the logic families used in the switch fabric design. Initially, a
survey of the logic families is carried out. Direct Coupled FET Logic (DCFL) and three
other static "normally-off' families (SDCFL, SBFL and UBFL) based on it are chosen
due to their simplicity, compatibility in signal levels, power consumption and speed
requirement. A'lnixed" logic approach based on these four families is used to realise the chip
sets for the switch fabric. The work rcgarding the optimisation and comparison of these four
families has been published in ICHU&JAKOBSEN] and is included in Appendix D.
2.1 Logic Families in GaAs
There exists a large number of logic families in the GaAs MESFET technology. They
can be classified into two main categories: the "Normally-On" and the "Normally-Off'logic
families [ESHRAGHIAN] .
"Normally-on" logic utilises depletion type MESFETs which a¡e normally "Olf'
devices and when used as switching elements they are required to be turned off. This class of
logic includes the following approaches:
. Buffered FET Logic (BFL) IVANTIIYLI], [VANTUYL2];
. Capacitively Coupled Domino Logtc (CCDL) IHOE&SALAMA];
. Capacitor Coupled FET Logic (CCFL) IMELLOR], IWELBOURN];
. CapacitorDiode FET Ingic (CDFL) [EDENl];
. Feed-Forwa¡d Static Logic (FFSL) [ESHRAGIfIAN];
. Inverted Common Drain Logrc (ICDL) [ABDEL,R&Y];
13
. Schottky Diode FET Logic (SDFL) IEDEN,L,L,w&Zf, [EDEN2], [HELIX,J,C&S],
lLoNGl;
. Source Coupled FET Logic (SCFL) [KATSY], IVU&PECZALSK! and
. Unbuffered FET Logic (UFt ¡ [BARNA&L].
"Normally=on" logic requires two power supplies (both VDD and VSS) and extra
level shifting diodes. This implies a higher complexity and area consumption and hence is not
suitable for VLSI implementation.
"Normally-Offl' logic, on the other hand, utilises enhancement type MESFETs which
are normally "OFF' devices and when used as switching elements they are required to be
turned on. This class of logic includes the following approaches:
. Direct Coupled FET Logic (DCFL) [BOSCH], ICHU&JAKOBSEN], USHIKAWAI,
IPECZALSKI, ISUYAMA];
. Feedback FET Logic (FBFL) [FULKERSON];
. FET FET Logic (FFL) [LARUE,W&C];
. Junction FET Logic (JTL¡ Í JLEEG,N&TI;
. Pseudo Current Mode Logic (PCML) IDUNCAN,S&SI;
. Quasi-FET Logrc (QFL) INUZILLAT];
. Super Buffer FET Logic (SBFL) [CHU&JAKOBSEN], ILONG&BUTNER],
[NAKAMURA];
. Source-follower Direct Coupled FET Logic (SDCFL) ICHU&JAKOBSEN],
IDAVENPORT], [ESHRAGHIAN92] ;
. Source Follower FET Logic (SFFL) [ESH,C,M&C], [ESH,B,S,L,B&B];
. Two-phase Dynamic FET Logic (TDFL) [NARY&LONG] and
. Ultra Buffer FET Logic (UBFL) ICHU&JAKOBSEN].
t4
As well as the above logic classes, technology incorporating pass transistors has been
developed for use in digital logic design. "Differcntial Pass Transistor Logic" (DPTL) has
been reported in IPASTERNAK&S] to take advantâge of the pass Eansistor structure without
sacrificing important design parameters such as noise margins by using buffers at the output.
However, it requires differential inputs and frequent buffering which makes its use limit€d.
While dynamic logic has desirable properties of being simple and low power, they are highly
sensitive to process va¡iations and operating conditions. These coupled with the fact that they
possess a minimum frequency of operation makes their use limited. As a result, only static
"normally-off'logic are considered here due to speed requirements (the chip must work at a
frequency range from 20MHz to 600MHz). Amongst all candidates, DCFL,has shown the
most promise. It has the desirable properties of being simple as well as having a constant and
low power dissipation. However, the main drawback lies in its low noise margins and weak
output driving capabilities. A mixed logic approach using pure NOR structures based on
DCFL, SDCFL, SBFL and UBFL has been developed in [CHU&JAKOBSEN] (see
Appendix D) to achieve high speed, low power, low noise and high packing density designs.
The four logic families are introduced in section 2.1.1 to 2.1.4.
2.1.1 Direct Coupled FET Logic (DCFL)
DCFL is the simplest logic family in the GaAs MESFET technology. A DCFL
inverter is shown in Figure 2.I(a).Its structure closely resembles the nMOS structure in
Silicon design except for the presence of the Schottky diode at the gate of the EFETs which
clamps the sæady state high voltage level to one diode drop (approximately 0.7V) instead of
the full swing to VDD. As a result, the pull-up DFET always stays in saturation. The
operation of a DCFL inverter can be describe as follows: When the input is a logic "low", the
pull-down EFET is cut-off. Its equivalent circuit is shown in Figure 2.1(b). On the other
hand, when the input is a logic "high", the pull-down EFET operates in the linear region and




Figure 2.1(c). From its operation, it can be seen that DCFL gates have a large delay when the
output load is large. This is due to the limited current available to pull up the output load,
hence a large rise-time. Also, the noise margin in DCFL is low since the output low level
depends upon the on-resistance of the EFET. As a result, a high noise margin can only be
achieved by increasing the width of the EFET. However, as shown in Appendix D, this
reduces the speed of the gate. Consequently, there is a rade-off between the speed and the







Figure 2.1: (a) DCFL inverter
(b) Equivalent circuit when the input is low and
(c) Equivalent circuit when the input is high
Besides inverters, DCFL also supports nor functions which can be realised by simply
adding extra parallel pull-down EFETs. The simplicity of DCFL together with the fact that it
has a constant and low power dissipation makes it an excellent candidate for VLSI
implementation since a high packing density, quiet power bus and low switching energies
can be achieved. Moreover, DCFL has been reported to have the smallest power-delay
product over any other static logic families in GaAs IESHRAGHIAN]. However, the main
drawback in DCFL lies in its low noise margins (which makes it difficult to realise large fan-
in nor gates and a stringent process requirement is imposed) and a weak driving capability.
Three other logic families presented in the following sections are designed to increase the









2.1.2 Source-follower Dírect Coupled FET Logic (SDCFL)
As its name suggests, SDCFL improves the performance of DCFL by appending a
source follower at its output. A basic SDCFL inverter is shown in Figure 2.2(a). V/ith
reference to Figure 2.2,the operation of an SDCFL inverter can be described as follows:
When the input is a logic "low", J2 is cut-off while Jl operates in the linea¡ region
and hence the internal node is pulled high to 2 diode drops. However, the difference between
the gate-to-source voltage (Vgs) of J3 and its threshold voltage (=2OOrnV) is less than its
drain-to-source voltage (V¿J. As a result, J3 saturates while J4 operates in the linear region.
The equivalent circuit for a logic "low" input is shown in Figure 2.2(b). On the other hand,
when the input is a logic "high", Jl saturates while J2 operates in the linear region and
behaves as a voltage conüolled resistor (VCR). Consequently, the internal node is pulled
down and hence J3 is cut-off while J4 operates in the linear region. The equivalent circuit is
shown in Figure 2.2(c). From its operation, it can be seen that SDCFL gates will have better
noise margins than DCFL since the buffer stage improves the output low level. However, it
has a large fall-time when the output load is large. This is due to the fact that SDCFL has a
large pull-down RC time constant since the pull-down DFET has a large on-rcsistance. One
way to improve the fall-time in SDCFL is to tie a small negative voltage (eg. -20OmV) instead
of ground to the pull-down DFET at the buffer stage so that it stays in saturation for a longer
period. It also gives a lower output low voltage which in turn increases the noise margin.
However, due to practical difficulties and complexiW issues, it is not considered here.
In additional to inverters and nor gates, SDCFL also supports Or-And-Invert (OAI)
structures which is illustrated in Figure 2.3.ln OAI structures, the output of two SDCFL
gates are connected together to form an "ot'' operation. Noæ that the total width of the pull-
down DFET is halved in order to maintain proper output high level. OAI stn¡ctures enables














Figure 2.2: (a) SDCFL inverter
(b) Equivalent circuit when the input is low
(c) Equivalent ci¡cuit when the input is high
v
A+B+C+D
Figure 2.3: Or-And-Invert (OAÐ structure in SDCFL
The realisation of nor gates with higher fan-ins than DCFL is possible since SDCFL
has higher noise margins. Moreover, the source follower also increases the output driving
capability. However, this is at the expense of a higher power dissipation and complexity.
Furthermore, the power dissipation of SDCFL gates depends on the output logic state as the
steady-state output high level is maintained by the pull-up EFET (J3) at the source follower













Figure 2.4: SBFL inverter and 2-input nor gate
An SBFL inverter and 2-input nor gate is shown in Figure 2.4. Simila¡ to SDCFL,
SBFL enhances the performance of DCFL by appending a push/pull super buffer at its
output. The operation of an SBFL inverter can be described as follows:
When the input is a logic "low", Jl operates in the linea¡ region while J2 is cut-off
and the output of the first stage is pulled high to 2 diode drops. However, the drain-tesource
voltage (VoJ of J3 is greater than the difference between its gate-to source voløge (Vgs) and
the threshold (=2gqttV). As a result, J3 saturates while J4 is cut-off. The equivalent circuit
for a logic "low" input is shown in Figure 2.5(b). On the other hand, when the input is a
logic "high", Jl saturates while J2 operates in the linear region and behaves as a voltage
controlled resistor (VCR). Consequently, the internal node is pulled down and'hence J3 is
cut-off while J4 operates in the linear region. The equivalent circuit is shown in Figure
2.5(c). From its operation, it can be seen that SBFL have both short rise-times (due to the
strong putl-up EFET J3) and fall-times (due to the small RC time constant as the resistance of
the pull-down EFET J4 is low). However, it has dynamic problems: Consider the case where
the input is initially a logic "low", J3 is turned on while J4 is cut-off. A positive transition of
the input will turn J4 on while J3 is conducting. This in turn creates a DC conduction path












Figure 2.5 (a): SBFL inverter
(b) Equivalent circuit when the input is low
(c) Equivalent circuit when the input is high
SBFL has both higher driving capabilities and noise margins than SDCFL. However,
this is at the expense of higher power dissipation. Also, nor structures are considerably more
complex than SDCFL as parallel pull-down EFETs have to be included in both the driving
and the buffering stages. The problem with DC conduction path between the power bus
possess some exm design considerations. Similar to SDCFL, the power dissipation in SBFL
also depends on the output logic state as the steady state high level is maintained by the pull-








Figure 2.6: (a) UBFL inverter and (b) UBFL 2-input nor gate
A UBFL inverter and a 2-input nor gate a¡e shown in Figure 2.6. UBFL is designed
ûo improve SBFL by maintaining a constant power dissipation irrespective of the steady-state
logic staæs. Hence, the noise injection to the power buses is kept to a minimum. This is
achieved by positively feeding the output back to the input ståge. With reference to the UBFL
inverter in Figure 2.6, its operation can be described as follows:
When input is high, both E2 and E4 are off and the sæady state oulput low voltage is
maintained by the DCFL inverter (D2 and E3). As the input voltage starts to decrease, El and
E3 get cut-off while the internal node pulls up which turns E4 on. Consequently, the output
load is pulled up by E4 and thus a short risè-time'is achieved. As the output voltage start to
increase, E2 gradually gets turned on and reduces the strength of E4. tWhen the output
reaches the sæady state high level, E4 gets cut-off completely and the output is solely
maintained by D2. This is the mechanism which enables UBFL to have constant power
dissipation irrespective of the steady-state logic states.
Despiæ the fact that UBFL has a constant and low power dissipation, its main




essentially a 2-input DCFL nor gate follower by a DCFL inverter). Similar to SBFL, it is
complex to realise nor gates in UBFL.
2.2 Logic Family Optimisation
The optimisation process is staræd by noting the operating environment of the chip:
. Power Supply
DCFL can work with a supply voltage as low as 0.9V. However, due to speed and
noise injection considerations, the pull-up DFET should operate in saturation at all times. On
the other hand, SDCFL, SBFL and UBFL require a higher supply voltage (typically 2 diode
drops) for proper logic operation. As a result, a 1.5V supply is chosen to ensure all four
logic families will function properly and achieve a reasonable speed.
. Temperature Requirement
A die temperature of 125oC is chosen for the optimisation process to ensurc the chip
set works under military specifications.
. Backgating Voltage
A backgating (substrate) voltage of 0.6V is chosen for simulation purpose according
to the foundry supplied design manual ICMPI as all logics have positivé signal levels despite
the fact that DFETs are used as pass transistors in the D flip-flop designs (see Chapter 5) and
require a negative voltage to be turned off.
. Process Variations
All logics are optimised with their typical parameters to reflect their real-life




The optimisation process is started by using the Curtice Model [CURTICE] to
calculate the appropriate pull-up/pull-down ratio for DCFL inverters. The Cunice Model can
be written as:
where p denotes the transconductance parameter of the device
[t denotes the drain-to-source culrent of the device
V/ denotes the width of the device
L denotes the gate length of the device
Vgs denoæs the gate-to-source voltage of the device
V1 denotes the threshold voltage of the device
î, denotes the channel-length modulation pararneter
V¿5 denotes the drain-to-source voltage of the device
q, denotes the saturation voltage parÍtmeter
Consider the case where the inverter voltage V¡y is applied to the input of a DCFL
inverter such that the output has the same level: both the DFET and the EFET operate in
saturation. The current flowing through the EFET and DFET can be written as:
r* = þ (i). (u* -u,)'. (t+ t". vo,). tanh(ø. v,")
t*.= þ" + (v," - v,)' 
. (t+ x"..v.).tanh(ø" .v")
and I*o = þo.f {ur)' .lt+ Lo.(voo - v,)].tanh[ø. .(voo - v")] respectively where
Vdd denoæs the supply voltage and Vo denotes the output voltage. Note that the subscript "e"
denotes parameters associated with the EFET while the subscript "d" denotes parÍrmeters
associated with the DFET.
23
By equating the two currents, the following expression can be obt¿ined:
fr {u," - v,r)2 '(t+ )""'vo)' tanh(ø" 'vo)&_
we 
fr tur)' 
.[r +,t¿ '(voo - v")]'tanh[ø¿ '(vdd - vo)l
which turns out to be a¡ound l/18 by substituting the appropriate parameters for constant gate
lengths for both devices. This provides a starting point for the optimisation process. A
mathematical treatment of other design parameters of DCFL are given below
ICHU&JAKOBSENI:
. Model Simplification
An abridged device model is required to simplify the mathematical expressions. In the
following analysis, second order effects such as voltage saturation (denoted by the voltage
saturation parameter a) and channel-length modulation (denoted by the channel-length




As a result, the drain-to-source current for a MESFET operating in cut-off, saturation and
linear region can be re-written as:
tanh(ø.V*)= t
tanh(ø.V0") = Ø. Vo.
À=0
for a. V¿. ) 1 (saturation)
for ø.V0" ( 1 (linea¡ region) and
0 for Vr, ( V,.









(u* - v,)' 'v*a".p. +
24
. Inverter Threshold
The inverter threshold voltage V¡¡y is defined as the voltage such that when it is
applied at the input of an inverter, the same output voltage will be obtained. This occurs
when the saturation currents for both the EFET and the DFET are equal which can be n¿ritæn
¿ts:
ron= þ" i (v,", -v,")'
and
t*o=þo t+) (v,)'
respectively. By equating the two currents, V¡¡y can be writæn as:
V,¿
^/f
Vio' = Vt -
where x =
. lnverter Cha¡acæristics
Given the inverter threshold Vinu, the inverter transfer characteristics can be
determined as:
von for V¡ (( Vinu
vout V 2
x.Øe. (Vtn - V6
for V¡, >) Vinu
Note that this is only an approximate model of the inverter's behaviour and is not accurate
when the input voltage V¡¡ is in the vicinity of V¡¡u.
denotes the pull-up/pull-down ratio
25
. Noise Margin
Noise margin is a measure of the noise tolerance of the circuit. Many different
definitions of noise margins have been reported in [HILL], [LONG&BUTNER],
['!WESTE&E] and it is shown in [LOHS,S&DEG] that they have the same mathematical
equivalence. Amongst various definitions, the "Intrinsic" method as described in
ILONG&BUTNER] is used here due to the ease of measurements. The test structure for the
measurement is shown in Figure 2.7.
<_
(b)(a) *
Figure 2.7:Test circuits for determining (a) the intrinsic high and
(b) the intrinsic low noise margin
With reference to Figure 2.7(a) the intrinsic high noise margin is defined as the maximum
noise voltage VNt which can be applied in the direction as shown such that the output Voutl
will not change from low to high state. Thus it can be writæn as:
V¡,¡n = Vo*, - Vio" = Vor -v. + E.Æ
Similarly with reference to Figure 23(b), the inüinsic low noise maigin can be
defined as the maximum noise voltage VNz which can be applied in the direction as shown
such that the ouþut Vouo will not change from high to low state. Thus it can be writæn as:
V
x. ø"(vo" - v,"
2
Vr.,r- = V,o" - Vo, = V,. -
where F¡n denotes the amount of fan-in of the gate.
26
Note that VNIH and V¡.¡¡ need not necessarily be the same. The intrinsic noise margins for an
inverter and multi-input nor gates are plotted against x in Figure 2.8. It can be seen that the
infinsic low noise margin degrades rapidly as the fan-in increases and the value of x should




















0 4 12 16 20 24
x
Figure 2.8: Intrinsic noise margins of DCFL gates fis a function of ¡ at 125"C
. Inverter Delay
The term delay has been defined ambiguously [LONG&BUTNER]. It can be referred
to as the propagation delay, the average between the rise and fall times and the ring-oscillator
delay. Here, the delay associated with a DCFL inverter is modelled as the average between
the rise and fall times for its mathematical simplicity. It can be written as:
Delay = '' !^'' = +f+ + R^',1z -t(.t*'"onj
wherc r, denotes the rise time, t¡denotes the fall time, C¡ denotes the load capacitance, 
^V 
is
the voltage swing, I6s¿ denotes the saturated drain-to-source current associated with the
DFET and Rg¡ is the "on" resistance of the EFET. If we assume Cl to be directly
proportional to the width of the EFET then it follows that the fall-time is constant. The rise-
time, however, is proponionat to Cl¿s¿ at a first order approximation. Thus the rise-time is




The power-delay product is a measure of the effectiveness of the device scaling
within the gates ILONG&BUTNER] as well as an indication of the level of integration
achievable TESHRAGHIA|ü. Once ¡ is determined the size of the devices should be chosen
accordingly. The power-delay product for DCFL gates can be written as:
Power-Delay Product - vdd'cr
2
(av + RoN .I¿ro)
where C¡ is the load capacitance and is the sum of the driver, fan-out and the interconnect
capacitances. It can be written æ:
CL = Cgo¿ + Fsr¡¡'Cgse * Cl
where Cgoo denotes the gate-to-drain capacitance of the DFET, Feu¡ denotes the number of
fan-outs, Cgse denotes the gate-to-source capacitance of the EFET and C1 denoæs the
interconnect capacitance. For wide pull-down EFETs, Cgdd is small compared to Cgss. For
short EFETS, however, the gate length of the DFET has to be scaled in order to achieve a
weak pull-up. In such case CgOO becomes comparable to, or even exceeds Cgse. The power
delay product of a DCFL inverter is shown in Figure 2.9 at one fan-out and constant ¡. It can
be seen that the polver detay product is minimal when vy'e = 8pm' Minimum delay' however'
can be obtained by using larger devices.
30
o 4 t 
to" crJÍ 16 20
Figure 2.9: Power delay product of DCFL inverters as a function of Ws







From the above analysis, it is found that by using a pull-down EFET of width 8pm
and a 1.2¡rm long gate together with a pull-up DFET of width 2pm and a 4pm long gate
gives a good trade-off between speed, noise margins and power delay-product (which
Íanslates to the level of integration achievable as reported in IESHRAGHIANI). The






Figure 2.10: Optimised DCFL inverter
2.2.2 Optimising SDCFL
The optimisation of SDCFL is first started by the DCFL stage taking into account that
the output high levet is at 2 diode drops (=1.2V) and its output low level can be higher than
that of a conventional DCFL gate but lower than the th¡eshold voltage of an EFET since it is
fed to a source follower. To maintain compatibility with DCFL, the input capacitance of
SDCFL is kept constant by using an EFET of the same size (8pm wide by L.2¡tm long).
However, SDCFL is a two-stage logic and in order to achieve the same overall speed as
DCFL, the first stage is made twice as fast as a norrnal DCFL gate. This can be achieved by
using a2pmwide by Z. ¡tmlong pull-up DFET. The buffering stage is optimised in a way
such that it gives equat high and low noise margins as defined by the "Slope" method (which
is just another convenient way of measuring both noise margins) to maximise the worst-ca,se

















Figure 2.12: Optimised SBFL inverter
In
GND
Figure 2.11: Optimised SDCFL inverter
2.2.3 Optimising SBFL
The optimisation of SBFL is similar to that of SDCFL: The input stage is kept
unchanged to maintain compatibility with SDCFL whereas the EFETs in the buffering stage
are chosen to have the same width to achieve equal rise and fall times. However, the input
capacitance will not be the same as the input signal is fed to both pull-down EFETs at the
input and buffering stages. The actual width of the output EFETs is chosen so that the
optimised gate will have a delay of less than 200ps with a 200fF capacitive load at the ouþut.
This ensures a suff,icient output curent to drive a wire-length of up to lmm. The sizing of an







The optimisation of UBFL is slightly more complicaæd. Initially, the DCFL stage
(Dl and El) is chosen to be the same as for SDCFL and SBFL. The size of E2 is a design
parameter a^s it determines the strength available to E4 which translates directly to the speed
and noise margin of the entire gate. The fransistor pair E3 and E4 should be chosen to be the
same as that for SBFL to maintain output srength compatibility. The size of the pull-up
DFET D2 should be the same as that for a DCFL inverter. An initially optimised LJBFL
inverter is shown in Figure 2.13. However, due to the effect of backgating, the sEength of
Dl must be reduced to preserve the logic levels. To compensate this, the size of the transistor































The DC characteristics of the optimised gates a¡e shown in Figure 2.15 where the
input voltage is plotted together with the output to hetp identifying the inverter voltage (Vnu).
Thei¡ performance are compared on the basis of speed, noise margin, power consumption
and noise injection to the power bus at the operating condition (125"C and a subsfaþ voltage
of 0.6V). As there are many ambiguous definitions for such par¿rmeters, the benchmark used
for the measurements ¿ue presenæd below:
. Speed: All speed measurements were carried out using a 7-stage ring oscillator'
. Noise Margin: All noise margins werc measured by the "maximum square" method lfilLl-].
. Power Consumption: The sum of the static and dynamic power consumption wâs measured
at 500MHz.
. Noise Injection: The noise injection to the power bus was measured by the static current
balance which is expressed as a percentage of the deviation in the current drawn in different
logic staæs and its average value. It can be written as:
Deviation of Current Drawn in Different l-ogic States x 1.007oStatic Current Balance = Average Current Drawn
which can be re-written as:
Static currenr Batance = lltri 
-lr"l x 700vo
Is¡ +16
where I¡¡i denotes ttre steady-state current drawn at the logic "Hi" state
I¡s denotes the steady-State current d¡awn at the logic "[-o" state
The graphs showing the effect of fan-out, capacitive load and fan-in on the
performance of the logic are plomed in Figure 2.16(a) to Figure 2.16(d) and a¡e summarised
32
in Table 2.1. Noæ that SBFL and UBFL are used only as buffers due to their complexity.
The current drawn from VDD and GND of va¡ious inverters are shown in Figure 2.17.









































Figure 2.I5(a): DC characteristics of DCFL inverter at I fan-out
'--l 
'; ;'ì
























55 0 . 0í
5 0 0 . 0t{
{50.0ll
{00.0}1
35 0 . 0Ë












































Figure 2.15(c): DC characteristics of SBFL inverter at 1 fan-out
¡T OC CHARACTERISTICS FOR USFL ¡NYERTERI VOO.I.5V, I25C 'TI
s-HAY93 L9r 0r3l
UBFL_NII . SI ()0utt



















































0 - 0í 300 -OUÌI (LIN]
I ¡ ì-->lJ
q 0 0 . 0H q95_637H
Figure 2.15(d): DC cha¡acteristics of UBFL inverter at 1 fan-out
Key: OUT1 denotes the input to the inverters





















































0 0 2 4 ',6 8
Fan-in




(a) Effect of fan-out on speed
(b) Effect of capacitive loads on speed
(c) Effect of fan-in on speed at one fan-out
(d) Effect of fan-in on noise margin
Figure 2.16: Graph showing the performance of various logic families












3l403510Power-Delay Product at 1 Fan-
Out (fI)
2n+42n+2n+3n+1Number of Transistors for ann-Input Nor Gate
















l9 | ¡+ : 59
VARIOUS INVERTERSI
6 0 0 . 0H
rt00.0l{




2 0 0 . 0u
0.





























I ( VG 1
s8FL.lR0
I(Vt00A--
I ( VG 1ß---
)




























200 I .0N r+ .0N 5. 0N
Figure 2.17: HSpice simulations on the current drawn from VDD and GND of va¡ious
inverters running at 500MHz
Key: IN denotes the input to the inverters
I(V100) denotes the current drawn from VDD






2.4 Logic Design Methodology
After optimising and comparing the four logic families, a "mixed" logic design
methodology is proposed: In this design approach, DCFL is used extensively in local logic
operations where the amount of fan-in, fan-out and the interconnect length are small. This
improves the overall chip performance by reducing the chip area (due to its simplicity) and
the power consumption (due to its superb power-delay product). In critical paths and/or
higher fan-in applications, SDCFL should be used to achieve a higher speed and noise
margins. Due to the complexity of SBFL and UBFL in realising nor structures, they are used
strictly as buffers. SBFL should only be used for buffering global signals (such as clocks
and resets) due to its high power consumption and the amount of noise injection to the power
buses. UBFL is best suited for buffering parallel buses (such as address and data bus) since









Chapter 3. System SpecifÏcations
This chapter describes the overall system specif,rcations of the switch fabric. Issues
such as the cell format and the architecture of the switch will be discussed. All designs
comply with the CCITT standard for interfacing outside the switch. However, modification
to the cell format is allowed within the switch to simplify the ha¡dwa¡e implementation.
3.1 The ATM Switch Block
Figure 3.1 shows the typical structure of an ATM switch. It consists of input and
output line units, a higher level control block and the switch fabric. The line units inærface
between the external transmission links (eg. optical fibre) and the switch fabric. They
perform bth physical and ATM layer functions such as bit synchronisation, cell delineation,
Header Error Confrol (HEC) checking and generation, policing and label translation.
The higher level control block monitors the Eaffic load on the switch fabric. This
information is used internally to divert cells away from congested switch nodes and
externally to minimise call set-up blocking by diverting cells away from congested switch
blocks.
The switch fabric uses routing information contained in the cell header to route
incoming cells to the appropriate output(s). Besides normal cell routing, the switch fabric
also supports cell broadcast (where an incoming cell is copied to all outputs of the switch)
and multicast (where the incoming cell is copied to a group of outputs). Multicast is treated

























Figure 3.1: A¡chitecture of an ATM switch block
3.2 Cell Format
The format of an ATM cell is specified in [I.361] and is shown in Figure 3.2 atthe
user network interface (UNI). It has a fixed length of 53 bytes and consists of a 48-byæ
information field (payload) and a 5-byte header. The header consists of seven different fields:
. Generic Flow Cont¡ol (GFC):
This field is 4 bits long and is used to monitor the amount of data entering the
network to ensure that known capacities are not exceeded.
. Virtual Path Identifier (VPI):
This field is 8 bits long and provides an explicit path identifrcation for the cell.
. Virtual Channel Identifier (VCI):






















. Payload Type (PT):
This 2-bit long freld defines the nature of the cell. It has a default value of "00" for
user information. However, the value for network control purposes is not yet defrned.
. Reserved (RES) Bit:
Reserved for future use and is set by defauit ûo "0".
. Cell Loss Priority (CLP) Bit:
It indicates the cell discard loss priority. It is set to "0" for high priority cells and "1"
for low priority cells.
. Header Error Control (HEC) field:
This field is eight bits long. It contains the coefficient of a cyclic redundancy check















The transmission rate for ATM is specified in [I.l2ll to be either 155Mb/s or
622Mbls. However, this information can be transferred in a parallel format instead of bit-
serially to simplify synchronisation and to reduce the clock frequency. In this project, the
ATM cells are carried by eight parallel lines (as described in [JTSKS93] to maintain
compatibility with the BATMANI project) and extend in time over 53 internal clock cycles.
Simila¡ to BATMAN, an internal header line (IH) is appended to simplify the hardware
implementation. It contains all ttre necessary routing information about the cell as required by
the swirch fabric and is discussed below:
. Sta¡t Bit (S):
This is the first bit in the internal header and is always set to "1" to indicate the
beginning of a valid cell. For idle cells, the entire internal header line is set to zero. This
simplifies the ha¡dware implementation as no separate cell clock is requircd.
. Priority Bit (P):
This is simply a copy of the cell loss priority (CLP) bit in the cell header.
. Broadcast Bit (B):
Indicates whether a cell is to be broadcast to all outputs.
. Address Field (24 bits):
This f,reld contains the routing information of the cell within the srivitch fabric.
. Reserved Field (26 bits):
Reserved for future use and is set to "0" here.
I Broadband ATM Access Network. It is a 155Mb/s ATM switch technology demonstrator previously
developed at Jydsk Telefon, Denmark.
4t
The format of an internal ATM cell is shown in Figure 3.3. Note that the switch












Figure 3.3: Cell format within the swirch fabric
3.3 The Switch Fabric
The switch fabric is realised as an "MSD" switch, that is, it incorporaæs Multiplexers
at the input, a Space switch at the centre stage and Demultiplexers at the output. The space
switch is realised as a Benes network UTSYB93], [BENES]. An N.r N switch utilising the
above a¡chitecture is shown in Figure 3.4 for a line speed (Bt) of 622Mbls. Notice that the
line speed is increased to effectively 2.4Gb/s2 before entering the Benes network. This is
known as the bit rate conversion technique as reported in [JAKOBSEN93] to reduce the call
set-up blocking probability, the delay, the number of switching elements and hence the total
power consumption of the switch. To realise a 155Mb/s switch, the configuration as shown
2 The maximum data rate is determined by the technology used for the swiæh realisation. In this case, it is
Z.4Gbls since the ATM cell is carried by eigbt parallel lines each having a maximun frequency of 600MHz.
42
AOBP ReservedA1........... A23
in Figure 3.5 should be used where the input line speed is increased from l55Mb/s to
Z.4Gbls before entering the Benes network for the same reason.
The issue of call set-up blocking probability in a Benes network is reported in
UAKOBSEN93I and is given by the following expression:
loBN _l

















the call set-up blocking probability
the switch size
the size of the switching elements (M=2 in this case)
the number of calls placed at the external lines and is given by
the expression: Lg = BB/BC
the maximum number of calls at any internal line and is given
by the expression: Ll - By'Bc
the maximum total load on any external input or output line
the maximum total load on any intemal line
the maximum equivalent load of a call allowable
43





2log2(N/4)-r - - -
---2log2(N/16)-L--- Nt32 N/16
N/8
Figure 3.4: An Nx N 622Mbls ATM switch fabric realised by 2 x 2 switching elements in a
Benes network configuration















Figure 3.5: An N¡N 155Mb/s ATM switch fabric realised by 2 x 2 switching elements in a






The performance desired [JTSKS93] for the switch fabric is summarised in Table
3.1. Noæ that this is not yet a strict requirement due to the experimental status of the projecf
Iæss than 10kW for a 1024 x 1024 622Mbls switchPower Consumption
Maximum burst bit raæ of lOTo rcLatve to 622MblsTraffic Burstiness
Not to exceed ZVo relative to 622MblsAverage BitRate
6O7o due to the experimental status of this proJect
(Could be increased to 80% by increasing the buffer size)
Traflic Loacl
I-ess than l7o at all load conditionsCall Set-Up Blocking
Iæss than l0-8Cell Loss Probabttity
25OpsMaximum Delay
1024 x 1024Maximum Size
Performance Req ui¡ementPert-ormance Criteria
Table 3.1: Performance required for the swirch fabric
45
3.4 Realising the Switch Fabric
The entire swirch fabric is designed such that it requires only three different chip-sets
tJTSYBg3l. They consist of a Buffer chip, a Router chip and a Multiplexer chip which are
described in sections 3.4.1 to 3.4.3. The same external data interfaces are used for the three












Figure 3.6: The dat¿ inærface common to all three chips
3.4.1 The Buffer Chip
The buffer chip receives a single input cell stream and sends it out to the output at half
the input speed. Consequently, the input data rate (2 Bt or a maximum of 4.8Gb/s for a
622ubls switch) is twice that of the output (Bt or a mærimum of 2.4Gb/s for a 622Mbls





has a size of 32 cells3 . The buffer size was chosen according to [TOBAGI] based on the
maximum tolerable cell loss probability of l0-8 and a traffic load of TOVo (60Vo at the input
multiplied by a speed up factor of 1.2). Figure 3.7 shows the interface of the buffer chip.
In realising this chip, an isochronous clocking strategy is used where the input
section uses the clock from the input data interface and the external clock is used for other
synchronous parts. Signal communications between sections running at different speed arc
handled asynchronously. The main reason for using an isochronous clocking strategy lies in
the different operational speed at the input and output. Furthermore, problems with clock
skew and clock distibution are the major concerns for a large chip.
The anticipated power dissipation of the buffer chip is lW in order to meet the power
requirement of 10kW for a 1024 x 1024 622Mbls switch (see Appendix E for detailed
calculations). Consequently, the on-chip 32-ceII internal buffer will have to be realised by









Figure 3.7: Interfacing the buffer chip
(Note that the extemal clock input has half the fr,equency as the input data inærface)
3.4.2 The Router Chip
The router chip receives cells at the input and sends them out to either one or both
outputs depending on the routing information contained in the internal header (IH) and the
3 The buffer size was calculated by Mr. Jens Jakobsen at Jydsk Telefon, DenmÂfk.
47
@
broadcast enable (BE) pin on the chip. Figure 3.8 shows the interface of the rouûer chip.
Note that there is no cell loss in this chip and the maximum data rate for both the input and












Figure 3.8: Interfacing the router chip
(Note that the broadcast enable input determines whether the chip can broadcast cells)
Due to the fact that both the input and outputs run at the same clock speed,
isochronous clocking is not necessary and the clock is simply driven from the input data
interface.
The anticipated power dissipation of the router chip is 0.7W to meet the power
requirement of l0kW for a 1024 x 1024 622Mbls switch (see Appendix E for detailed
calculations).
3.4.3 The Multiplexer Chip
The multiplexer chip receives two input cell streams at the input and multiplexes them
to a single cell stream at the output at twice the input speed. Hence, the maximum data rate aP
the input and ouþut ports are 2.4Gbts (eight lines at 300MHz each) and 4.8Gb/s (eight lines
at 600MHz each) respectively. It requires a  -cellinternal buffer for temporary storage so that
















Figure 3.9: Interfacing the multiplexer chip
(Noæ that the eitemal clock input runs at twice the ñre4uency as the input data interface)
Simila¡ to the buffer chip, isochronous clocking strategy is used since the inputs and
ouþut operate at different speed. Moreover, it simplifies the problem with clock distribution
and clock skew.
The anticipated power dissipation of the multiplexer chip is 0.8W to meet the power
requirement of 10kW for a 1024 x 1024 622Mbls switch (see Appendix E for detailed
calculations).
After defining the basic elements, higher level building blocks can be constructed to
realise the switch fabric. Three basic building blocks are: the 2 x 2 switch, input multiplexers
and output de-multiplexers which are described in sections 3.4.4 to 3.4.6 respectively.
3.4.4 The 2x2 Switch
A2 x2 switch can be constructed by the th¡ee elements as shown in Figure 3.10. The
two incoming cell süeams each running at2.4Gb/s are initially multiplexed at double speed
(a.8Gb/s) by a multiplexer chip. The multiplexed cell sEeam is then routed to the appropriaæ
output by a router chip and slowed down by a factor of two by buffer chips at each output.













Figure 3.10: Realisinga 2.r 2 output buffered switch by the three elements
3.4.5 Input Multiplexers
The input multiplexers for the 622Mbls switch arp 4-to-1. They can be realised by
three multiplexer chips as shown in Figure 3.11(a). For the 155Mb/s switch fabric, the input
multiplexers are 16-to-1 and can be realised by five 4-to-1 multiplexers as shown in Figure
3.11(b). The maximum bit fate on the outputs arc2.4Gbls for both cases.




















The output demuttiplexers for the 622Mbls switch a¡e l-to-4 and can be realised by
three router chips and six buffer chips as shown in Figure 3.12(a) whereas for the 155Mb/s
switch, they are l-to16 and can be realised by fîve 1-to-4 demultiplexers as shown in Figure
3.12(b). The maximum bit rate on the inputs are2.4Gbls for both cases.









Figure 3.L2(a): Output demultiplexer for the 622MUs switch
16 Br 4Bt Bt
















Chapter 4. Layout Style and Design Tools
This chapær describes the layout style used throughout the conúol logic within the
buffer chip. Various computer aided design (CAD) tools used in the design phase are also
inüoduced.
4.1 Layout Style
The layout style has a major impact on the performance of very high speed VLSI
ci¡cuits. Some key objectives in selecting a layout style are to:
. Minimise interconnect lengths to reduce parasitic capacitances, hence the coupling between
high speed signals.
. Reduce the inductance and increase the capacitance ¿¡ssociated with the power busses to
reduce voltage and current spikes.
. Achieve a high packing density.
The "Ring Notation" as reported in IESH,S,C&N], IESHRAGHIAN] and
IESHRAGHIAN92I is used explicitly to achieve these goals. It is a generic term given to a
free form topological symbolic layout in which graphical symbols are placed relative to each
other rather than in an absolute manner. Figure 4.1 illustrates the "Ring Notation" layout
style for a DCFL inverter where the VDD and ground busses are placed in parallel and in
close proximity of one another to increase the decoupling capacitance. Transistors are placed
below the ground bus to reduce noise injection because the AC currents in the ground bus are
smaller, and isolate the transistors from the effects of the larger AC currents in the power
52












Figure 4.1: (a) Circuit diagram
(b) Ring notation rcpresentation and
(c) Symbolic notation of a DCFL inverter
It is reported in [ESH,C,M&C] that the noise injection to the power busses can be
further reduced by decreasing the characteristic impedance of the lines. This can be achieved
by using a grid like connection for VDD and ground busses as shown in Figure 4.2. It is
shown in [ESH,C,M&C] that the noise amplitude can be reduced to one.third olthe original
after ariving at the lrst cross-point. In a circuit's point of view, this ranslates to reduced
inductance and resistance as well as increased capacitance associated with the power busses












Figure 4.2: Using net-like power lines to reduce the noise injected into the power buses
Note: Triangle denotes an inverter
Vertical lines denotes global power lines
Horizontal lines denotes local power lines
4.2 Design Methodology
The design phase can be divided into three main stages: The specification, the
realisation and the verihcation stage.
In the specification phase, a VHDL behavioural description is written fcjr the design
to describe its functionality. It is then simulated in ASIMUT (a circuit simulation tool within
the ALLIANCEa package) with a predef,rned set of input stimuli to verify its behaviour.















In the realisation stage, a VHDL structural description is first written for the design
based on standard library cells (see Chapter 5) which is then simulated and compared with
the behavioural description to achieve a one-to-one correspondence. The structural
description for the circuit is then laid-out in MAGIC5 with standard cells.
The verification stage is canied out in three phases:
. Netlist Verification
The laid-out circuit is fîrst converted to a spice netlist by the program ext2sp which
is then converted into the VALID6 format by the progrÍLm magic2valid. This netlist is then
compared by the program NETCOMPARE with a similar one generated ftom the VHDL
structural description to ensure both circuits match by containing the exact type and number
of tansistors and connections.
. Functional Verification
Two levels of simulation are used. In the fîrst phase, the layout is extracted and
converted to a sim format by the program ext3sim. IRSIM, a switch level simulator within
MAGIC, is then used to verify its connectivity and functionality. The merit for using
IRSIM lies in its ability to check any undefined inputs and outputs which ASIMUT fails to
handle.
In the second phase, a full HSPICET simulation is carried out after the extracted
circuit is converted to spice format by the program ex0sp to verify the speed requirement as
well as providing an estimate of the circuit's power consumption. The relative merits of
various simulation tools a¡e summarised in Table 4.1.
MAGIC is a full-custom layout tool by University of California at Berkeley.
VALID is a schematic layout and simulation tool by CADENCE Software.




































































0.6 0.t o.rs 02
Erlñ-d AR pü; d.nrly rp.tr
0.6 o.r o.r5 0.2 08 0.3 0.s 0.4 045 05
tlqfrúrd t.ÇrrËï in qr
E fm-d¡tâpild.tiY?.cl¡
o t5 o¿ o2s 0.3 0.G 0.4
Ncnrdird trçrey h ç¡
(b)
E¡tn t dAnpffid6iv¡9.ctr
0.r 0.t5 02 025 0.3
Nmrdi¡¡d t¡çmt h ç¡
(d)
E.fDdrd AF pffi dÍ*l tPrcll




0.¡(i 0.5 0_6 0.'l
0.6 0.1
0.¡15 0l
E.tm-d Aâ pdr ùn.aly !p.cùr
0¡ 0.a5 05 0.¡l 0.¡15 05
(c)
0 ¡l 0.¡ß 05
(e) (f)
FIGURE 4.5 Comparison of AR PSDs between Yule Walker method and Recursive
method at a low frequency 0.1 cps with h = 0.6 * h-max and SNR = 20 dB afær 1













idd ¡m : RælÚn ñhod
pr l0
St{ß ¡ ãl 6
| ¡lrrlo




































































































































This chapter describes the basic primitives used throughout the buffer chip. It is also
intended to provide a complete library for the VHDL descriptions of the lowest level cells
tCHUll. All sizing shown in this chapter are drawn size only, the actual gate lengths are
0.4pm shorter.
5.1 Logic Gates
All four logic families as described in Chapter 2 are used in the construction of the
buffer chip. Due to the complexity of SBFL and UBFL, they are used strictly as buffers.
Moreover, all primitives are given a unique name which are consistent with the instance
names used in the YHDL structural descriptions.
5.1.1 Inverters
Inverters from all four logic families were used according to the output load, speed
and noise injection requirements (see Chapter 2).They were optimised for high temperaturË
operation (125"C) and a speed of approximatety 100ps at one fan-out. The sizing of each











































































5.1.2 2-lnput Nor Gates





























5.1.3 3-Input Nor Gates
As for the case of 2-input nor gates, only DCFL and SDCFL 3-input nor gates are










Figure 5.8: 3-Input DCFL nor gate "o3nd"


























5.1.4 4-Input Nor Gates









Figure 5.10: 4-Input SDCFL nor gate "o4ns"
5.1.5 Or Gates
"Or" gates are realised by appending an inverter to the output of nor gates. Two

























Four different types of RS flip-flops a¡e used in the construction of the buffer chip
with thei¡ circuit diagrams shown in Figure 5.13 to Figure 5.16. They a¡e chosen due to the






Figure 5.13: Circuit diagram for RS flip-flop "rrsds"
RO
Figure 5.14: Circuit diagram for RS flip-flop "rrssd"
R





Figure 5.16: Circuit diagram for primitives "rssd"




. Q defaults "Lo" and Q* defaults
. Both O, O* outputs can drive an
"I{i" in VHDL behavioural
extra fan-out or a 20fF
rsdd
a
a fan-outs or a 40fF capacitive load
fan-out or a 2OfF canacitive load
rrssd
. RS flip-flop with reset
. Q output cannot be used





Five different data larches with instance name 'T" are used in the construction of the
buffer chip. Tabte 5.2 shows the module name in the VHDL structural descriptions together
with their differences in application regarding output buffering and the need for prcselclear.
Their individual circuit diagrams are shown in Figure 5.17 to Figure 5.21 respectively.




Figure 5.17: Circuit diagram for data larch'Jdd"
. Data latch require manual preset on start-up
. Q output can d¡ive an extra fan-out or a 20fF
. O* outDut cannot be used
capacitive load
jpsd
@ear. Q defaults "Lo" and Q* defaults ('Hi" ill
. Q ouþut can drive another 4 fan-outs or
r Q* ouþut can drive an exEa fan-out or a
VHDL behavioural descriPtion








a defaults "I:[i' a* defaul ts "f¡tt m VHDL behavioural description
output can driVE an eXtra fan-out or a 20tr load






o ouþut cannot be used
can drive an exEa fan-out or a 20fI. load
jcdd
¡Data larch w-th no preset/clear
. Q defaults "Lo" and Q* defaults
. Both Q, Q* outputs can drive an
"Hi" in VHDL behavioural description































Figure 5.21: Circuit diagram for data larch'Jpsd"
69
5.4 Pass Transistors
Four different types of pass üansistor ¿urangements are used in the construction of D
flip-flops within the buffer chip. More specifically, double depletion type transistors are
utilised to allow the use of two-phase non-overlapping clock to simplify the D flip-flop
designs. All pass Eansistors in this section are given the instance name "p" in the VHDL
structural descriptions. Due to the different needs of clamping diodes at the inputs and the
strength of the input signals, four variations of the pass Eansistors are presented. Their
circuit diagrams are shown in Figure 5.22 to Figure 5.25 respectively. Note that all diodes
arc consfucted by connecting the source and the drain of the DFETs together.
Figure 5.22 shows the configuration for a double pass transistor with clamping
diodes at both inputs. The clamping diode is used to limit the input voltage swing from two
diode drops to a single one so that the output swing is also clamped. The reason for a
clamped output is to increase the speed of the logics at the following stage. This arrangement








L: 2 L: ,,
W:2 W:2
Figure 5.22: Ctrcuit diagram for double pass fansistors "pcc" with clamping diodes at both
inputs
70
Figure 5.23 shows the configuration for a double pass transistor with a clamping
diode at the input il. This arrangement is used when input i0 is already clamped or driven by















Figure 5.23: Circuit diagram for double pass transistors "pc" with one clamping diode
Figure 5.24 shows the configuration for a double pass transistor with no clamping





Figure 5.24: Ci¡cuit diagram for double pass [ansistors "p" with no clamping diode
7l
Figure 5.25 shows the configuration for a double pass transistor with no clamping
diode at either inputs. Note that this arrangement is the same as the one in Figure 5.24 exæpt
it is only used when both inputs are driven by doubled sized UBFL gates. This ensures the











Seven different D flip-flops are used due to various output load, power and speed
requirements. They are given the instance name "1" for normal D flip-flops and "f' for
differential ones. Clamping diodes a¡e included to limit the voltage swing to DCFL level in
order to increase speed and reduce noise injection to the power busses. All diodes a¡e
constructed by connecting the source and the drain of the DFETs together as illustrated in
Section 5.4.
Figure 5.26 shows a D flip-flop realised by pure DCFL gates. This configuration
should only be used in situations where both Q and Q* outputs have less than th¡ee fan-outs












In situations where a stronger output driving capability is only required at the output
Q, the configuration as shown in Figure 5.27 should be used where Q is driven by an
SDCFL inverter. This increases the maximum fan-out at Q to four or an equivalent capacitive






Figure 5.27: Ctrcuit diagram for D flip-flop "lddsd"
If a sgong output driving capability at both output Q and Q* is required, the
configuration as shown in Figure 5.28 should be used where both outputs are driven by
SDCFL inverters. This increases the maximum fan-out for both outputs to four or an








Figure 5.28: Circuit diagram forD flip-flop "lddss"
74
In situations where there is a long interconnect line or a large fan-out (eg. five) at the
output Q, the configuration as shown in Figure 5.29 czn be used where Q is driven by a
UBFL inverter. This increases the maximum fan-out at Q to six or an equivalent capacitive
load of l0OfF while the muimum load at output Q* remains that of a DCFL inverter.
0l
nd Q*
Figure 5.29: Circuit diagram for D flip-flop "lsdud"
If there are long interconnect lines or large fan-outs at both outputs (Q and Q*), the
configuration as shown in Figure 5.30 can be used where both outputs are driven by UBFL















If speed and the skew in the outputs (Q and Q*) are of major concern, then the
configuration as shown in Figure 5.31 is best suited. It is a differential D flip-flop so there is
little/no skew in the two outputs and they are driven by SBFL inverters. This is the fasæst
¿uïangement available without sizing the gates to a larger ratio. Both outputs can drive over
seven fan-outs or an equivalent capacitive load of 160fF. However, it is associated with a
large power dissipation which makes its use limited Nevertheless, it is useful in critical paths





Figure 5.31: Circuit diagram for differential D flip-flop "fddvv"
For synchronous signals that have an exuemely long interconnect (such as the ones
between different functional blocks which could be as long as lmm) a4d a large fan-out (eg.
100), the output driving capability of normal sized logic gates will be insufficient. In this
case, triple sized SBFL inverters are used at the output (Q and Q*). Consequently, the pass
transistors will have to be triple sized to avoid leakage due to the effect of backgating. This
configuration is illustrated in Figure 5.32. Note that the first stage of the D flip-flop is
composed of two normal sized SBFL inverters to minimise the noise injection to VDD as



















Figure 5.32: Circuit diagram for differential D flip-flop "fvv3v3v"
To drive an exceptionally large output load (such as clocks and reset lines), the
buffering stategy as described in IESHRAGHIAN] can be employed. This is illustrated in
Figure 5.33 where all SBFL inverters are driven by a half-sized one in the preceding stage.
The number on top of the inverter indicates the ratio to the previous one. Although it has been
shown in IESHRAGHIAN] that a ratio of e provides the minimum delay, it does not






SBFL SBFL SBFL SBFL







For¡r different types of multiplexers ate described with instance name "mux":
Figure 5.34 shows the simplest configuration for a multiplexer whose truth table is
shown in Table 5.3. It utilises only 2-input DCFL nor gates and hence the output culent




















Figure 5.34: Circuit diagram for multiplexer "mux"
If reset is required, the confîguration as shown in Figure 5.35 can be used. The truttt
table of the circuit is shown in Table 5.4. Note that the reset signal on the last DCFL nor gate
is active high. Since output is driven by a DCFL 3-input nor gate, it can only support one









Figure 5. 35 : Circuit diagram for multiplexer'Tnuxt''
Table 5.4: Truth table of the multiplexer "mux/'
In situations where a stronger output driving capability is required, the circuit as
shown in Figure 5.36 can be used. The only difference between this circuit and the one
presented in Figure 5.35 is that the output in the latter one is driven by an SDCFL 3-input nor


























If an active high preset is required, the circuit as shown in Figure 5.37 can be used.
The truth table of the ci¡cuit is shown in Table 5.5. Note that only DCFL gates are used in its
construction. Consequently the output can only drive two fan-outs or an equivalent capacitive
load of 20fF.
ir
Figure 5.37: Circuit diagram for multiplexer "muxs"


















Chapter 6. Implementing the Control Logic
This chapter describes the design details of the conrol logic within the buffer chip
which can be divided into three main functional blocks: the input control, the buffer
manager and the output control. Other modules of the chip were designed by Mr. Jens
Jakobsen at Jydsk Telefon, Denmark. The floor plan of the enti¡e chip is shown in Figure
6.1.
Figure 6.1: Floor plan of the buffer chip
Throughout the whole chip, two-phase non-overlapping clocks are used (which a¡e
derived from the input single-phase clocks by level shifting and complementing) to simplify
the D flip-flop designs (see section 5.5). The two-phase clocks have 0V for logic high and
-1.5V for logic low to ensure the DFETs, which are used ¿rs pass Eansistors, can be switched
off properly.As discussed in section 3.4.1, an isochronous clocking strategy is used to
simptify the clock distribution and minimise the clock skew. More specifically, the input
serial-to-parallel (S/P) converters and the input control use the twophase clock derived from
the input data interface while the buffer manager, the output conEol and the output parallel-to-
serial (P/S) converters use the two-phase clock derived from the external input clock (which




























the buffer manager as well as that between the buffer manager and the output control a¡e done
asynchronously. That is, all incoming signals must first pass through two Data flip-flops to
get synchronised with the local clock as described in ICMPI to avoid meta-stability.
As mentioned in Chapter 3, the input sections of the buffer chip have to run at
600MHz to match the input data rate of 4.8Gb/s. However, the output sections will only
need to run at 300MHz as the output data rate is only 2.4ßbls. This implies the input control
must operate at 600MHz to generate proper control signals to the input serial-to-parallel
converters whereas the output control will only need to operate at 300MHz to control the
output parallel-to-serial convefters. The buffer manager, on the other hand, can be designed
to operate at both 600MHz or 300MHz as it needs to inærface with both the input and output
control modules. It is decided in tJTFEBg3l that the buffer manager should operate at
300MHz to enable more flexibility in the hardware design process. The detailed design of the
three modules are presented in section 6.1,6.2 and 6.3 respectively.
6.1 The Input Control
The input control detects the arrival of non-idle cells and sends out input block
request signals (IBR0 or IBRI) to the buffer manager according to their cell priorities
tCHU2l. Depending upon the decision of the buffer manager, the request will either be
granted (which will be accompanied by a 5-bit block address) or not ganted. f ìne request is
granted, the input con6ol will generate the full 7-bit write address (W40..6) and a write
enable (WE) signal to the dynamic memory (DRAM) together with an input convert (IC)
signal to the input serial-to-parallel (s/P) converters at the appropriate moments (sep later).
Otherwise, no action will be taken and the incoming cell is disca¡ded. Figure 6.2 shows the
inærface of the input control and Table 6.1 provides an indication of the signals' function.
82
Input Conrol











Figure 6.2: Interfacing the input conEol
In order to achieve a lower cell loss probability for high priority cells, a two-level
priority scheme is used such that when the DRAM is at or more than three-quarters full
(which corresponds to 24 cells), any further incoming low priority cells will be discarded
wherea,s high priority cells are only discarded only when the DRAM is full.
Table 6.1: Description of interfacing signals in the input control
write. sequenceEnabline memoryWnte Enable I
ñStruct th-eFput SyP converters to load its contents
to the DRAM
Input Convert (IC)
The full T-bit write address to memory
Tñlãtchd top 5-bit write address
Not ganting writÊ permission for the incomingIBNG)ot
Grantine write permission for the incoming cell
Request wri for a low priority cellte pefrrusslon
from the buffer manager
Input Block Request I (IBRI)
Request write permission for a high priority cell
from the buffer manager
Input Block Request 0 (IBRO)
Reseß the entire input control module
Intemal Header of the incoming cell
DescriptionSignal Name
83
In order to maximise memory access time and to preserve a reasonable chip
geometry, it is desirable to split the incoming cells into sub-cells. For a 32-cell memory, a
four sub-cell configuration of 14 bits each would result in a memory size of 126 bits by 128
words. Consequently, there are 56 DRAM cells assigned for each parallel line and since there
are only 53 bits of valid data available, the extra three DRAM cells a¡e filled by storing some
bits twice. In this case, it is decided in UTFEBS3I that the first sub-cell (which corresponds
to the lst to the l4th bit on one incoming line of the cell) contains no redundant bit to ensure
there is sufficient time to process the priority information while the other three sub-cells
(which corresponds to the 14th to 27th, 27th to the 40th, and 40th to the 53rd bits
respectively) each contains one extra bit from the previous sub-cell. Consequently, the l4th,
27th znd the 40th bit in each parallel line are stored twice in the DRAM. Note that special
signalling pulses (Q11, Q26, Q39, and Q52) are required to indicate the end of subcells.




Figure 6.3: Flow chart summarising the functionality of the input control
The input conüol can be realised in five main modules as shown in Figure 6.4. Table
6.2 describes the functionality of each module.
Sends out IC and WE signals





No action is taken and








































Figure 6.4: Block diagram representation of the input control




ite addresscount)z-Bit Counter i
input convert signal to the input S/P converters






. Generates inpuß for the 2-bit counter as well as appropnat€
control signalõ for the input S/P converters and memory wriæ
process
Pul,se Generation (icpg)




the end of sub-cells




6.1.1 The Cell Arrival Extraction Module
The core of the cell arrival extraction module is a 6-bit pseudo random counter
tJTMOBg3l which is chosen for its simplicity and high specd operation. Pipelined decoders
are included to indicaæ the end of sub-cells which a¡e used to contol the input VP converters
as well as the memory write process. The schematic of the module is shown in Figure 6.5
and the resulting staæ transition table shown in Table 6.3. A timing diagram of the cell a¡rival
extraction module is shown in Figure 6.6. Note that the Start Pulse (SP) is only one clock


















































































































Table 6.3: State transition table for the pseudo random counter
89
6.1.2 The Block Requester
The block requester is realised as shown in Figure 6.7.It takes the internal header
(IH) together with the complement of the Start Pulse (SPb) as inputs to extract the cell
priority. The appropriate input block request is then sent out by an RS flip-flop which gets
reset on the a¡rival'of either the input block granted (IBG) or the input block not granted
signal (IBNG) from the buffer manager. If the request is granted, the complement of the
block write signal (BlVfb) is sent out from the 2-stage latch after 1l clock cycles on the
a¡rival of the cell and remains low until the cell is fully written to the DRAM. Otherwise,
BWfb remains high. On the 15th clock cycle (Q15), the top 5-bit write address is latched
out to generate the current write address whether the input request is gfanted or not. A timing
diagram showing the operation of the block requester is shown in Figure 6.8.
BWfb IBR1 IBRO















































Figure 6.8: Timing diagram of the block requester
6.1.3 The Pulse Generation Module
The pulse generation module is simply a set of pipe-lined decoders. If an input block
request is granted, the module takes the four timing pulses (Q13, Q26, Q39 and Q52) from
the cell arrival extraction module together with the complement of .the block write signal
(BWfb) from the block requester to generate four WrÍte pulses to the memory conEol
module as well as three count-up pulses (Up) and the clea¡ signal (Clear) to the 2-bit
counter to generate/update the lower 2-bit write address. On the other hand, if the input
request is not granted, the same inputs to the 2-bit counter will be generated but no Write
pulses are produced. The module can be implemented as shown in Figure 6.9 with the timing






















































6.1.4 The Memory Control Module
The memory control module receives a lVrite pulse of one-clock period long from
the block requester and sends out an input convert signal (IC) to the S/P converters after one
clock period to load the parallel data out to the DRAM. It also generates a wriæ enable signal
(WE) of 11 clock cycles long two clock periods after IC. It is realised as shown in Figure























Figure 6.12: Timing diagram of the memory contol module
93
6.1.5 The 2-Bit Counter
This counter is realised as a synchronous counter. However, due to the fact that it
will be clea¡ed before gening to state 00, the logic can be simplified by mapping the next state
of 11 to XX. The ci¡cuit diagram of the counter is shown in Figure 6.13 and the resulting






out3 out4 QBb wAlDB
o
lddss






Next Ståte (Up=l)Next State (UeO)Present State












Figure 6.14 shows the layout of the entire input control circuit in Magic to verify the
effectiveness of the layout style employed. In digital VLSI designs, it is customary to over-
design a circuit to operate at a higher speed to account for any unforseen reasons that might
render the circuit to operate slower. In this case, a 50Vo over-design factor is chosen due to
the immaturity of the process employed. Hence, the entire input control is simulated at
900MHz. Furthermore, all output nodes are connected to a capacitive load of lpF (where in
reality it tanslates to a long interconnect line which is connected to other functional blocks)
except for the input convert signal (IC) which has a capacitive load of 200fF and a 30pm
wide EFET connected as external load (where in reality this signal is directly connected to an
adjacent differential D flip-flop "fvv3v3v").
In the simulations, the first and the third cell are high priority cells whereas the
second cell has a low priority. It can be seen that the first two block rcquests are granted and
are assigned to a block address (IBA) and a block granted signal (IBG) by the buffer
manager whereas the last block request is not granted (IBNG assefied). The IRSIM and
HSpice simulation results are shown in Figure 6.15 and Figure 6.16 rcspectively.







































































































T I HE ( L I N ]
N
l8r+.925N
Figure 6.16(a): HSpice simulation of the input control at 900MHz
Note that all output nodes exrept IC are laadedwith a lpF capaciør
Graph 1 shows the system reset signal (SYSRST) and the two phase clock (PHII,
PHr2)
Graph 2 shows the internal header (IH) and the start pulse (SP)
Graph 3 shows the four timing pulses Qll, Q26, Q39 and Q52 (Q39 and Q52 not
shown in the legend)
Graph 4 shows the two input block request signals IBR0 and IBRI which are
denoted by BROF and BRIF in the legend
Graph 5 shows the lowest 3-bit write address WAO, WAl and WA2 (WA2 not
shown in the legend)
























































































r.. L -r- -1. r t
100.0N








Figure 6.16(b): Current drawn by the input contol at 900MHz for the simulation in figure
6. l6(a)
It can be verified from Figure 6.16(b) that the logic families does in fact draw a fairly
constant current from VDD. The mean deviation of current drawn by the input circuit is
calculated to be 5.6Vo. This explains the fact that the low noise injected to the supply lines






6.2 Tlne Buffer Manager
The buffer manager processes input block requests (IBR0, IBRI) from the input
conrol and output block rÞquests (OBR) from the output control according to the status of
the DRAM tCHU3l. High priority input block requests (IBR0) are always ganted unless
the DRAM is full. On the other hand, low priority input block requests (IBRI) are only
granted when the DRAM is less than three-quarters (or equivalent to 24 cells) full. This
ensures a lower cell loss probability is achieved for high priority cells. If an input rcquest is
granted, both the Sta¡t Pointer (SP) and the Queue Size Register (QSR) (which are
responsible for generating the top 5-bit write address and keeping Eack of the number of
stored cells in the DRAM respectively) will increment. Otherwise, both SP and QSR remain
unchanged. Similarly, output block requests (OBR) are only granted if the DRAM is not
empty. If an output block request is granted, the End Pointer (EP) (which generates the top
5-bit read address) increments while the Queue Size Register (QSR) decrements to indicaæ
the DRAM has one more cell vacancy. A flow cha¡t summarising the operation of the buffer
manager is shown in Figure 6.17. Note that the buffer manager is designed to handle
simultaneous arrival of both input and output block requests without getting into ambiguous
states.
Figure 6.18 shows an interface of the buffer manager and Table 6.5 indicates the
function of the interfacing signals. The buffer manager can be realised in eight main modules





































Check if the DRAM





Wait for the anival of
















Table 6.5: Description of interfacing signals in the buffer manager
IBCb oBcb
ldd4lsdud


































and sends out the top 5-bit
block address
5-Bit Up Counterlo (bcbmcSu)
Generates count pulses tbr the QSReount Pulse Generator (CPG)
Processes output requests and sends out
acknowledgments
Ouþut Block Requester (OBR)
Processes input requests ancl sends out
acknowledgments
Input Block Requester (IBR)
Keèps track of the status of the DRAMQueue Size Register (QSR)
Indicates whether the DRAM is fr¡ll, 3/4 full or
empW
Fuffer Full & Empty Decoder (BFED)
!'unctionalitModule Name
Table 6.6: Module requirement of the buffer manager
The realisation of the six individual modules is discussed in the following sections.
Noæ that both the st¿¡t and end pointers (SP, EP) a¡e realised as 5-bit up counters.
toz
l0
6.2.1 The Queue Size Register
The core of the queue size register (QSR) is a synchronous 5-bit up/down counter
where the schematic and the resulting state transition table are shown in Figure 6.2O and
Table 6.7 respectively. The re¿¡son for a 5-bit instead of a 6-bit implementation for the QSR
can be illustrated as follows: Assume there are 31 stored cells in the DRAM, a further granæd
incoming input block request will attempt to overwrite the new cell on the location where the
fî¡st cell is stored. This is due to the first-in-hrst-out (FIFO) memory structure where the 5-
bit input block address (1840..4) can overflow and recycle. As a remedy, only a 5-bit
up/down counter is required to indicate the stÍttus of the DRAM.
Additional logic (module QSRUD) is included at the input of the counter to ensure
stability when both the input block count (IBC) and output block count (OBC) pulses are
high at the same time. This is done by not sending any increment (UPB) nor decrement




Figure 6.20(a): Schematic of the queue size register and































Figure 6.20(d): Schematic of the module bcbmSudb within the 5-bit uldown counter











































6.2.2 Th,e Buffer Full and Empty Decoders
The buffer full and empty decoders (BFED) can simply be realised as a set of
decoders as shown in Figure 6.21 where the signals QS4, QS3, QS2, QS1, and QSO
denotes the staæ vectors from the queue size regisær (QSR) while QS4b, QS3b, QS2b,


























































Eouivalent OS OutoutNumber of cells
in DRAM
II
Table 6.8: Truth table of the buffer full and empty decoders
r07
6.2.3 The Input Block Requester
The input block requester (IBR) module receives information about the status of the
DRAM (whether it is full, more than 314 fuLI or less than3l4 full) to process the input block
requests (IBRO and IBRI). Figure 6.22 shows the schematic of the IBR module where the
module bcbmbr within is realised as shown in Figure 6.23. One bcbmbr module is
























Figure 6.24 TimlrLg diagram of the module bcbmbr
6.2.4 The Output Block Requester
The functionality of the output block requester (OBR) module is simila¡ to that of the
input block requester (IBR) module except only one bcbmbr module is required since all
output block requests have the same priority. The OBR module can be realised as shown in
Figure 6.25 andthe corresponding timing diagram is shown in Figure 6.26.
RstD2 empty











Figure 6.26: Timing diagram of the output block requester module
6.2.5 The Count Pulse Generater
The count pulse generater (CPG) produces a negative pulse of one clock period long
on every positive edge of the input signal (In). It can be realised as shown in Figure 6.27
and the corresponding timing diagram is shown in Figure 6.28.
Figurc 6.27: Schematic of the count pulse generater module








6.2.6 The s-Bit Up Counters
The 5-bit up counters (bcbmcSu) in both the Start Pointer (SP) and the End Pointer
(EP) are realised as synchronous counters. The schematic is shown in Figure 6.29 and the
corresponding state transition diagram is shown in Table 6.9.
Figure 6.29(a): Schematic of the 5-bit up counter (bcbmc5u)








































Up=1Uo=0Present State INext State text
IlIE
Table 6.9: State transition table for the 5-bit up counter
tt2
Figure 6.30 shows the layout of the entire buffer manager circuit in Magic to verify
the effectiveness of the layout style employed. The simulation result in IRSIM is shown in
Figure 6.31 where the DRAM is initially empty and is gradually filled up by incoming cells
while the output block request signal (OBR) requests a block once every 53 clock periods. It
can be verified that when the DRAM is at or above 3/4 full (fiom the BHF signal), only high
priority block requests (IBRO) are granted. When the DRAM is completely full, all block
requests (IBRO,IBRI) are rejected. Furthermore, no output block rcquest (OBR) will be
granted unless the DRAM is filled with at least one cell. Due to the long simulation time for
HSpice, only a 3-cell long simulation was carried out. Initially, an output block request
(OBR) arrives and since the DRAM is empty, it is rejected. A high priority input block
request (IBRO) then arives with an output block request (OBR). Since the DRAM is still
empty, OBR is rejected while IBR0 is granted. The status of the DRAM then becomes filled
and the buffer empty signal (BE) deserts. A low priority input block request (IBRI) then
a¡rives and gets granted. Finally, a high priority input block request (IBRO) a¡rives with an
output block request (OBR). As the DRAM is neither empty nor completely filled, both
requests are granted. By the same argument as Section 6.1, the simulation is performed at a
507o oveþdesign factor (ie. 450MHz) to ensure the targeted operational speed can be
achieved. The simulation result is shown in Figure 6.32with all output nodes connected to a
capacitive load of 50OfF to model the effect of the interconne¡t wire.


































































































Figure 6.31: Simulation of the buffer manager in IRSIM
tt4
I
rr SIIlULATION SCRIPT FOR BCBI'I5-0EC 93 18r{3r56
vBG - 0 .6V





























































2 0 0 . 0M
BT+---
A
I 0 0 . 0N 300.0NIN] Ê.20N
Figure 6.32(a): HSpice simulation of the buffer manager at 450MHz
Note tløt all output nodes are loadedwith a 500ÍF capacitor
Key: Graph 1 shows the reset signal (RSÐ (not shown in legend) and the nvo pbase clock (PHII, PHI2)
Graph 2 sbows rhe high priority input block request (IBRO), the high priority input block granæd
(IBC'Q) and the high priority input block not granted (IBNC'0) signals which are denoted by WBR0,
WSO and WNSO (not shown) in the legend
Graph 3 shows the low priority input block request (IBR1), the low priority input block granted
(IBGI) and rhe low priority input block not granted (IBNGI) signals which are denoted by rñ/BRl'
WSl and WNSI (not shown) in the legend
Graph 4 shows the output block request (OBR), the output block granted (OBG) (denoæd by RBG
in the legend) and the output block not ganted (OBNG) (not shown in the legend) signals
Graph 5 shows the buffer empty (BE) and the buffer tull (BÐ signals




r¡ S I t',IULAT I ON SCR I PT FOR BCBI{
5-DEC 93 l8r{3:56

































r - | ' I - 'l 'l' ' r - r
3 0 0 , 0N








Figure 6.32(b): Current drawn by the buffer manager at 450MHz for the simulation in figure
6.32(a)
It can be verified from Figure 6.32(b) that the logic families does in fact draw a tairly
constant current from VDD. The mean deviation of current drawn by the buffer manager is
calculated tobe 5.2Vo. This again verifies the fact that the choice of logic families does have a
significant impact on the amount of noise injected to the supply lines which in turn affect the
performance of the logic themselves.
r16
6.3 The Output Control
The functionality of output contol closely resembles that of the input control except
the former sends out an output block request (OBR) to the buffer manager once every 53
clock cycles whereas the latter sends out input block requests (IBR0 or IBRI) only when
cells arrive tCHU4l. Depending upon the decision of the buffer manager, the output block
request will either be granted (which will be accompanied by a 5-bit block address) or not
granted. If the request is granted, the output control will generate the full 7-bit read address
(R40..6) and a read enable (RE) signal to the DRAM as well as an output convert (OC)
signal to the output parallel-to-serial (P/S) converters at the end of each subcells. Otherwise,
only OC signals ate generated. A flow chat summarising the functionality of the output
control is shown in Figure 6.33. Note that the current ganted cell is read on the next 53
clock cycles. This is due to the fact that the queue size register (QSR) in the buffer manager
indicates a cell is already in the DRAM before it is completely stored and the output control
reads cells out from the DRAM at half the speed they were written in by the input control.
Consequently, granted output block requests should only start reading cells out from the
DRAM in the next 53 clock cycles to ensure they are complete.
An interface of the output control and the corresponding signal descriptions are
shown in Figure 6.34 and Table 6.10 respectively.
The output control can be realised in six main modules as shown in Figure 6.35. A











Figure 6.33: Flow cha¡t summarising the functionality of the output control











Read last subcell from the cell
Sends out OC
&
cell (if )Read 3rd subcell from the
outOC
&
Re¿d 2nd subcell from the cell
Latch in 5-bit block address htch in 5-bit block address
&

















































Generates an output convert signal for the output
P/S converters
Output Convert Decoder (OCD)











Table 6.11: Module requirement of the output conEol




6.3.1 The Pseudo Random Counter
The pseudo random counter (bcocprc) is similar to the one used in the input control
except that it counts at all times whereas the latter counts only when a new cell a¡rives. As a
result, the start exffaction module can be removed. The resulting circuit diagram is shown in
Figure 6.36 and the corresponding state transition table is shown in Table 6.12.
lsdud
PI
where the decoders (bcocprcd) are realised as:
ldddd
tdddd





























































































Table 6.12: State t¡ansition table for the pseudo random counter (bcocprc)
r22
6.3.2 The Address Latch
The address latch (AL) is realised as five lots of two-stage data latches. When an
output block request (OBR) is sent out, it latches out the current Output Block Address
(OBA) from the first stage. Five clock periods after the signalling pulse P3, this address is
latched out from the second stage to the DRAM. Note that this address is latched out
irrespective of the status of the output block granted (OBG) or the output block not granted
(OBNG) signals as the validity of the address is only determined by the read enable (RE)
signal. The schematic and the timing diagram of this module are shown in Figure 6.37 and
Figure 6.38 respectively.




D D D D D
OBAO OBAI OBA2 OBA3 OBA4











Figure 6.38: Timing diagam of the address latch (AL) module
6.3.3 The 2-Bit Counter
The 2-bit counter (bcocc) is identical to thè one used in the input control. The circuit

















Next State IOCO2=I)Next State (OCQ2{)Present State
Table 6.13: State transition table for the 2-bit counter
6.3.4 The Read Enable Generater
The read enable generater module (REG) generates read enable signals (RE) to the
DRAM onty if an output block request (OBR) is granæd. It was realised as shown in Figure
6.40. The same 2-stage latch as in the address latch (AL) module is used in its constn¡ction
to latch in the output block granted (OBG) signal. A timing diagram is shown in Figure 6.41
where it can be seen that there are three clock cycles between adjacent read enables (RE) to















Figure 6.41: Timing diagram of the read enable generater
6.3.5 The Output Block Request Generater
The output block request generater (OBRG) sends an output block request (OBR) to
the buffer manager once every 53 clock periods and awaits its decision. It can be realised as
















Figure 6.43: Timing diagram of the output block request generater
6.3.6 The Output Convert Decoder
The output convert decoder (OCD) generates output convert (OC¡ signals as well as
delaying the signalling pulse P3 for other modules. It can be realised as shown in Figure






































































Figure 4.46 shows the layout of the entire output control circuit in Magic to verify
the effectiveness of the layout style employed. The simulation result in IRSIM is shown in
Figure 6.47 where the DRAM is initiatly empty and only alærnating output block request
(OBR) gets granted. By the same argument as Section 6.1, the HSpice simulation is
performed at a.50Vo over-design factor (ie. 450MHz) to ensure the targeted operational speed
of 300MHz can be achieved. The simulation result in HSpice is shown in Figure 6.48
where all ouþut nodes arc connecæd to a capacitive load of lpF (where in reality it translates
to a long interconnect line which is connected to other functional blocks) except for the output
convert signal (OC) which has a capacitive load of 200fF and a 30pm wide EFET connected
as external load (where in reality this signal is directly connected to an adjacent differentiat D
flip-flop "fvv3v3v").
It can be verifîed from the last panel in Figure 6.48 that the logic families do in fact
draw a fairly constant current from VDD. The mean deviation of the current drawn by the
output control is calculaæd to be 6.IVo. This again justifies the fact that the chosen logic
families do have a minimal noise injection to the supply lines.







































































I0 _ 02I il










































0.0 25 0 . 0N
280.0N
Key:
Figure 6.48: HSpice simulation of the ouçut control at 450MHz
Note that all output nodes except OC are loadedwith a IpF capacinr
Graph 1 shows the system reset signal (SYSRSÐ (not shown in legend) and the
two phase clock (PHII, PHI2)
Graph 2 shows the ouþut block request (OBR), output block granted (OBG) and the
output block not granted (OBNG) (not shown in legend) signals
Graph 3 shows the four timing pulses P0, Pl, P2 and P3 (P2 and P3 not shown in
legend)
Graph 4 shows the output convert (@) and the read enable (RE) signals
Graph 5 shows the lowest 3-bit read address RAO, RA1 and RA2 (RA2 not shown
in legend)
Graph 6 shows the current drawn from the VDD bus
131
Chaptet 7. Conclusion and Future Work
A conclusion of this research is presented in section 7.1 and some possible future
work on this project is discussed in section 7.2.
7.1 Conclusion
A survey of GaAs logic families was carried out. Amongst all candidates, Direct
Coupled FET Logic (DCFL), Source-follower Direct Coupled FET Logic (SDCFL), Super
Buffer FET Logic (SBFL) and Ultra Buffer FET Logic (UBFL) are chosen due to their
simplicity, compatibility in signals levels, power consumption and speed requirements. They
are optimised at 125'C to operate over military specifications with a l.5V power supply.
After a performance comparison, a "mixed" logic design approach using pure NOR
structures is presented in Chapter 2 and is used extensively throughout the design of the
control logic within a buffer chip in which DCFL is used extensively for local logic
operations where both the fan-ins and output loads a¡e small due to its simplicity and superb
power-delay product. SDCFL is used in critical paths and/or higher fan-in applications.
SBFL is used only for buffering global signals (such as clocks and resets) due to high power
consumption and noise injection to the power busses. UBFL is best suited for buffering
parallel busses (such as address and data bus) since the power dissipation is low and the
injection of noise into the power busses is the major concern.
The contnol logic of the buffer chip is realised in three main modules: an input control
module, a buffer manager module and an output control module. The details of these
modules are discussed in Chapter 6. They are designed, laid out using the layout tool
"MAGIC" and simulated using different simulation tools such as IRSIM and HSpice (see
Chapter 4). A performance summary of the three modules is shown in Table 7.1.
t32






9l2ym x 416¡tm1069pm x 42l1un.882¡rm x 481¡rmDimensions(length x width)
11041232t336Number of Transistors
Output Contnoltsuttêr ManagerInput Control
Module Name
Table 7.1: Performance suûrmary of the th¡ee modules constituting the control circuit within
the buffer chip
In addition, a design methodology for high complexity VLSI ci¡cuits was presented.
It is based on different levels of simulation which ranges from switch level to full spice
simulations. The relative merits and disadvantages of various simulation tools used
throughout this project are discussed in Chapter 4.
The layout of the entire buffer chip is shown in Figure 7.1 where the input control
(denoted by bcic), the buffer manager (denoted by bcbm) and the output control (denoted by
bcoc) a¡e the author's contribution on this chip design. Other components of the chip, that is,
the dynamic memory (DRAM) which constitutes some 46,000 Eansistors, the input serial-to-
parallel converters (SÆ converters), the ouþut parallel-to-serial converters (P/S converters),
the clock drivers and the pads were designed by Mr. Jens Jakobsen at Jydsk Telefon,
Denmark. The buffer chip has a total die area of 26.1mm2 and contains over 70,000
transistors. HSpice simulation shows this chip has a power dissipation of less than lW at a
supply voltage of 1.5V while operating at a 125"C environment. This chip was sent to
fabrication on November,1993 and is expected to be delivered on April, 1994.
133
Figure 7.1: Layout of the entire buffer chip
134
7.2 Future Work
. The most important future work would be to test the chip after delivery and compare the
simulated results with the actual measurements to verify the accuracy of the device models
and the design methodology.
. Finish the design of the multiplexer chip and construct a 2 ¡ 2 switch so that further
properties of the swirching elements can be cha¡acterised.
. Investigate the systems level specifications such as printed circuit board (PCB) layout,
chip-techip communications and cooling mechanisms.
. Continue the search for better logic families to enable higher operational speed while
preserving low power consumption and low noise injection to the power busses.
. Explore the idea of a hybrid approach to the implementation of the ATM swirch with GaAs
and CMOS technologies. For instance GaAs technology may be used as an interface to
multiplex/demultiplex high data rate cell streÍrms so that the internal operation speed can be
reduced to a frequency at which CMOS logic circuits can operate.
135
Appendix A. VHDL Structural Description of the
Input Control
The VHDL structural description of the enti¡e input control is shown below:
-- VHDL structural description of module
-- File name: bcic.vst
-- Function: Buffer chip---Input Circuit
-- Author: Eric Chu















,Q1 1b,Q26,Q39,Q52: inout BIT);
IDIH: bit; -- Input data internal header
IBA: bit_vector (4 downto 0); -- Input block address
IBG: bit; -- Input block granted
IBNG: bit; -- Input Block Not Granted
IC: inout bit; -- Input Conversion
IBR: out bit_vector (1 downto 0); -- Input Block Request
WE: out bit; -- Write Enable




ARCHITECTURE structural view OF bcic IS
COMPONENT icmc
port (vss,vdd,phil,phi2,SysRst,Write: BIT; PL: inout BIT; WriteEnable: out BIT);
END COMPONENT;
COMPONENT icpg
port (vss,vdd,phi1,phi2,SysRst,Ql3,Q26,Q39,Q52,BWfb: BIT; Write: out BIT; Up,








port (vss,vdd,phil,phi2,Clear,up,upb: BIT; WA1,WAO: out BIT);
END COMPONENT;
COMPONENT Tdddd
port (vss,vdd,phi1,phi2Ð: BIT; Q,QB: out BIT);
END COMPONENT;
COMPONENT lddsd
port (vss,vdd,phi1,phi2Ð: BIT; Q,QB: out BIT);
END COMPONENT;
COMPONENT lsdud
port (vss,vdd,phi1,phi2Ð: BIT; Q,QB: out BIT¡;
END COMPONENT;
.- SIGNAL DECLARATION














mc: icmc PORT MAP (vss,vdd,phil,phi2,SysRst,Write,IC,WE);
up2: count PORT MAP (vss,vdd,phi l,phi2,ClrCount,Up,Upb,WA( I ),WA(O)) ;
br: icbr PORT MAP (vss,vdd,phil,phi2,SysRst,StartPulseb,IDIH,IBG,IBNG,Qllb,
IBA(4),IBA(3),IBA(2),IBA(1),IBA(0),IBR(0),IBR(1),BWfb,WA(6),WA(5),WA(4),
wA(3),WA(2));
dffd 1 : ldddd PORT MAP (vss,vdd,phi 1,phi2,Q 1 1,Ql 2,Q 1 2b);
dffsl : lddsd PORT MAP (vss,vdd,phi1,phi2,Q12,Q13,Q13b);
dffd2: ldddd PORT MAP (vss,vdd,phi l,phi2,Rst,QRst,QRstb) ;
dffu I : lsdud PORT MAP (vss,vdd,phil,phi2,QRst,SysRst,SysRstb);
end structural-view;
t37
Stn¡ctural Description of the Cell Arrival Exüaction module within the input contrrol:
-- VHDL stn¡ctorial description of module
-- File n¿tme: iccae.vst
-- Function: Buffer Chip--Input Contol Cell Anival Extraction
-- Author: Eric Chu




ShnPulse,S tatPulseb,Q 1 1,Q 1 I b,Q26,Q39, Q52 : inout B IT) ;
END iccae;
-- Architecture Decla¡ation
ARCHITECTURE structural view OF iccae IS
COMPONENT nd
port (vss,vdd,i: BIT; o: out Bff);
END COMPONENT;
COMPONENT o2nd
port (vss,vdd,i0,i1: BIT; o: out BIT);
END COMPONENT;
COMPONENT o2ns










(vss,vdd,phil BIT; Q,QB: out BIT);
COMPONENT lddss
port (vss,vdd,phil,phi2Ð: BIT; Q,QB: out BIT);
END COMPONENT;
COMPONENT lsdud
port (vss,vdd,phil,phi2,D: BIT; Q,QB: out BIÐ;
END COMPONENT;
COMPONENT lsduu





























SO,S 1,S2,53,S4,S5 : BIT;

















































invdl: nd PORT MAP (vss,vdd,IH,IHb);
dffd0: ldddd PORT MAP (vss,vdd,phi 1,phi2,IHb,DIHb,DIH) ;
nor2sl: o2ns PORT MAP (vss,vdd,DIHb,IH_DISABLE,STARf);
nor2dl: o2nd PORT MAP (vss,vdd,RESET,Dlb,Dl);
nor2d2: o2nd PORT MAP (vss,vdd,Dl,START"D 1b);
dffd I : ldddd PORT MAP (vss,vdd,phi 1,ph2,D I,IH_DISABLE,IH_DISABLEb);
dffs I : lddss PORT MAP (vss,vdd,f hi l,Ëhp,srentstartputse,startpulseb) ;
-- Pseudo Random Counter
-- 6th Bit
invxl: nd PORT MAP (vss,vdd,S0,S0bx);
invx2: nd PORT MAP (vss,vdd,S5,S5bx);
norãd3: o2nd PORT MAP (vss,vdd,S0,S5bx,out1);
nor2d4: o2nd PORT MAP (vss,vdd,S5,S0bx,out2);
nor3dl: o3nd PORT MAP (vss,vdd,RESET,outl,out2ÐS5);
dfful: lsduu PORT MAP (vss,vdd,phil,phi2,DS5,S5,S5b);
-- 5th Bit
nor2d5: o2nd PORT MAP (vss,vdd,RESET,S5b,DS4);
dffu2: lsduu PORT MAP (vss,vdd,phil,phi2,DS4,S4,S4b);
-- 4th Bit
nor2d6; o2nd PORT MAP (vss,vdd,RESET,S4b,DS3) ;
dffu3: lsduu PORT MAP (vss,vdd,phi1,phi2,DS3,S3,S3b);
-- 3th Bir
nor2d7 : o2nd PORT MAP (vss,vdd,RESET,S3b,DS2) ;
dffu4: lsduu PORT MAP (vss,vdd,phi1,phi2,DS2,S2,S2b);
-- 2nd Bit
nor2d8: o2nd PORT MAP (vs3,vdd,RESET,S2b,DSl);
dffu5: lsduu PORT MAP (vss,vdd,phil,phi2,DSl,Sl,Slb);
-- lst Bit
nor2d9: o2nd PORT MAP (vss,vdd,REsET,Slb,out3);
nor2d 10: o2nd PORT MAP (vss,vdd,START,out3,DS0) ;
dffu6: lsduu PORT MAP (vss,vdd,phil,phi2,DS0,S0,S0b);
-- End of Pseudo Random Counter
-- Endof CellDecode
nor2dl2; o2nd PORT MAP (vss,vdd,Q52,SysRst,out7) ;
invd2: nd PORT MAP (vss,vdd,out7,out7b);
dffuO: lsdud PORT MAP (vss,vdd,phil,phi2,outTb,REsET,RESETb);
-- Decode 11 (Actually 7)
nor3d4: o3nd PORT MAP (vss,vdd,S0,Slb,S2b,out8);
dffd4: ldddd PORT MAP (vss,vdd,phi 1,phi2,out8,Qout8,Qout8b) ;
nor3d5: o3nd PORT MAP (vss,vdd,S3b,S4b,S5b,out9);
dffd5 : ldddd PORT MAP (vss,vdd,phi l,phi2,out9,Qout9,Qout9b) ;
140
nor2d 1 3 : o2nd PORT MAP (vss,vdd,Qout8b,Qout9b,outl0);
dffs3: lddss PORT MAP (vss,vdd,phil,phi2,outl0,Ql 1,Q1 lb);
-- Decode 26 (Act:ually 22)
nor3d6: o3nd PORT MAP (vss,vdd,S0,Slb,S2,outll);
dffd6: ldddd PORT MAP (vss,vdd,phil,phi2,outl l,Qoutl l,Qoutl 1b);
nor3d7: o3nd PORT MAP (vss,vdd,S3b,S4b,S5b,out12);
dffdT : ldddd PORT MAP (vss,vdd,phil,phi2,out1 2,Qout 1 2,Qoutl 2b) ;
nor?ÅI4: o2nd PORT MAP (vss,vdd,Qout 1 1 b,Qout I 2b,out I 3) ;
dffs4: lddss PORT MAP (vss,vdd,phil,phi2,out13,Q26,Q26b);
-- Decode 39 (Actually 35)
nor3d8: o3nd PORT MAP (vss,vdd,S0,Sl,S2,out14);
dffdS : tdddd PORT MAP (vss,vdd,phi l,phi2,outl 4,Qout14,Qout I 4b) ;
nor3d9: o3nd PORT MAP (vss,vdd,S3b,S4b,S5b,outl5);
dffd9: 15b);
nor2d1 1Sb,out1
dffsS: lddss PORT MAP (vss,vdd,phil,phi2,out16,Q39,Q39b);
-- Decode 52 (Actually 48)
invd3: nd PORT MAP (vss,vdd,S5b,S5by);
nor3d10: o3nd PORT MAP (vss,vdd,S0,S1b,S2,out17);
dffd I 0: ldddd PORT MAP (vss,vdd,phi 1,phi2,out I 7,Qout I 7,Qout 1 7b) ;
nor3dl l: o3nd PORT MAP (vss,vdd,S3,S4b,S5by,out18);
dffd 1 I : ldddd PORT MAP (vss,vdd,phi 1,phi2,outl 8,Qout 1 8,Qout I 8b) ;
nor2dl6: o2nd PORT MAP (vss,vdd,Qout17b,Qout18b,outl9);
dffs6: lddss PORT MAP (vss,vdd,phil,phi2,out19,Q52,Q52b);
end structural_view;
Sfucn¡ral Description of the Block Requesær module within the input control:
-- VHDL stn¡ctorial description of module
-- File namo: icbr.vst
-- Function: Buffer chip---Input Contol Block Requester
-- Author: Eric Chu
































port (vss,vdd,i0,i1: BIT; o: out BIT);
END COMPONENT;
COMPONENT o3nd
port (vss,vdd,i0,il,i2: BIT; o: out BIT);
END COMPONENT;
coMPoNENT jdd
port (vss,vdd,I-db,D: BIT; Q,Qb: out BIT);
END COMPONENT;
COMPONENT jds
port (vss,vdd,Ldb,D: BIT; Q,Qb: out BIT);
END COMPONENT;
COMPONENT jpsd
port (vss,vdd,Set,[.db,D: BIT; Q,Qb: inout BIT);
END COMPONENT;
COMPONENT ldddd
port (vss,vdd,phil,phi2,D: BIT; Q,QB: out BIÐ;
END COMPONENT;
COMPONENT lddsd
port (vss,vdd,phil,phi2Ð: BIT; Q,QB: out BIT);
END COMPONENT;
COMPONENT lsdud





















SIGNAL Ql2,Q12b,Q1 3,Q1 3b,Q14,Q14b,Q1 5,Ql 5b: BIT;
-- Sfficn¡ral Description
BEGIN
dffdl: ldddd PORT MAP (vss,vdd,phil,phi2,IH,QIH,QIHb);
nor2dl: o2nd PORT MAP (vss,vdd,QIHb,SPb,D2);
nor2d2: o2nd PORT MAP (vss,vdd,QIH,SPb,D3);
dffd2: ldddd PORT MAP (vss,vdd,phil,phi2,D2,Q2,Q2b);
dffd3: ldddd PORT MAP (vss,vdd,phi1,phi2,D3,Q3,Q3b);
nor2sl: o2ns PORT MAP (vss,vdd,brl,Q2,brlb);
invul: nu PORT MAP (vss,vdd,brlb,BRlÐ;
nor2d3: o2nd PORT MAP (vss,vdd,brlb,out2,brl);
nor2s2: o2ns PORT MAP (vss,vdd,br0,Q3,br0b);
invu2: nu PORT MAP (vss,vdd,br0b,BR0Ð;
nor2d4: o2nd PORT MAP (vss,vdd,br0b,out2,br0);
nor3dl: o3nd PORT MAP (vss,vdd,QQBNG,SysRst,QQBG,outl);
invsl: ns PORT MAP (vss,vdd,outl,out2);
nor2s3: o2ns PORT MAP (vss,vdd,br0,br1,br);
-- Getting Qllb's
dffsl: lddsd PORT MAP (vss,vdd,phil,phi2,Ql lb,Ql2b,Q12);
dffd4: ldddd PORT MAP (vss,vdd,phi1,phi2,Q12b,Ql3b,Q13);
dffd5: ldddd PORT MAP (vss,vdd,phi1,phi2,Q13b,Q14b,Q14);
dfful: lsdud PORT MAP (vss,vdd,phil,phi2,Ql4b,Q15b,Ql5);
-- Extra Logics for Delaying Block Write Signal
latchl: jdd PoRT MAP (vss,vdd,br,Bc,Bcl,BG1b);
sinkl: sink PORT MAP (vss,vdd,BG1);
latch2: jpsd PORT MAP (vss,vdd,SysRst,Q I 2b,BG I b,BWfb,BWÐ;
sink2: sink PORT MAP (vss,vdd,Bv/fb);
--- Larching Addresses
latch3: jddPORTMAP (vss,vdd,br,A2,QA2,QÆU);
latch4: jdd PORT MAP (vss,vdd,br,A3,QA3,QA3b);
latch5: jdd PORTMAP (vss,vdd,br,A4,QA4,QA4b);
latch6: jdd PoRT MAP (vss,vdd,br,A5,QA5,QA5b);
latchT: jdd PORTMAP (vss,vdd,br,A6,QA6,QA6b);
sink3: sink PORT MAP (vss,vdd,QA2U);
sink4: sink PORT MAP (vss,vdd,QA3b);
sinkS: sink PORT MAP (vss,vdd,QA4b);
sink6: sink PORT MAP (vss,vdd,QA5b);





































sink PORT MAP (vss,vdd,QQA3);
sink PORT MAP (vss,vdd,QQAa);
sink PORT MAP (vss,vdd,QQA5);
sink PORT MAP (vss,vdd,QQA6);
nu PORT MAP (vss,vdd,QQA2b,WA2);
nu PORT MAP (vss,vdd,QQA3b,WA3);
nu PORT MAP (vss,vdd,QQA4b,rWA4);
nu PORT MAP (vss,vdd,QQA5b,WA5);
nu PORT MAP (vss,vdd,QQA6b,WA6);
tì
--- Input Dffs for Meta-Stability
dffd6: ldddd PORT MAP (vss,vdd,phil,phi2,BG,QBG,QBGb);
dffdT: ldddd PORT MAP
dffdS: ldddd PORT MAP
dffdg: ldddd PORT MAP Gb);
end structural-view;
Structural Description of the Pulse Generation module within the input control:
-- VHDL structorial description of module
-- File name: icpg.vst
-- Function: Buffer chip---Input Control---Pulse Generation
-- Author: Eric Chu
-- Date : l lth October,1993
-- Entity Declaration
ENTITY icpg IS
















port (vss,vdd,i0,i1: BIT; o: out BIT);
END COMPONENT;
COMPONENT o3ns
port (vss,vdd,i0,il,i2: BIT; o: out BIT);
END COMPONENT;
COMPONENT o4ns
port (vss,vdd,i0,i1,i2,i3: BIT; o: out BIT);
END COMPONENT;
COMPONENT lddsd




SIGNAL out2,out2b,ClrCountb : BIT;
BEGIN
-- Generating Write
nor4sl: o4ns PORT MAP (vss,vdd,Q13,Q26,Q39,Q52,selectb);
nor2d 1 : o2nd PORT MAP (vss,vdd,selectb,BWfb,Write) ;
-- Generating Up & Up* for 2-bit counter
nor3sl: o3ns PORT MAP (vss,vdd,Q26,Q39,Q52,outl);
dffsl: lddsd PORT MAP (vss,vdd,phil,phi2,out1,Upb,Up);
-- Generating Counter Clear
nor2d2: o2nd PORT MAP (vss,vdd,Q I 3,SysRst,out2) ;invdl: nd PORT MAP (vss,vdd,out2,out2b);
dffs2 : lddsd PORT MAP (vss,vdd,phi l,phi2,out2b,Cl¡Count,ChCountb) ;
end stn¡ctural_view;
t45
Sbuctural Description of the Memory Control module within the input control:
-- VHDL structr¡ral description of module
-- File name: icmc.vst
-- Function: Buffer Chip---Input Conûol Memory Control
-- Author: Eric Chu
-- Date : l lth Octobe\ L993
-- Modified: Eric Chu 27th October, 1993
-- Comments: Increased driving capability of the signal IC
-- Entity Decla¡ation
ENTITY icrnc IS
PORT (vss,vdd,phil,phi2,SysRst,Write: BIT; PL: inout BIT; WriteEnable: out BIT);
END icmc;
-- Architecture Decla¡ation
ARCHITECTURE structural view OF icmc IS
COMPONENT nu




























dff: lsdud PORT MAP (vss,vdd,phil,phi2,Write,Pl,Plb);
dffdl: ldddd PORT MAP (vss,vdd,phil,phi2,PL,QPL,QPLb);
dffd2: ldddd PORT MAP (vss,vdd,phi I,ph2,QPL,QQPL,QQPLb) ;
rrsdsl : rrsds PORT MAP (vss,vdd,SysRst,Rstl 1,QQPL,W,Wb);
sinkl: sink PORT MAP (vss,vdd,W);
invul: nu PORT MAP (vss,vdd,Wb,WriteEnable);
dffüì: ldddd PORT MAP (vss,vdd,phil,phi2,QQPL,Rstl,Rstlb);
dffd4: ldddd PORT MAP (vss,vdd,phil,phi2,Rstl,Rst2,Rst2b);
dffd5: ldddd PORT MAP (vss,vdd,phil,phi2,Rst2,Rst3,Rs8b);
dffd6: ldddd PORT MAP (vss,vdd,phi1,phi2,Rst3,Rst4,Rst4b);
dffdT: ldddd PORT MAP (vss,vdd,phi1,phi2,Rst4,Rs6,Rst5b);
dffdS: ldddd PORT MAP (vss,vdd,phi1,phi2,Rst5,Rst6,Rst6b);
dffd9: ldddd PORT MAP (vss,vdd,phi1,phi2,Rst6,Rst7,Rst7b);
dffdl0: ldddd PORT MAP (vss,vdd,phil,phi2,Rst7,Rst8,Rst8b);
dffdl 1: ldddd PORT MAP (vss,vdd,phil,phi2,Rst8,Rst9,Rst9b);
dffdL2 ldddd PORT MAP (vss,vdd,phi l,phi2,Rst9,Rst 1 O,Rst I 0b) ;
dffd13: ldddd PORT MAP (vss,vdd,phi1,phi2,Rst1O,Rstl I,Rstl 1b);
end sur¡ctural_view;
Sructural Description of the 2-Bit Counter within the input control:
-- VHDL structural description of module
-- File name: count.vst
-- Function: Counter for Lower 2-Bit Write Address Generation
-- Author: Eric Chu
-- Date i 27th September,1993
-- Entity Decla¡ation
ENTITY count IS
PORT (vss,vdd,phil,phi2,Clear,up,upb: BIT; WAl,WAO: out BIT);
END count;
-- Archiæcnre Decla¡ation













(vss,vdd,iO,il,i2; BIT; o: out BIT);
COMPONENT;
COMPONENT lddss





SIGNAL out 1,out2,out3,out4: BIT;
-- Strucnual Description
BEGIN
lddss I : lddss PORT MAP (vss,vdd,phi l,phi2,DA,QA,QAb);
lddss2: lddss PORT MAP (vss,vdd,phil,phi2,DB,QB,QBb);
-- Input to Dff A:
nor2dl: o2nd PORT MAP (vss,vdd,Up,QA,outl);
nor?Ã2; o2nd PORT MAP (v ss,vdd,Upb,QAb,out2) ;
nor3d I : o3nd PORT MAP (vss,vdd,out l,out2,Clea¡ÐA) ;
-- Input to Dff B:
nor2d3 : o2 nd PORT MAP (vss,vdd,Upb,QAb,out3 ) ;
nor?d4: o2 nd PORT MAP (vss, vdd,out3,QB,ou t4) ;
nor2d5 : o2nd PORT MAP (vss,vdd,out4,Clear,DB) ;
-- Buffering Write Addresses
invul: nu PORTMAP (vss,vdd,QBb,W





Appendix B. VHDL Structural Description of the
Buffer Manager
The VHDL structural description of the entire buffermanager is shown þlow:
-- VHDL structural description of module
-- File n¿Ìme: bcbm.vst
-- Function: Buffer chip buffer manager module
-- Author: Jens Jakobsen




vss,vdd: bit; -- Power supply
phil, phi2, Rst: bit; -- Reset and clock
IBR: bit-vector (1 lownto 0); -- Input block request
OBR: bit; -- Output block request
IBG: out bit; -- Input block granted
IBNG: out bit; -- Input block not granted
OBG: out bit; -- Output block g¡anted
OBNG: out bit; -- Output block not granted
IBA: out bit-vector (4 downto 0); -- Input block address




ARCHITECTURE structural OF bcbm IS
COMPONENT bcbmc5u
PORT (
vss : in BIT; -- vss
vdd: in BIT; -- vdd
phil : in BIT; -- phil
phi2: in BIT; -- phi2
Rst : in BIT; -- reset
upb : in BIT; -- Up





vss : in BIT;
vdd: in BIT;
phil : in BIT;






Rst: in BIT; -- rcset
upb : in BIT; -- Up
dnb : in BIT; -- Down









size : in BIT;







-- Empty or full
-- Block granted or not granted
COMPONENT bcbmic
PORT (
vss : in BIT; -- vss
vdd: in BIT; -- vdd
phil : in BIT; -- phil
phiZ: in BIT; -- phi2
Rst : in BIT; -- reset
s,sb : in bit_vector (5 downto 0); -- Counter state
IBR: bit_vector (1 downto 0); -- Input block request
IBG: out bit; -- Input block granted
IBNG: out bit; -- Input block not granted





vss : in BIT; -- vss
vdd: in BIT; -- vdd
phil : in BIT; -- phil
phi2: in BIT; -- phi2
Rst: in BIT; -- reset
s,sb : in bit-vector (5 downto 0); -- Counter state
OBR: bit; -- Output block request
OBG: out bit; -- Output block granted
OBNG: out bit; -- Ouþut block not granted





vss : in BIT;
vdd : in BIT;
phil : in BIT;
phi2: in BIT;
D : in BIT;
Q: inout BIT;













vss : in BIT; -- vss
vdd: in BIT; -- vdd
phil : in BIT; -- phil
phi2 : in BIT; -- pht2
D:inBIT; --D
Q : inout BIT;





vss : in BIT;
vdd: in BIT;






vss : in BIT;
vdd: in BIT;






vss : in BIT;
vdd: in BIT;






vss : in BIT;
vdd: in BIT;
i0 : in BIT;






vss : in BIT;
vdd: in BIT;






























vss : in BIT;
vdd: in BIT;
i0 : in BIT;






vss : in BIT;
vdd: in BIT;
i0 : in BIT;
il : in BIT;






vss : in BIT;
vdd: in BIT;
i0 : in BIT;
i1 : in BIT;
























SIGNAL QueueSize,Queuesizeb: bit-vector (4 downto 0);
SIGNAL RstQl,RstQl b,RstQ2,RstQ2b : bit;
SIGNAL full: bit_vector (l downto 0);
SIGNAL fulla,fullb : bit;









































-- Handle metastability at RST by using two flip flops
l0: ldddd PORT MAP (vss,vdd,phi l,phi2,Rst,RstQ1,RstQl b) ;
ll : lsdud PORT MAP (vss,vdd,phil,phi2,RstQl,RstQ2,RstQ2b);
-- Input block request
bcbmibr0:bcbmbr PORT MAP
(vss,vdd,phi 1,phi2,RstQ2,IBR(0),ful1(0),IBGx(0),IBNGx(0)) ;
bcbmibrl :bcbmbr PORT MAP
(vss,vdd,phi 1,phi2,RstQ2,IB R( 1 ),full( I ),IBGx( 1 ),IBNGx( 1 )) ;
o2n0: o2ns PORT MAP (vss,vdd,IBGx(0),IBGx(1),IBGb);
o2nl : o2ns PORT MAP (vss,vdd,IB NGx(0),IBNGx( I ),IB NGb) ;
n0: nu PORT MAP (vss,vdd,IBGb,IBG);
nl: nu PORT MAP (vss,vdd,IBNGb,IBNG);
-- Output block request
bcbmobr:bcbmbr PORT MAP (vss,vdd,phil,phi2,RstQ2,OBR,empty,OBGx,OBNGx);
n2: ns PORT MAP (vss,vdd,OBGx,OBGb);
n3: nu PORT MAP (vss,vdd,OBGb,OBG);
n4: ns PORT MAP (vss,vdd,OBNGx,OBNGb);
n5: nu PORT MAP (vss,vdd,OBNGb,OBNG);
-- Generate count up pulse
12: ldddd PORT MAP (vss, I ,IBGQIb,IBGQl);
n6: nd PORT MAP (vss,vdd,IB
o2n2: o2ns PORT MAP (vss,vdd,IBG b,IBGy,IBC);
n7: nu PORT MAP (vss,vdd,IBC,IBCb);
-- Generate count down pulse
13: ldddd PORT MAP (vss,vdd,phil,phi2,OBGb,OBGQlb,OBGQl);
n8: nd PORT MAP (vss,vdd,OBGb,OBGy);
o2n3: o2ns PORT MAP (vss,vdd,OBc Q I b,OBGy,OBC) ;
n9: nu PORT MAP (vss,vdd,OBC,OBCb);
153
-- Input block address counter
bcbmc5ui:bcbmcSu PORT MAP (vss,vdd,phil,phi2,RstQ2,IBCb,IBA);
-- Ouþut block address counter
bcbmc5uo:bcbmcSu PORT MAP (vss,vdd,phi1,phi2,RstQ2,OBCb,OBA);
-- Buffer size counter
-- n10: nd PORT MAP (vss,vdd,IBC,IBCbx);
lxl: ldddd PORT MAP (vss, I l,IBCQlb);
o2n4: o2ns PORT MAP (vss, lb,oBcQ1,UP);
nl1: nu PORT MAP (vss,vdd,UP,UPb);
- nl2: nd PORT MAP (vss,vdd,OBC,OBCbx);
lx2: ldddd PORT MAP (vss,vdd,phil,phi2,OBC,OBCQl,OBCQ1bX
o2n5: o2ns PORT MAP (vss,vdd,OBcQlb,IBCQl,DN);
n13: nu PORT MAP (vss,vdd,DN,DNb);
bcbmud:bcbmud PORT MAP
(vss,vdd,phi 1,phi2,RstQ2,UPb,DNb, QueueSize,QueueS izæ,b);
-- Empty decoding
o30: o3dd PORT MAP (vss,vdd,QueueSize(0),QueueSize(1),QueueSize(2),emptya)3
o2O: o2dd PORT MAP (vss,vdd,QueueSize(3),QueueSize(4),emptyb) ;
o2n6: o2ns PORT MAP (vss,vdd,emptya,emptyb,empty);
-- Full decoding
o3l: o3dd PORT MAP (vss,vdd,QueueSizeb(0),QueueSizeb(1),QueueSizeb(2),fulla);
o2l: o2dd PORT MAP (vss,vdd,QueueSizeb(3),QueueSizeb(4),fullb) ;
o2n7 : o2ns PORT MAP (vss,vdd,fulla,fullb,full(0)) ;
o2n8 : o2ns PORT MAP (vss,vdd,QueueS izeb(3), QueueSizeb(4),full( 1 )) ;
END structural;
154
Stn¡ctural Description of the 5-Bit Up/Down Counter within the buffer manager:
-- VHDL structural description of module
-- File name: bcbmud.vst
-- Function: Buffer chip buffer manager up/down counter
-- Author: Jens Jakobsen




vss,vdd: bit; -- Power supply
phil, phi2: bit; -- Clock
Rst: in BIT; -- Reset
upb : in BIT; -- Up
dnb : in BIT; -- Down
















vss : in BIT;
vdd: in BIT;






vss : in BIT;
vdd: in BIT;
i0: in BIT;





















vss : in BIT;
vdd: in BIT;
i0 : in BIT;
i1 : in BIT;






vss : in BIT;
vdd: in BIT;
i0 : in BIT;






vss : in BIT;
vdd: in BIT;
i0 : in BIT;
il : in BIT;






















SIGNAL a,b,c : bit_vector (4 downto 0);
SIGNAL m3a¡n3b : bit;
SIGNAL m4a,m4b,n4a,n4b : bit;
BEGIN
-- Bit 0
nOa: nd PORT MAP (vss,vdd,Upb,a(0));
nOb: nd PORT MAP (vss,vdd,dnb,b(0));
o2n0: o2nd PORT MAP (vss,vdd,a(0),b(0),c(0));
bcbmudbO: bcbmudb PORT MAP (vss,vdd,phi 1,phi2,Rst,c(0),s(0),sb(0));
-- Bit I
o2nla: o2nd PORT MAP (vss,vdd,Upb,sb(0),a( I )) ;
156
o2nlb: o2nd PORT MAP (vss,vdd,dnb,s(0),b(l));
o2nlc: o2nd PORT MAP (vss,vdd,a(1),b(1),c(l));
bcbmudb I : bcbmudb PORT MAP (vss,vdd,phi l,phi2,Rst,c( I ),s( I ),sb( I ));
-- Bit2
o3n2a; o3nd PORT MAP (vss,vdd,Upb,sb(0),sb( I ),a(2)) ;
o3n2b: o3nd PORT MAP (vss,vdd,dnb,s(0),s(1),b(2));
o2n2c: o2nd PORT MAP ( v s s, vdd, a( 2),b (2),c(2));
bcbmudb2: bcbmudb PORT MAP (vss,vdd,phi 1,phi2,Rst,c(2),s(2),sb(2));
-- Bir 3
o3m3a: o3dd PORT MAP (vss,vdd,sb(2),sb(1),sb(0)¡n3a);
o3m3b: o3dd PORT MAP (vss,vdd,s(2),s(1),s(0),m3b);
o2n3a; o2nd PORT MAP (vss,vdd,Upb,m3a,a(3));
o2n3b: o2nd PORT MAP (vss,vdd,dnb,m3b,b(3));
o2n3c: o2nd PORT MAP (vss,vdd, a(3 ),b(3 ),c(3)) ;
bcbmudb3 : bcbmudb PORT MAP (vss,vdd,phi 1,phi2,Rst,c(3),s(3),sb(3));
-- Bit 4
o2m4a: o2dd PORT MAP (vss,vdd,sb(3 ),sb(2),m4a) ;
o2n4a; o2dd PORT MAP (vss,vdd,sb ( 1 ),sb(O),n4a) ;
o2m4b; o2dd PORT MAP (vss,vdd,s(3 ),s(2),m4b) ;
o2n4b: o2dd PORT MAP (vss,vdd,s( 1 ),s(0),n4b) ;
o3n4a: o3nd PORT MAP (vss,vdd,Upb,m4a,n4 a,a(4));
o3n4b: o3nd PORT MAP (vss,vdd,dnb,m4b,n4b,b(4));
oZnqr, : o2nd PORT MAP (vss,vdd,a(4),b(4),c(4)) ;
bcbmudb4: bcbmudb PORT MAP (vss,vdd,phi 1,phi2,Rst,c(4),s(4),sb(4));
END structural;
Stn¡ctural Description of one birslice of the logics within the 5-bit up/down counter
in the buffer manager:
-- VHDL stn¡ctural description of module
-- File name: bcbmudb.vst
-- Function: Buffer chip buffer manager counter dff and module XOR
-- Author: Jens Jakobsen


















ARCHITECTURE structural OF bcbmudb IS
COMPONENT nd
PORT (
vss : in BIT;
vdd : in BIT;






vss : in BIT; -- vss
vdd: in BIT; -- vdd
i0: in BIT; -- i0
il : in BIT; -- il
s0: in BIT; -- s0
sl : in BIT; -- s1
r: inBIT; --r





vss : in BIT;
vdd: in BIT;
phil : in BIT;
phi2: in BIT;
D : in BIT;
Q: inout BIT;















SIGNAL D,ib : bit;
BEGIN
n0: nd PORT MAP (vss,vdd,i,ib);
muxrl: muxr PORT MAP (vss,vdd,Q,Qb,ib,i,Rst,D);
158
l0: lsduu PORT MAP (vss,vdd,phil,phi2,D,Q,Qb);
END structural;
-- VHDL stn¡ctural description of module
-- File name: bcbmbr.vst
-- Function: Buffer chip buffer manager counter dff and module XOR
-- Author: Jens Jakobsen
-- Date : October 31993
-- Entity Declaration







size : in BIT;







-- Empty or full






ARCHITECTURE structural OF bcbmbr IS
COMPONENT ldddd
PORT (
vss : in BIT;
vdd: in BIT;
phil : in BIT;
phi2: in BIT;
D : in BIT;
Q: inout BIT;












vss : in BIT;
vdd: in BIT;







vss : in BIT;
vdd: in BIT;
i0 : in BIT;






vss : in BIT;
vdd: in BIT;
i0: in BIT;
il : in BIT;






vss : in BIT;
vdd: in BIT;
i0: in BIT;
il : in BIT;
s0: in BIT;
sl : in BIT;






















SIGNAL BRQI,BRQ1b,BRQ2,BRQ2b : bit;
SIGNAL BRQ3,BRQ3b : bit;
SIGNAL süobe,süobeb : bit;
SIGNAL sizeql,sizeql b,sizedl : bit;
BEGIN
-- Handle metastability by using two flip-flops at BR input
l0: ldddd PORT MAP (vss,vdd,phil,phi2,BR,BRQ1,BRQ1b);
11: ldddd PORT MAP (vss,vdd,phil,phi2,BRQl,BRQ2,BRQ2b);
-- Generate strobe at positive edge of BRQ2
160
13: ldddd PORT MAP (vss,vdd,phil,phi2,BRQ2,BRQ3,BRQ3b);
o2n0: o2nd PORT MAP (vss,vdd,BRQ2b,BRQ3,strobe) ;
-- Larch cturent value of size
n0: nd PORT MAP (vss,vdd,srobe,strobeb);
muxO: mux PORT MAP (vss,vdd,size,sizeQ l,strobeb,strobe,sizeD 1 ) ;
14: ldddd PORT MAP (vss,vdd,phi 1,phi2,sizeD l,sizeQl,sizeQ I b);
-- Generate output
o3nl : o3ns PORT MAP (vss,vdd,Rst,SizeQl,BRQ3b,BGR);
o3n2; o3ns PORT MAP (vss,vdd,Rst,SizeQ 1 b,B RQ3 b,B NGR) ;
END structural;
Sructural Description of the 5-bit up counters within the buffer manager:
-- VHDL structural description of module
-- File name: bcbmcSu.vst
-- Function: Buffer chip buffer manager counter dff and module XOR
-- Author: Jens Jakobsen




vss,vdd: bit; -- Power supply
phil, phi2: bit; -- Clock
Rsc bit; -- Reset
upb : in BIT; -- Count up




ARCHITECTTJRE structural OF bcbmcSu IS
COMPONENT bcbmcSub
PORT (
vss,vdd: bit; -- Power supply
phil, phi2: bit; -- Clock
Rsc bit; -- Reset
i: bit; -- Input


















vss : in BIT;
vdd: in BIT;
i0: in BIT;






vss : in BIT;
vdd: in BIT;
i0: in BIT;
il : in BIT;






vss : in BIT;
vdd: in BIT;
i0: in BIT;






vss : in BIT;
vdd: in BIT;
i0 : in BIT;
i1 : in BIT;








































n0: nd PORT MAP (vss,vdd,Upb,i(0));
bcbmc5ubO: bcbmc5ub PORT MAP (vss,vdd,phil,phi2,Rst,i(0),s(0),sb(0));
-- Bit 1
o2n1: o2nd PORT MAP (vss,vdd,Upb,sb(0),i(1));
bcbmcSub 1 : bcbmcSub PORT MAP (vss,vdd,phi 1,phi2,Rst,i( I ),s( 1),sb(1));
-- Bitz
o3n2: o3nd PORT MAP (vss,vdd,Upb,sb(0),sb(1),i(2));
bcbmc5ub2: bcbmcSub PORT MAP (vss,vdd,phil,phi2,Rst,i(2),s(2),sb(2));
-- Bir 3
o23a: o2dd PORT MAP (vss,vdd,sb(2),sb( 1 ),m3a) ;
o23b: o2dd PORT MAP (vss,vdd,Upb,sb(0),m3b);
o2n3; o2nd PORT MAP (vss,vdd,m3a,m3b,i(3));
bcbmc5ub3: bcbmcSub PORT MAP (vss,vdd,phil,phi2,Rst,i(3),s(3),sb(3));
-- Bit 4
o24: o2dd PORT MAP (vss,vdd,sb(3),sb(2),m4a) ;
o34: o3dd PORT MAP (vss,vdd,Upb,sb(0),sb(1),m4b);
o2n4: o2nd PORT MAP (vss,vdd,m4a,m4b,i(4));








Structural Description of one bit-slice of the logics within the 5-bit up counter in the
buffer manager:
-- VHDL structural description of module
-- File name: bcbmcSub.vst
-- Function: Buffer chip buffer manager counter dff and module XOR
-- Author: Jens Jakobsen













vss : in BIT;
vdd: in BIT;






vss : in BIT;
vdd: in BIT;






vss : in BIT; -- vss
vdd : in BIT; -- vdd
i0 : in BIT; -- i0
il : in BIT; -= il
s0 : in BIT; -- s0






























vss : in BIT; -- vss
vdd : in BIT; -- vdd
phil : in BIT; -- phil
phi2: in BIT; . -- phi2
D:inBIT; --D
Q : inout BIT; -- a




SIGNAL D,Q,ib : bit;
BEGIN
n0: nd PORT MAP (vss,vdd,i,ib);
muxrl : muxr PORT MAP (vss,vdd,Q,Sb,i,ib,Rst,D);
l0: lsduu PORT MAP (vss,vdd,phi 1,phi2,D,Q,Sb) ;
nl: nu PORT MAP (vss,vdd,Sb,S);
END stn¡ctural;
165
Appendix C. VHDL Structural Description of the
Output Control
The VHDL structural description of the entire output control is shown below:
-- VHDL sm¡ctural description of module
-- File name: bcoc.vst
-- Function: Buffer chip output control
-- Author: Jens Jakobsen
-- Date : Septem&r 29,1993




vss,vdd: bit; -- Power supply
phil, phi2: bit; -- Clock
Rst: bit; -- Reset
OBA: bit_vector (4 downto 0); -- Output block address
OBG: bit; -- Ouþut block granted
OBNG: bit; -- Output block not granted
OC: inout bit; -- Output conversion
OBR: inout bit; -- Output block request
RE: out bit; -- Read Enable




ARCHITECTURE structural view OF bcoc IS
COMPONENT bcocprc
PORT (
vss,vdd:bit; -- Power supply
phi1,phi2:bit; -- Clocks
Rst: BIT; -- Reset





















r0 : in BIT;
rl : in BIT;
s : in BIT;
q : inout BIT;






r0 : in BIT;
r1 : in BIT;
s : in BIT;
q : inout BIT;












































port (vss,vdd,Clear,[.db,D: BIT; Q,Qb: inout BIT);
END COMPONENT;
COMPONENT ldddd
port (vss,vdd,phil,phi2,D: BIT; Q,QB: out BIT);
END COMPONENT;
COMPONENT lddss
port (vss,vdd,phi1,phi2,D: BIT; Q,QB: out BIT);
END COMPONENT;
COMPONENT lddsd














port (vss,vdd,i: BIT; o: out BfÐ;
END COMPONENT;
COMPONENT o2nd
port (vss,vdd,i0,i1: BIT; o: out BIT);
END COMPONENT;
COMPONENT o3nd
port (vss,vdd,i0,il,i3: BIT; o: out BIT);
END COMPONENT;
COMPONENT o2dd
(vss,vdd,i: BIT; o: out BIT);
COMPONENT;






-- Delayed versions of input signals
SIGNAL OBGDI,OBGD2,OBGDIb,OBGD2b : bit; -- Block
SIGNAL OBNGDI,OBNGD2,OBNGDIb,OBNGD2b : bit; -- B







































: bit; -- Combining P0 and Pl













-- I-ocal copies of block address and granted signals
SIGNAL BLA0,BlA0b,BlAl,BlAlb: bit-vector (4 downto 0);-- Block address
SIGNAL OBGrantedO,OBGranted0b,OBGrantedl,OBGrantedlb: bit;-- Block granted
-- Generation of Block request
SIGNAL OB,OBRI,OBRIb,OBRb bit;
-- Read enable signals
SIGNAL RED I,RED 1b,RED2 bit;
-- Structural Description
BEGIN
-- Handle metastability by using two flip flops on asynchronous control inputs
m0: ldddd PORT MAP (vss, vdd,phi 1,phi2,OB G,OBGD I,OBGD I b) ;
m I : ldddd PORT MAP (vss,vdd,phi l,phi2,OBGD l,OBGD2,OBGD2b) ;
m2 : ldddd PORT MAP (vss,vdd,phi l,phi2,OBNG,OB NGD I,OBNGD I b) ;
m3 : ldddd PORT MAP (vss,vdd,phi 1,phi2,OB NGD l,OB NGD2,OB NGD2b) ;
m4: ldddd PORT MAP (vss,vdd,phi 1,phi2,Rst,RstD l,RstD I b) ;
m5: lsdud PORT MAP (vss,vdd,phi1,phi2,RstD1,RstD2,RstD2b);
-- Generate Output block request
o20 : o2dd PORT MAP ( vss, vdd,OB GD2, OB NGD2, OB ) ;
rs0: rrsds PORT MAP (vss,vdd,RstD2,OB,P0,OBR1,OBRIb);
sinkl : sink PORT'MAP (vss,vdd,OBR1);









-- Make local copy of OBG and Block address
n2: nu PORT MAP (vss,vdd,OBR,OBRb);
PORT MAP (vss,vdd,OB Rb,OB A(0),BIA0(0),B LA0b(0)) ;
PORT MAP (vss,vdd,OB Rb,OBA( I ),BlA0( I ),B tAOb( I )) ;
PORT MAP (vss,vdd,OBRb,OBA(2),BlA0(2),BtA0b(2));
PORT MAP (vss,vdd,OBRb,OBA(3),BlA0(3),BtA0b(3)) ;
PORT MAP (vss,vdd,OBRb,OBA(4),BlA0(4),BlA0b(4));
PORT MAP (vss,vdd,OB Rb,OB G,OBGranted0,OBGranted0b) ;
sink PORT MAP (vss,vdd,OBGranted0b);
sinklO: sink PORT MAP (vss,vdd,BlA0b(0));
sinkll: sink PORT MAP (vss,vdd,BlA0b(l));
sinkl2: sink PORT MAP (vss,vdd,BlA0b(2));
sinkl3: sink PORT MAP (vss,vdd,BlA0b(3));
sinkl4: sink PORT MAP (vss,vdd,BlA0b(a));
Counter
bcocprc : bcocprc PORT MAP (vss,vdd,phi 1,phi2,RstD 2,P0,P l,P2,P3);
-- Decoding Ouþut Convert signals
o2nl: o2nd PORT MAP (vss,vdd,P0,P1,P01Dl);
l0: ldddd PORT MAP (vss,vdd,phi1,phi2,P0lD1,P01,P01b);
o2n2: o2nd PORT MAP (vss,vdd,P2,P3,P23D1);
I 1 : ldddd PORT MAP (vss,vdd,ph iI,phi2,P23D 1,P23,P23b);
o2l; o2dd PORT MAP (vss,vdd,PO1b,P23b,OCD1);
12 : lsdud PORT MAP (vss,vdd,phi 1,phi2,OCD l,OC,OCb) ;
13: ldddd PORT MAP (vss,vdd,phi 1,ph2,OC,OCQ1, 1b);
14: lddsd PORT MAP (vss,vdd,phil,phi2,OCQl
15: ldddd PORT MAP (vss,vdd,phil




16: ldddd PORT MAP
17: ldddd PORT MAP







19: ldddd PORT MAP (vss,vdd,phil
110: lsdud PORT MAP (vss,vdd,phi
l7a: ldddd PORT MAP (vss,vdd,phi 1,phi2,P3 Q l,P3Q2a,P3Q2ab) ;
l8a: ldddd PORT MAP (vss,vdd,phi 1,phi2,P3 Q2a,P3Q3 a,P3Q3ab) ;
19a: ldddd PORT MAP (vss,vdd,phi 1,phi2,P3 Q3a,P3 Q4a,P3 Q4ab) ;
I 1 0a: ldddd PORT MAP (vss,vdd,phi l,phi2,P3Q4ab,P3 Q5ab,P3 Q5
-- Current copy of OBG
170
a);
jll: jcdd PORT MAP (vss,vdd,RstD2,P3Q5ab,OBGranted0,OBGrantedl,OBGrantedlb);
sink4: sink PORT MAP (vss,vdd,OBGrantedl);
-- Generate read enable
o2n4: o2nd PORT MAP (vss,vdd,OCQ4b, OB Granted 1 b,RED2) ;
-- rs 1 : nssd PORT MAP (vss,vdd,OCQ2,RstD2,RED2,RED I,RED 1 b);
rs I : rrsds PORT MAP (vss,vdd,OCQ2,RstD2,RED2,RED I,RED I b) ;
sinkO: sink PORT MAP (vss,vdd,RED1);
n3: nu PORT MAP (vss,vdd,REDlb,RE);
-- Generation of read address
j 6: jds PORT MAP (vss,vdd,P3Q5b,B LA0(0),B lA I (0),BlA 1 b(0)) ;
j7 : jds PORT MAP (vss,vdd,P3Q5b,BIA0( I ),B lA I ( 1 ),BlA 1 b( 1 ));
j 8 : jds PORT MAP (vss,vdd,P3Q5b,B LA0(2),B tA 1 (2),BlA I b(2) ) ;
j9: jds PORT MAP (vss,vdd,P3Q5b,BLA0(3),BtA 1 (3),BlA I b(3));
j I 0 : jds PORT MAP (vss,vdd,P3Q5b,B LA0(4),8 LA 1 (4),8 IA I b (4)) ;
sink5: sink PORT MAP (vss,vdd,BlA1(0));
sink6: sink PORT MAP (vss,vdd,BlAl(1));
sinkT: sink PORT MAP (vss,vdd,BlAl(2));
sinkS: sink PORT MAP (vss,vdd,BlA1(3));
sink9: sink PORT MAP (vss,vdd,BlAl(a));
n4: nu PORT MAP (vss,vdd,BlAlb(0),RA(2));
n5: nu PORT MAP (vss,vdd,BlAlb(l),RA(3));
n6: nu PORT MAP (vss,vdd,BlAlb(2),RA(4));
n7: nu PORT MAP (vss,vdd,BlAlb(3),RA(5));
n8: nu PORT MAP (vss,vdd,BlAlb(4),RA(6));
bcocc : bcocc PORT MAP (vss,vdd,phi 1,phi2,P3Q4a,OCQ2,OCQ2b,RA( 1 ),RA(0)) ;
end structural_view;
t7l
Structural Description of the Pseudo Random Counter module within the output
control:
-- VHDL structorial description of module
-- File name: bcocprc.vst
-- Function: Buffer Chip output control pseudo random counter
-- Author: Jens Jakobsen
-- Date : Septem&r 29, L993
-- Modified: Eric CHU on l2th October, 1993.



































(vss,vdd,i: BIT; o: out BIT);
COMPONENT;
COMPONENT o2nd





vss,vdd,phil,phi2,D: BIT; Q,QB: out BIT);
COMPONENT lddsd
172
port (vss,vdd,phil,phi2,D: BIT; Q,QB: out BIT);
END COMPONENT;
COMPONENT lsdud
port (vss,vdd,phil,phi2Ð: BIT; Q,QB: out BIT);
END COMPONENT;
COMPONENT lsduu




vss : in BIT; -- vss
vdd : in BIT; -- vdd
i0 : in BIT; -- i0
il : in BIT; -- il
s0 : in BIT; -- s0






SIGNAL S,SbÐS: bit-vector (5 downto 0);
SIGNAL RESET,RESETb,RESETD I,RESETD 1 b: BIT;
-- Decoder oulputs
SIGNAL Q10b,Q25b,Q38b,Q51b : BIT;
SIGNAL SOb1,S5b1,s52,s53 : BIT;
-- Strucnral Description
BEGIN
-- Pseudo Random Counter
-- 6th Bir
invdl: nd PORT MAP (vss,vdd,S(0),S0bl);
invd2: nd PORT MAP (vss,vdd,S(5),S5b1);
muxr: muxr PORT MAP (vss,vdd,S(0),S0b l,S5b 1,S(5),RESET,DS(5));-- Xnor
dfful: lsduu PORT MAP (vss,vdd,phil,phi2pS(5),S(5),Sb(5));
-- 5th Bit
nor2d5: o2nd PORT MAP (vss,vdd,RESET,Sb(5)ÐS( ));
dffu2: lsduu PORT MAP (vss,vdd,phi1,phi2,DS(4),S(4),Sb(4
-- 4rh Bit
nor2d6: o2nd PORT MAP (vss,vdd,RESET,Sb(4),DS(3));
173
));
dffu3: lsduu PORT MAP (vss,vdd,phil,phi2,DS(3),S(3),Sb(3));
-- 3rd Bit
nordT : o2nd PORT MAP (vss,vdd,RESET,Sb(3)ÐS(2));
dffu4: lsduu PORT MAP (vss,vdd,phil,phi2,DS(2),S(2),Sb(2));
-- 2nd Bit
nor2d8: o2nd PORT MAP (vss,vdd,RESET,Sb(2)ÐS(1));
dffu5: lsduu PORT MAP (vss,vdd,phil,phi2ÐS(l),S(l),Sb(1));
lst Bit
nor2d9: o2nd PORT MAP (vss,vdd,RESET,Sb(1)ÐS(0));
dffu6: lsduu PORT MAP (vss,vdd,phi1,phi2,DS(0),S(0),Sb(0));
-- End of Pseudo Random Counter
bcocprcdlO: bcocprcd -- Decode l0
PORT MAP (vss,vdd,phil,phi2,sb(5),sb(4),s(3),s(2),sb(1),s(0),Q10,Q10b);
bcocprcd25: bcocprcd -- Decode 25
PORT MAP (vss,vdd,phil,phi2,s52,sb(4),sb(3),s(2),sb(1),s(0),Q25,Q25b):
invd3: nd PORT MAP (vss,vdd,sb(5),s52);
bcocprcd38: bcocprcd -- Decode 38
PORT MAP (vss,vdd,phi 1,phi2,s53,s(4),s(3),sb(2),s( 1 ),sb(0),Q3 8,Q3 8b) ;
invd4: nd PORT MAP (vss,vdd,sb(5),s53);
bcocprcd5l: bcocprcd -- Decode 51
PORT MAP (vss,vdd,phil,phi2,sb(5),sb(4),s(3),s(2),sb(1),sb(0),Q51,Q51b);
-- Internal Header End Decode
o2nxx: o2nd PORT MAP (vss,vdd,Q5l,Rst,RESETDlb);
nxx: ns PORT MAP (vss,vdd,RESETDlb,RESETDl);
lxx: lsdud PORT MAP (vss,vdd,phil,phi2,RESETDl,RESET,RESETb);
end structural_view;
t74
Structural Description of the Pseudo Random Counter Decoding module within the
ouþut control:
-- VHDL structorial description of module
-- File name: bcocprcd.vst
-- Function: Buffer Chip output conEol pseudo random counter decoder
-- Author: Jens Jakobsen















ARCHITECTURE structural_view OF bcocprcd IS
COMPONENT o2nd
port (vss,vdd,i0,il: BIT; o: out BIT);
END COMPONENT;
COMPONENT o3nd
port (vss,vdd,i0,il,D: BIT; o: out BIT);
END COMPONENT;
COMPONENT ldddd
port (vss,vdd,phil,phi2Ð: BIT; Q,QB: out BIÐ;
END COMPONENT;
COMPONENT lddss
port (vss,vdd,phi1,phi2Ð: BIT; Q,QB: out BIT);
END COMPONENT;
COMPONENT lsduu -- Currently not used













o3n0: o3nd PORT MAP (vss,vdd,i0,il,i2,outO);
10: ldddd PORT MAP (vss,vdd,phil,phi2,out0,Qout0,Qout0b);
o3n1: o3nd PORT MAP (vss,vdd,i3,i4,i5,out1);
11 : ldddd PORT MAP (vss,vdd,phil,phi2,outl,Qoutl,Qoutlb[
o2n0: o2nd PORT MAP (vss,vdd,Qout0b,Qoutlb,out2);
12: lddss PORT MAP (vss,vdd,phi1,phi2,out2,o,ob);
end structural-view;
Structural Description of the 2-Bit Counter within the output contol:
-- VHDL stn¡ctural description of module
-- File name: bcocc.vst
-- Function: Counter for Lower 2-Bit read Address Generation
-- Author: Eric Chu
-- Date : 27th September,1993
-- Revised: Jens Jekobsen, September 30, 1993
-- Entity Decla¡ation
ENTITY bcocc IS
PORT (vss,vdd,phi l,phi2,Clear,up,upb : BIT; 41,40: out BIT) ;
END bcocc;
-- Archiæcn¡re Decla¡ation
ARCHITECTLIRE structural view OF bcocc IS
COMFONENT nu










(vss,vdd,iO,iI,i2: BIT; o: out BIT);
COMPONENT;
COMPONENT lddss








lddssl : lddss PORT MAP (vss,vdd,phi1,phi2,DA,QA,QAb);
lddss2: lddss PORT MAP (vss,vdd,phil,phi2,DB,QB,QBb);
-- Input to Dff A:
nor2dl: o2nd PORT MAP (vss,vdd,Up,QA,out1);
nor2d2: o2nd PORT MAP (vss,vdd,Up b, QAb,out2) ;
nor3d I : o3nd PORT MAP (vss,vdd,out l,out2,Clear,DA) ;
-- Input to Dff B:
nor2d3 : o2 nd PORT MAP (vss,vdd,Upb,QAb,out3 ) ;
nor2d4: o2nd PORT M AP (vss,vdd,out3,QB,out4) ;
nor2d5 : o2nd PORT MAP (vss,vdd,out4,Clear,DB) ;
-- Buffering Write Addresses
invul.: nu PORT MAP (vss,vdd,QBb,Al);
invu2: nu PORT MAP (vss,vdd,QAb,A0);
end structural-view;
t77
Appendix D. Published Paper
The paper "Comparison of GaAs Static Logic Families" which was published in the
llth NorChip seminar held in Trondheim, Norway on November 1993 is included in this
appendix.
178
COMPARISON OF GAAS STATIC LOGIC FAMILIES
Eric Chu, Jens Jakobsen
Jydsk Telefon R&D
30 Sletvej, 8310 Tranbjerg J, Denmark
Phone: + 45 89 45 45 45
Fax: + 45 86 29 90 68
E-MAIL: eric @ lab jt.dk, jj @ lab jtdk
AB STRACT
Optimisation of logic famílies plays an important role in VI^SI designs. In this paper we
attalyse DCFL for speed, power and noise margins. Three other logíc families based on
DCFL are discussed. SDCFL, SBFL, UBFL as well as DCFL are then optimised and
compared on the basis of speed, power, noise margíns and the noise induced in the power
supply. It is concluded that DCFL has good performance at low fan-ins and fan-outs.
SDCFL can be used for larger fan-ins. For large fan-outs SDCFL, SBFL or UBFL have to
be used according to the noße margin and tnise induction requirenunts.
1. INTRODUCTION
Optimisation of logic families plays an important role in VLSI designs as it determines the
size, speed and power consumption of ttre final product.
number of logic families for Gallium Arsenide
Ingic (DCFL) regarding speed, power, and
simple and having good performance at low
margin degrades rapidly, and the driving
capability is poor at large fan-out. In these sin¡ations other logic families have to be used.
From DCFL a series of compatible logic families a¡p derived. These include Source-follower
Direct Coupled FET Logic (SDCFL) [DAVENPORT], Super Buffer FET Logic (SBFL)
INAKAMURAI and Ultra Buffer FET Lngic (UBFL). The logic families are analysed and a
comparison is done on the basis of speed, power, noise margins and induced noise in the
power supplies.









Figure 1: (a) DCFL inverter and (b) DCFL 3-input nor gate
Figure la shows a DCFL inverter. It consists of a pull-up depletion modg MESFET (DFET)
(EFE')'Tìi?,ä1i"lili;#-'s"':,Y3#R;
Schottky diode to be 0.6v. 
¿lssume the clamping voltage of this
In
179
The supply voltage should be chosen so that the DFET always operates in saturation. This
optimises speed and ensures that the current drawn from V¿¿ is constant. Thus minimal
current noise is induced in the power supply. A 3-input nor gate can be constructed by
connecting EFETs in parallel as shown in Figure lb.
The operation of the MESFETs can be modelled by the Curtice model [CURTICE] which is
given by the expression below:
ro. =, (i).(u* -v,)'. (1+Â.v*). tanh(ø.v*)
In order to analyse the DCFL inverter some simplifications to the model are practical. In the
following analysis, we will assume that tanh(o.V¿s) = I for Gr.Vds > 1 (saturation) and
tanh(o.V¿s) = cr.V¿s for ø.V6s < 1 (linear region). Furthermore we assume À to be 0.





ü..þ" f t ,)'
for Vr. ( V,"
for Vr. ) V,"i Vo, 2 !
ac
for V* ) V,.i V* . !
ac
(v, )'
where the subscript "d" denotes DFET. The EFET will operate in 3 different regions: cut off,




V -v V8Ê ds
where the subscript "e" denotes EFET.
2.1 Inverter Thrcshold
We define the inverter threshold Vinv Írs the input voltage of the inverter where the saturation







(pull - down / pull - up ratio)
2.2 Inverter Characterisúics
Given V¡y the inverter fansfer cha¡acteristics can be determined as:
V tclnv
Voo, =




As can be seen, the above expression does not model the behaviour of the inverter very
when Vin is in the vicinity oi V¡¡n. The characteristics of a DCFL inverter simulaæd
foundry supplied model at 125'C is shown in Figure 2.
The input voltage is swept from 0V to 0.6V. Initially the EFET is turned off while the DFET







n,n ('nn) 4oo 600
Figure 2:DC characteristics of DCFL inverter with I fan-out
When the inp EFET enters the saturation region. As loqg as qe
current in thè the DFET, the output voltage stays high. When the
current in the of the DFET, the output voltage reduces until the
EFET enters the linear region. From this point onwards the output level decreases. The slight
rise in the ouþut at 0.6V is due to gate-drain Schottky diode conduction.
2.3 Inverter Noíse Margín
From the transfer characteristics we can calculate the intrinsic noise margins of a DCFL











Figure 3-: Circuits for dffinining (a) the intrinsic hþñ'and (b) the intinsic low noise margin
For a logic high level, the worst case is when only one input of the driven nor gate is high.
Thus the inrinsic high noise margin can be written as:
Vmr = Vot -Vio' = Vot -v' +E,J-
For a logic low level (VOù, the worst case is when only one input of the driving 84te is high
while al-i inputs of the driven gate are low. Thus V¡ny is modified by the fan-in (F¡¡) of the






The intrinsic noise margins arc plotted against.r in Figure 4.Itcan be seen that the low noise






















0 4 12 16
x
20 24
Figure 4: Intrinsic noise margins of DCFL gates as a function of r Ltl25"C
2.4 Inverter Delay
The delay of a DCFL inverter is assumed to be the average of the rise and fall times:
Deray =+ ?(#***)
where C¡ is the load capacitance, ÀV is the voltage swing and RoN is the "on" tesistance of
the EFET.
If we assume Cl is directly proportional to the width of the EFET then it follows that the fall-
time is constant. The rise-time, however, is proportional to C/Idsd to a first order
approximation. Thus the rise-time is directly proponional to x.
The choice of ¡ is therefore a trade off between the noise margin and the speed.
2.5 Power-Delay Product
Once ¡ is determined the size of the devices should be chosen accordingly. The power delay
product of DCFL gates can be written as:
Power - Delay hoduct = 
Voo 'C' (av + Ro* .I*)
2
where Cr is the sum of the driver, fan-out and the interconnect capacitances:
Ct=Cg¿o+Feu¡'Cgse+Ct
the DFET, Fsu¡ is the number of fan-outs,
iffi {Ë,*iïTie".1åi',fåii":Hi"T
e a weak pull-up. In such a case Cg¿¿ becomes
comparable to, or even exceeds C*r". The power delay product of a DCFL inveñer is shown
in Figure 5 with one fan-out and õonstant ¡. It can be seen that the power delay product is






o 4 t *" ruJl' 16 20
Figure 5: Power delay product of DCFL inverters as a function of Ws.
Cgoo is minimum when We = 16Pm.
3. SOURCE-FOLLOWER DIRECT COUPLED FET LOGIC (SDCFL)
















Figure 6: (a) SDCFL inverter and (b) SDCFL O-A-I structure
Both the noise margin and the driving capability of DCFL can be improved by appending a
source follower at its output. An SDCFL inverter is shown in Figure 6(a).
The driving stage of an SDCFL inverter (E1, Dl) operates identically to a DCFL inverter
except the óutput high level (Von) is not clamped to 0.6V. The buffer stag€ (_E'2,D2) ryluges
the swing of the DCFL output. Thus L gates can be designed with 0V for tol. Simila¡
to DCFL gates, VgH of SDCFL is o 0.6V by the following fan-out gates
which loaäs the output. SDCFL gat se margins due to the improved output
low level.
When the input is high, the driving stage pulls down V¡¡1 and E2 is turned off. Whereas a
low input caùses the DCFL stage tó pull up Vint and turn on E2. As a result, there is a large
difference in the cuÍent drawn from V¿¿ between the output high and the output low states.
An additional advantage of SDCFL is that or-and-invert (O-A-I) functions can be
implemented as shown in Figure 6(b).
4. SUPER BUFFER FET LOGIC (SBFL)
Similar to nnance of DCFL by appending a buffer (E2,
E3) at the Figure 7(a). V/hen compared with SDCFL,













output low. Thus the
disadvantage of SBFL
pull-down EFET can be much stonger than that of SDCFL. The






7: (a) FL inverter and (b) SBFL 2-input nor gate
As for SDCFL, SBFL has a large difference in the curent drawn from V66 at the output high
and output low states. Moreover when the input has a low to high transition, both E2 and E3
a¡e turned on simultaneously until V¡¡¡ is pulled down which in turn draws a large current
spike from the power supply ILONG&BUTNER].
A 2-input nor gate is shown in Figure 7(b). Parallel transistors are required in both the driver
and the buffer stages.














Figure 8: (a) UBFL and (b) input nor gate
The UBFL inverter as shown in Figure 8(a) is a further improvement to SBFL. The ultra
buffer has both an active pull-up EFET (E4) and a passive pull-up DFET (D2). E4 is only
active during the output low to high transition, otherwise it is turned off. This is achieved by
using a two input nor gate (formed by El, E2 andDl) as the driver. The buffer output is fed
back to one of the driver inputs. When the input is high, both El and E3 are turned on while
both the buffer and the driver outputs a¡e low. As a result, both E2 andF,4 ue turned off.
\Vhen the input goes low the output of the driver goes high. This turns E4 on which pulls the
output high. When the ouþut is high E2 turns on which pulls the driver output low and turns
E4 off. The output is held high by D2.
It can be seen that UBFL does not have the current spike as SBFL does. Moreover the static
cunent drawn from V¿¿ in both the ouþut high and the output low state is determined by Dl
andD2. Thus UBFL has low power consumption and only minimal noise is induced in V¿¿.
Furthermore, the cui¡ent within the gate will always have a direct path to ground.














6. COMPARISON BETWEEN DCFL, SDCFL, SBFL AND UBFL
The above logic families have been optimised and compared in terms of speed, power, noise
margins and noise induced in the power supply. The optimisation is done for the TCS SAGA
0.8pm process with 1.5V V¿¿ at L25'C.
For SBFL and UBFL, only inverters are considered whereas for DCFL and SDCFL we have
also considered nor gates. All delay simulations a¡e ca¡ried out using a 7-stage ring oscillator.
For capacitance and fan-out measurements, we include extra capacitance and DCFL inverters
respectively at the output of each gate in the ring oscillator. The noise margin is determined
by the Ma:rimum Square Method tHILLl. Some of the simulaæd results are shown in Figure













































Key: "+" denotes DCFL, "*" denotes SDCFL, "o" denotes SBFL and "x" denotes UBFL
Figure 9: (a) Effect of fan-out on speed of inverters (b) Effect of capacitive loads on speed of
inverters (c) Effect of fan-in on speed at I fan-out (d) Effect of fan-in on noise margins











2n+42n+2n+3n+1Number of transistors for an
n-Input Nor Gate
1049427Static Current Balance











Va¡ious GaAs static families have been analysed for speed, power dissipation, noise margins
and noise induced in the power supply.
Analytical results on DCFL show that the pull-down/pull-up ratio has to be optimised as a
compromise between speed and the noise margins. The device size can be optimised for
either maximum speed or maximum power delay product
DCFL, SDCFL, SBFL and UBFL have been optimised and compared. It can be seen that
DCFL is only good for low fan-ins and fan-outs. It fails to operate correctly at a large fan-in
and is slow at large fan-out
SDCFL has good noise margins, high driving capabilities and can operaûe with a larger fan-
in. But it has a larger power consumption and induces noise in the power supply. Thus
SDCFL is a good substitution for DCFL in critical paths where extra speed and/or fan-ins are
required.
SBFL has even better noise margins and driving capability than SDCFL. Fioweuer it has a
large fan-in load and induces even more noise into the power supply than SDCFL. This
makes its use limited.
In cases like wide buses where alarge number of signals shift at the same time, noisy gates
cannot be tolerated. UBFL should be used since it induces only a small amount of noise into
the power supply while preserving good driving capability. However, its noise margin is as
low as DCFL and the input load is as large as that of SBFL.
It can be seen that a mixed logic approach should be used. DCFL should be the most widely
used gates and SDCFL, SBFL and UBFL should be used in special circumstances to
enhance performance.
8. ACKNO}VLEDGMENTS
The authors would like to thank Assoc. Prof. Kamran Eshraghian in the Cenüe for Gallium
Arsenide VLSI Technology, South Australia for his informative discussions.
9. REFERENCES
ICURTICEI: Curtice, W. R.: "A MESFET Model for Use in the design of GaAs
Integrated Circuits," IEEE Trans. Microwave Theory and Tech., MTT-28, pp. M8-456,
May 1980.
IDAVENPORT]: Davenport, W. H.: "Macro Evaluation of a GaAs 3000 Gate Array,"
Proc. IEEE GaAs IC Symp., Grenelefe, Fla., pp. L9-22, October 1986.
[HILL]: Hill, C. F.,: "Noise Margin and Noise Immunity in Logic Circuits,"
Microelectronics, vol. 1, pp. L6-22, April 1968.
[LONG&BUTNER]: Long, S. I. and Butner S. E.: "Gallium Arsenide Digital Integrated
Circuit Design", pp.214-215, McGraw Hill, 1990.
[NAKAMURA]: Nakamura, H., et a1.: "A 390ps 1000 Gate Array Super-Buffer FET
Logic." Dig. of Tech. Papers, 1985 Int. Solid-State Cir. Conf., pp 2M-205, February 1985.
186
Appendix E. Power Requirement Calculations
The anticipated power dissipation for a Buffer chip is lW whereas for the Router chip
and the Multiplexer chip, they are 0.7W and 0.8W respectivelyll. As a result, the otal power
dissipation of a2 x 2 switching element (which composes of one Multiplexer chip, one
Router chip and two Buffer chips) is 3.5W. Similarly, the power dissipation of a 4-to-1
multiplexer (which composes of three Multiplexer chips) is 2.4W whereas the power
dissipation for a 1-to-4 demultiplexer (which composes of six Buffer chips and three Router
chips) is 8.lW.
For a 1024 x 1024 622Mbls switch, it requires 1920 (128 rows by 15 columns) 2 x 2
switching elements, 256  -to-l multiplexers and 256 l-to-4 demultiplexers. Summing all
these together, the total power dissipation of the switch becomes:
1920 x3.5W + 256 x 2.4W + 256 x 8.1W = 9408\ù/.
Note that this calculation is based on the worst case condition that all the elements in the
multiplexers and demultiplexers run at their maximum speed. In reality, different stages of
these components run at different speed and there is a slight power saving. However, it will
not be as significant as that for CMOS since the power dissipation in GaAs MESFETs is
mainly static whereas it is mainly dynamic in CMOS.
For comparison, a 1024 x 1024 lssMb/s switch requires 352 (128 rows by 15
columns) 2 x 2 switching elements, 64 4-to-1 multiplexers and 64 l-to-4 demultiplexers.
Summing all these together, the total power dissipation of the swirch becomes:
352x3.5W + 64x 12W + 64 x 40.5W =4592W.
The power saving is achieved by the use of the bit-rate conversion technique which increa,ses
the incoming data rate so that the number of 2 x 2 switching elements is reduced.
I I All power dissipations a¡e based on simulation results running a¡ 50% higher than their designed









I. M. Abdel-Mortaleb, W. C. Rutherford and L. Young. GaAs
Inverted Common Drain Logic (ICDL) and lts Performance
Compared with Other GaAs Logic Families. Solid-State
Elecronics, 1987, vol. 30, pp.403-414.
A. Barna and C. Liechti. Optimízation of GaAs MESFET Logic
Gates with Subpicosecond Propagation Delays.IEEE Journal of
Solid-State Circuits, vol. SC-14, pp.708-715, August 1979.
V. E. Benes. Optimal Rearuangeable Multistage Connecting
Networks. Bell Systems Technical Journal, vol. 43, no.7, pp.
l@I-1656, July L964.
B. G. Bosch. Gigabit Electonics-A Review. Proceedings IEEE,
vol. 67, Ma¡ch L979, pp. 340.
E. Bushehn. Critícal Design Issues for Gallium Arsenide VLSI
Circuits. Ph. D. thesis, Middlesex Polytech University, London,
1992.
E. Chu. Module Description (MOB) for Primitives Used within
the Buffer Chip in High Bandwidth Packet Switch. Jydsk Telefon
Internal Technical Report, version 1, November 1993.
188
lcHU2l: E. Chu. Module Description (MOB) for Input Contol of Buffer
Chip in High Bandwidth Packet Switch. Jydsk Telefon Inærnal
Technical Report, version l, November 1993.
lcHU3l: E. Chu. Module Description (MOB) for Buffer Manager of Buffer
Chip in High Bandwidth Packet Switch. Jydsk Telefon Internal
Technical Report, pilot version, November L993.
[cHU4]: E. Chu. Module Description (MOB) for Output Control of Buffer
Chip in High Bandwidth Packet Swítch. Jydsk Telefon Internal
Technical Report, version 1, November L993.
ICHU&JAKOBSENI: E. Chu and J. Jakobsen. Comparison of GaAs Static l-ogic
Families. Proceedings of the l lth NorChip Seminar, Trondheim,
Norway, November 1993, pp. 93-98.
lcMPl Foundry Design Manual Version 5.0. Thomson Composants
Microondes, March 199 1.
lcuRrrcEl: W. R. Curtice. A MESFET Model for Use in the Design of GaAs
Integrated Circuits.IEEE Transactions on Microwave Theory and
Technology, MTT-28, pp. M8-456, May 1980.
ÞAVENPORTI: W. H. Davenport. Macro Evaluation of a Gø4s 3AOO Gate Anay.








M. de Prycker, Asynchronous Transfer Mode: Solution for
Broadband /SDN. Ellis Horwood Limited,l99l.
R. A. Duncan, K. C. Smith and A. S. Sedra. Gallium Arsenide
Pseudo-Curuent-Mode l-ogíc. Electronics l-etter, 1990, vol,. 26,
pp.2L30-2132.
R. C. Eden. Capacitor Diode FET Logíc (CDFL)-Circuit
Approach for GaAs D-MESFET IC's. Proceedings 1984 IEEE
GaAs IC Symposium, Boston, Massachusetts, pp.ll-14, October
1984.
R. C. Eden, B. M. \Velch and R. Zucca.Inw Power GaAs Digital
ICs Using Schonþ Díode FET l-ogic. Digest of Technical Papers,
1978 International Solid-State Circuit Conference, pp. 66-69,
February 1978.
R. C. Eden, F. S. Lee, S. I. Long, B. M. Welch and R. Zucca.
Multi-level Logic Gate Implementation in GaAs IC's Using
Schottþ Diode FET Logic. Digest of Technical Papers, 1980








K. Eshraghian, A. Blanksby, R. Sarmiento, C. C. Lim, E.
Bushehri and R. Bayford. Gallium Arsenide Design Methodology
& Perþrmance Estimates for Very High Speed Circuits Using
Normally-Off Classes of Logic. Proceedings of the l2th
Australian Microelectronics Conference, pp. 227-232,
Queensland, Australia, October, 1993.
K. Eshraghian, E. Chu, A. Moini and S. CuL Comparison of
GaAs Static Logic Families Suitable for VLfl Implementation.
Internal Report GAAS-92-15, Centre for GaAs VLSI Technology,
The University of Adelaide, South Austalia, November 1992.
K. Eshraghian, R. Sarmiento, P. P. Carballo and A. Núñez.
Speed-Area-Power Optimisation for DCFL and SDCFL Class of
Logic Using Ring Notation. Microprocessing and
Microprogramming 32, pp.75-82, North Holland, 1991.
K. Eshraghian. Fundamentals of Very High Speed Systems:
Gallium Arsenide VLSI Technology Course Notes. Centre for
GaAs VLSI Technology, The University of Adelaide, South
Ausralia, 1991.
K. Eshraghian. Design Methodology and l-ayout Style for Very
High Speed Circuits and Subsystems.Intemal Report GAAS-92-
4, Centre for GaAs VLSI Technology, The University of









A. Firstenberg. GaAs ICs for New Defense Systems Oîfer Speed
and Radiatíon Hardness Benefirs. Microwave Journal, pp. 145,
Ma¡ch 1985.
D. E. Fulkerson. Feedback FET l,ogic: A Robust, High-speed,
Low-power GaAs l-ogic Family. IEEE Journal of Solid-State
Ci¡cuits, 1991, vol.26, pp.70-74.
M. Gloanec, et. al. GaAs Digital Integrated Circuits (GaAs
MESFET Circuit Design, R. Soares, ed.), Artech Flouse, Chapter
8, 1988.
L. R. Goke and G. J. Lipovski. Banyan Neworks for Partitioning
Multiprocessor Systems. Proceedings lst Annual International
Symposium in Computer Architecture, pp. 2l-28, December
1973.
M. J. Helix, S. A. Jamison, S. A. Chao, C. and M. S. Shur. Fan
Out and Speed of GaAs SDFL Ingic.IEEE Journal of Solid State
Circuits, vol. SC-17, pp. 1226-123I, December 1982.
C. F. Hill. Noise Margin and Noise Immunity in Ingic Circuits.
Microelectronics, vol. 1, April 1968, pp. L6-22.
D. H. Hoe and C. A. T. Salama. Dynamic GaAs Capacitively
Coupled Domino Logic (CCDL).IEEE Journal of Solid-State









M. S. Hsu. Aspects of Designing a High Speed Analog to Digital
Converter. M. Eng. Sc. thesis, Department of Electrical and
Electronic Engineering, The University of Adelaide, South
Austalia, 1992.
Broadband Aspects of ISDN. CCITT Recommendations, June
1990.
BISDN ATM Layer Specification. CCITT Recommendations,
June 1990.
H. Ishikawa, et al. Norrnally-Off Type GaAs MESFET for Low
Power, High Speed Logic Circuits. IEEE Int. SSCC Digest,
February 1977, pp. 200.
J. Jakobsen. Buffered Benes Networks with Bit Rate Conversion.
Proceedings of the Australian Broadband Switching and Services
Symposium L993, July 1993, volume 2, pp.363-370.
J. Jakobsen. Functional Description (FEB) for Buffer Chip in
Hígh Bandwidth Packet Switch. Jydsk Telefon Internal Technical
Report, version 2, September 1993.
J. Jakobsen. Module Description (MOB) for Start Extraction in
Router Chip in High Bandwidth Packet Switch. Jydsk Telefon








J. Jakobsen. Systerns Specifications (SK.S) for High Bandwidth
Packet Switch. Jydsk Telefon Inærnal Technical Report, version
2, April 1993.
J. Jakobsen. Systems Description (SYB) for High Bandwidth
Packet Switch. Jydsk Telefon Internal Technical Report, version
2, August 1993.
S. Katsy, et al. A Source Coupled FET I-ogic-A New Cunent-
Mode Approach to GaAs Logics.IEEE Transaction of Electron
Devices, 1985, vol. ED-32, no. 6, pp. l114-1118.
G. Larue, T. Williams and P. Chan. FET FET Logíc: A High
Perþrmance, High Noise Margin E/D Logic Family. GaAs IC
Symposium, 1990, pp. 223-226.
J. Lohstroh, E. Seevinck and J. De Groot. Worst-Case Static
Noise Margin Criteriafor Logic Circuits andTheir Mathematícal
Equivalence.IEEE Journal of Solid-State Circuits, vol. SC-18,
number 6, pp. 803-807, December 198.3.
S. I. [.ong, et al. MSI High Speed, Low Power, GaAs Integrated
Circuits Using Schottþ Diode FET l-ogíc. IEEE Transaction on









S. I. Long and S. E. Butner. Gallium Arsenide Digital Integrated
Circuit Design. McGraw Hill, 1990.
P. J. T. Mellor, et al. Capacitor-Coupled Ingíc Using GaAs
Depletíon mode FETs. Electronic l-etters, vol. 16, September
1980, pp.749.
H. Nakamura, et al. A 390ps 1000 Gate Aruay Super-Buffer FET
l,o gic. D i g est of Te chnical P apers. 1 985 International Solid-State
Circuits Conference, February, I 985, pp. 204,-205.
K. R. Nary and S. I. Long. GaAs Two-Phase Dynamic FET
Logic: A Low-Power Logic Family for VIÅI.IEEE Journal of
Solid-State Circuits, vol. 27, no. 10, October, 1992, pp. 1364-
137r.
A. Nuñez. A Survey of GaAs Computer Designs.
Microprocessing and Microprogrammin g 2I, pp. 665 -67 0, North
Holland, 1987.
G. Nuzillat, et al. Quasi-Normally-Off MESFET lagic for High-
Perþrmance GaAs /Cs. IEEE Transaction on Electron Devices,










J. H. Pastemak and C. A. T. Salama. GaAs MESFET Dffirential
Pass-Transistor Logic. IEEE Journal of Solid-State Circuits,
1991, vol.26, pp. 1309-1316.
A. Peczalski, et al. Design Arnlysis of GaAs Direct Coupled Field
Effect Transistor Logic. IEEE Transaction of Computer-Aided
Design, 1986, vol. CAD-S, pp.266-273.
M. Simons. Radíation Effects of GaAs Integrated Circuits: A
Comparisonwith Silicon. GaAs IC Symposium, Technical Digest"
r983.
W. Stallings. Advances in ISDN and Broadband ISDN. IEEE
Computer Society Press, 1992.
K. Suyama, et al. Design and Performance of GaAs Norrnally-Off
MESFET Integrated Circuits. IEEE Transaction on Electron
Devices; vol. ED-27, no. 6, June 1980, pp. 1O92.
S. M. Sze. Semiconductor Devices: Physics and Technology.U
S. A., Bell Telephone Laboratory, 1985.
F. A. Tobagí. Fast Packet Switch Architectures For Broadband
Integrated Services Digital Networlcs. Proceedings of the IEEE,








R. L. Van Tuyl, et al. High-Speed Integrated Logic with GaAs
MESFETs. IEEE Journal of Solid-State Circuits, vol. SC-9,
October 1974, pp.269.
R. L. Van Tuyl, et al. GaAs MESFET lagic with 4GHz Clock
Rate.IEEE Journal of Solid-State Circuits, vol. SC-12, October
1977, pp.485.
T. T. Vu, A. Peczalski, et al. The Performance of Source-Coupled
FET Logic Circuits that Use GaAs MESFETs. IEEE Journal of
Solid-State Circuits, 1988, vol.23, pp. 267-279.
A. D. Welbourn, et al. .4 Hígh Speed GaAs 8-Bit Multiplexer
using Capacitor-Coupled Logic. IEEE Journal of Solid-State
Circuits, vol. SC-18, no. 3, June 1983, pp. 359.
N. H. E. Weste and K. Eshraghian. Principles of CMOS VßI
Design: A Systems Perspectiv¿. Addison Wesley, second edition,
1992, pp. 69-71.
R. Zuleeg, J. K. Notthoff and G. L. Troeger. Double-Implanted
GaAs Complementary JFETs. IEEE Electronic Device lætter, vol.
EDL-5, pp.2l-23, January 1984.
r97
