GaAs VLSI for aerospace electronics by Larue, G. & Chan, P.
N94-71116
2nd NASA SERC Symposium on VLSI Design 1990 5.2.1
GaAs VLSI For .
Aerospace Electronics
G. LaRue and P. Chan
Boeing Aerospace and Electronics
High Technology Center
P.O. Box 3999 MS 7J-56
Seattle, WA 98124-2499
1 Introduction
Advanced aerospace electronics systems require high-speed, low-power, radiation-hard,
digital components for signal processing, control, and communication applications. GaAs
VLSI devices provide a number of advantages over silicon devices including higher carrier
velocities, ability to integrate with high performance optical devices, and high-restivity
substrates that provide very short gate delays, good isolation, and tolerance to many
forms of radiation. However, III-V technologies also have disadvantages, such as lower
yield compared to silicon MOS technology.
Achieving very large scale integration (VLSI) is particularly important for fast complex
systems. At very short gate delays (less than 100 ps), chip-to-chip interconnects severely
degrade circuit clock rates. Complex systems, therefore, benefit greatly when as many
gates as possible are placed on a single chip. To fully exploit the advantages of GaAs
circuits, attention must be focused on achieving high integration levels by reducing power
dissipation, reducing the number of devices per logic function, and providing circuit designs
that are more tolerant to process and environmental variations. In addition, adequate noise
margin must be maintained to ensure a practical yield.
2 Applications
Specific applications of GaAs ICs are in fiber optic communications and digital signal
processing. The use of fiber optics on board aircraft and spacecraft provide significant
reductions in weight. GaAs electronics have achieved fiber optic data rates well beyond
1 Gigabit per second. GaAs circuits can also benefit aerospace applications in the high
speed data processing by occupying a smaller volume, and reducing power dissipation and
thus saving weight. Although ECL technology can come close to the speed of GaAs, its
power dissipation is much higher. CMOS technology can be used in some applications
by processing data in parallel at the expense of larger volume. Some applications require
low latency and must be performed at a high data rate, thus eliminating parallel solutions
entirely. ECL technology can come close to the speed of GaAs, but has higher power
dissipation.
https://ntrs.nasa.gov/search.jsp?R=19940004361 2020-06-17T00:15:32+00:00Z
5.2.2
3 Floating Point Multiplier
We designed a 32-bit floating point multiplier to investigate the yield and performance of
GaAs VLSI for applications in digital signal processing. With over 10,000 equivalent gates,
the multiplier approaches the current complexity limits of GaAs. It also provides a good
example of a GaAs VLSI integrated circuit targeted for aerospace applications.
The multiplier accepts normalized 32-bit floating point numbers expressed in the IEEE
Standard 754, version 8.0 or 10.0 single precision format[l]. GaAs 1 micron E/D MES-
FET technology was chosen because of the maturity of the fabrication process for LSI
production. Operation over the full military temperature range is required.
4 GaAs Logic Family Considerations
There are several logic families that are commonly used to design GaAs E/D MESFET
circuits. In choosing a logic family, we were most concerned about noise margin. At these
high integration levels, noise margin must be higher due to increased device variations,
power-bus noise and crosstalk on signal lines. This high noise margin must be sufficient
over the entire military temperature range to ensure adequate yield. We also wanted single
supply operations.
Families that meet the above criteria are source- coupled FET logic (SCFL)[2] Gain
FET logic (GFL)[3] and FET-FET logic (FFL)[4). FFL was invented at Boeing and has a
better delay power product than either GFL or SGFL. Although SCFL has a much higher
power dissipation, it can perform very complex logic functions which include implementing
a full adder with two gates and providing the sum and carry outputs in only one gate delay
each.
Adder Type
Nor FFL
Complex FFL
SCFL
Complex GFL
Device
Count
91
34
44
39
Sum
Delay.ps
480
585
460
750
Carry
Delay.ps
290
282
450
395
Adder
power .mW
9.6
2.1
2.4
1.8
Wallace Tree
delay.ns
1.9
2.0
1.8
2.6
Wallace tree
delay power
product
18.2
4.2
4.3
4.7
Table 1: Comparison of Different Adder Designs
5 Full Adder Design
The most important building block of the multiplier is the full adder. The design of the full
adder is a determining factor in the final speed and power dissipation of the chip. Figure
1 shows the schematic of an all NOR implementation of a full adder. It requires 12 gates
2nd NASA SERC Symposium on VLSI Design 1990 5.2.3
bO-
YY Y
Sum.
Figure 1: NOR Implemantation of Full Adder
to implement and the carry and sum are generated in 2 and 3 gate delays, respectively.
This design of a full adder result in high device count and high power.
A complex AND/NOR gate full adder was designed using FFL and GFL gates (fig. 2).
This full adder requires only 2 complex gates and the carry and sum are generated in 1
and 2 gate delays respectively. As a comparison, an SCFL full adder was designed (fig. 3).
The SCFL adder was designed to have comparable power dissipation to the complex FFL
and GFL adders.
It takes 3 sum delays and 1 delay delay for the Wallace adder tree to reduce 13 partial
products to 3. Table 1 shows device count, the sum and carry delay, power dissipation,
as well as the Wallace tree delay for the different adder design under nominal processing
cD>
Carry
Sum
Figure 2: Complex AND/NOR Full Adder
5.2.4
sum Acany A I carry A I sum A
carry C r, carry Crjsnm C r{ sum C
lum B
Figure 3: SCFL Adder
conditions.
The all NOR implementation has much higher device count and power than the other
designs.
We choose FFL with the complex gate full adder approach to implement the multiplier.
FFL has the lowest delay-power product for the' Wallace tree and the smallest device
count. GFL has 15% lower performance than FFL with comparable layout area. SCFL is
comparable in delay-power performance to FFL but requires a substantially larger layout
area. The area is larger mainly because SCFL is a differential logic family and requires
two interconnects between gates instead of one.
6 Multiplier Architecture
Figure 4 shows a simplified block diagram of the floating point multiplier. The chip has
a 4-stage pipeline architecture employing high-speed pass-transistor pipeline latches. The
32-bit inputs are screened for invalid inputs and the signs of the numbers are multiplied by
an exclusive- or gate. The exponent adder performs addition of the two 8- bit exponents
and outputs the sum, as well as the sum incremented by 1 for the possible right shift of
the 24-bit mantissa result. The modified Booth Encoder produces a 69-bit code from the
multiplicand.
Thirteen 26-bit partial products are generated by the partial product generator and
are reduced to three partial products by the first Wallace Tree. The second Wallace
Tree further reduces the three partial products to two and the look-ahead-carry generator
2nd NASA SERC Symposium on VLSI Design 1990 5.2.5
Input A Input B
J 32 | 32
Input latches Inpnt latches
4 » i 32
Sign Exponent Booth
Generator Adder Encoder
J 1 f~20 ,
Pipeline Latches
\
1
\
1
i 115
i 115
L Partial productgenerator
and
Wallace tree 1
1
Pipeline Latches
1
1
115
•v
J
M 1 118
Wallace tree 2
and look-ahead
carry generator
.
Pipeline Latches
' }» F;«.I
Exponent Output ^_ ^j
 lwilld/leil
MUX
J. j 23
Output Latches
f **
| 88
ormaliBcd
t
V 3
Round
mode
J
Error flags
323
Product
Figure 4: Block Diagram of Floating Point Multiplier
5.2.6
generates the carries for the final adder. Two rounding modes are available: round to the
nearest and round toward zero. The result from the final adder is rounded, checked for
overflow and underflow, and renonnalized into the 23-bit mantissa product. The correct
8-bit exponent result is then chosen and the 32-bit (sign bit, 8-bit exponent, and 23-bit
mantissa) product is obtained.
7 Simulated Results
Automatic placement and routing of FF1 standard cells was used to lay out the circuit.
Interconnect capacitances were then extracted. The critical paths were resimulated and
found to be less than 3 ns between latches. Operation near 350 MFLOPS is expected
for TriQuint Semiconductor's 1 micron E/D MESFET process. Power dissipation will be
under 4.5 W. The die size is about 7.5 mm by 8 mm with about 40,000 devices.
8 Conclusion
The design of a GaAs VLSI floating point multiplier was described. The chip is expected
to perform multiplication at data throughput rates of about 350 MHz when the pipeline
latches are enabled. With the pipeline latches disabled, the multiplier will operate at about
110 MHz.
The high-speed, low-power and radiation hardness of the multiplier will demonstrate
the benefits of using GaAs VLSI for aerospace, electronics.
References
[1] "IEEE Standard for Binary Floating Point Arithmetic," ANSI/IEEE Std 754-1985,
New York, The Institute of Electicral and Electronics Engineers, Inc., 1985.
[2] T.Takadaet al.,u A 2 GB/s Throughput GaAs Digital Time Switch LSI using LSCFL,"
IEEE Trans, on Electron Devices, Vol. ED-32, pp. 2 478-2753, 1985.
[3] G. M. Lee et al., "A High Performance, Low Power GaAs Gate Array Family," VLSI
Systems Design (USA), Vol. 8, No. 8, pp. 24-25, 28-30, July 1987.
[4] G. S. LaRue, T. J. Williams and P.Y.Chan, "FET FET Logic: A High Performance,
High Noise Margin E/D Logic Family," GaAs 1C Symposium, Oct. 1990.
