A practical WSI experimental programme by Jalowiecki, IP et al.
A PRACTICAL WSI EXPERIMENTAL PROGRAMME 
1an.P.Jalowiecki'. Stephen.J.Hedge* and RM.Lea3 
Introduction 
At Brunel University. research has been underway for several years to assess the architectural. electrical 
and physical benefits and constraints of the WASP wafer-scale Associative String Processor (ASP). This 
is intended to implement a massively parallel processor entirely within the constraints of WSI. 
WASP 1 and WASP 2 were the technology demonstrators of the UK funded Alvey programme (starting 
1984). researching fundamental design methodologies for WSI. They are both examples of the Associative 
String Processor (ASP) architecture, developed by Brunel University. Further demonstrators are currently 
funded by a 3%-year U S  ONR IS&T programme (starting 1987). involving both further technology 
demonstration, aoDlications research and fundamental packaang and manufacturing design issues. 
I I 
D 
A 
T 
A 
I 
N 
T 
E 
F 
A 
C 
E 
S 
n 
C N  
O T  
N E  
T R  
R F  
O A  
L C  
E 
m 
Figure 1 ASP module 
ASP Architecture 
A generic ASP module. 
c o m p r i s i n g  
communicating ASP 
s u b s t r i n g s ,  each  
comprising an ASP 
Data Buffer (ADB). and 
an ASP Control Unit 
(ACU). as shown in 
Figure 1. 
Each ASP substring is 
an  SIMD parallel 
processing structure. 
comprising a string of 
i d e n t i c a l  A P E s  
(Associative Processing 
Elements), as shown in 
Figure 2 .  Each APE 
incorporates a 32-bit 
data register, a 5-bit 
activitv reaster. and a 
(32+5)-bit parallel comparator. Each APE also includes a single-bit full-adder. 4 statis  fligs and logic 
for communicating with other APEs via an Inter-APE Communication Network. All APEs in a substring 
share common bit-parallel Data, Activity and Control busses and a single feedback line (Match Reply. 
MR). The ASP is based on content-matching, thus APEs are selected by comparing their registers with 
the states of the corresponding Data and Activity busses. Data 110 is supported by the Vector Data 
Buffer. a dual-port memory which has a bit-parallel interface to the ADB via the Secondary Data 
Exchange (SDXJ port. and which can perform bit-serial exchanges all APE data registers via the Primary 
Data Exchange (PDQ port. 
Since the ASP Comprises a long string of simply linked. small, identical content-addressable APEs. the 
ASP structure is highly amenable to defect/fault-tolerance. by simply adding APEs to the end of the 
~~ 
' Brunel University. Uxbndge. Middlesex, UB8 3PH. UK 
Aspex Microsystems Ltd.. Brunel University, Uxbridge. Middlesex, UB8 3PH. UK 
Brunel University, Uxbridge. Middlesex. UB8 3PH. UK 
U 
I 
Figure 2 ASP substring schematic 
required ASP substring 
length and bypassing 
faulty APE-blocks. 
Furthermore.  the  
W A S P  m o d u l e  
i n t e r - c o n n e c t i o n  
s t r a t e g y  o f f e r s  
h i e r a r c h i c a l  
defect/ fault-tolerance 
by selective by-passing 
of faulty APE-blocks, 
faulty groups-of-blocks 
(i.e. a 'chip') or faulty 
substrings. 
WASP Architecture 
As indicated in Figure 
3. a WASP device is 
physically composed 
from 3 different VLSI 
sized blocks known as 
Data Routers (DRS). 
ASP substrings and 
Control Routers (CRS). The DR and CR blocks incorporate routing to connect ASP substring rows to a 
common Data Interface (DI) and a common Control Interface (CI) respectively. Moreover. both these 
blocks incorporate LKL. and LKR ports to effect row-to-row extension of ASP substrings. 
WASP 1 
A fundamental demonstration of this class of 
wafer-scale device was successfully made by 
WASP1. fabricated in 3988. This comprised 
individual ASP substrings. each with four ASP 
modules and a dedicated CR/CI module. 
Manufacturing methods were based on standard 
fabrication technology. involving the use of 
standard steppers and VLSl die masks. Indeed. 
the WASP 1 & 2 demonstrators employ only one 
stepper reticle (i .e maximum 13mm x 13mm) 
which is subdivided to achieve cost-effectiveness 
by manufacturing all blocks through selective 
exposure of a single reticle. Wafer fabrication is 
therefore based upon the selective exposure of 
shuttered portions of the reticle by the Canon 
FPA- 1550 stepper. 
Experiments carried out on this demonstrator 
included 
1. zoned clock and signal distribution 
2. selective power isolation of modules 
failing through short-circuits 
3. selective ASP module isolation and 
bypass (inter-module fault-tolerance) 
4. selective bypassing of APE blocks (intra- 
module fault-tolerance) 
3 
Figure 3 Generic WASP device floorplan 
After extensive testing. this method of DSW (Direct-Step-on-Wafer) reticle "stitching' (to interconnect ASP 
substring and CR/CI blocks) was fully proven. In addition, defect/fault-tolerance within and between 
ASP substring blocks and selective power isolation were successfully demonstrated, as was the Wafer 
Scale clock and signal distribution across the -4cm devices. 
WASP 2 
Two WASP2 variants have been fabricated. based on the successful ASP block from WASP1 . These are 
described below and detailed in Table I. 
~ 
#APES 
area 
transistors 
power 
1 external 
clock 
l internal clock 
1. WASP2A 
WASP2A integrates 864 APES in 6 substrings, each with four ASP blocks. a DR incorporating an ADB 
buffer memory, and a new CR/CI design. This was implemented as a less than full-wafer device as a safe 
intermediate to a whole wafer WASP. Four devices are on the wafer, with the remaining area occupied 
by test chips. 
2. WASP2B 
The WASP2E composition is representative ofa whole wafer WASP device. and comprises an array of 180 
APP devices, on a 6 inch wafer. This device tests some of the fundamental issues of whole wafer 
monolithic integration. especially signal and power distribution. 
These variants are fabricated in relaxed 2-micron design rules at Plessey Roborough. 
completed fabrication in 1Q90 whilst WASP 2 B  completed fabrication in 3 8  90. 
WASP development Scale Integration demonstrators 
WASP 2A 
Table I Characteristics of the WASP 2a and WASPZb Wafer 
6480 
9.lcm x 
9.8cm 
8.43M 
' 6MHz 
112MHz 
29.6-51.2W 
As (monol i th ic1  WSI 
technology demonstrators, 
WASP 1 and 2 have been 
highly successful. However, 
a s  f u n c t i o n a l  W A S P  
prototypes, they leave much 
to be desired. Indeed, WASP 
1 and 2 provide only a partial 
implementation of ASP 
substrings. Moreover. 
budgetary res t r ic t ions  
constrained designs to 
remesentative rather than 
WASP 2A 
864 
3.9cm x 
3.9cm 
1.26M 
5.2 - 8.0W 
6MHz 
12MHz 
WASP 2B 
realistic 'chip' blocks. 
suitable only for proof-of- 
principle testing. 
Currently, a 3-phase prototype development ofthe 58mmx 56mm 15.360-APE WASPdevice is scheduled 
for 1991 through 1992. This project, funded under a U S  SDIO-IST contract, involves the design, 
fabrication and evaluation of 
WASP 3: constituting a bold step towards a full implementation of the ASP substring, with 320-  
APE ASP substring blocks, but with the DI and CI blocks designed only to facilitate 
testing and evaluation of ASP substring rows 
consolidating ASP substring blocks and incorporating full CI and DI blocks 
consolidating and debugging WASP 4 to deliver a deflnitive WASP. 
WASP 4 : 
WASP 5 : 
The WASP3 ASP substring block design is now well advanced, based entirely on the principles pioneered 
in the early WASP devices. Special emphasis Is being placed on the implementation of power and. 
especially, signal distribution. Novel methods are being investigated to enhance the reliability of large 
area bus structures, based on the experience of WASP2. 
. Conclusions 
WASP 1 experimental results have confirmed the feasibility of the basic architecture and design 
methodology. Furthermore. the WASP 2 development represents an expansion of the scope of the original 
programme, with the originally planned WASP 2A ULSI device being augmented by a full-wafer WASP 2B 
variant, assembled from the same reticles. Testing of the WASP2A has determined that it achieves most 
of its major objectives. Continued development is underway on the latest demonstrator in the series. 
WASP3. 
