Evaluation of a signal processing test bed. by Vrabel, George Thomas
Calhoun: The NPS Institutional Archive
Theses and Dissertations Thesis Collection
1978
Evaluation of a signal processing test bed.
Vrabel, George Thomas
























SECURITY CLASSIFICATION OF THIS PAGE (Whtn Dmim Entmfd)
REPORT DOCUMENTATION PAGE READ INSTRUCTIONSBEFORE COMPLETING FORM
I. MEPOAT NUMBSM 2. GOVT ACCESSION NO ». RECIPIENT'S CATALOG NUMBER
4. TITLE rand Su6«rl*j
Evaluation of a Signal Processing
Test Bed
5. TYPE OF REPORT k PERIOD COVERED
Master's Thesis;
December 1978
«. PERFORMING ORG. REPORT NUMBER
7. AUTMORf«J «. CONTRACT OR GRANT NU»i*a£Rr»J
George Thomas Vrabel
I. PERFORMING ORGANIZATION NAME AND ADORESt
Naval Postgraduate School
Monterey, California 93940
10. PROGRAM ELEMENT, PROJECT, TASK
AREA A WORX UNIT NUMBERS





1]. NUMBER OF PAGES
107
14. MONITORING AGENCY NAME t AOORESSC// dltftrmnl Insm Confroi/ln« Otilet)
Naval Postgraduate School
Monterey, California 93940




l«. DISTRIBUTION STATEMENT (ol tMa K»pert)
Approved for public release; distribution unlimited
17. DISTRIBUTION STATEMENT (of IH» ah»iT»<t antend In aieek 30. II dlllmrwtl tnm Raporl)
IB. SUPPLEMENTARY NOTES
It. KEY WORDS (Contlnu9 on fW9Ta» »ld» II nac««««T «*d Idrnntlty ^ )>iack niaa^mt)
Array processing
Signal-processing
20. ABSTRACT (Contlruf on r«v«ri tarr «nd Idtniitr hj hiuck mtmbot)
This thesis was undertaken to examine an acoustical
signal processing test bed, similiar to the one installed
at the Naval Postgraduate School, to be used primarily for
experimental applications. The major components include two
PDP-11 series computers, at least one array processor, a
mass storage unit as well as assorted input and display
DD 1 ja"*7J 1473 EDITION or 1 NOV •> IS OBSOLETE
(Page 1) S/N 0102-014- 6601 I Unclassified
1 SECURITY CLASSIFICATION OF THIS PAGE (Wttun Dmia KntOf*d)

ynglass3.fig<^
^eo*wry ci. ahi^ic a tiqx py This m^Qirw*,^ n»i« Kmimfit
equipment. Of major interest were the computer selection,





1 Jan 73 UnclassiriedC/M r»i Ao_ni /<_«cm n -. ....... ...^^,.

Aporoved for oublic release? distribution unlimited




Lieutenant/ United States Navy
Submitted in partial fuMillment of the
requirements for the decree of






This thesis was undertaken to examine an acoustical
signal processinq test Dedf s i m i 1 i a r to the one installed at
the Naval Postgraduate School* to be used primarily for
exoerimental apolications. The major components include two
PDP-11 series computers* at least one array processor/ a
mass storage unit as well as assorted input and display
eauipment. Of major interest were the computer selection*














III. ARRAY PROCESSOR 2 6
IV. THE AP-1208 ARRAY PROCESSOR 30




4. Table Memory 42
5. Data Pad x and Y 43
6. Vain Data l^emory 45
7. Prograoi Source '"^^odule 48
8. Interface with PDP-11 Series 49
a. Front Panel 49
b. DMA Control 52

p. SOFTWARE 5a









q. Testina Software 57
2 , Programmina Language 58
3. Page Select Option 59
a, Prcgrammaole I/O Rrocesor bO
C. PROGRAMMING/ OPERATION AND EXECUTION 61
MAP-500 6o
A. CHARACTERISTICS AND HARD;«JARE 70
1. CSPU 70
?. Arithmetic Processor, 73
a. APU 73
b. APS 78
3. Host Interface Scroll Bl
a. V. emory , 83
B. soft^jare support 8a
1. Executive and Associated Routines 85
a. Assembler 85
b. Simulator 3 5
c. Loader 8 6




3. Proara-Trrinq Lanquaqe 88
a. I/O Scrolls ^0
a. Analog Data Acquisition Module ^1
C. PROGRAM.viING, OPERATIOTj AND EXECUTION 92
VI. DISCUSSION OF FINDINGS 95
VII. CONCLUSIONS AND RtCO'^MENDA T I ONS 103
VIII. REFERENCES 104
INITIAL DISTRIBUTION LIST 107

(This page intentionally blank)

ACKNOWLEDGEMENTS
The continual support and critical help provideci by
thesis advisor Professor G. Rahe and systerns orogramme rs A.
Wonq and W. Thomas is gratefully acknowledged.
The preliminary text of this thesis was oreoared using
the software of the UNIX operating system/ operating on a
PDP-11/SO of the Naval Postgraduate School Computer
Labo ra t o ry .

ABBREVIATIONS
Al ** AP-120B Adder Register One
A2 ** AP-l^OB Adder Register Two
A/D ** Analog to Digital
ADAM ** Analog Data Acguistion Module
ALU ** Arithrnotic Logical Unit
A\'SI ** American National Standards Institute
AP ** MAP-300 Arithrretic Processor
APGET ** Get Data From the AP-120B
APPUT ** Put Data Into the AP-120B
API ** MAP-500 Arithmetic Processor One
AP^ *•* MAP-300 Arithmetic Processor Two
AP-1^06 *-* Floating Point Systems Arrav Processor
Model 120-b
ARAL ** AP-l20d Cross-Assembl er
APEX ** AP-l^OB Executive Routine
APDMA ** AP-1,^0B AP Direct Memory Adaress Register
APLINK ** AP-iaOB Linker and Loader

APMATH ** AP-1206 Math Library
APMAt ** AP-1208 Merrory Address Extension
APSIM ** AP-120B SifTulator
APTE5T ** AP-120B Path Tester Program
APS ** MP-300 Addresser Processor Section
APU ** MP-300 Arithrretic Processing Unit
CVMUL ** Complex Vector Multiply
CPU ** Central Processing Unit
CSPU ** MAP-300 Central System Processing Unit
CS^ ** MAP-300 Control Status Register or C-State
^''10 rd
CTL ** AP-120B Control Register
DCB ** ^^AP-300 Driver Control Block
DIO ** Direct Inout/ Output
DMA ** Direct Memory Access
OPA ** AP-120e Data Pad Address Register
DPX ** AP-120B Data Pad X
CPY ** AP-120B Data Paa i
FA ** AP-120B Adder Result Register

FCB ** MAP-300 Function Control Block
FFT ** Fast Fourier Transfon
FIFFT ** Forwa rd/
I
nve rse Fast Fourier Transform Test
FIFO ** First In First Out
FL * AP-1£0B Adder Results Less Than Zero
FM ** AP-120B Multiclier Result Register
FMT ** AP-120B Format Reaister
FN ** AP-120B Function Register
FO * * AP-l^OB Adder Exponent Overflow
FU ** AP-120B Adder Exponent Underflow
FZ ** AP-1206 Adder Results Equal Zero
hMA * AP-120B Host Memory Access Register
hIC ** MAP-300 Host Interface Controller
HIM ** MAP-300 Host Interface ^oaule
his ** MAP-300 Host Interface Scroll
lOS ** MAP-300 Inout/OutPut Scroll
IQ ** yiAP-300 Input Queue
LIFO * * Last In First Out
V.

LITES ** AP-1206 Lichts Register
Ml ** AP-1^06 MulticHer Unit Number One
M2 ** AP-120B 'Multiclier Unit Number Tw.
MAP ** Macro Array Processor
MAP-300 ** CSPI Macro Array Processor Model 300
MO ** AP-l^Oe Main Data Memory Output Buffer Registee r
.MI ** AP-120B Main Data Memory Inout Buffer
MOS ** Wetalic Oxide Semiconductor
MT6F ** Mean Time Between Failure
MTTR ** Mean Time To Repair
NOP ** No Ooe ration
OQ ** MAP-300 OutDut Queue
P0-P3 ** MAP-300 Program Counters One Through Three
p ** VAP-300 Multiplier Results Register
PIOC ** AP-120B Programmable Input/Outout Channel
PIOP ** AP-120B Programmable Input/Output Processor
k ** MAP-300 Adder Results Register
RAF ** MAP-300 Read Address FIFO
13

RAMP ** Reliability And Maintenance Program
WFFT ** Real to Complex FFT
RFFTSC ** Real FFT Scale and Forrrat
ROM ** Read-Only '^errory
S-Paa ** AP-120B Scratch Pad
SNAP-II * ^•iAP-3 Systematic Notation For Array
Processing Version II
SPFN ** AP-120B S-Pad Uutout Buffer Register
SRA ** Subroutine Return Address
SWR ** AP-120B Switch Register
SYSFLG ** MAP**300 System Flag Register
TM ** AP**120B Table f^emory
TMA ** AP-120B Table Memory Address Register
TMRAM ** AP-120B Random Access Table ^emory
VAC ** Volts Alternating Current
VMUL ** Vector Multiply
wAF ** MAP**300 A' rite Address FIFO
rtC ** AP-120B k^ord Count Register

I. I,NTRODUCTIOiM
The purpose of this study is to begin evaluation of a
proposea sianal-orocessing test bea similiar to the test bed
being installed at the fJaval Post Graduate School/ Monterey^
Califor^nia. The oasic test bed consists of an analog
subsystem (fig 1), data-processing subsvsteT. (fig d) r
signal-processing subsystem (fig 3) ana display subsystem
(fig ^) to bp used for general-purpose Mava! research.
The analog subsystem of the test bed was desioned for
signal reception and condition ina. This is basically
accomplished by a 12o-line input into a programmed matrix
svvitch v^ihicr emits l>i lines of outout. These 32 lines
continue throuan a oroaram-controlled filter issuing output
from the subsysten,
Tne signal-processing sucsvstem receives results from
the analog subsystem via an AM -5*^ A/D converter. This
information can then oe stored in an Arroex N'egastore unit to
be later processed by one MAP-300 array orocessor. A
PDP-11/3^ computer controls the mass storage device* the
arr ay processor ana input functions. Output is directed to
the data-processing subsystem.
The data-processing subsystem receives the processed













•^ o ^ S "*
3 « ae t: "
»- u. o Ik
* s s; « ^
• I « *
I
«-> 5 z —
I
•_ Ck t_) UI























« tx Of irt %n
<; t •< o ?-J .^ .J 3-o C3 o 3 3
D. a. Q- a: ?» 2r S S 5 Q. —
03 ca ca X w oX M 3< >o CO
IS 113.10d SISSVH3 3oairi 1
-/
Ui Uf ^ UJ
ac CI n! rro o O o
a? UJ«..J w* O —
i
»— h— 5^ v^ O
>- >- >- •c o;
e.1 <Z1 C3 o *—
^<; X w X its
*NJ fN* rj «N» <_l































































































1 3; UI «J O














































































Display devices oresently include a Ramtek 9500 Video
i
Display Unit (color and shades of gray)^ the Versatec 1600A
printer/plotter and an EPC 2300 Gram writer.
The goal of this study was to examine the major system
^components/ computers^ array processors and major data paths
to determine feasibility for various uses and suggest
possible alternative methods^ especially in the real-time
environment. The basic task o* the test bed was assumeg to
be general witn no suagestion of specific tasks although it
was recognized that many uses and data rates may be
ut 1 1 i zed.
Chapter II discusses specific comouter manufacturers
and computer types. Chapters III/ IV and V deal with the
two most popular gene r a 1 -pu roose array processors on the
market* discussing the pros and cons of each. C^^apter VI





For the test bed evaluation/ choosing the proper
computer is important since a varying ar^ount of
computational power is required for each subsystem. Also/ a
gambit of functions ana uses may be tried necessitating a
system that must realistically emulate many soeed/ cost and
memory constraints. A common and poDular system affords
better software support while still maintaining a low price.
The ability to rely on system supoort is an important issue
when consiaering long term use. A popular system tends to
develoD newer, more efficient software oacicages earlier anj
more freguently than go less used systems.
For large array processing applications with many
disDlay devices the ideal situation would be for one
comouter to initially load the array processor ana then act
as a "whole system" monitor and statistician. It could also
oerform the information aathering function while another
computer would act as the output processor for t^e array
processor ana control the oisplay devices. That situation
would be similiar to that of a test bed where flexibility
may be the key and being computer-bound would be hiahly
undesirable and possiply unjustly influence the evaluation
of the array processor. An ultimate goal might to be to
choose the smallest comouter capable of operating the array

processor and associatea cisplay devices in the desired
ifashion while orovidinq for product expansion. It is
realized that for test and research activities more
computing power may be necessary than would be neeaed for
normal production activities.
In October 75/ the Computer Family /Architecture
Selection Committee was formea to evaluate computer
architecture canaidates as a basis for a family of
software-compatible military comouters. Ten Army and 17
Navy oraanizations were represented on the selection
committee [11]. The purpose was to select an architecture
which could be useo as a standard/ haa a proven instruction
set and an architecture which could be used in advanced
techno! oaies.
B. PDP-11 FAMILY
The committee voted that tne POP-11 had the best
i;
i
architecture for use in the Military Computer Family,
However/ it aenerally container] a small address space ana
possible floating ooint instruction compatability oroblems
with existing systems. The IBM system 370 was ranked second
with the Interdata 8/32 rankeo third [\2], The Digital
Equipment Corporation PDP-11 series provided a popular
example of both the or ice and performance excellance in
available computer systems. Iheir popularity is evidenced by
the shipping of 10/000 PDP-11/Oa and 10/000 PDP-ll/3a

computers as of 1975/ 1976 respectively [<?8] . relevant
,PDP-11 computers considered were the PDP-11/Oa, PDP-ll/3a/
I
PDP-ll/aS, HDP-U/55, PDP-11/foO, and the PDP-11/70 (listed
from least powerful to most powerful). in hat follows is a
brief descriotion of each system. Unless otherwise stated/
It will be assumed that the more powerful system will
contain all the features of systems less powerful. The
POP-11/05 ana the LSI-11 series were not considerea due to
their not having the advantaaes of the UNIBUS [c'8].
1. POP-11/Oa
The PDP-11/Oa is the smallest compute^ of the PDP-11
series* containing the entire central processing unit on one
board permitting room for crastic expansion due to unused
chassis area , The system contains self-test logic to
determine system ooerability every time the orocessor has
power applied/ the console emulator is used or the bootstrap
routines are initiated. The console emulator allows the
operator to control the system from a terminal without
Dhysically throwing switches or reading lights on the front
panel of the unit. The bootstrap loader automatically
restarts the system from various peripheral devices without
need of Dhysical switch throwing. Memory size varies from
8K bytes to 5bK bytes C8 bits = 1 byte) of either MOS
(metalic oxide semiconductor) or core type with an average
access time of 50U-nanoseccnds and system cycle time of





The P0P-ll/3a is the next size of the PDP-11 family
and is the lowest architecture to contain a memory
managenient routine to orovide proaram protection so user
programs cannot access or change system memory space. (In
the 11/Oa it is tne orogrammers resoons i b i H t y to maintain
and protect this area.) Memory management also allows
virtual memory paging of uc to 16 pages ranging in size from
6U bytes to 8K bytes for a total possible memory of 256K
bytes of which 128K is physical. (The highest ^K of address
space on the POP- 1 1 / 3a/a5/55/60/7 is used for registers
that store I/O data or status of indiviaual peripheral
devices. This means that the 11/3^ can physically address
12aK bytes but virtually aadress 256K bytes.) The 11/34
allows Doth core memory and '^^OS memory to be used
concu r rent 1 y
.
The PDP-11/3'4 also contains a memory option called
cache memory which is a 2K high speed (300-nanosecona cycle
time) memory used to store a copy of the the most recently
selectea portions of main memory afforoing faster access of
instructions ana data. Tne "hit" time or time the next
access is resident in cache is approximately 8b percent for
the 11/34. Time is saved by less area to access, therefore
less search time, and shorter less complicated data
transmission. Since '^'OS memory is volatile (loses

information when oower is removed)^ the ll/3'4 has a battery
back-up ODtion which will retain information in the NiOS
memory for apo rox i ma t e 1 y two hours. The PDP-11/3^ can
operate in two rrodes^ Kernel and User. This two mode
concept is important in security since the User mode is
prevented from executing certain instructions that could
cause modification of the Kernel proaram, halt the computer
or use memory soace assigned to the Kernel or other users.
Monitoring ana Supervisory routines are executed in the
Kernel mode. The Kernel/User concept is imoortant since if
the Kernel can be made secure/ the overall security of the
operating system from accidental harm is much easier to
achieve. Prices range from 211,0^0 to S53,800 [29],
3. PDP-11/45
The PDP-11/45 system is aesigned for soeed. The
high-soeed central processor allows program execution of
three million instructions per second ana has either
300-nanosecond bipolar memory or 980-nanosecona core memory
available. "''OS memory is also available as an "add-on"
option. Total memory soace is the same as the 11/34, [here
is an optional floating point processor to hanole douPle
precision arithmetic. The system is especially good for
mu 1 t i o 1 e-t as k: apo 1 i c a t i on s / otherwise it is the same as the
11/34. The price is $41,300 129].

a. PDP-ll/55
The PDP-11/55 system imoroves on the 11/^5 by
insertina a dual bus structure to allow intermix inq core and
bipolar memory (ud to ^4dK with memory managementJ to
optimize system performance. Two separate semiconductor
I
controllers allow simultaneous data transfer for increased
system tnroughput. Both the 11/^5 ana 11/55 hardware have
been optimized towards a multiproQramminq environment by
installing a tnird mode/ Supervisor, to control system
operation while oroperly handling multi-user operations
[303. The price is S50,aU0 to 480,780 [29] .
5. PDP-1 1/bO .
The PDP-11/bO system is the interface between the
mid~range mini and the more powerful mini. With the 11/60
we See the first caoability to microprogram and four levels
of priority interrupts. The system was also designea with
the engineering traae-off between ease of maintenance and
reliability in mind. A system that is very difficult to
reoair after failure may oe less useful tnan an easy system
to repair that fails more often. The availability of the
system is a measure of mean time between failure divided by
the quantity mean time between failure plus mean time to
repair (MTBF /(MT6F + VTTR)] [303. Digital Equipment
Corporation has tried to allow for a more complex
architecture (probable higher failure rate) by oroviding a
•Reliability and Maintenance Proqram (RAMP) software package
26

to helD locate software and hardware errors^ decreasing the
MTTK thereby increasing availability. The price ranges from
$a2/a00 to over S200,000.
6. PDP-ll/70
The PDP-11/70 is the largest of the PCP-11 series
and gives tho power of a large con^puter at the cost (J63/000
to 514^/860 [d9] ) of a minicomputer. It was designee to
operate in high-performance systems and is iaeally suited
for real-time systems due to the high speed of execution and
the 8 0-95 oercent "hit" ratio of cache memory. Aggressinq of
over four Megabytes of physical memory is theoretically
possible with the Rd. bit acdresser, although iScK of this ^M
must be used for the UNIBUS referencing, (The UMBUS can
only address 18 bits^ therefore the memory manacement
routine must convert the 4 ^'!egaDyte addresses as if it we^e
a virtual location,) At the present time however only 2^ of
physical memory can actually be accommotated by the UNIBbS.
There is the option to use 64 oit floating ooinr numbers in
calculations. /oth two megabytes of main memory there is
little concern for memory constraints during a mult i -task
environment. The option of attaching high speed mass
storage devices to t^e central orocessing unit through
dedicated paths is available. The system has eight levels
of priority and a large amount of flexibility in its
orogramming making it Dossible to run several levels of




An Array Processor is an unit capable of performing
floating point operations on large data arrays or data
[
streams. It usuall/ operates as a peripheral device to a
"host" computer system and best performs the repetitious
reiterative operations requiring a large number of
summations and multiplications tyoicallv encountered in
matrix calculations such as correlations and fast fourier
transforms. This system is special purpose and cannot
"think" for itself since it has no executive functions
except tnose necessary to control the mathematics required
to perform additionSf mu 1 t
i
d 1 i c a t i ons and data movement
C181 .
IN i t h an array processor^ large transforms can be
achievea dependent only on memory capacity. These
transforms can be done faster tnan in the normal CPU since
the array nrocessor performs only one function at a time
(here function is used in the broader sense as in
transposition) and there is no need for the normal overhead
control Ionic of a general purpose computer [?'] , This is
more advantageous than a special purpose comouter in that an
array processor- can he programmed to execute various array
processing applications and can also act as a peripheral.
Ideally a system would be wanted that could handle any size
arrays including the possitility of very large arrays if the
^S

situation warranted. Fnis is theoretically possible by using
sequential processing anc stringing a series of array
I processors together having each oerforin a specific
' operation. That woulo only be aoodf however/ for
applications not neeaing results of data processeo in step N
! to be used in step ,'^J-l. Using one array processor/
efficient ana sufficient performance of large arrays is
possible due to trie soecial architecture and memory of the
array o roc esso r .
Two general purpose a r r a'y' orocessors oresently seem to
dominate the market. These are the CSP Inc. '^AP-3U0 ("''aero
Array Processor) ana the Floating Point Systems AP-l^OB.
While the basic function of each is similiar» tne actual
operation is ouite different.
The theoretical adv an
t
aae/d i sadvan t age of each
processor will be oiscussed in detail comparing
architecture/ operational characteristics/ software support
ana proaramabilitv. Cnaotef* VII/ Conclusions ana
Recommendations/ ^ill ciscuss the actual croblems
encoun'"erea .vith tne installation of the M1AP-3O0 system to




IV. TH£ AP-120B ARRAY PPQCESSOR
The AP-120B Array Processor (fig 53 is many f ac t u rea by
Floating Point Systems Inc./ Portland/ Oregon, It operates
synchronously using a lo7-nanosecond cycle time master cIock
synchronized with a 50 percent safety margin every cycle for
worst-case temperature and voltage. The system uses ore-
conditioned meaium-scale integrated circuitry/ large-scale
integrated circuitry ana transistor-to-transistor logic.
The AP-120B is capable of operating in temperatures from 10
to '-10 degrees centigrade at to 90 percent relative
humidity. This processor is also able to operate using one
of these various power ootions? 105/125 VAC at 120 ampS/
160/228 VAC at 10 amps or 210/250 VAC at 10 amps with eitner
50/o0 hertz or 50/aOO hertz available [71.
The AP-120B emcloys a technigue <nown as oioeline
processing to increase throughput. Pipeline orocessing
utilizes a combination of tne elements of both secuential
processing and parallel orocessing, A sinale basic
orocessor/ like an adder, is logically divided into integral
units that can each perform a specific and separable
function while another unit of the adder simultaneously
performs another function of the addition task. '^hen one
task is completed/ it will move on to the next step in the
seguence allowing the just vacated section of the acder to



































increased by insuring that the entire system is always full.
This technique works with both the adder and the multiplier
in the AP-120B. Pipelining is good for vector operations
since vectors are Dasically independent ana a solution of
vector N is not needed before vector i'^i+1 can he started.
However scalar operations are basically seauential
operations ana cannot make use of oioelining [1]. By
carefully considering every operation^ especially those in
looPSf the programmer can squeeze more operations per time
interval by piDelining than would be possible using standard
sequential technicues. Fne time is generally limitea by the
multiolication time [I'J],
The AP-1206 instruction word is up to 6^-bits long ana
can perform a maximum of ten different operations in a
single cycle. As an examole^ an add^ a multiply^ a move to
and from each data oaa (there are two) and an adaress
increment or decrement can all be performed in the same
cycle. Any one instruction or comoination of the above can
be performed as long as the resource required is not being
usea in another ooeration (some operations are multi-cvcle
and "lock-out" the resource until they are comolete). It is
the prooramimers ooliqation to insure that all required
resources are available when they are requested or else tney
will be lost [71. As an example? a reao from a data pad
takes at least two cycles. If cycle ^J wanted to read from
Data Pad X and cycle N-1 already initiated a read from Data
Pad X, the entire instruction word for cycle N woula be
32

delayed one cycle waiting for the resource to become
available. This ability to perform more that one basic
operation per cycle allows a theoretical 30 million
instructions per second to be executed. Due to memory size
limitations and algorithms not needing ten operations per
instruction word for sustained periods this rate can never
be fully attained except possibly for short bursts [3b3,
Since some of these operations are housekeeping functionSf
the maxirnum number of arithmetic operations per second
theoretically possible is twelve million for vectors and
five million for scalars (scalar speea is much lower since
it reauires sequential processing and cannot take advantage
of p i pe 1 i n i ng J I 1 i .
The AP-120B uses a 3^-bit data word which Floating
Point Systems contends generates better accuracy than the
3<^-Dit word commonly used by other systems [7], This 38-bit
word consists of a ten-oit biased exponent and 28-bit twos
compliment mantissa thereby allowing numners in a range of
3.7 * 10 ** -15S to b .7 * 10 ** 153 to be represented. The
2b-bit mantissa allows for extensive calculations without
significant truncation errors or a maximum relative error of
approximately 7.5 * 10 ** -9 per arithmetic operation or
about 8 decimal digit accuracy. Floating Point Systems Inc.
also employes a techniaue known as convergent rounding which
tney assert forces the roundoff error to approach zero.
33

The AP-120B aoes not contain the normal bus structure
[| of other array processors but insteaa uses dedicated 38-bit
data paths tor the movement of data. There are two paths
available to the adder (one for each input register)/ two
'f
oaths to the multiplier anc three paths available to the
memory and data pads. This allows seven independent data
woras to oe transferred each cycle. (This coupled with an
aadf multiply and adaress i nc remen t /dec remen t ^ equals the
ten instructions oer cycle possible.) These separate data
J patns eliminate the neec for a handshaking arrangement
between logic elements^ although hankshaking is reauired
when the AP-120H communicates with the host (7/3o).
The price o ^ a unit which includes the AP-1208 array
processor/ interface <v i t h the PDP-11/ IbK words of
533-nanosecond interleaved I'^'OS memory/ expansion chassis/
installation, i25o words of program source memory/ 51(^ words
of H^ead Only Memory (POM) taole memory/ a lin<er/ loader/
simulator/ debugger/ algorithrn library and executive is
$SO/<?70.00 110]. This mcluoes a 90-day warranty with a
servicina agreement availacle at extra cost. The field test
mean time between failure is 3500 hours [31.
The following section explains the hardware of the
AP-IPOB in detai 1
.




The Multiplier unit (fig 6) consists of two 38-Dit
multiplier registers ^^1 and M2, three multiplication stages
and a 38-bit register to store the result (KM), To receive
a resultant after initiating the multiolv/ three cycles or
500-nanosecond5 are reauirea. Inputs to the Ml register can
come from Data Pad x (OPX), Data Pad Y (DPY), Table ^^emory
(TM) or the Multiplier result register (FM). Inputs to M^
are either from DP\ , OP i , Aader result register (FA) or Main
Data Memory Output Buffer (^D), Results from the multiplier
can go to ^'il/ the Adder incut recister (Al), ''lain Data
I
Memory input buffer (-'1), DPX or DPY.
Stage one of the multiplier starts the product of
fractions ny beginning the multiplication of the two 28-bit
mantissas. This multiolication is completed in stage two
resulting in a Sb-Dit mantissa. Stage three aads the
exDonents as it normalizes and convergently rounds the
5tD-bit mantissa to 2^-bits. This stage also detects
exponent overflow/underflow and if either exist will set the
FO of Fij bit in the status reaister. The status register
can be read by the program to determine; if conditions are
met from an arithmetic operation^ to specify errors^ or to
be used in branching logic. These bits are available for
testing one cycle after completion of the multiply.
This three stage multiply allows pipelining to be





























permits a multiplication result to be present at the result
register every 1 67-nanoseconds once the pipeline becomes
full (three cycles reauired to fill). Note that
500-nanoseccnds are recuired if the result of the
multiplication is requirea in the next multiplication as is
the case with scalar aritnmetic.
A readily apparent problem with the multiolier is
that Ml receives inputs from both the Table i^emory (TN*) ana
the Multiplier Result register (FW) while ^-^2 receives inputs
from neither. Therefore/ if a constant from TN^ were to be
multiplied by the result of s just-completed multiplication,
it would require an extra two cycles since either F^i or TM
would first have to be written into DP < or DPY and then
written into ^2 . This disadvantage is overshadoweo by the
fact that even though dedicated data lines cause the above
problem/ in most cases they present a distinct advantage by
allowing multiple data transfers in any given cycle [321.
2 . Adde r
The operation of the adoer (fio 73 is similiar to
that of the multiolier and consists of two 36-bit adaer
registers Al and A2/ two adder stages and an adder result
register (FA). The addition of two numbers requires
333-nanoseconds (two cycles). Inputs to Al are from Table
'Memory (T'^), Multiplier Output register (F'^M/ Data Pad X
(DPA), Data Pad i (DPY) and the ZERO constant while inputs
to A2 are from the Adder Output register (FA), Data Pad X
37





























(DPX), Data Pad V (DRY) ana the ZERO constant. The results
from the adder can qo to ^^, ^2, DPX, DRY or MI. Stage one
aligns the mantissas by shifting the smaller value/ based on
the value of the exponent/ to the right until both exponents
are equal then adc:iinq or subtracting these mantissas. Stage
two normalizes ana convergently rounds the mantissa and
adjusts the exponent. This stage also sets four bits in the
status register to denote results egual zero (FZ)/ results
less than zero (FL)/ exponent overflow (FO) or exponent
underflow (FU). These oits may be tested by other proqram
instructions one cycle after the addition is completed,
(^ote that FO and FU are the same bits that are set by the
multiplier on exponent overflow or underflow.)
As with the multiplier/ the two-staoe aader allows
pipelining ana a result can be generated esjer^
lb7-nanoseconds. The adder does not have the disadvantage
of inputtina Table 'Memory (TNM values at the same register
as FA but does have the multiplier result FM at the same
adder input register (A2) as TM values. There is therefore
not the ability to immediately add a F^, value with a TM
value without first going through DRX or DRY [3<?].
For both the adder and the multiplier there would be
a two cycle time loss if Fi^ was just loaded with a new value
from the multiplier when it was needed for the
addition/nnultiplication process (time U) and only a one
Cycle loss if it was ready the cycle before neeaea (time ^^ -
39

1 cycle). Otherwise there would be no loss of time since
steps could be taKen to move the value in FM through the DPX
or DPY which would make it be available at the
adder/multiplier input register when necessary.
(PresuDposing of course that the data paths to or from
memory were not needed for other uses.)
3. S-Pad
The 5-Pad (fig 8) (pseudonym for scratch oad)
consists of the S-Pad Memory, S-Pad Arithmetic Logical Unit
(ALU)/ Data Pad Aadress Register (DPA), N'emory Address
Register (MA) and the lable Morpory Address Register(TMA).
The sole purpose of the S-Pad is to compute aodresses for
Table Memory/ Main Data '^emory and the Data Pads. The S-Pad
can operate concurrently with the memories. Multiplier and
Adder [71 .
The S-Pad Memory is made ud of lo registers each lb
bits wide giving the ability to compute an effective address
of o^K. These registers may be assigned label names like
"pointer" Dy the use of cseudo-ooerators/ to maKe crograms
more readable, or mav be oirectly addressed by number.
The S-Pad Arithmetic Logical Unit forms the operand
addresses and also automatically looo counts, shifts the
aodresses left once (divide by two), shifts the addresses
right once (multiply by two) or right twice (multiply by












fHData Pad Address (DPA) Register
±1
u
Memory Address (MA) Register
±1
u
Table Memory Address (TMA) Register
__J
±1





reversal/ to swao bits while accessing data in a scrambled
order after a Fast Fourier Transform. The results of the
S-Pad arithmetic logic unit/ called SPFM/ set bits in the
status register to indicate whether the results were less
than zero (f^O/ zero fZ) or if there was a carry bit (C).
These bits are available for testing by orogram instructions
at the next instruction cycle.
Tf^A/ DPA and MA store the computed address from the
S-Pad ALU. The contents of each can either be changed by
the value of SPFN or incremented by one. One cycle is
reauired to conoute the address and load it into the oroper
register [3<£1 .
^, Table Memory
Table memory is a 512 word/ 38-bits per word bipolar
read-only memory used to store important and much used
constants. This memory has a lb7-nanosecona cycle time out
reguires two cvcles to get the value from memory to the
output register T'^i [7], values in [M are available for use
by DPX/ OPY/ .MD/ f^U and Al. These values may be reguestea
every macnirie cycle and are initiated by changing the
contents of the Table Memory Address Register (TMA) in the
S-Pad. The orogrammer must control the timing necessary to
insure the correct constant is at T:"*^ when needed due to the
2 cycle access time reauirement.
a2

In the Fast Fourier Transform f^ode/ the address in
TMA is interpretted by the hardware to be the angle which
ooints to the appropriate root of unity for a particular
step in the FFT aloarithnn. Therefore^ in a single auaarant
of cosines^ a full table can be represented f32].
There is an ootional .^andom Access Table l^emory
(TMRAM) containing IK of random access meinory (81. This
allows loading of soecial constants necessary for special
applications without the overhead of computing them every
time or usina valuable data pad soace to store them. The
price of this option is aoproximately S 1850, 17).
5. Data Pad X ana Y
The Data Pads (fig ^) consist of sixty four 38- bit
accumu 1 a t o r s < four of which are available from the lb
addressable each instruction cycle [7]. Tnese t)4
accumulators are dividea into two 32-register blocks called
Data Pad X (DPX) and Data Pad Y fDPY). From each Data Paa,
one reaister can be read ana another written aurina the same
cycle.
Tne restrictions are that the same reoister cannot
be read and written simultaneously and that a read and write
operation during the same cycle must occur on registers
whose addresses differ by no more than 7 aue to base-
address-plus-offset addressing. (However a register in DPX
may be written at the same time as a register in DPY even if
43

INBS VALUE DPX DPY MD SPFN TM
_J \ I I I I I






^f "^ 5 ^
Ml 112 Al A2 DPBS
Write Index
Read Index
vf J ^ ^ ^





they both have the same address.) In the S-Pad, the Data Pad
Address Kegister (DPA) sucplies the base address to be used
by the read/write instruction to locate the orooer Data Pao
register. The DPA supplies both DPX and DPY concurrently.
The instruction uses this base address and an offset in the
form DPX(offset) or DPY (offset) and can address -a to +3
offset from the base in each Data Pad to find the effective
address. Therefore if the DPA contains decimal value 2.0,
reaisters Ifo^ 17, 18, 19, 2u, 21, 2 2. and 23 can be addressed
in eacn data cad. The reaister addresses of both Data Pads
range from to 37 (base b) and are arranged in a circular
addressing scheme. Therefore 37 (base 8) +1 = and the
programrner need not be concerned about writing into a non-
existent location but must only be concerned with
overwriting previously written information.
DPX and DPY receive information from MD, FA, FM,
DPX, DPY, output o^ the S-Pad arithmetic logical unit (SPFN)
and VALUE (an immediate value used Oy immediate instructions
arriving from the command buffer), DPX and DPY suooly
values to Ml, M2, Al, A2, CPX, DPY and MI [32J.
6. Viain Data •'^emory
Main Data Memory (fig 10) contains bUK 38-bit words
used primarily to store inputted data which will be operated
on Dy the program. This memory is available in two forms,
lb7-nanosecond hardware interleaved MOS with -^K wora










































wora segments. Both memories have a two bit parity option
available [7] ana a one meqaword oaae selection option [9],
With memory limited to o4K, the largest comp 1 e x - t o-c omp 1 e
x
Fast Fourier Transform possible is 32K/ which may not be
acceptable in some applications.
Main Data '^emory receives input information into its
Memory Input Buffer ( '"'i I ) from FA, FM, MO/ DPX, DRY, T N! , SPFN
and VALUE. It can output via the '^'emory Data duffer to DRX,
DRY/ A2 and ^'2,
Memory read or write may be requested every other
cycle by chanaino the value of the Memory An cress Register
(NiA) in tne S-Pad. This yields an effective memiory cycle
time of either 333-nanoseconas (lG7-nanoseconas plus one
machine cycle) or 50 0-nanoseconds (333 plus one machine
cycle) dependent on the tyne of memory installeo [3^]. By
special programming tecnnicues and procer chip procurement/
this overhead can Pe reduced to the aavertisea memory speed
witn the restrictions that the memory alternate between
chips or alternate between even ana odd boundaries. If
effective speed is essential/ it oecomes the programmers
responsibility to insure data location is known to the
program at all times[81. A read reauires three cycles for
information to be present in the ^D if using 3 33-nanosec ond
memory and two cycles if using 1 b 7 -n ano sec ond memory. This
information will be available until a new value overwrites
it. If a write or read is initiated before two memory
ay

cycles (unless scecial chics and techniaues of above are
used)^ the reauest will not be lost but the memory will
automatically provide a haraware lockout (wait until memory
available for read/ write) [l^^].
The value in the Memory Address Register (MA) points
to the desired location in main data memory, N'A may be
either set to a specific value or incre mi ented/ decremented by
one in tne S-Pad. Since there is a slight time lag cetween
when a value is requested to be placed in N'D ana when it
actually aets there^ the crogrammer must always be aware of
what values are in MI and ^D , to allow the proper* "set up"
time to get these values to either the Aader^ Multiplier or
correct DPX, DRY or mi address [321 .
7. Proaram Source Module
Tne Proorarn Source Module (fig 11) consists of the
Program Source '^emory (PS); Proaram Source Address Register
(PSA), Control duffer (CS) ana the Subroutine Return 3tac<
fSRb) [12] .
The PS is a nigh soeed, 50-nanosecond, bipolar
memory aadressaple to 7K b^-oit words ana is available in
?56 wora increments [^] . The PSA contains the address of
the next instrtjction ana is incremented by one after
instruction execution unless modified by either the Control
Buffer (new aadress as a result of a branch or jumo
instruction) or the Subroutine Return Stack. The SRS saves
as

the current PSA when a Jumo Subroutine instruction is
performeci and increments the value of the Subroutine Return
Address (SRA). \-inen a Return instruction is oerformedf the
SKA is decre-nented by one rnaking nested subroutines
possible. The Control Buffer decodes and executes the
instruction as the CPU would in a general ouroose computer
tiaj
.
8. Interface with PQP-ll Series
The interface unit with the PDP-11 series contains
two major segments/ the Front Panel and the ON' A Controller
and Formatter. The Front Panel contains three registers and
is used mainly as a debucgina aid while the 0!^A Controller
and Formatter contains five registers and is used for
program and data entry or removal.
a. Front Panel
The Front Panel (fig 12) consists of three
l6-oit registers/ the Switch Register (S;.'R3/ tne Liahts
Register (LITES) and the Function Register (Fi\). Tne Front
Panel is used for boo t s t race i ng and debugging of user
programs. Ihese three registers can be examined oy the host
and tatce the place of the toggle Sv«itches normally on the
front panel of the console [32], i/^ith the use of the
Debugger proaram/ these registers can effectively breakpoint
the AP-120B at a selected program location or data aadress.





















through its execution sequence [6/7],
The Switch Register is written by the host
computer but can be read by both the AP-120B or the host.
The Si/gR is used to enter data and addresses into the
AP-12uB/ primarily for debugging. Its contents can be fed
to the OPX, DPY, MO or the S-Pad.
The Lights Register siniulates the front panel
lights of the console. This reoister is set by the AP-li^Ori
and can only be reaa ov the host. LITES is used to display
selected contents of the internal registers of the AP-1208.
The final register is the Function Reqister
which provides front canel toggle-li<e controls to the
AP-1208. The Fi^ can stop/ start, step or reset the AP-120B.
It can also continue operation resumino at the current value
of the PSA, examine a register, examine a portion of a
register or memory contents of a selected area > deposit the
contents of Si'^R into a selected reaister or memory location
ana then breai<.point according to the values of TMA, "'lA or
DPA. The F;J can also increment the T'-iA, WA or DP A after
completion of an instruction to facilitate stepping throuoh
memory locations [3<21.
The Front Panel is advertised to be invaluable





The DMA Control is the second half of the
interface ano consists of three lb-bit registers^ one 16-bit
register and one 38-bit register. DMA Control is
responsible for transferring orograms and data between the
AP-120B and the host connputer. This section of the Front
Panel will also do forfrat conversion "on the fly" which
shoula effectively alleviate time lags [321. Four types of
data transfer compinations are possible^ host D"^A to AP-120B
DMA/ host DMA to AP-120b Prograrpmea I/O, host Programmed I/O
to AP-l^OB Proarammed I/O and host Pro a rammed I/O to AP-lc'OB
DMA with a maximum theoretical ourst transfer rate of three
meaa words oer second for all tyoes of transfers [71.
The Format Register (FMT) is a 3P-bit double-
buffered register used to perform all transfers of
f 1 oa t i ng-Qo i n t numbers from the host to the AP-IPOB [1>2] .
The FMl will convert 16-bit integer numbers to 3b-bit
unnormalized f 1 oa t i na-po i n t numbers, 32-bit PDP-11 integers
to 32-bit AP-12 0B inteaers and 32-bit floating-point numbers
to 38-bit floating-point numbers. Al] these ooerations are
in reverse for the AP-120b to host direction L7J. Since the
POP-11 is a 16-bit computer, it will access the Formatter in
lo-bit half-words to be compatible. It must be notea that
for some applications, such as difference filtering, there
is a possiblity of extreme accuracy loss due to l6-bit
integer to 38-bit floating-point conversion. The synthetic
52

precision aenerated by such a conversion can cause certain
coeffiecient comb i na t i ons / such as +1 and -1^ when
multiplied by mirrored arrays^ to result in errors when
reconverted to lb-bit format. The program,Ter must be aware
of these oossible losses and test for them, before faith is
placed in tne result.
The AP Direct f^emory Aadress Reaister (APDMA)
points to consecutive locations in AP-120B Main Data Memory
during DMA transfers. This register can be automatically
incremented/decremented allowing bloci<s of information to be
read into consecutive locations with minimal overhead.
The Host N-emory Access Register (H.VAJ operates
similiar to the APDMA except it ooints to consecutive memory
locations in the host memory. In the PDP-11 this memory is
256K so the HMA is Itt-bits to allow for this aadressing
cacab i 1 i t y
.
The '^^ord Count Register ( .\C) counts the numoer
of words transferred during a DMA ooeration. This register
must be preset to the reauired number of words ana will stoo
DMA transfer when the prescribed number of words is
transferred.
The final and most inportant register in the
interface is the Control Register (CTL). It controls the
direction and mode of transfer, type of format conversion
and provides certain status bits oertaining to the transfer.
b3

This register/ with the use of HMA and/or APDMA, allows the
host to execute other programs and be interrupted when the
DN'A is comoletea. This CTL also allows either the host or
AP-120B to control the data transfer. (The AP-120B must
control transfer from a loaded proaram since the executive
alone is not powerful enough to control data transfer [12] .)
B. SOFT^^JARL
Various software supcort/ executive and aevelopment
programs are available with the AP-120R.
1. Executive and Associated Routines
The AP-lc^OB orovides executive and housekeeping
routines to increase the effectiveness of operation ana
enhance program develooment.
a . A P M A T h
APMATH is a series of acproximately 150 [8]
library functions^ vector and matrix subroutines and sional
processing algorithms [7] written in AP-iPOB assembly
language [Ml. These routines are callable from either host
Fortran/ host Assembly or AP assembly lanauages [36] with
the use of the AP Executive. These programs can reduce the
run time and decrease programming time by presenting some of
the most common array processing functions in subroutine
callaole form. These routines include: data transfer ana
control? basic vector arithmetic? matrix operations and Fast
5a

Fourier Transform; all of whicH are able to work with both
real and complex data.
b. APEX
APEX is the AP Executive routine which is
resident in the host ccn outer and allows the AP-120B to
communicate with the host comouter via Fortran or host
Assemoly language calls. APEX decodes subroutine calls from
the host comouter [36] and directs the AP-l?OB to perform
the specified action. Both APN'ATKt routines and user written
routines may De called by the AP-120B from the host computer
[3?] .
c. APAL
The Ap Cross Assembler (APAL) is a two pass
assempler written in Fortran IV which reouires d.^\s memory in
the host computer to operate. APAL assembles source text
written in aP Assembly lanouage into coject code
understandaole by the AP-120B. The assembler also
optionally oroduces an AP Assemoly listing containing errors
in Doth passes^ location counters, assembled data/ the
symDol taole and source statements.
APAL recognizes signed constants ranging from
-32768 to 32767 and unsigned constants from to 65535 both
of which may oe reoresented in binarv/ octal (default base)»'
decimal or hexadecimal. It allows free formatting but
recognizes the general source stat-ement form: optional
55

label followed by a colon, multiole op codes separated by
semicolons (one to ten operations which total no more than
64-0 its. Sixty four-bits is the maximum dictated by seven
data transfers/ one ado, one multiply and one address
increment/decrement)/ and an optional comment statement
denoted with leading double quote (").
Once the modules are written/ APAL can be
operated dynamically/ allowing the proarammer to build the
program at assembly time. APAL will question the operator
about the source file name, destination file name etc. ana
subsecjuently will prompt him concerninq m. issing items. If
there are errors in the module, these can be changed
dynamically without reassembling the entire module [4].
d. A PL INK
The AP linker (APLINK) is written in Fortran IV
and requires apo ro x i ma t e 1 y lOK of memory in the host
computer. APLINK performs functions similiar to those of
any other Iiok editor which inclune relocation and assigning
absolute adaresses to the oDject module, correlation of
qloDal entry symools in one module with external symbols in
the other moaules/ loadinc the module from the program
library ana production of the final load module. These
functions are performed interactively with dialogue between
APLIMK and the user at the console.
5b

Besides linkinc the modules/ APLINK returns to
the console any symbols in a file which are undefined/ will
output the symbol table anc locations when requested ana
returns the high address and starting address to be used
witn the Deougger routine [51
.
e. APSLM
APSIM is the AP-l,?Ob simulator and is designee
to be used when aevelooinq orograms when use of the AP-l^OB
is imoractical or imoossible due to rrociuction schedules.
APSi"'^ emulates all haraware and timing characteristics of
the AP-120R as well as performing the mathematical routines
as closely as possible to the way the AP-120B woul(3 perform




APDEBUG is the AP-120B interactive aebugger
program to be used for dynamic debugging of AP-l^OB
aoplications programs at run time. Changes can be m. aae when
the proolem is identified and APDtBUG will call the APLINK
and APAL routines to insert the new object module then
continue with orooram develooment. APDEPUG can work in




There are three software modules available to
completely test the AP-120fi hardware coerations.
APTEST is the AP-120B path tester. This
software exercises the panels DMA interface^ internal
registers ana memory to check for proper operation.
APPATH tests the internal data paths of the
AP-1208 and returns diagnostics upon finding any errors.
Forward/Inverse Fast Fourier Transform Test
(FIPFT) verifies correct ooeraticn of the AP-120ri*s
arithmetic units by performing Fast Fourier Transforms and
inverses them comparing results with stanaard answers 132] .
These packages can Pe used to help insure proper
operation of the AP-120B before development or actual
operation and also help with the hardware fault locating
effort during system maintenance.
(? . P rog r amm i ng L an q u a c e
The '^ath Library of AP functions can oe called by
the host Assemoly Language^ Fortran or the AP Assemoly
Language (3d1. However to write a custom library function/
AP Assembly Language must be used and the cross-assembler
will translate it into an executable routine.
Investigating the programming language is not
important here except to say that it is similiar in
58

characteristics to other assembly languages. There are
sufficient commands available to write a program to properly
control AP-120B execution in an efficient manner. Bit
testing^ conaitional branching/ flaa settina and arithmetic
instructions all are part of the instruction repertoire
which allows varied aoolications programs to be written.
3, Page Select Option
The AP-12UB can alternatively be eauipped with a
Page Select Ootion. This orovioes the aoility to aaaress
one megawora of main memory in the AP-120B by using host
main memory and virtual memory techniques. tach page can be
up to 64 K woras long (full Main Data i-^emory size but each
page must be at least 8K) and lb pages are available. The
Page Select Option increases the abilit-y for the AP-130b to
work on larger transforrrSf but due to paging overheads it
may not increase the throughout rate due to increased host
i n vo 1 V emen t
.
This option modifies the AP Direct "^emory Agdress
Register (APOI^A) located in the DMA Control section of the
interface by extending it from 16 to 20 bits therefore 2**20
addressing caoability (approximately one megaword). This
virtual memory ability is called the AP Memory Address
Extension (APMAE) and new addresses can only be loaded by
the host. Since tne host will control all oaging
operations/ the AP-120B commands will not change inasmuch as
it will only recognize 6Uk viorci locations i"^] .
5<5

^. P roq rammab 1 e I/O Procesor
The Programmable I/O Processor (PIOP) is a micro-
codable micro-processor wnich acts like a high speed channel
program control lina an inout/output port. It is capable of
transferring aata at a six megahertz burst rate or at a
three megahertz sustainea operation rate (assuming 167
nanosecond Main Data Memory). The PIOP can be usea with up
to eight external aevices (like A/0 converters or mass
storage devices) thereby acting as an I/O bus controller.
J r\e PiOP interfaces directly with the D'VA Controller
in the interface unit. It has a 38-bit instruction word/ a
20-bit arithmetic logical unit and is caoabale of addressing
to one megaword of memory making it compatiole with the Page
Select Option. Communication with the AP-I20B is
accomolisned via one of eiaht flags and four interruots.
The micro code suoports subroutines ana has the logic to
oerform jumos witnin its own code.
The PIOP must hancle all handshaking ana timing
considerations with notn the external devices and the host
program to insure data integrity. This can be complicated at
times so a Proqrammable I/O Channel fPIOCJ is also available
which decreases flexibility but eases the programming buraen
[33] .
Neither tne PIOP nor PIOC orovides a method of
connecting two AP-1206's together in series without host
60

intervention which tends to limit scne of the possible
applications of the AP-l?oe,
C. PKOGRAMMIi\G/ OPERATION AND EXhCUTIOM
The AP-lj^Ob can utilize the parallel creratiori
capability of the adaer, Tultiplier ano aata transfers to
increase execiJtion of tho oroaram and throuahput on larae
data arrays. These parallel operations fnust oe controllea
so that oDtiTum execution speea can oe realized without
causina interloci< or lockout. LocS^out cou^d eve^^tually leaa
to a oroaraTi s^oppaae (11. Since rnost scientific aata can
best be struct urea into an array forr, t^e array processor
is able to work on it- auici<^lv ano efficiently in its natural
state where a genera! purpose ccnouter '"ust/ in r^ost caseSf
restructure it [3bl.
Before the Ap-l2 0b can «ori< on aata; the aata -"ust first
be transferrea troT its TieTory locations in tne host to ''^ain
Oata ^-lerr. ory in the a r r a\/ processor (or -povea to N'^ain Data
Memory trcm an external oevice via the PIuP. That situation
will not oe aealt with here since the °IlJP is programmable
and therefore oatn ana data options associate o with it are
manv,). The data is transferrea via the interface with the
use of the APPUTCriOST, AP,N,TYPt) command (Put Oata into t^e
AP-12ue). As with arnuments of other AP-i20B CALL
statements/ HOST AP , N and Tt'Pt neea not oe explicitly
stated Put can be expressions/ integers or variables.
bl

The host ana AP-12 0B must be svnchronized in their
operations so comoutations can not go on while aata is still
being transferred to memory. APA'D (wait on Data) causes the
host to wait until aata transfer is completed before it
resumes executing the Dronram, AP/vP ('/.ait on Running)
causes the host to wait until the AP-12nH is ccT^pleted with
one command before another is sent over. APv'jAIT is a
combination of AP/,D and APaP. One difficulty encountered
using these cooimands is that the host to monitor the
orogress of the execution if collina is usecJ to .Determine
APl'.u, ApiA'P or AP/^jATT comoietion or the A P - 1 c? h must wait if
oriority interrupts are used/ which increases the time
necessary to comolete the program.
Sonie of tne overheaa of the host can be eliminated by
not using the AP .'J ait on ^^unning (Api^jR), AP ^lait on Data
(APaD) or AP /J ait (AP»vAITj commands. This tecnniaue may
soeed uo crogram execution an-j should only be used when it
is absolutely necessary anc wnen there is no chance tnat the
results will ce orocessec before thev are actually present
ij
in the AP-l<^OB N'^ain Data ^^emory. Floating Point Systems
sugaests that the orogram first re written ana executed with
the AP'aP, AP/jD and AP/^AIT commands oresent ana the results
aotten. Then rem, oving a few of tiicse instructions at a
time> the results can be checked to see if they match the
original results. This only works for specific applications
ana does not conform to modern programming practices. It is




fluctuations Que to temperature variations.
When Drocessina is complete/ the data can be transferred
bacK to the host via the APGET() command which operates in
the same manner as the APPUT,
The application program resides in the host memory and
the host executes this proaram. The host will determine
which routines must be passer) to the AP-l^OB and if the data
necessary is present in the array processor. '^hen a routine
is called/ the host will jump r o it and execute it but if
the routine called is part of the math Horary (whether from
APMATH or a user written math routine)/ the host first jumos
to APEX. APEx then loacs the 6'4-bit instructions into the
AP-120B Program Source Memory/ calculates the remaining
space available in the Program Source ^'''emory/ upoates the ^S
location table/ loaas the parameters ana initiates the
execution. If the same routine is called again immediately/
it will not be reloaded since it is already present but only
the new parameters will be loaded. If a aifferent routine
is called/ APEX will first check the PS location table to
see i ^ there is enough unused space available to load it
without aestroying any routines currently residing in
Program Storage. If not enough soace is available/ the
last-written program will be overwritten with the newly
called routine (Last In First Out (LIFO)).
The overhead reguired for each math library routine
called is between 100 and 1000 microseconds. Line hundred
63

microseconds is the minimurr time required to check the table
and move parameters. This minimum time is reauired for
every call/ even in looping operations. During this periods
the host must de available to the AP-1<^0B which would cause
unnecessary host overhead. While the AP-l^UB is e5<ecuting
any specific routine* the host can be freea to do other
tasKS and treat the AP-120B as a peripheral device. The
host can either be interrupted or can use polling techniques
to oetermine if the array processor requires assistance. In
either case/ the programmer must be aware of when a brea'<
occurs so he can insure that the proper seauence of routines
is used to allow the host to perform other operations and
not be burdened by many AP-12 0B services.
Several ways to increase availacle free time in the host
are to transfer more than one vector with each APPUT or
APGET command/ use optimum AP-120B library calls to perform
given operations (it is the programmers responsibility to
determine which AP routines are oest for each situation) and
overlap nost ana AP-120fi operations whenever possible.
Since every call of a routine reauires nost intervention/
several routines can ne comoined into one by writing a
special macro combining those routines/ which will
effectively eliminate some host overhead bv using only one
"call" statememt. (^ut these macros must be small due to
limited AP-l^Ob program memory.) Since host overhead varies
betiween luO and 1000 microseconds/ with the higher value
being oue to the maximum amount of data and proaram
6a

transfer/ sonne overhead can be eliminated by loaainq the
rnost used routines first/ since overwrite is accornolished by
LIFO. APEx must also be a part of the interrupt priority
scheme of the host (interrupt or polling); therefore/ by
having the AP-l^OB at a high priority* the overall wait time




The WAP-500 (Macro Array Processor) (fig 13) is manufactured
by CSP I nco roo ra t ed » Burlinqton/ Massachusetts. The Dasic
structure consists of three independent busses^ an executive
rout in e^ t«vo parallel arithnnetic units^ an addresser and an
input/output hanaler, each having its own clock and
operatina in a parallel asvchronous fashion. The casic
logic units are the Central System Processor Unit (CSPU)/
the Arithmetic Proceessor (AP) (consisting of the Arithmetic
Processing Unit (APU) ana the Aaaresser Processor Section
(APS))/ the Host Interface Scroll (hi IS) ana an optional
Input/ Output Scroll (lOS). All except the CSPU use micro-
coded routines stored in their own small memories and
communicate witn each other via flags set in registers.
(The CSPU stores its micro cooed routines in main MAP
memory.) The Host Interface Module (HIN') section of the HIS/
the lOS ana the CSPU are built around a stanoara Intel 300<£
bit slice micro processor.
The representation of MAP-30C numbers is usually a
32-D it floating-point format with a one- bit sign/ a seven
bit exponent (giving a ranae of lb ** -^4 to 1 o ** b3 biased
by 6U therefore to 127 are the actual numbers storea) and
a Z'-* bit mantissa allowing a total ranae of lu ** -77 to 10
** 76. Sixteen-bit floating-point and lo-bit fixed-point






































addressable in either 32-bit full-words or 16-bit half-words
but eight-bit bytes can be accessed by packing pairs into a
Ib-Dit half-wora (18}. SNAP-II commands like VFIXb assume
this packing exists 15^13. The ability to address in half-
woros and/or bytes is important as it may increase the
efficiency of the program and array processor, allowing
operations to be performea which may not have otherwise fit
in a word-only addressable memory.
Although the MAP-3U0 is asychronous, the aavertisea
average CSPU cycle time is approximately 70-nanoseconJs with
about 500-nar. oseconds recuired for a memory reaa/ write
operation when using 500-nanosecond '''US memory
(
1
2b-nanoseconds using nipolar). Full-word operands and
results starting on an odd address oounaary, however,
reguire about two 500-nanosecond memory cycles. A pseuoo-
operation can be used to insure even-boundary locations
exist ri8]
.
The MAP-iOO is capable of operating in temperatures
from to bO degrees centigrade at 10 to 90 percent
humidity. The power reguirements are eitner 115 VAC or «?30
VAC single phase plus or minus ten percent at ^7 to 5 3 .\^
V
hert?. The weight is approximately 80 pounos.
The MAP relies heavily on internal parallel processing
to increase throughput and limit wait time. The MAP-iOO
stores the executive and array routines in its own memory
(as opposed to storing it in the host memory). with the use
68

of function lists and statements like "MPlrtHL" (MAP version
of the "DO t'JHiLt"), the 'MAP can ooerate indeoenaently of the
host after initial loading of the orogram [I'^l. iNith the
three bus structure^ the MAP theoretically can
simultaneously input into one memory^ output from the secono
while doing computations on the third and never utilize the
host except for initialization.
The iMAP has a separate instruction set for the Central
System Processor Unit (CSPU)/ Arithmetic Processor Unit
(APU), Addresser Processer Section (APS)^ and Host Interface
Scroll (HIS). Inasmuch as these processors work
i ndeoenaen t 1 y f the instruction sets are not as complicatea
as mav nave been necessary if operation was controlled
totally from a central site. The total number of
instructions cer second attainable by the ^'AP-iOO is data
dependent. /whenever all steps necessary to perform the
operation are completed, as witnessed oy oroperly setting
the correct flags in Pseudo-memory (to be discussed later)/
the operation will perform to completion, v^Jhile the
aodition/multiolication operaton is being carrieo out in the
APU/ preparation for the next word (half-word) of
information can be conducted in the unaffected processors.
System flags are usea to communicate between the processors.
These flags include General Purpose flags available to the
orogrammer for general system communication/ Control flags
to control processor moaes ana operation seguencing/ Status




The MiiP-300 system installed for evaluation consisted
of: the MAP-300 processor/ interface with the PDP-11
computer utilizing the PSX-llM ooerating system^ 2'4K words
of 500-nanosecond N*OS master memory (8K for each memory)^
power oanelf expansion chassis/ installation/ I/O driver/
SNAP-II algorithm library/ cross assembler/ simulator and
loader. The price of the system was S44/5U0 [21] ,
A. CHARACTfcRISTICS AND HARDWARE
1. CSPU
The Central P'^ocessor Unit (CSPU) Cfig la) is the
"Commano Central" of the '^AP-300 array processor. The CSPU
responds to commands from the host/ transfers aata to ana
from the host/ assists the APS in address calculations and
loads tne orogram memories of the Arithm. etic Processor and
Host Interface ^^odule. The CSPU performs the functions of a
front-end micro computer to control the actions of the
sy s t em .
The CSPU has a fast/ fixed-ooint aritnmetic unit for
address calculations/ an instruction register/ an eight
register accumulator file and a priority interrupt network.
It has access to the three main memories via the memory
busses and supplies the other MAP processors with the




































































































subroutines and multi-level inairect addressing are
recognized by the CSHU. It has no I/O capability Cut
instead instructs the Host Interface Scroll (or I/O Scroll)
to Derform input or outcut operations to or f ro^ the host
(or external devices). The CSPU will never halt but will
always be in the /iAIT state after its instruction seguence
i s comp 1 e t ed
.
An irroortant register in the CSPU is the Control
Status Register or C-State .Jord (CS'^), It is a 3<i-Dit
reaister containina the status of prior operations/ the
program counter as well as the source and oestination
locations for olock memory transfers. Fielas of the
register can be combinea to give hardware condition codes
for use in conditional operations/ branches/ jumps or
executes. The CS.j also stipulates on which bus instructions
or uata aa present and controls the interrupt responses for
other units.
The CSPU is Che only processor able to be
interrupted in the WAP (otner processors can either Halt or
Wait) and contains a b^ level interrupt priority system with
one interrupt device oer level and three lines per device
(I'^cf Possible c omo i na t i on s ) . The CSPU may only be
interrupted between instructions. It will also nest ana
queue lower priority interruots if a higher priority
interrupt is preceived curing the servicing of a lower
priority interrupt. These interrupts are detected by
72

oollinq and levels are polled only if they are above the
current interrupt level. Lower level interruots will
continue to exist but will not be recognized until the
higher priority interrupts are serviced.
The CSPU contains no memory but uses main memory to
store Its instructions. l^jhen fetcheo/ these instructions
are stored in the instruction register until execution. The
CSPU may also address a pseudo-memory location called System
Flag Register (SYSFLG) which is the orimarv inter-processor
communication system. By testing the bits of SYSFLG^ the
CSPU can sense the status of any of the other processors.
(Pseuao"Memory refers to memory physically located within
the Sub-processors but which acoear on the bus as a memory
address similiar to the POP- 1 1 / 3a/ a5/55/oO /70 . ) [18j.
2. Arithmetic Processor
Tne Arithmetic Processor consists of two components^
the Arithmetic Processor Unit (APU) and the Addresser
Processor Section (APS).
a. APU
The Arithmetic Processor Unit (APU) (fig 15) is
responsible for the computation required in arrav processing
and executes programs relatively independent of the other
MAP processors^ operating under the aeneral control of the
CSPU
. The APU consists of t-wo adoers/ two multipliers (the


















































MAP-200 is that the former contains two adders ana
multipliers while the others contain only one eachj, 34
various registers and three First-In-First-Out (FIFO)
buffers for inout ana outout storage. The two aaders and
two multipliers oermit parallel processing of data to
increase throughput. APU programs are stored in main MAP
memory ana are seguentially b 1 oc k- t
r
ans f e r red to the APU
program memory under control of the CSPU,
The main units of '"he APU are the arithmetic
processors (API and AP2). Each arithmetic processor
consists of an adaer and multiplier that may operate
simultaneously ana independently of each other. Each adaer
is fed by eight registers ana each multiplier by four
multiplicand registers and four multiplier registers. The
results of the adder are routed to the result register P and
the multiplier loads the product register P. fo transfer
data between the separate arithmetic processors/ an exchange
register is proviaed.
APU memory consists of two 256- word lo-oit
sids-by-side memories. Tne memory is initially loaded by
the CSPU from ^AP memory and the APU is then out into the
run state. Instructions are sequentially decoded in the APU
to perform the specified algorithm. The instructions are
lo-cits for each board (API and AP2) and are executed in
oarallel. They can perform addition/ multiplication/
transfer of data and the setting of flags. Tnese
75

instructions are aecoaea and the operation started as soon
as all necessary conditions are met. I mmea i a t e 1 y » the next
instruction is retrieved and decoded and attempts to be
executed. If either the P/R register is involved in a
multiplication/addition operation which has not yet been
completed/ the Input Queue(IQ) is emoty or the Output Queue
(OQ) is full, the APU will go into a "wart" state. It will
remain in this "wait" state until the
multiplication/addition instruction is completed or the
other conditions are satisfied. There is a problem that can
exist oue to tne sids-by-side 16-bit memories used for
program storage. Since there is only one proaram counter
and the API and AP2 processors work in parallel the sias-
by-sioe memory acts as two halves of a 3^-Dit instruction
register. Therefore if one board (API or AP2) is forced to
wait/ the other must also wait since the next instruction
may not be retrieved until tne proaram counter can be
incremented.
The Input Queue is a four-deep FIFO buffer which
services both API and Ap^. To get the next input data
field/ the 10 must be advanced before tne aata is
transferred. If both boards reguest data without advancing
the gueue/ they will receive the same data/ which may be
gooa for certain applications. If they both simultaneously
try to advance the 10/ it will advance only once and give an
API priority/ then advance the second time after the
transfer has been completec to give data to AP2.
76

There are two Output Queues each of which is a
four-deep FIFO buffer. These queues allow maximum capacity
of the adder and multiplier to be utilized, since it is less
likely that the processor will have to wait for either
buffer to have a vacancy due to a busy bus system. If both
processors try to act on any sinale 0Q» orocessor API will
be given the priority.
A tyoical multiplication takes appro5<imately six
cycles (i^^O-nanoseconds) and a typical adu ta<es about three
cycles (210-nanoseconds). Therefore/ to increase
throuqhout/ "hiding" addS/ mcves/ etc. behind multiplies
will accomolish operations in tne time it takes to do the
multiply alone. The most efficient method to program the
iMAP-300 is to treat successive samole sets in alternate
processors; this effectively produces a multiply every
210-nanoseconcjs. Since there is one inout aueuer this
method allows both to nave access to the same information
(by not incrementing the queue) and also gives a greater
chance to use hiding effectively.
The APU can usually operate in two modes. Mode
One/ the normalized moae/ can either use normalized or
unnormalized floating-point numbers as input with the
results being a normalized floatinq-ooint numiber. Using
unnormalized f 1 oa t i ng-co i n t numbers as inout can lead to
precision loss since the normalization process will shift
the mantissa to the left (values less than .1) or to the
77

right (values greater than 1.0). The vacancies created by
these shifts /^ill be filled with zeros/ which, after
comPutationr could possibly produce an unusual truncation.
The unnormalized mode will accept unnormalized numbers as
input and will return unnormalized numbers as output [16],
b. APS
The Addresser Processor Section (APS) (fig lb)
computes both the address in '"'AP memory for the location of
input data words to be processed oy the APU and tne MAP
memory addresses for the output from the APU. It operates
indeoenoently of other processors^ within status ana control
flag constraints of SYSFLG. The APS contains a 128-wcra
25-bit memory/ four program counters (two for read and two
for write)/ eight address buffers (to be used as inputs to
the adder)/ four First-In-First-Uut (FIFO) buffers/ an
arithmetic logic unit (adder)/ ana associated logic and
control units.
The APS programs are stored in MAP main memory
and are loaded by the CSPU. Certain absolute adaress
locations must oe known to a APS proaram at run time which
are not available during proaram writing, [he assembler
computes thenn at assembly time and the CSPU inserts them
into the proper location curing this program transfer. The
CSPU then initiates APS operation by setting the proper
flags. The APS may be loaded with new information by the


























tne APU to slow and wait for a value in the IQ or a space in
the OQ. Because the instructions in WAP .-nemory are 32-bits
long ana the APS instruction is only 25-bits long^ the seven
bits left over are used to store the APS memory address for
that instruction. This allows the CSPU to increase
throughput by immediately installina the instruction into
the correct location in a pre-computea order.
The adder computes addresses dependent on prior
computational results/ literals or specified increments.
All address addition ana subtraction is considered to be
modulo ^ ** 17 so tnat only oositive addresses in that ranae
will be computed. Results are queued in either the Read
Address FIFO (RAF) or /J rite Aadress FIFO (aAF). Along with
the address is a code to delineate whether the address is
full-word/ half-word or oyte (pair of bytes in a lo-bit half
wora adaress) and if it is a eioht-bit fixed-point number/
lb-bit fixed-point number, 16 bit floating-point number or a
32-bit floating-Doint number.
The distinctive feature of the APS is that there
are four program counters (PO, Pi, P ^ and P3). These allow
four separate orograms to be stored in the APS and executed
in an interleaved manner. Seiuencing of these programs is
controlled bv the status of the 'aAF and RAF in conjunction
with the APS instructions. These program counters also
provide a loopina ability allowing the APS to work with the
Host Interface Scroll or I/O Scrolls to keeo data flowina.
80

After one -nernory has been processed and reloaded/ the APS
need not be reinitiated out can continue operation on tHe
new data ov this looping feature [18].
3. Host Interface Scroll
The Host Interface Scroll (HIS) consists of two
subsections^ the Host Interface -''^oaule (HIM) (fig 17) which
is located in the MAP-500 and the Host Interface Controller
(HIC) which is located in the host memory. The host
Interface 'Module transfers "^AP programs/ unprocessed data/
host status and Host Interface Controller commanas from the
host to the N'AP. Processeo data/ "^^AP status and processing
commands are also transferred from the MAP to the host via
the HIM, A programmaDle scroll processor is provided for
computing MAP and host memory locations durina a Direct
Memory Access (DMA) operation. Other pertinent devices
include a memory-bus interface/ controllers for host memory/
format conversion hardware/ status and control logic along
with interrupt logic.
The HiC controls the handshaking necessary between
the host and the •^AP. The handshaking consists of interrupt
logic from MAP to host and logic necessary for controlling
the transfer of data with either Direct Input/Output (DIO)
facility or DMA transfer [18].
The host generally interrupts the MAP to initiate





























































Y Indicstts a satire 'u"ctic




will initiate communication (interrupt) with the host for
further work, t\hen the interruDt is acknowledgea by the
hostf more data or programs are sent to the ^^AP depending on
the flags. (if all ^^AP processors are in a loop operating
on data supplied from external devices and delivered to
external devices via I/O Scrolls^ the host will not be
interrupted unless there is an error. Ihis frees the host
to do any other unrelated processing necessary.) The
maximum response time to initiate an interrupt is 15
microseconds for the H I f^ and 250 microseconds for a user
CALL rout i ne 135] .
4, i^lemory
Main memory in the '^'AP-300 consists of three
independent Dusses each havina trie capability of 256h words
of 500-nanosecond i^OS memory or o^K words of bipolar memory.
Memory types may not oe intermixed on any given bus but each
bus may have a aifferent type from another bus. '"^ emery can
also be either master or slaves master memory oeing used to
control program execution/ aroitrate and observe system
protocol while slave memory stores the data. tach memory
bus containing memory is reguired to have at least one
master memory module (available in either UK or 8K blocks
for MOS or IK, <fK, or aK clocks for oipolar).
Access to each memory is via a common bus having 11
ports an(j two priority levels. Three ports are reserved to
83

be used with the at^solute Driority scheme leaving eight
ports with a sequential round-robin (polite) priority
scheme. Absolute priority is the highest priority ana is
intended to be used with high SPeed minimally-buffered
devices such as disc units or tape units where loss of data
may result. Sequential round-robin priority handling is
used for slower buffered devices and is a round-robin
(circular) aueue which is checked each memory cycle. The
device first in the Queue will get the next memory cycle.
Scanning for the next queued device will commence
immediately upon the previous device starting tranfer. i'^jhen
the next memory cycle occurs the new device will be known
keeping overhead minimal. Of these 11 ports^ the HIS anq
CSPU each have one dedicated port and the AP has two
dedicated oorts on each bus with seven ports remaining for
the IDS and other uses,
Psuedo-memory (alluded to earlier) is the upoer uK
words on Bus 1 containing addresses of certain registers
used for status and control. These registers are located in
the suc-processors but apoear as addresses on the memory
bus. Any sub-processor may alter the contents of these
locations so it is important that the proarammer not try to




As with the AP-120B, there are software routines to aid
in program development ana execution.
1. Executive and Associated Routines
a . A ssemb 1 e r
The MAP-3U0 assembler/ written in ANSI Fortran
IV/ tal«es a source program written for either the CSPU, APIJ,
APSf Mis or lOS and creates an executable ooject module. A
listing file and errors file can also oe created. Editing
and updating can be accomplished from the last source file
by chanaing and assembling only the incorrect line (.or
lines) of coder tnereby avoiding the reassembling of the
entire program [18], The assembler will also allow change
of the dlM memory to enable it to handle necessary
buffering.
b. Simulator
The "^'AF Simulator Program simulates model 200
and model 300 orocessors by executing ''lAP object code. The
simulator oermits the programmer to develoo or debug
software off-line so as not to disturb production schedules.
The '^AP Simulator Program has tne caoability of
simulating the ooeration of the APUr APS* CbPU, '^emory ana
the interrupt handler. It has not been updated to handle
certian new commands and flaas (listed in the front of
refl251) nor does it have the ability to simulate the APu
85

test mode. Memory size anc tyoe can be specified either in
the initial loading of the simulator or while running to
tailor it for current or orooosed configurations.
When used as a deougginq aid^ tne i^AP Simulator
Program allows the operator to: install breakpoints and
execute macro instructions at these breakpoints; detect
program errors and execute macro instructions after their
discovery; examine reaister contents; run programs from
different processors (APUf CSPU^ etc.) independently; and/
patch loaded proorams. Input /outout may be obtained from a
term, inal/ orinter, tapeCmagnetic or paper), cards or
cassette. A batch moae is also available. Actual program
timing can oe estimated by installing breakpoints and
individually timing small sections of code [25].
c . Loade r
Tne MAP Loaaer is a Fortran orogram which
acceots object code proauced by the Assemoler ana create
blocks of binary code in MAP machine lanauage. This coae is
transmittea to the i'-lAP memory via the MAP driver through the
Host Interface Scroll. Errors in transmission are
detectable since check-sum digits are transmitted to the MAP
along witn the blocks of cede. The Merge operation creates
and updates the tables and adaresses necessary if the loaaed




The MAP-300 diagnostic package is designed to
verify hardware operations and isolate any malfunction, to a
specific card. One module is resident in the host while
another/ which contains the test modules and test programs
necessary to determine proper system operation of the CSPU
and other sub-proc essors / is present in the MAP. This
software can run interactively or under batch processing
[18J .
The MAP-'500 LUCK proaram permits the programmer
to examine i>''AP memory for oseudo-memory) from any computer
capable of operating under A MSI Fortran IV. This is also an
interactive routine and provides the ability to "patch"
coded program seaments or enter entire machine languaae
programs. The proarams or segments can then be stepped
througn to examine the results closely [20],
2. SNAP-II
Systematic Notation for Array Processing Version II
or SNAP-II is a single-command high-level macro-tyoe
language used to program the NlAP-300 ar r a^ processor. The
SNAP-II package consists of a Host Support ^-'odule/ Host/N^AP
driver module^ 3NAP-II Executive, SNAP-II Function Modules
ana an installation test and Acceptance test Module [18j.
Tne Si\AP-iI executive permits the user to define
buffer size, and the structure and location of programs in
87

MAP rnemory. The executive also structures the routines to
operate at maximum speed bv insurina that the maximum
possible parallelism exists between sub-orocessors (for CSPI
written functions)^ thereby accentuating "hiding". The
SNAP-il subroutines are written in ANSI Fortran and passed
to the MAP via Function Control Blocks (FCB), The MAP
Driver^ which is located in the host/ directs the loading
and operation of the orograms. (In a looo or "Map While"
condition the driver need only load and initiate the
seguence then return control to the host operating system.)
S^iAP-II allows the oroarammer to buil(J his own
function lists with the Fortran tvoe statement "'"'ap Begin
Function List" ('MPBFL()) whicf^ oermits the host to remain as
free as doss i Die from the ooeration of the MAP. Two-
dimensional arrays are demu 1 t i o 1
e
xed by SNAP-II thereby
increasing speed of execution in the processor oy not having
to compute two-dimensional adaress structures. SNAP-II
functions are callable from either ANSI Fortran or Host
assembly language orograms ana are able to operate on both
real ana complex data [15J .
5, Programming Language
If SNAP-II functions are not specific enough to
satisfy the programmer's needs or if they do not exist in
the SNAP-II library/ new routines may be written in an
assembler type language. The CSPU, APU/ APS and HIS each
have their own instructions to ootimize each sub-processor's
88

caoab i 1 i t i es ,
Tne CSPU instructions are broken into 10 groups
which have the ability to oerform all the functions that a
general purcose computer is nornnally visualized as
performing. They include: generic (performs interrupt
system cooing and looping); single register/' move; logical;
push and pop; hop and jump (a hop is within ^5b half-word
locations ana a jump can be to any new location); skip ano
bit manioulation; comoare; and maintenance and test console
instructions. The APlj can perform: two-argument adder;
single argument adder (lik:e aooroximate reciprocal
instructions); multiply; data transfer; jump and call; ana
control operation instructions. The APS performs: load;
address increment; register arithmetic and control type
instructions. The hIS recognizes: single register; logical
register; arithmetic register; literal ana control
instruction types [18],
Since each sub-processor is desianed to perform a
special ooeration and can be programmed to optimize that
design/ the overall oerformance of the system is increased.
All Processors perform in parallel and stay in "sync" by the
use of flags. A sub-processor will wait until the proper
ij
flag is set before continuing/ thereoy insuring intearity.
The waiting also relieves the programmer of "counting
cycles" with No Operation (NOP) instructions which could
ll
possibly cause lost data. The drawback is that he does nave
8P

an increased comolexitv by insuring that proper flags are
set at the prcoer time [lo]. '^ost of these encumbrances are
elirninated by the executive however. Flags are available in
Dseudo memory and are easily tested. The complexity issue
is minimal since for most applications only APU and APS
routines neeo be written. Only under soecial circumstances
is a CSPU or HIS routine reauired.
Pseudo-ooe ra t i ons are also available to ease the
programming Duroen, They perform such tasks as naming
character strings-^ insuring that information is olaced into
memory on a wore oounaary/ generating constants and making a
test Control Status Aord (CSw).
a. I/O Scrol 1
s
The I/O Scrolls (lOS) control Ir 1 oc <- 1 rans f e r s to or
from external oerioheral devices (incluoing other MAP'sJ
without interferring »yith tne '^'AP-3uO processing cycle by
using a sub -d r oc es so r which can oe d re~p rog rammed . The lOS
contains three functional elemf^nts: orotocol logic necessary
to interface the external device directly to the MAH-iOU
memory busses/ a programmable orocessor to compute MAP
addresses and issue control signals? and/ tne transfer logic
necessary to interface with oeripheral devices.
There are five basic lOS models. lOSW also icnown
as the maintenance ana test console^ is caoable of
transferring eight-oit single words to MAP ous number one at
90

a S KHZ rate. I0S2 has two transfer rate options and two
word size options available. \.'ior6 size option one utilizes
the b 1 oc k -t rans f e r of 8 or lb-bit words to any of the three
HAP busses while option two uses either 16 or 3^-bit words.
Transfer rate option one conveys information at a 1 ^HZ rate
as compared to the 2.5 MHZ rate of option two. Either
transfer rate option may be comoined with either word size
option; however/ only one combination is available at a time
since they are hard-wired. Under program controls I0S3 can
transfer either lo or i^^-oit woras to any of the three
busses at a 750 Kn7 sustained rate. 10S3 can also perform
format conversion/ monitor data with a basic ooeration
si miliar to ti^e HIM and sucport indirect adaressinq. lOS^ is
a high speed (up to ^0 N'HZ) scroll/ allowing block, transfers
only, of ft/ 16/ 32 or 6^-bit words to any Dus (oU-bit woras
must be transferred simultaneously to bus <? and bus 3).
lOS^ also allows oacking and buffering of data [18]. IDS
5
is a airect memo r y-
t
o-memo ry bus-connect option for airect
data transfer between user aevices and the MAP-300. The
module requires no software (and will not supcort softwai'e).
Its operation is control lee by hardware ana three interrupt
reques t lines [2 1 J .
a. Analog Data Acquisition Module
The Analog Data Acquisition Moaule moael 5120
(ADAM-512U) i<; a programmable analog interface capable of
accepting from 2 to 16 channels of analog information. This
^1

information is then digitized to 12-bit resolution at a 2.1
KHZ throughput rate for the 16-channel case (125 KriZ for
single channel). As with the I/O Scrolls^ the A/D operation
may taKe place simLjltanecusly with the MAp-300 processing.
The AQAM is functionally ecuivalant to the lOSd with only
added analog-to-digital circuitry. This allows the ADAM to
be SMAP-II compatible.
The oDeration of the ADAM is carried out via a
set o^ up to Id samole-anc-hold units which then make their
sianals available to a lb:l multiplexer. Each channel of
the multiplexer is the consecutively sampled by the A/D
converter which outputs either a Ib-oit sign-magnituoe or
lo-Dit f 1 oa t i na-oo i n t number. Performance accuracv is
specified a 0.2 percent of full-scale resolution [2].
C. PKOGRaMMING, OPEPATI0^i AND EXECUTION
The N'AP-^OO can not only utilize parallel operations of
the adaer and multiplier in the APU/ but also the parallel
sub-processor ©Deration of the APS, HIS/ lOS/ APU ana CSPU
to increase total throughput. The programmer/ dv breaking
the Problem into smaller independent proarams of adc^ressing/
arithmetic/ I/O and management/ can theoretically more
easily proaram the entire proolem than by adherring to
internal communication protocol and flags [18). The
respective programs should be easier to write with much of
the increase in overheaa due to the added handshaking and
92

protocol requirements beinq assimulated by the executive.
[16J .
CSPI recommends that a modified top-down programming
technique be used initially by writing the APU routine first
to insure the optimum execution speed. Then adding the other
necessary routines (generally just the APS routines) to
insure the information is cresent when the APU needs it.
The APU shoula be orogrammed to treat subsequent sample sets
in alternate adder/multiolier modules and arrange data so
that as many adds can be "hidden" as oossible 118] . By
proper execution, sequencing total time can be shortened to
equal the time to multiply only# with all other operations
"hidden" under these multiclies. This "hiding" operation
becomes easier in the f>'AP-300 than in the AP-120B since
cycles need not oe counted and ^iOP's need not be inserted
for unused cycles due to flags being set to signal the
availability of resources lib]. The Drogrammer must be
a^are that the timing is not absolute/ therefore the
executive will tightly control synchronization oy flags to
insure one adder/multiplier does not get anead of the other.
The programs are initially loaded from the host to the
MAP via the ooerating system interface and driver. The
'^APOVR.MAC routine ma<es the standard interface through the
operating systm anrj MPQRV.MAC maxes the "^A? appear as a
standaro PSX-llM device to the computer. Initial
communication from the host to the MAP is done via a four
^3

word Driver Control Block (DCB) [2b], when the Central
System Processing Unit is initialized by the host/ it will
load the other sub-processor programs and commence program
execut ion.
Subsequent MAP commanas are sent to the ^AP from the
HOST via Function Control Blocks (FCB) which reouire host
intervention to send. (Function lists and the MPWHL macro
treat multiple f-CB's as a single entity). These FCb's
transmit host to ^'AP status^ interrupts and functions to
perform and can be cueued in the HIS buffer. ^\hen it is no
longer necessary for the host to send or receive a FCB/ it
can perform other operations [35], Therefore/ with
efficient use of the 103 and the possibility of stringing
MAPs in serieSf the host can be free to either oerform other









VI. DISCUSSION OF FINDINGS
In the test oeOf the PDP-11/3^ was chosen to perform
the front-end functions which consisted of buffering the
data^ formatting it and then passing it to tne array
processor or mass storage device (or from the mass storage
device to the array processor). This limited front-end
inputting function did not dictate tnat the computer be
large. Tne choice of the PDP-11/3^ cofnputer for this
apoli cation seems adeguate. The PDP-11/0^ would normally
contain enough speea to handle the necessary ooerations but
may be unsatisfactory since it does not have a resident
memory control ana protection routine to ease the
programmers burden ana help insure system, integrity/ nor
does it contain the 2k cache memory to increase sceed. A
computer larger than the PDP-11/3U may not increase the
effici^ency of the system although it would increase the
cost.
The test bed utilized the PDP-11/70 for the outcut
computer. The output comouter would he reauired to receive
information from the array processor, manipulate the data
and store it for future display on one or more devices. For
this application, the PDF-ll/70 seems best for several
reasons. The system is much like the 11/34 except that the
current maximum memory is 2 megabytes to allow for better
utilization of information. There are dedicated paths to
95

hiah performance storage devices that would allow more
infofmation to be processed per unit of time. To further
process arrays for outputs there is a 3?-bit or a 6M-bit
floating-point arithmetic unit available. The PDP-11/70
gives large-computer performance and expansion capabilities
with the cost and space reauirements of smaller units 151].
Using tne same manufacturer for the output function as was
used for the input function reduces interface problems and
contributes to the proficiency of the orogrammers by
increasing overall knowledge of the architecture.
The proposed test oea uses of the 11/34 and 11/7 can
be greatly modified by the choice of the ar r av processor.
The MAP-300 utili^^ing an Analog Data Acauistion Module
and/or I/O Scroll can eliminate the neea for the input
functions (including lb channel analog-to-aigital
conversion) therefore permitting the 11/70 (or possibly a
less costly model) to perform input/ output and monitor
functions in the test bed. In fact/ the 11/70 will probably
be large enough and fast enough to facilitate combining all
subsystems/ except the display subsystem/ under one
computer. The 11/34 and 11/70 combination should provide
for the full range of computers necessary to properly
emulate and evaluate just how much comouter capability will
actually be needed for any soecific apolication.
The guestion arises as to which is the best array
processor for the application. The AP-120b is synchronous/
96

therefore some may say safer, has a 38-bit word which could
mean greater accuracy, more standard library functions (such
as vector log base 10 and vector log base e) and a 3500 hour
mean time before failure. The MAP-300 is a newer system
which, due to the minimal host involvement, three separate
busses, I/O Scrolls and the ADAN', can proviae greater long
run throughput ana more flexibility.
For the non real-time environment where simple
programming ana host involvement can be tolerated, the
AP-120B may be a good ci^oice. It can provide facilities to
tailor algorithms to specific needs; these facilities are
not yet too complex to tax the normal programmer. however,
new programs cannot be adaea directly to the AP math library
(AP.MATH) out must be linkea and loaded for every usage as
would any application program. This creates an excessive
tim.e overhead. Therefore, the AP-li^Ob should be used only
where simplicity and ease of use are paramount and utility
can be sacrificed.
For applications recuiring real-time computations
(which the test bed most likely »«ill eventually demand)
innovative desian, high throughput rates and generally
greater flexibility, the MAP-300 provides the answer. The
improved performance o* both array-processing potential and
computer availability is offset by the increased cost of
program development if non-library routines must be written.
These routines however may ce added to the Horary
^7

effectively reducing overhead. Reference [231 reports that
the NiAP-300 also complies with MIL-E-lbaOO, MIL-L-5a00,
MIL-STD-abl A, MIL-STD-70UB and M
I
L-STD- 1 599 .
During the installation of the MAP-500 at the )>iava1
Postgraduate School/ it was noted that the installation
documentation was extremely poor. As of this writing, three
weetcs were required to install the system. This was due
mainly to the ooor documentation in the installation package
receivea with the unit. Not only was the package
incompletef but chanaes to the software were performea that
were not changed in the original documentation/ nor was an
eratta sheet provided.
It is realized that for many companies involvea in data
processing equipment manufacture/ documentation is not a
chief concern. However/ CSPI seems to nave far inferior
installation documentation than would reasonaoly be
expected. This situation made it impossible to ao a aood
test of the system operation but allowed only a cursory
review.
E.ven with the evident shortcomings of the documents/
theoretically the N'AP-300 is far superior to the AP-12UB.
If CSPI would upgrade their documentation and perform the
installation at the site/ their sometimes negative public
image could be eliminated and confidence in tneir eouipment
could be increased. It must be noted ho«ever that ref [I'^J
and the publication "Simple Notation For Array Processing/
98

Version 11/ Reference Manual"/ are excellently written.
There/fore in the following discussion/ the use of the
MAP-300 will be assumed. I will now look at each suDsystem
closely and attemot to aetermine alternate designs.
The analog subsystem oDtains data from one of four
sources: time code read/generator/ 14-track recoraer
(Honeywell 9b), signal synthesizer (Rockland 5100) ana/or a
noise generator (HP 3722A). Up to 128 channels of input are
amplified/ sent through a programmable matrix switch
resulting in 52-channel output signals to a oroarammaole
32-channel filter. Tnese analog signals then leave the
analog subsystem to be input to the signal processing
subsy s t em .
The ArJ-S'^OO analog-to-digital converter performs a
Id^-Dit A/0 conversion and is then loaded the Amoex Megastore
mass storaqe device through the PDP-11/54 computer. The
output of the array processor will then oe sent to the data
processing subsystem.
I suggest it may be easier/ more flexible and cheaper
to inout the 3£ channels as before to the orogrammaole
filter/ but tnen the 32 channels may be better hanalea by
two Analog Data Acquisition Modules directly into the i^'AP
for processing or via an I/O Scroll/ moael 3/ be sent to the
PDP-11/70 storage devices for future use. This will
eliminate the expense of the A/0 converter/ Ampex Megastore
and the PDP-11/34 but more important/ it will be relatively
99

easy to perfornn calculations in real-time. Once the MAP-300
is started; it can perform without host intervention until
interrupted and witn an assumed input of ^0 KHZ* the system
should not De taxed. The output of the MAP can tnen be sent
directly to the data-processnq subsystem. The entire system
can also be less complex, affording easier system
de ve 1 oomen t .
Assume that a fictional system with a ^0 KHZ input
requires a FFT and discrete digital filter to be aone on the
information. The timing of a 102^ real to 516 complex
Fourier transform reauires 3.0 milliseconds I2i] and a 'JU
KHZ input rate would require 59 , I FFT's per second on the
average. This would consume 117.3 milliseconds ano assuming
a 50 percent overhead yielc 175.^5 milliseconas to perform
the Fourier transform. Discrete filtering would require
another 39.1 * ( lOcfa * ( 2 * 500 nanoseconds • 1 2 * 7
nanoseconds)) or 73.67 milliseconds. Again assuming 50
Percent overhead/ 110.51 milliseconds would be necessary for
tne filtering. The total time consumed by the two functions
would be 586.5 milliseconas^ leaving 713.5 milliseconds for
other wor<. (Fifty percent overhead is an over-estimation.)
Loadina data into the MAP-300 would be hidden behind the FFT
operation (except for the initial case) and would not
contribute to overall execution time.
This would effectively eliminate the entire signal-
processing subsystem with th exception of the ^iAP-300. The
100

PDP-11/70 computer in the data orocessing subsystem coula
control the MAP along with its other intended function of
controHinq the display subsystem. Any storage necessary
for output or any taped inout data could be handled by the
tapes and disks associated with the 11/70 and execution
could be performed on the MAP-300 along with the above
calculations. However, for expanded utilization, not
specifically aoaressed, the above use of only one i^AP and no
PDP-11/3^ may have to be modified to accomodate the new
reauirements if these new reauirements are significantly
1 a rge r .
If after extensive testina the ^AP-300 proves to be too
costly due to unreliable software, the AP-lc?OB can perform
the same functions although at an increased hardware and
t i me cos t
.
For example, in the AP-1^08, to perform the above real
to complex FFT, it requires 5.08 milliseconas for the FFT,
0.8 microseconds to rescale ana 1.7 microseconds to reformat
the result for a total of 5.0^ milliseconds per 102^ sample
FFT. To this must be added 100 to 1000 microseconds
overheaa for each of the four call statements: Get data




real to complex FFT(fiFFT) and real FFT scale and
f o rma t ( ^^FFTSC ) . I will use the arithmetic average of bSO
microseconds oer call for an added 2.2 milliseconds
resulting in a subtotal of 7,29 milliseconds oer FFT. APPUT
101

and APGET have no specific times in ref [6]^ but according
to Floating Point Systems the PDP-11 interface transfer rate
is 750 KHZ, This would therefore reauire approximately 2.67
milliseconds for each IQ^?^ element transfer giving a total
of 9,96 milliseconds each for 39.1 FFT's, This results in a
389,5 millisecond execution and transfer time. Again,
allowing for 50 oercent overhead safety margin, the total
becomes 57^4.16 milliseconds oer second. To perform the
discrete filtering would reciuire an additional APGET, APPUT,
RFFT, RFFTbC as well as a vector mu 1 t i p 1 y ( V ^UL) and a
complex vector multiplyCCvN'UL) oringing the time to compute
one seconds worth of data to well over one second.
Therefore another AP-120B must be installed to insure
that speed reguirements are met. Also/ since the host
computer must be interruoted many times, it mav be necessary
to retain the PDP-11/3-4 in the sianal orocessinq subsystem.
There is also the cons i ae r a t i on that if a math routine is
custom written, it will not be able to be loaded in the math
library which will generate considerable overhead each time
it is called, (The amount of this overhead time is system
depenaen t , )
102

\/II. CONCLUSIONS AND RECOMMEND A T I ONS
The test-bed as prooosed seems to be a workable aesign/
althouqh for nnost applications a more efficient and
economical architecture may be constructea.
For many uses the neec for the POP-11/3^ comouter ana
the AN -5^0 A/D converter seem unnecessary when used in
conjunction with the ^lAP-SCO array processor. The Aticex
Megastnre may be reouirec for a few applications but would
not be suitable for the majority of apoli cations (including
real-tiTeJ since a disk oerioheral attached to the PDP-11/7U
would be cheaper ana still perform the same functions.
It is felt that the increase in comolexity and possible
confusion using the '^iAF-300 over the AP-12UB can be
overshaao«ed by the '"eduction in eguipment reauired by the
MAP-3 0. This increased oroficiency should even te more
greatly felt (assuming a normal learnina curve) with
subsequent installations. A1so/ wit-h the time savina in
execution/ extra calculations coulg be performea on the MAP
in a real-time environment^ thereby increasing efficiency/
operaoility and soectrum.
It is recommenoea that further tests be conducted using
the actual applications/ oata tyoes ang speed reguirements





1. A PIPELINED PARALLEL SYNCHROMOUS PROCESSOR FUR HIGH-
THROUGHPUT DATA REDUCTION, Floating Point Systems, Inc.,
FPS-7524, February, 1977.
2. ADAM Mode] 5120 Analog Data Acquisition Module, CSP
Inc., Docurpent number S-13, 1978.
3. AN INTRODUCTION TO THE i^AP SERIES MODELS 100, 200, AND
300, CSP Inc., Document number S-02, December, 1^75.
a, AP-120B ARAL - ARRAY PROCESSOR ASSEMBLY LANGUAGE,
Floating Point Systems, Inc., FPS-7275-01, February, 1976.
5. AP-120B APLINK - ARRAY PROCESSOR LINKING LOADER,
Floating Point Systems, Inc., FPS-727b-01, Fecruary, 1976.
6. AP-1208 DEBUG - ARRAY PROCESSOR DEBUGGER, Floating Point
Systems, inc., FPS-7277-01, February, 197o.
7. AP 120B FLOATING POINT ARRAY PROCESSOR, Floating Point
Systems, Inc., Form 72^4, Revised January. 1*^77.
8. AP-I20B MATH LIBRARY, Part One, Floating Point Systems,
Inc., FPS-7288-03, August, 1977.
9. AP-120B PAGE SELECT OPTION (PRELIMINARY), Floating Point
Systems, Inc., FPS-732o, February lu, 1977.
10. AP-120B PRICES, Floating Point Systems, Inc., Form 7293
January 7o.
11. COMPUTER FAMILY ARCHITECTURE SELECTION COMMITTEE
FINAL REPORT, Vol I, Burr et al, 1 December, 1976.
12. COMPUTER FAMILY ARCHITECTURE SELECTION COMMUTEE
FINAL REPORT, Vol Vlli, Clearwaters et al, 1 December, 1976
104

13. DATA ACQUISITION UNIT FOR SATCOM SIGNAL ANALYZtR,
Thesis by Varvin J. Langstcn, June 1*^78.
14. HOW TO PROGRAM THE AF-120B, Floating Point Systems,
Inc.r FPS-7503, March, lQ7fc.
15. INTRODUCTION TO ARRAY PRUCtSSING wITH SNAP-H,
Inc./ Document number S-03/ Revised March, 1978.
CSP
lb. IT'S SIMPLE TO PROGRAV SNAP-II ARRAY FUTiCTIONS,
Inc«, Document number S-04, Revised December, 1977,
CSP
17. LOOK PROGRAM USER's MANUAL, CSP Inc., Document number
Dv-i800a-001-01 , June, 1977.
18. MACRO ARITHEMITIC PROCESSOR SYSTEMS ( MAP- 1 /20 / 30 )
,
PROGRAMMER'S REFERENCE MANUAL, CSP Inc., Document number
JBbOOU-001-03, May, 1977.
19. MAP APPLICATION NOTE 1, CSP Inc., Document number
MAN-0 1-01, February, 1976.
20. MAP CROSS ASSEMBLER MODEL 8001 - SOFTWARE INSTALLATION
BUOKLET, CSP Inc., Document number DW80 1 -0 1 -0 u , May, 1977.
21. MAP I/O Scroll Model 5 Interface Booklet, CSP
Document number DE4080-0 00-00 , May, 1977.
Inc . ,
22, MAP Loader in Fortran, CSP Inc., Auaust, 1978.
23. MAP - Macro Arithemitic
FiAiD 77- 1 59 1 , Fort -^ayne, Inciana.
Processor, Magnavox,
24. MAP MODEL 3 I/O SCROLL (I0S3) INTERFACE MANUAL, CSP
Inc., Document number DESaO 3-0 0-0 1 , August, 1977.
25. MAP Simulator Program Moael 8002 Reference Manual, CSP
Inc., Document number Jrt8 002-00 1 -02 , October, 1978.
2b. MAP SOFTk*JARE INTERFACE DESCRIPTION FOR ThE DEC RSX-llM
SYSTEM, CSP Inc., Document number 8901-000-00, July, 1977.
105

27. MAP STANDARD PRICE LIST, CSP Inc., January, 1977.
28. MINI AND MICRO COMPUTER SURVEY, Datamation, Knottek,
August, 1978.
29. PDP-11 END USER PRODUCT SUMMARY, Digital Equipment
Corporation, January, 1977.
30, PDP-1 l/0a/3a/a5/55/fe0 PROCESSOR HANDBOOK, Digital
Equipment Corporation, 1978.
31. PDP-11/70 PROCESSOR HANDBOOK, Digital
Corporation, 197b.
Equ i pmen t
32. PROCESSOR HANDBOOK, Floating Point Systems, Inc., Form
7259-02, May, 197o.
33. PROGRAMMABLE I/O PROCESSOR PIOP, Floating Point
Systems/ Inc./ FPS-7350, June, 1977.
34. SNAP-II Reference Caro, CSP Inc., Document number S-11,
1978.
35. SOFTWARE FUNCTIONAL SPECIFICATION - SNAP-II HOST/MAP
DRIVER MODULES, CSP Inc., Document number DvMbUOO-OOb-O ,
August>f 1976.
36. THE AGE OF ARRAY PROCESSING IS HERE, Floating Point









2. Library^ Coae 01^2
Naval Postgraduate School
Monterey^ California 939^0
3. Deoartment Chairman, Code Sc?
Deoartment of Comouter Science
Naval Postgraduate School
Monterey/ California 939aO
^. Professor Georae Rahe, Code 52 Ra
Department of Computer Science
Naval Postgraduate School
Monterey, California 939aO
5. Philip Mylet, Code PME-12a
NAVELtX
l^ash i ngt on / D.C. 203b0
6.
7.
LT, Georae T. Vrabel
86 Norseman Drive
Portsmouth, Rhode Island 02871
Professor John E, Ohlson, Code 62 01













c.i Evaluation of a sig-
nal processing test bed.





Evaluation of a sig-
nal processing test bed.

