University of Central Florida

STARS
Retrospective Theses and Dissertations
1980

An associative backend machine for data base management
Alireza Hurson
University of Central Florida

Find similar works at: https://stars.library.ucf.edu/rtd
University of Central Florida Libraries http://library.ucf.edu
This Doctoral Dissertation (Open Access) is brought to you for free and open access by STARS. It has been accepted
for inclusion in Retrospective Theses and Dissertations by an authorized administrator of STARS. For more
information, please contact STARS@ucf.edu.

STARS Citation
Hurson, Alireza, "An associative backend machine for data base management" (1980). Retrospective
Theses and Dissertations. 5114.
https://stars.library.ucf.edu/rtd/5114

AN ASSOCIATIVE BACKEND MACHINE FOR DATA BASE MANAGEMENT

by
Alireza Hurson

A dissertation submitted in partial fulfillment of the requirements
for the degree of Doctor of Philosophy in
the Department of Computer Science at
the University of Central Florida
Orlando, Florida
December 1980
Major Professor:

Dr. Amar Mukhopadhyay

Graduate Council
University of Central Florida
Orlando, Florida
CERTIFICATE OF APPROVAL

PH.D. DISSERTATION
This is to certify that the Ph.D. Dissertation of
ALIREZA HURSON
with a major in Computer Science has been
approved by the Examining Committee on November 25, 1980
as satisfactory for the dissertation requirement
for the Ph.D. degree.

Ct,

Examining Committee:

"Jlt..C'-r?fo- ~

Major Professor

Dr. Amar Mukhopadhyay

Member

~

~

Member

/2{Jd!YcVi~

Terry J. Frederick

r. D. A. Workman

Member

Member

Dr. Brian Petrasko

Graduate Council
University of Central Florida
Orlando, Florida
CERTIFICATE OF APPROVAL

PH.D. DISSERTATION
This is to certify that the Ph.D. Dissertation of
ALIREZA HURSON
with a major in Computer Science has been
approved by the Examining Convnittee on November 25, 1980
as satisfactory for the dissertation requirement
for the Ph.D. degree.

tJ, ~~~
Dr. Amar Mukhopadhyay

Examining Committee:

Major Professor

Member

Member

:r5r.terry J. Frederick

/lfJAu~L

r. D. A. Workman

Member

Member

Dr. Brian Petrasko

C

Alireza Hurson
All Rights Reserved

1980

ACKNOWLEDGEMENTS
I wish to acknowledge Professor Amar Mukhopadhyay for his cooperation,

guidance,

helpful

suggestions

and

exhibited throughout the past four years.
committee members, particularly, Dr.

the

extreme patience

he

Thanks al so goes to my

Terry Frederick and the secre-

taries in the Computer Science Department at the University of Central
Florida.

I wish also to acknowledge Ms. Patti Martin for her excel-

lent work in typing this dissertation.
gratitude to all

I would like to express my

the good people in Iran

including

my

parents,

Mr. and Mrs. Hurson. Finally, I would like to thank Simin for her forbearance and support throughout this project.
This work has been partially supported by the National Science
Foundation Grants #MCS 76-04763 and #MCS 800 5096.

ii

TABLE OF CONTENTS
Page
LIST OF TABLES

vi

LIST OF FIGURES

viii

ABSTRACT

xiv

CHAPTER
I. INTRODUCTION

1

Introduction
.......... .
Approaches to DBMS
....... .
Software Implementation
Software Data Base
... .
Backend Approach
... .
Hardware Approaches to Data Base
Management
.... .
Technology Trends
... .
Objectives of This Dissertation
II. BACKGROUND AND PREVIOUS WORK

5
6
9

13

16
24

37

41

Introduction
Associative Processor
The Relational Data Model
Previous Works
.....
Fully Parallel Architecture
Block Oriented Data Base
Architecture
Discussion
III.

1

ASL - AN ASSOCIATIVE SEARCH LANGUAGE FOR
DATA BASE MANAGEMENT
..... .
Introduction
......... .
ASL Lanaguage
.... .
Definitions
...... .
Informal Definition of ASL
Formal Definition of ASL . . . . . . .
The ASL Interpreter
.... .
Some Definitions . . . .
ASL Implementation . . . .
More About ASL Interpreter . . . .

; ii

41
41
49

60
60
63

69
74
74
78
78
84

88
98
98
104

121

Page
CHAPTER
IV. ASLM - AN ASSOCIATIVE SEARCH LANGUAGE
MACHINE
....... .
Introduction
. . . .
. .... .
Overall View of the ASL Machine
....... .
General View of the Hardware.
Overall Flow of Data of ASLM
ASL Machine Architecture
General Purpose Computer
....
Associative Search Language Hardware
Backend Storage . . . . . . . . . .
The Operation of ASLH
..... .
Index Processor . . . . . . . . . . . . . . . . .
Secondary Storage Interface
.... .
The Operation of Cell . . . .
. ..... .
Flow of Data in the Associatve Stack
Module
Discuss ion
V. THE ASL PRIMITIVES

133
133
136
136

141
149
149
150
196
198
198
202
206
211
228
233

Introduction
Micro Instruction
Index Processor
Secondary Storage Interface . . . .
Non-Numeric Processor . . . . . . . .
Conclusion . . . .
VI. ASL TIMING SEQUENCE

233
235
235
249
251
274
276

Introduction . . . .
Computation of the ASL Macros
......... .
Index Processor . . . .
. . . .
Secondary Storage Interface . . . . . . . . . . .
Non-Numeric Processor
......... .
Evaluation of ASL
...... .
Index Processor . . . . . . . . . . .
Secondary Storage Interface
Non-Numeric Processor . . . .
An Example of an ASL Program . . . . .
Discussion
VII. Conclusion

276
280
286
289
290
304
305
306
307
312
324
328

Summary
. . . . . . . ..
Further Development

328
331

iv

Page

CHAPTER
APPENDIX I

335

APPENDIX II

347

APPENDIX III

358

LIST OF REFERENCES

367

V

LIST OF TABLES
Page
TABLE
1.1

Dynamic Operation Frequencies(%)

1.2

Increase in Chip Capacity

26

1.3

Bubble Memory Objectives

28

1.4

Commercial CCD Memories

29

1.5

CCD-Memory Projections

30

1.6

Characteristics of Figure 1.15

35

3.1

ASL Alphabets

89

3.2

Set of Non-Terminal Symbols

90

3.3

Set of Terminal Symbols

91

3.4

ASL Productions

92

3.5

Set of Implemented ASL Productions
in BNF Format

105

3.6

List of Reserve Words

109

3.7

List of Delimeters

110

3.8

Set of Non-Terminal Symbols and
Associated Attributes

118

Set of Productions Which Intermediate
Form Will Be Generated For Them

122

3.9

3.10 Intermediate Operations
5.1

8

123

The Variant of the MTCH Instruction
of the Index Processor

240

5.2

The Variant of the SCTR

255

5.3

The Variant of the MVRR

258

5.4

The Variant of the TSBT

265

Page
TABLE
5.5

The Variant of the STBT

268

6.1

Instruction Time of ASL

281

6.2

The Execution Time of the ...

307

6.3

The Execution Time of an ASL Example

325

vii

LIST OF FIGURES
Page
FIGURE
1.1

Conventional Approach

1.2

The Access Gap Between Main Memory and
Conventional Secondary Storage

10

1.3

Memory Cycle Time

11

1.4

Software Data Base Management

12

1.5

Data Base Management System

14

1.6

The Backend Concept

15

1.7

Software Backend Approach

17

1.8

A Software Backend DBMS

18

1.9

Memory Hierarchy Structure

20

1.10 Intelligent Controller Solution

22

1.11 Hardware Backend Solution

23

1.12 Effective Main Memory Cycle Time Declined,
Capacity Increased . . . .
. . . .

25

1.13 Charge Coupled Device and Magnetic Bubble
Memory Have Filled the Access Gap

31

1.14 Charge Coupled Device and Magnetic Bubble
Memory Have Filled the Storage Gap

32

1.15 A Memory Hierarchy

34

1.16 Capacity vs. Cost for Different Technology

36

2.1

An Associative Memory

44

2.2

An Example of Associative Memory

46

2.3

The Employee Relation

51

2.4

A Projection on the Employee Relation

54

4

viii

Page
FIGURE
2.5

A Restriction on the Employee Relation

56

2.6

The S relation

57

2.7

A Join of Employee and Department
Relations . . . . . . . .

58

General Structure of a Cellular
Organization

64

3.1

A Grammar G

80

3.2

LR Parser of Grammar of Figure 3.1

82

3.3

LR Parser of Figure 3.2

83

3.4

Different Modules of Compiler . .

100

3.5

Different Modules of Interpreter

101

3.6

Relationship Between Different Modules
of One-Pass Interpreter

103

3.7

Data Structure of the Symbol-Table

113

3.8

Sequence of Steps

115

3.9

Data Structure of Each Entry in the
Intermediate Array

125

2.8

3.10 The Structure of Each Element in a Row

of the

11

T11

126

4.1

Language Machine Relationships

134

4.2

General View of ASL Machine

137

4.3

General Flow of Data

139

4.4

ASLM Architecture

140

4.5

Retrieval

143

4.6

Deletion of Tuples

145

4.7

Modification of Attributes' Values

148

4.8

Extended Version of Figure 4.4

151

ix

Page
FIGURE
4.9

Encoding of a Relation

153

4.10 Encoding of a Tuple

154

4.11 Encoding of Figure 2.2

156

4.12 Controller

159

4.13 Encoding of Descriptor in the RAM Memory
of the Index Processor

161

4.14 Index Processor

163

4.15 Secondary Storage Interface

166

4.16 An Associative Stack

169

4.17 An Automatic Rotate Register

171

4.18 Non-Numeric Processor

173

4.19 Cell

175

4.20 Domain Recognizer

177

4.21 Another Version of Domain Recognizer

179

4.22 Input Register

180

4.23 The Hardware of Each Unit

182

4.24 Each Row of the Test Logic Circuit

184

4.25 The Hardware Organization of the
Test Logic Circuit
4.26 The General Organization of Associative
Stack Module
....... .

190

4.27 Set of Registers Associated To Each
Associative Stack

191

4.28 The Control Bits

192

4.29 Remove Blanks Circuit

195

4.30 Reformat Circuit

197

185

Page
FIGURE
4.31 Read

199

4.32 Insert

201

4.33 Delete

203

4.34 Read

205

4.35 Sequence of Operations in the Domain
Recognizer

207

4.36 Sequence of Operations in the Input
Register

209

4.37 Sequence of Operations in the Test
Logic Circuit

210

4.38 Sequence of Operations in Cell

212

4.39 Push

214

4.40 Pop

215

4.41 Union

216

4.42 Intersection Operation

218

4.43 Difference Operation

220

4.44 Inclusion

221

4.45 Set Equality

223

4.46 Cartesian Product

224

4.47 Selection

226

4.48 Projection

227

4.49 Join

229

5.1

General Format of the ASL
Primitives

234

5.2

Format of the Set Mask Register

237

5.3

Format of the Operands of the
Set Mask Register

237

Page
FIGURE
5.4

Format of the Set Comparand Register

239

5.5

Format of the Match

239

5.6

Format of the Copy

242

5. 7

Format of the Find

242

5.8

Format of the Set Register

244

5.9

Format of the Delete

244

5.10 Format of the Insert

247

5.11 Format of the Test Bit

247

5.12 Format of the Branch Relative
Forward

248

5.13 Format of the Branch Relative
Backward

248

5.14 Format of the Delete

250

5.15 Format of the Read

250

5.16 Format of the Set Content of Register

253

5.17 Format of the Move Register to
Register

257

5.18 Format of the Move Output Register
to Comparand Register

257

5.19 Format of the Set Mask Register

260

5.20 Format of the Operand of the SMKR

260

5.21 Format of the Gate Subtuple

262

5.22 Format of Copy

262

5.23 Format of the Match

264

5.24 Format of the Test Bit Register

264

5.25 Format of the Set Bit

267

xii

Page
FIGURE
5.26 Format of the Remove Bl an ks/Reformat

267

5.27 Format of the Branch Relative Forward

270

5.28 Format of the Branch Relative Backward

270

5.29 Format of the Increment/Decrement Top
Register

272

5.30 Format of the Rotate

272

5.31 Format of the Write

273

6.1

Execution Time vs. Depth (Union)

316

6.2

Execution Time vs. Depth (Inclusion)

317

6.3

Execution Time vs. Depth (Projection)

318

xiii

AN ASSOCIATIVE BACKEND MACHINE FOR DATA BASE MANAGEMENT

by
Alireza Hurson

An Abstract
A dissertation submitted in partial fulfillment of the requirements
for the degree of Doctor of Philosophy in
the Department of Computer Science at
the University of Central Florida
Orlando, Florida
December 1980

Major Professor:

Dr. Amar Mukhopadhyay

xiv

It has

long

been

recognized that computer systems containing

large data bases expend an inordinate amount of ti me managing the
resources (viz. central processing time, memory, ... etc.) rather than
performing useful computation in response to user I s query.

This is

due to the adaptation of the classical machine architecture, the socalled

von

Neumann

architecture,

to a problem domain

that

needs

radically different machine architecture for an efficient solution.
The characteristics that di st i ngui sh the computation for data base
management

systems

are:

massive

amount of data,

simple repetitive

non-numeric operations and the association of a name space with the
information space at a high level.
requirements

by

memory

app 1 i cation programs

management

and a

The current systems meet these
techniques,

specially

sophisticated address

designed

mapping methods.

This accounts for a large software overhead and the resulting semantic gap between the high level language and the underlying machine
architecture.
To

overcome

the

di ffi cult i es

of the

von

Neumann

machines,

Slotnick suggested the idea of the hardware backend processing by
distributing the processing capabilities outside of CPU and among the
read/write

cells.

These

cells

act as

filters

which

imp rove

the

system performance by reducing the processing load on the CPU as well
as the amount of data transported back and forth between secondary
and main storage.

The major contribution of this dissertation is the

definition of a backend machine architecture ASLM (Associative Search
Language Machine) and the development of a query language ASL (Associative Search Language) which is directly executed by the backend

xv

machine using built-in hardware algorithms for query processing and
associative hardware for name-space resolution.
The language ASL is a high level data base language using associative

principles

for

defined based on the
complete,

and

basic

relational

provides

f ac i 1 it i es for query,

operations.

complete

The

data model.

language has

been

ASL is relationally

data independence.

ASL provides

insertion, de 1et ion and update operations on

tuples of variable sizes.

Moreover, the structure of the statements

in ASL are represented in arithmetic expressions like entities called
set expressions.
ASLM is designed based on cellular organization, a design similar to Slotnick's idea with an important exception.

In the design of

ASLM, the processing units (cells) are moved into the backend machine.

The general

strategy in ASLM is based on the pre-search

through the data file and then the execution of the operations on the
exp 1 ici t

subfi 1es which are stored in the associative memory.

The

generation of the subrelations explicitly eliminates the existence
of so-called mark bits in some of the previously designed data base
machines.
operations

Moreover,
such as

it provides fast a 1gori thms for i nterre 1at iona 1
11

join. 11

ASLM is also microprogrammable which

gives more flexibility to the system.
The design of the ASLM differs from the majority of the data
base machines based on Slotnick's idea:
ce 11 s

first, the separation of the

from the secondary storage wi 11 result in a cost effective

system in comparison to the other machines.

xvi

This also eliminates any

machine using built-in hardware algorithms for query processing and
associative hardware for name-space resolution.
The language ASL is a high level data base language using associative

principles

for

basic

operations.

The

language has been

defined based on the rel at i ona l

data model.

complete,

data independence.

and

provides

complete

ASL is relationally
ASL provides

facilities for query, insertion, deletion and update operations on
tuples of variable sizes.
in

Moreover, the structure of the statements

ASL are represented in arithmetic expressions

like entitities

called set expressions.
ASLM is designed based on cellular organization, a design similar to Slotnick's idea with an important exception.

In the design of

ASLM, the processing units (cells) are moved into the backend machine.

The general

strategy in ASLM is based on the pre-search

through the data file and then the execution of the operations on the
explicit subfiles which are stored in the associative memory.

The

generation of the subrelations explicitly eliminates the existence
of so-called mark bits in some of the previously designed data base
machines.
operations

Moreover, it provides fast algorithms for interrelational
such as

11

join. 11

ASLM is also microprogrammable which

gives more flexibility to the system.
The design of the ASLM differs from the majority of the data
base machines based on Slotnick's idea:

first, the separation of the

cells from the secondary storage will

result in a cost effective

system in comparison to the other machines.

xvi

This also eliminates any

restriction on the secondary devices.

Second, s i nee ce 11 s are in-

dependent of each other there is no need for interconnection network
between the cells. Third, ASLM is implemented by associative memory,
the closeness between associative operations and data base operations

reduces the existing semantic gap found in the conventional

system, and fourth, ASLM is expandable to the MIMD class of machines.

Abstract approved:
Dr. Amar Mukhopadhyay

Professor of Computer Science

Date of Approval

xvii

CHAPTER I
Introduction
In recent years we have witnessed a tremendous growth in the use
of

digital

systems

in

varied applications

of

human

civilization.

These digital systems, which evolved as a result of the efforts over
several

centuries, have an important property:

they are capable of

performing reliable calculations with the speed of millions of operations per second far exceeding the computing power of human beings
and mechanical

devices.

The

innovation

of these

new devices has

changed our lives, and yet their real effect on our societies began
three decades ago when the concept of
introduced:

11

II

stored program machi ne 11 was

A computer with a storage component which may contain

both data to be manipulated and instructions to manipulate the data 11
(Adam and Haden 1973).
During the last three decades the environment of the use of the
di gi ta l

computers has advanced from sing -l e application, single file

program to batch processing, to multiprogramming, to multiprocessing
and to

interactive time sharing systems.

The architecture of the

machine has evolved from a uniprocessor to multiprocessors, to parallel

and distributed processors in closed loop or network configura-

tions.

The contra 1 unit of the machine has been developed from a

simple hardwired circuit to a flexible and complex microprogrammable
unit.

The memory has evolved from a single low speed device to an

2

interleaved parallel

modular system.

The

speed gap

and size gap

between main memory and secondary storage have been reduced by the
introduction of techniques like memory hierarchy and virtual storage.
The

input/output

devices

have evolved

from

simple and

slow card

reader and printer to movable head disk to fixed head disk to interactive devices with a corresponding increase in the complexity of
dedicated

channel

processor

and

interrupt

handling

capabilities.

These developments were necessary in order for the computers to
be able to cope with the increased demands on the information processing capabilities .

These improvements were possible because of

the advancement of technology.

The proliferation of computers in all

spheres of human civilization during the last decade has created a
new

computing

environment

which

even

the

most sophisticated von

Neumann machine (Knuth 1970) is incapable of handling.

The informa-

tion processing or the data base management systems is an area in
which

the

traditional

von Neumann machines are inadequate and new

ideas in computer architecture are needed.
In the past two decades the computers have been used in di fferent areas such as artificial intelligence, management information
systems, military and corporate logistics and medical diagnosis.

All

these areas are developed based on the systems' capability to store,
maintain and access

data bases

of varied size.

Such systems are

referred to as DBMS or Data Base Management Systems.

As an example,

consider the Sperry Univac system which processes 170,000 transact i ens
1978).

per day on a data base of 300 million characters (Champine

3

These systems generally have been imp 1emented on conventional
computers which are based on the von Neumann design.

In this design

operations are performed on the information in the memory which are
accessed by means of their addresses .

Because of the typical size of

the data bases and the cost of memory, it is impossible to hold all
information in the main memory in order to perform a search through
the data .

This means that blocks of information relevant to the

current computation are held in the memory at any given time.

If a

block resident in the secondary storage is needed, it has to replace
some of the existing blocks in main memory.

If this exchange becomes

too frequent it creates a bottleneck problem at the channel processor.

On the other hand, conventional systems have to trans fer large

sets of data from their mass storage to the CPU, where simple compare
functions

are

performed

in order to separate relevant data from

irrelevant data. Figure 1.1 illustrates this fact:

which is known as

the 90% - 10% rule in data base literature (Champine 1979).

Among

all the data which is transferred to the computer only 10% of data is
re 1evant data and 90% of data is i rre 1evant data.

Moreover, among

the 10% of relevant data, only 10% is utilized to produce the result.
The differences between convent i ona 1 computers and app 1 i cation
requirements

for

information retri eva 1 introduce i neffi ci enci es in

both the processor and storage.

Moreover, the von Neumann machines

are inefficient by order of magnitude

in executing operations on

non-numeric data typically used in data base computation.

Numeric

computations can be performed faster than non-numeric computations
because of built-in arithmetic hardware.

Today's supercomputers can

4

HOST

Application
Programs

From 90% of relevant data
o produce 10% of answer

Operating
System
From 90% of raw data to
produce 10% of relevant
data
Data Base
Management

On-line

raw
data

I-0

Controller

mass storage

Figure 1 · 1

Conventional approach

5

execute between 50-250 million floating point operations per second,
whereas the maximum pure string pattern matching rate of a carefully
designed optimized

SNOBOL compiler on IBM 360/65 is 0. 5 million

characters per second (Gi mpe 1 1976).

Even for a fast machine such

CRf\'/

as GAR-¥--1 the string pattern matching rate is about 10-12 mi 11 ion
characters per second (Figure 1.3).

The relatively slow performance

of these machines can be attributed to the lack of built-in hardware
that can handle non-numeric operations on variable length dynamic
operands, as opposed to fixed size numeric entities.

A simple pat-

tern matching, searching, deleting or retrieval operation on character data when encoded at the machine level could look quite complicated compared to a reasonably complex arithmetic assignment statement.

In most artificial intelligence work, a significant part of

the computation is concerned with dynamically varying non-numeric
information structures.

There has been a similar upsurge of activity

in non-numeric computation involving finite graphs and the development of efficient combinatorial algorithms on non-numeric data entities (Reingold et al. 1977; Deo 1974).
The remainder of this chapter discusses different approaches to
the data base management system which overcome some of the inefficiencies of the systems based on von Neumann design.

This discussion

covers the simplest software solutions to the most advanced hardware
so 1 ut ions.

Moreover, the advantages and the disadvantages of di f-

ferent approaches will be discussed.
Approaches to DBMS

6

Software Implementation
Since the early 1960' s, new designs have been proposed to improve the effectiveness of the conventional systems for data base
management.

These designs are based on specialized hardware/software

to support basic data base management functions found in most contemporary data base management systems.
The first improvement was based on the reduction of the amount
of data fl ow between main memory and secondary memory by means of
sophisticated software systems and additional redundancy of data such
as directories, which are based on partitioning of the data file.
these

techniques,

address

By

of information can be obtained from a

directory and a search will be followed through a part of the data
file rather than all of the data file.
Although directories have partially solved the bottleneck prob1em, they nevertheless have created some additional problems.
directory should logically be kept in the main memory.
base

naturally

implies

a large directory,

The

A large data

and a large directory

occupies a large portion of the main memory (Bird et al. 1977; Haskin
1978).

The use of a di rectory al so creates some complexity in the

search,

update

simple:

first, the address or location of a record is obtained by a

and

delete

algorithms.

These

algorithms are not

search through the directory, and then the data file is searched and
manipulated.

Moreover,

most of the time, manipulation of the data

file implies manipulation of directories.

These procedures not only

increase the size of the algorithms (space) and execution time, but
also create overhead in time sharing and interactive systems.

For

7

example, where different users have access to the data file, sophisticated routines should be devised in order to enforce the integrity
and security of the systems.
Another drawback of the conventional system (von Neumann machine) is due to the accessibility of data by its address rather than
by its value.

In the von Neumann systems the main memory is a random

access memory; therefore, information is accessed by address rather
than value.

This creates complexity in the software systems due to

the so called

11

name-mapping resolution 11 problem.

Before execution of

each instruction, real values of operands should be accessed based on
the specified addresses in the instruction.

As a result, transferred

data from the main memory to the CPU are mostly the addresses of
operands.

Moreover, most of the CPU time is consumed for calculating

the addresses of the operands. Table 1.1 shows the frequencies of the
operations (Kuck 1978) for conventional programming languages.
access operations have the highest frequencies.

Data

Data access opera-

tions are those which need access to the main memory for obtaining
data.

If we remember, arithmetic operations also need access to the

main memory, we can say that over 50% of operations on the conventional computers are memory access operations.
data base systems have

confirmed

the

same

Similar studies on
result (Rose 11 1976)

Naturally, if information is accessed by its value, memory accesses
can be reduced and as a result, operations could be executed faster.
Accessing the data by its address will create another problem due to
the access gap between the main

memory

eye le

ti me

101 µ second) and secondary storage access time (higher

(less than
than 10

4

µ

8

TABLE 1.1.
DYNAMIC OPERATION FREQUENCIES(%)

Instruction
Type

Study I
(IBM 7090 Compile
and Scientific)

Study II
(I BM 360 Compile
and Scientific)

Study II I
(IBM 360 Scientific and
COBOL

Store/Fetch
Index

31. 2
18

Data Access
Total

49.2

50.4

42.9

Branch
Compare
Shift/Boolean

16.6
3.6
6.0

19.9

29.6
11.2
4.1

Logic Total

26.2

11. 2

44.9

7.8

7.4

Fixed Point
+

6.1

X

I

0.6
0.2

Fixed Point
Total

6.9

Floating Point
+

6.9

SP

DP

X

3.8
1. 5

4.5

6.0

I

Floating Point
Total

12.2

10. 5

2.6

Other

5.3

.2

2.2

TOTAL

100.0

100.0

100.0

9

second) from one side, and the main memory and processor cycle time
from the other side, which forces the processor to be idle most of
Figure 1. 2 shows the access gap between the main memory

the time.

and the secondary storage; it also shows the access time of different
technology vs. the cost per bit.

Figure 1.3 shows the evolution of

the processor cycle time over the last 20 years for different machines.

In spite of all the improvement in stroage and processor

eye le ti me,

the access gap seems to be the fundamenta 1 parameter

which limits the speed of the computation.

Software Data Base
From the users point of view another major inefficiency of the
conventional systems is due to the complexity of the algorithms. The
second approach to the data !Jase management system is based on the
concept of software data base system in order to lower the complexity
of the application programs (Maryanski 1980).

Figure 1.4 shows a

data base software system and the communication paths between different modules of the system.

As can be seen all the data base func-

tions will be handled by a software system.
base management system.
1)

Figure 1.5 shows a data

The major components of this system are:

DBMS which is responsible for handling the data base functions.

2)

The application programs.

3)

The operating system and its buffers.

4)

The logical description of the data base (schema).

5)

The different user I s view of the data files (subschema).

®Core
I

10 t-

1

r

CJ

I

Bipolar

MOS

I
.µ

.....

10-1

..0

I

s...

I

I

I

! ,,--,.

I

QJ

0..
V')

.µ
C

I

Fixed-head
pi sk/drum

10-2

QJ

u

.µ
V')

10-3

0

u

10-4

10

Moving-head
disk

Access Gap

-5
I

,o- 3

I

,0- 2

I

,0- 1

I

1

10

10 2

I

10 3

I

I

10 4

I

(

10 5

I

10 6

Access Time (Micro Seconds)
Figure 1·2

The access gap between main memory and
conventional secondary storage

I--'

0

11

200

'""'
(/)

-360/65

150

-0
C:

0

u

a,
(/)

0
C:
~

C:

a,

100

E
•rta,

.-u

>,

u

• 370/168
eCDC STAR

50

CDC 7600•

65

70

75

YEAR (1900)

Figure 1·3

Memory cycle time

80

A general-purpose computer

User reques

Answer
Applications program
~

Operating System

o-----

I i

l

1/

Database management
system

Databases

Figure 1·4

Software data base management

I--'
I'\.)

13

6)

The work areas, which are shared between application programs and DBMS.

This so 1 ut ion decreases the comp 1exi ty of the app 1 i cation programs from the user 1 s point of view, and yet the other inefficiencies
of the conventional system in performance, such as accessing data by
address, bottleneck problem and inefficiency of handling non-numeric
operations remain as major drawbacks of the system.
Backend Approach
The next step in the development of the software data base
system is the concept of the software backend machine.

The concept

of backend approach is based on the divide and conquer principle,
which states

11

partition the problem into smaller parts, find solu-

tions for the parts and then combine the solutions for the parts into
a solution

for

the

whole

problem (Aho et al. 1976).

The backend

processor can be thought of as a master-slave configuration; a general purpose host computer (master) is backed by a slave machine.

The

backend machine performs all of the data base functions on the data
files (Maryanski 1980).

The host machine is an interface between the

users and the backend

machine.

Figure

1. 6

shows this concept.

The comparison of Figure 1.4 and 1.6 shows that the host computer has
been freed from the DB functions.

In general the backend system

improves the throughput of a data base system if the backend system
is mu 1t i programmed, a high speed 1 ink is ut i1 i zed between the main
frame and the backend machine, and substantial demands for data base
access exists (Heacox et a 1. 1975; Rosentha 1 1977).
hand,

On the other

the performance would be degraded because of the interface

14

Application Program
a

0

Work Area
Application Program
m
Work Area

p

s

E

y

R

s

A T
T

DBMS
Buffers

I
N

Schema

Figure 1·5

subSchema
a

.. . . . . . . . .

subSchema
m

Data base management system

G

E

M

General-Purpose Computer System
User Request
I

System
Response

Applications Programs
Operating System

I

I

t

I

Interface

0

H

Data Base
Functions

I

Back-End
Computer

( Freed)

Database
Host Computer

Figure 1·6

The backend concept
1-1

u,

16
communication and the intercomputer transmission of data.

Because of

this, the execution of a single data base command requires more time
in a backend design than in a single machine.

Therefore, the amount

of concurrency should be large enough to eliminate loss of overhead
on the single instruction execution time.

In general the concept of

backend machine is more efficient than the conventional computer in
respect to the cost, reliability, security and modularity.
The concept of the backend machine can be cl ass ifi ed into two
classes, software backend and hardware backend machine.

Figure 1. 7

shows a software backend machine and Figure 1. 8 shows a software
backend DBMS.

A comparison between Figure 1. 5 and 1. 8 shows the

major differences between a data base system and a backend data base
machine.

The communication system must provide enough facilities for

the transmission of data, instructions, and information between the
host and backend machines.

In the software backend concept, the data

base functions which are performed by the backend machine are implemented by software routines.

In other words, the backend machine is

a general purpose minicomputer.
Hardware Approaches To Data Base Management
The recent growth of the data base management technology in
varied areas of applications and the corresponding increase in users
requirement have increased the size and complexity of data base management software.

This increase and the lack of practical verifi-

cation method for software systems create unreliable systems (Hoare
1969).

Advances

in technology have overcome some of the problems

(i.e. cost, capacity and logic complexity) of the hardware design of

17

Application
Program
Host
Operating
System
I

Query

Answer
'

Backend

Database
Management

On-line

raw

I-0

Controller

data
~

r---.

__../
~

....

---

Data base

Figure 1·7

Software backend approach

~

18

Application Program
a
Application Program
m

Work Area

0
p

A

s
y
s

T
I

M

E

Work Area

Host Interface

R

N

T HOST
E

G

Corrmunication System
'

.,

Communication System

0

p

E

Backend Interface
DBMS
Schema

SubSchema

SubSchema

j_
t

~

__,,.

Database Database N
Task a Task m G
Buffers

r

r::::: t __..

-

Figure 1·8

R

A
T
I

A software backend DBMS

s
y
s

BACK
T END
E

M

19
data base systems.
relatively easily.

Moreover, the hardware design can be verified
There are four approaches to the handling of data

base systems by hardware:
Memory Hierarchy
Intelligent Peripheral Control Unit
Network Node
Backend Processor
Each of these approaches

is

independent of the other and a

system may include more than one of the above concepts in its design.
Memory Hierarchy.

This approach directly addresses the concept

of access gap which was mentioned before (Lipovski and Doty 1978;
Rege 1976; Denning 1970).

The memory hierarchy reduces the ratio of

access time between main memory and secondary storage from one side
and CPU-main memory from the other side, by defining intermediate
storage between CPU-main memory and main memory-secondary storage.
Two approaches have been taken to reduce this gap, Slave Memory and
Distributed Memory.

Figure 1.9 shows a hierarchy of N levels (a

practical value of N is 4).

Generally, the lower the number of the

level , the faster its speed, the higher its cost and the lower its
capacity.

The hierarchy of memories is an elegant solution for the

access gap problem which is based on the concept of the locality of
reference (Liptay 1968).

The real drawback of the memory hierarchy

is that at each moment during the execution, there might be more than
one copy of the part of program or data in the memories.
should affect all

Each update

the copies, moreover, the replacement algorithm

Processor

~

Primary Memory

MN-1

--

~

MN

Secondary Memory

Figure 1·9

Memory hierarchy structure

N
0

21

governing the overall scheduling of the system should be chosen very
carefully because of its effect on the performance of the system.
Intelligent Peripheral Control Unit.

In this approach the data

base access facility is moved out to the mass storage (Slotnick 1970;
Parker 1971).

The basic functions of the device scheduling, data

recovery, head positioning, searching, sorting and error correction
are implemented at this level. Figure 1.10 shows this concept.

One

aspect of the intelligent peripheral control unit is the elimination
of the 90%-10% rule. The major drawback of this approach is the cost.
This approach will be discussed in detail in Chapter II.
Network Node.

In this, a general purpose computer is used which

communicates with severa 1 other nodes
1980).

in the system (Wah and Yao

The benefit of this approach is that several nodes can access

a single shared data base.

Therefore,

the concept of data base

security and integrity have to be considered on this approach.
Backend Hardware Processor.

The concept of the software backend

machine was discussed before. The backend hardware approach is the
same as the backend software approach except that the backend processor will be implemented by incorporating more hardware for handling
data base functions (Maryanski 1980).
machine is a special purpose machine.

In other words, the backend
Therefore, the inefficiencies

of the backend software such as handling of non-numeric operations
as we 11 as numeric operations and accessing data by va 1 ue could be
eliminated.

Figure 1.11 shows a backend hardware data base machine

22

HOST

Application
Program

Operating
System

Data base
Management

Relevant Data
Orders
I nte 11 i gent
Controller

Figure 1·10

I

Intelligent controller solution eliminating one application of 90% - 10% rule.

23

HOST

Application
Program

Operating
System

Back end

Data base management
and intelligent
..---~------,------control

Figure 1 · 11

Hardware backend solution
elimination of any application of
90% - 10% rule.

24

implemented by intelligent control. The Figure 1.11 also shows that
by off 1 oadi ng the host,

the 90%-10% rule is totally e 1 i mi nated.

Technology Trends
This dissertation is concerned with the design of a hardware
backend data base machine.

The design and implementation of the

hardware data base machines have been motivated by the advances in
the technology over the past 20 years.

These advances enable us to

close the existing semantic gap, (the measure of differences between
programming 1anguages and the hardware architecture:

Chapter II I),

size gap and access gap of the conventional systems.

In the rest of

this chapter we wi 11 discuss the trends in some of these improvements:
First

-

The

development of technology from vacuum tubes to

large seal e integration has caused processing ti me and cost to be
decreased while size of memory has increased.

Figure 1.12 shows the

increase in memory capacity as well as reduction in memory cycle time
during the past 10 years.

Table 1.2 shows the increase of the chip

size in the near future.

It shows that by mid 80 1 s the chip size

will be increased by a factor of 16, while the cell size (representing a bit of memory) will be decreased by a factor of 6 to 7.
Second -

The

semiconductor

technology has progressed toward

the development of cost effective seri a 1 access secondary devices,
such
(MBM).

as

charged

coupled devices

(CCD) or magnetic bubble memory

The impact of the technology has recently been investigated

in the design of the future systems (Slana 1977; Toombs 1978a; Toombs

25

I

10

I
I

V)

-0

-s::::
V)

0

Q.)

u

>,

V)

..µ Q.)
.0 0
(tj

s...

O'>U

-Q.) .,....

EE
>,

..µ

5

Q.)

E

•r-- •r--

UI(tj

0.. Q.)
(tj r--

u u

>,G
s...
0

>,

Q.)

0

Es...

~E

Q.)
~

0

1960

1970

1965

1975

1980

YEAR

Figure 1·12

Effective main memory cycle time declined, capacity
1ncreased

26

TABLE 1. 2.
INCREASE IN CHIP CAPACITY

2
Chip Size (µm )

Year

Chip Size bits

1978

16K

350

450

1982

64K

150

200

256K

50

75

1985-87

27

1978b; Juli us sen 1977; Juli us sen 1976; and Boyle and Smith 1971).
Tab 1es 1. 3, 1. 4 and 1. 5 give a rough idea of the scope of current
technology in CCD 1 s and MBM 1 s.

A simple chip of 256-K bit CCD module

for $3 . 00 will be available in 82-83 and quarter million bit devices
containing 1024 blocks of 256 bits each costing 30 to 50 millicents/
bit should be on the market soon (Slana 1977).

It is forecasted that

by the early 1980 1 s CCD and MBM memories of the multimegabit will be
implemented; Tables 1.4 and 1. 5 show this fact. One advantage of CCDs
over the metal-oxide semi-conductor (MOS) is its higher bit density
and lower cost per bit, so it could supplement dynamic MOS RAMs on
the main frame of computers as auxiliary memories, and as a result
overall computer system performance could be improved at constant or
lower total memory cost.

Magnetic bubbles are particularly interest-

ing because they offer non-volatile storage.
bubbles with

conventional

magnetic

storage,

Moreover, in comparing
disk systems have a

relatively large initial overhead cost, which is also true for MBM.
Thus today, MBM is a cost effective replacement for movab 1e head
disks at a capacity of lM bits and for fixed-head disks at a capacity
of 9M bi ts.

It is forecasted by the mid 80 1 s MBM wi 11 be a cost

effective replacement for moving head-disk at 4M bits and for fixedhead disk at 40M bits (Eichelberger et al. 1978).
CCD and MBM reduce the existing "access gap" between the conventional

memories (<lµs access time) and the slower magnetic serial

access memories (>5 x 10 3.µs access time).

Therefore, they will fill

out the storage capacity between core memories and other storage
devices.

Figures 1. 13 and 1. 14 shows how CCD and MBM wi 11 fi 11 up

28

TABLE 1. 3.
BUBBLE MEMORY OBJECTIVES

Objective
Characteristic
1977

Bubble

1978

size

Storage density

5

µm

155 Kbi ts/cm

Capacity

100

Access Time

2 - 4

1980

2

2

256

ms

50 Kbits/s

Packaging

14 - Pin Dip

Standby Power

µm

2

Kbits

2

1

ms

250 - 500 Kbits/s
16

-

Pin

Dip

< 1 watt

< 1 watt

Power Dissipation

3

620 Kb its/ cm

Kbits

Data Rate

-

0

0

System:
Data Rate

> 50 Kbi ts/s

Storage Capacity

~

Packaging
Capacity/board
Controller

0.1 mbit
Pc board

~
~

25 0 Kb i ts/ s
0.25

Mbit

Pc board

.5 - 3 mbits

2.5 - 10 mbits

Mpu based

Mpu based

29

TABLE 1. 4.
COMMERCIAL CCD MEMORIES

Device

Number of Bits

Chip Size (mm)

Intel 2416

16384

3.6

X

6.0

Fairchild CCD 460

16384

5.6

X

5.1

Fairchild F 464

65536

4.4

X

5.8

TF TMS 3064

65536

5.0 X 5.5

30

TABLE 1. 5.
CCD-MEMORY PROJECTIONS

Capacity

Minimum

Cell

Chip Size

bits

Geometry (µm)

2
size (µm)

2
(mm)

Year

256K

2.0 - 2.5

45 - 65

20 - 28

1979

lM

1.0 - 1. 5

13 - 26

26 - 45

1981 - 82

4M

0.75 - 1.0

7 - 12

53 -106

1985 - 86

C)

10

~Core

1
Bipolar

M~

...-.µ

•r-

..0

10-l

s...

QJ

\J 6

0..
l/)

.µ
C:

10-2

QJ

u

i

..__.
.µ
l/)

0

u

10- 3

10-4

r

Electron B

addressableam
memories e

Bubble

Fixed Head
Disk/drum

CJ

I

I

I
I

I

Moving Head
Disk

acess gap

I

10-5

10- 3

10- 2

10- 1

1

10

102

103

104

105

Access Time (Microseconds)
Figure 1·13

Charge coupled device and magnetic-bubble
memory have filled the access gap
w

t-,J

32

1.0

MOS RAM
FixedHead Disk
+->

•r-

.Cl

s...

1

Electronbeam
addressable
memories

~
~

Q)

0..

~
Floppy
Cassette ~ Disk

l/)

+->
C

Q)

~

u

Fixed~,
Moving-head',,
Disk
',,

Q)

u
s...

•r-

0..

s...

~01

''

Q)

l/)

::,

Removable
Movinghead Disk

I

-0
C

w

0.1

1

10

100

Storage Capacity (megabit)

Figure 1·14

Charge-coupled device and magnetic-bubble
memory have filled the storage gap

1000

33
the

access

and

storage gap between main memory and conventional

secondary storage.

Figure 1.15 shows a possible memory hierarchy and

Table 1.6 gives the cost and access time for this hierarchy.
Because of the above facts, it is forecasted that in the near
future

these

two

tech no l ogi es wi 11 be used more on data base ma-

chines. However, for large capacity on-line data base (i.e. more than
10 megabits), it appears that moving head disks will remain considerably less expensive compared to the CCD and MBM.

Figure 1.16 shows

this fact.
Finally,

the development of medium to large size associative

memory wi 11 al so have an impact on the future organization of the
backend machines,

s i nee it uses a fundamentally different mechanism

to access and manipulate information.

The concept of the associative

memory is as old as the second generution machines (Hayes 1978; Bird
et al. 1977; Yau and Fung 1977; Finnila and Love 1977; Foster 1976;
Lea 1976; Lewin 1976; Berra 1974; Linde et al. 1973).

Si nee then a

great deal of literature has been published about associative memory,
its advantages and disadvantages (ref. 1.5 in (Yau and Fung 1977)).
In general, associative memory is valuable because of accessibility
of data by
offers.

its

content,

and the concept of para 11 el ism which it

Its implication on the architecture is that it can reduce

the semantic gap between the higher level languages and the hardware
organization which

implement

associative memory

is

it.

The

instruction capabilities of

usually grouped into two categories, search

instructions and arithmetic instructions.
we

can also perform simultaneous

In an associative memory

comparisons

and mass

arithmetic

34

CPU

Fast Interface

RAM

Fast Auxiliarly
Memory

Mass Storage

Figure 1·15

A Memory hierarchy

35

TABLE 1.6.
CHARACTERISTICS OF FIGURE 1.15

Level 1

2
Cache memory in I L, bipolar, or fast MOS costing
1.6 to 16 cents/byte, access time of 10 - 100 nsec.

Level 2

Main memory in MOS, Core or CCD costing .6 to 2
cents/byte, access time 400-1000 nsec

Level 3

Auxiliary memory in CCD, bubble or beam access
costing .16 to .4 cents/byte, access time 10 - 50
nsec

Level 4

File memory in bubble, beam access or disk costing
.04 to .16 cents/byte, access time 1 - 50 msec

36

.1

U')

CCD

.µ
C

OJ

u

.µ

.,...

Bubble

..c
S-

OJ

0.

Fixed Head
Disk

.µ
U')

0

u

.01

E

OJ
.µ
U')

~

V>

1978--

Early----- 80's

Moving Head
Disk

. 001 ' - - - - - - - - - - L - - - - - - - - - - - - - - - - - - - ~
1
10
100

Capacity (megabits)

Figure 1·16

Capacity vs. cost for different technology

37
operations (Foster 1976).

Unfortunately, because of its relatively

high implementation cost,

associative processors have so far been

used in very restricted form in conjunction with conventional systems
(Linde et al. 1973; Moulder 1973; Defiore and Berra 1973; Gaines and
Lee 1965; Lee and Paull 1963).

Because of rapid development in large

scale integrated (LSI) circuit technology, the implementation cost of
associative processors has been reduced, and it is anticipated that
associative processors wi 11 be used more extensively for enhancing
the performance of many current systems (ref. 1. 5 in (Yau and Fung
1977).
Objective of this Dissertation
The conventional general purpose computers can solve a variety
of problems, but in the case of special tasks such as
management these machines are not efficient.

data

base

This i neffi ci ency is

due to the bottom-up approach which has been taken in the design of
these systems.

Two features of the von Neumann design

which

have

direct effects on the inefficiency are:
i)

selection of the valid data among all the data is performed
on the information which is in the main memory by means of
its address.

Since it is not possible to hold all informa-

tion in the main memory, a great amount of information has
to be transferred between main memory and secondary storage
or in other words, swapping converts the search problem to
a transportation problem.
ii)

lack of hardware orientation of the conventional system for
handling non-numeric operations.

38

The inefficiency of the conventional systems calls for develop; ng new architecture for future machines.

This chapter described

some of these solutions for handling this inefficiency.

Because of

the high cost of the hardware technology it was not possible to have
a drastic change in the design of the convent i ona 1 genera 1 purpose
computers. The general trends of the technology show that today or in
the near future, we are capable of incorporating more hardware in the
design of new machines.
i)
ii)
iii)

In new systems:

the transportation of data should be reduced;
non-numeric information should be manipulated efficiently;
information should be accessed by value .

This dissertation addresses the design and implementation of a
new high level data base language and its hardware implementation
ASLM

(Associative

Search

Language machine).

ASLM

is a cellular

backend hardware capable of accessing information by value and handling non-numeric data as efficient as numeric data. Therefore, it
satisfies the above-referenced problems.
The query language ASL (Associative Search Language) is capable
of information retrieval and storage operations on variable length
records, and is defined based on the rel at i ona l data mode 1.

Among

the features of ASL we can talk about:
i)
ii)
iii)

its completeness;
the simplicity of its productions; and,
the similarity of its productions to the arithmetic expressions.

39

Therefore, ASL can be used by the users who do not possess a high
degree of mathematical sophistication.
ASLM is a language oriented backend hardware machine capable of
performing non-numeric operations on the data as well as numeric
operations.

One important aspect of ASLM is the accessibility of

data by value.
In the design of ASLM, the existing semantic gap in the conventional system has been greatly reduced.

This reduction is the result

of the topdown approach which has been taken in the design of ASLM.
Accessibility of the data by value implies the incorporation of the
associative memory in the system.
ASLM is designed, based on the cellular organization a design
similar to the Slotnick's (Slotnick 1970) idea with an important
exception.

In the design of ASLM, the processing unit of each read/

write head is a part of the backend machine.
cost effective system.

This exception gives a

The general strategy behind the execution of

the operations in the ASLM is based on the 90-10% rule:

first data

file will be presearched and then operations will be performed on the
explicit subfiles which are stored in the associative memory.

The

generation of the explicit subfile has significant effect on the
organization of data on the secondary storage and the execution time
of the interrelational operations such as

11

join 11 operation.

ASLM can be classified among SIMD (Single Instruction Stream
Multiple Data Stream) class of machines, but it is possible to change
its class from SIMD to MIMD (Multiple Instruction Stream Multiple
Data Stream), and this will increase the power of ASLM.

40

In the rest of this dissertation the design and implementation
of ASL and ASLM will be studied.

Chapter II contains the background

discussion related to the research problem, with an overview of the
proposed computer architecture based on the rel at i ona l data model.
In Chapter II I, the ASL language along with related topics such as
the formal definition of the language and its implemented interpreter
are discussed.

The hardware design of ASLM and the flow of data, and

the sequences of the operations are the contents of the Chapter IV.
This chapter addresses the important features of ASLM.

Chapter V

addresses the ASL primitive operations, their formats and functions.
In Chapter VI the timing sequence of the ASL primitives and different
data base operations based on the current technology will be developed, and finally Chapter VII will address the future development of
ASL and ASLM in detail.

41

CHAPTER II
BACKGROUND AND PREVIOUS WORK
Introduction

The need for the implementation of a new hardware design capable
of performing data base operations has been discussed in the previous
chapter.

Recently,

severa 1 such speci a 1 purpose architectures cap-

able of performing DBMS functions have been proposed and implemented.
In this chapter, the concepts

11

associative processor 11 and

11

rela-

ti ona 1 data model II which have been used in our hardware design wi 11

be explained.

This chapter will

also review the literature of pre-

vious work in the areas related to our design.
Associative Processor
Access by address is a traditional method by which information
is retrieved.

The von Neumann machines translate the variable names

such as A or B to execute a statement like
able B.

11

11

add variable A to vari-

Since variables A and Bare in RAM, the mapping of A and B

into the addresses in RAM is efficient if the size of RAM is small.
But, such an approach would be very time consuming and complex for a
statement 1 i ke

II

obtain a 11 employee names working in the accounting

department 11 s i nee the information about the employees is stored in a
secondary storage, and information has to be read into the primary

42

memory from the secondary memory.

Unfortunately, convent i ona 1 sys-

tems (von Neumann machines) present only a sequential view of storage
and permit storage to be accessed by locations.
One proposed so 1 ut ion to this prob 1em is to access storage by
content,

thereby

avoiding

address

mapping algorithms.

A content

addressable memory or an associative memory is defined to be:
•

•

•

11

a store whose registers are not

identified by their name or position but
by their content.
A content
1)

11

addressable

memory

has

the

following

properties:

access to the storage cells (e.g. words or records) occurs
in parallel;

2)

information in the memory can be proces sect simultaneously
in all the storage cells;

3)

searching

and

comparison

operations

are built into the

logic of the memory; and,
4)

searching time is independent of the number of cells in the
memory.

Definition:

An

Associative

processor

can

generally

be

described as a processor which has the following properties:
1)

stored data i tern (words or records) can be accessed in
parallel using their content or part of the content, rather
than their addresses; and,

2)

transformation of the data in each cell by arithmetic and
logical operation can be performed over all the words with
a single instruction.

43

The typical
Figure 2.1.
1)

components of an associative memory are shown in

These are:

The memory array, which provides the registers for storing
data;

2)

The

comparand

register,

which

contains

the

data to be

compared against the contents of the memory array.
register

This

is also used to communicate between a central

control and the memory array, and typically has some shifting and other logical functions associated with it;
3)

The mask register, is used to mask off portions of the data
words from comparison and other operations;

4)

The match/mismatch i ndi ca tor register ( response register)
is used to indicate the success or failure of a search or
comparison operation.

For each word in the memory, there

is a corresponding bit in the indicator register;
5)

The

multiple

match

resolver,

is used to narrow down a

specific location in the array in case more than one word
satisfies the search or comparison criteria;
6)

Search logic, is implemented in hardware; and performs the
search

7)

operations

issued

from

a

central

control;

and,

Input/output buffer and logic, for multiwrite operations,
which allows for the parallel read operation as well as the
capability to

enter words

in

parallel

into the memory

array.
The following example illustrates the way an associative memory
operates.

The

data

consists

of employee's number,

name,

salary,

44

□
□

Comparand Register

Mask Register

R

-----

E

Memory Cell Array

s

-----

p

0

s

E

R
_..,.

3:

N

E
G

C:

......

--

::::0
Cl)
V>

0

......

<

Cl)
~

T
E
R

-

An associative memory

Cl)

3:

p,,

c+
("')

I

Figure 2·1

-'•

::r

s

Input/Output Buffer

c+

"O
......

45

degree,

and department and is loaded in a fixed length format.

A

query to determine the name of a 11 emp 1oyees who earn more than
$20,000 is processed by setting the comparand and mask register (as
shown in Figure 2.2), and executing a "greater than" search.

All the

data in the array will be checked simultaneously, in bit serial word
parallel fashion against the stored value in the comparand register.
Those words which satisfy the given criterion wi 11

be marked by

setting the corresponding bit in the response register.

It is possi-

ble to use the response register to provide greater flexibility of
power in the associative algorithms that need selection of the operand based on the previous computation.
The

architecture

of associatve

processors

can

generally

be

classified as:
1)

Fully parallel systems - the basic associative operations
are performed in parallel on each word or with respect to
al 1 the

bits

of the memory array.

The fully parallel

systems can be further grouped into a) word organized orb)
distributed logic
a)

the major characteristic of a word organized associative

memory

is

that the

comparison capability

is

associated with each storage cell of all words in the
memory.

Therefore,

the

associative

operation

is

performed in a parallel by word and parallel by bit.
In other words, operations are performed simultaneously in all the storage cells. Although this type of the

46

Comparandl.....__________2_0_0_0_0_____,

0

__,JI

0

._____ _ _ _ _1_11_11_ _
M-125
A- 110
N-090
A-102
M-011
N-050
A-111
A-125
N-021

Smith J.
Johnson, C.
Evans, S.
Endress, B.
Goets, L.
Sheet, D.
Boldon, B.
Ooas, H.
Roof, J.

19000
18000
25000
21000
17000
19000
22000
20000
18500

5 M

r

5 A
7 N
4 A

e
s

6 M

5 N
7 A
6 A
4 N

p
0

n

s
e
r
e
g

i

s
t
e

r
s

Figure 2·2

An example of associative memory

47

associative

memory

is

the

fastest one compared to

other types of the associative memory, its hardware is
also the most expensive and complicated.
b)

The distributed logic associative memory is a fully
parallel one dimensional associative memory where the
comparison capability is associated with each storage
cell or a group of storage cells denoting a character.

2)

Bit Serial Systems - the basic associative operations are
performed in para 11 el on one bit of each word at a ti me
(bit-slice).

Since the number of words in the memory array

is usually larger than the number of bits in each word,
this

idea

seems

to be a good compromise between fully

para 11 el processing in which the entire array is involved
and word serial processing in which the words are processed
in sequence.
slice"

An extension of the "bit-slice" idea is "byte

in which

the

orthogonal

parallel

computation is

performed over a byte or a subfield of the words.
3)

Word Serial Systems - if the memory array is not large and
if the operations are not too complicated and time consuming, a reasonable design philosophy would be word seri a 1
associative

processing.

This

approach

is

particularly

useful if the storage medium is a rotating device or if the
information
processing

is in a state of circulation.
is

actually a

hardware

Word serial

implementation of

a

program loop in which instruction need not be decoded every
time around the loop.

48
4)

Block oriented Systems - for large information files residing in secondary rotating devices, this approach is cost
effeci tve, s i nee the so-called 90-10% rule will be largely
eliminated between secondary storage and main memory.

This

type of associative memory is a compromise between the high
cost of the bit serial associative memory and the low speed
of the word serial

associative memory.

provide "logic per track"

The idea is to

(Slotnick 1970) capability for

each track .of a rotating secondary storage.
Associative processors have a substantial amount of logic in
comparison to the conventional random access memories

(Foster 1976).

One consequence of this is the impression that associative processors
must be extremely costly, which is not necessarily true.
for this are:

The reasons

the economics of circuit technology are changing such

that the design cost of a circuit is much more significant than the
cost of producing it provided that it is manufactured in large quantities.

Moreover, in conventional systems, based on the 90-10% rule

most of the transferred data between secondary storage-main memory
and main memory-processor are not relevant data.
later

As will be seen

( Chapter IV) associative processors not only eliminate the

90-10% rule, but they also provide fast retrieval and update operations with a considerably simplified software program management.
The concept of associative processing was first introduced by
Slade and McMahon (1956) who described the design of a cryogenic
memory system.

Since then,

associative memories have been imple-

mented using different techniques.

49

Moreover, the changes in the technology have increased the size of
associative memories with a drastic reduction
Fung 1977).

in

the

cost (Yau and

Associative memory board of 4K is now available.

Lamb

and Vanderslice (1978) describe a version of associative memory, in
which the words are of 256 bytes long and user is capable of defining
records of one, two, three or more bytes.

Multiple boards can be

accessed in parallel, therefore, multiple boards can be linked (up to
boards) as a single associative memory.

8

The associative processors show a great potential in the 80 1 s
for data base operations, because of their content addressability and
the concept of parallelism which they offer.
The Relational Data Model
In this section, some definitions related to the relational data
model along with the common operations on the relational data model
will be discussed .
Definition:

s2 , . ..

R is said to be a relation on finite sets

s1 ,

,Sn (not necessarily distinct) if it is a set of m tuples, each

of which has

its first element from

th l
S ,. . . an d 1. ts n
e emen t f rom S n.
2

the Cartesian product
th domain of R.
j

s1*s 2*... *

Sn.

s1 ,

its second element from

In other words Risa subset of
S. will be referred to as the
J

A tab 1 e wi 11 be used to represent a re 1 at ion. In

this representation the degree of the re 1 at ion is the number of
columns and its cardinality is the number of rows.

An array which

represents an n-ary relation R has the following properties:
1)

each row represents an n-tuple of R;

50
2)

the ordering of rows is immaterial;

3)

all rows are distinct; and,

4)

the

ordering of columns

sponds to the ordering

is

s i gni fi cant.

s1 , s 2 ,

It corre-

... Sn on the domains

on which R is defined.
Figure 2 . 3 shows a 5-ary relation.
In a given relation,

a primary key is a domain which uniquely

identifies each element of that relation.
2. 3,

11

EN0 11 is a primary key.

atomic domain,

For example, in the Figure

In a relation a domain is said to be an

if it is not decomposab 1 e, meaning at every row and

column position in the table there exists precisely one value.
contrast with atomic domain we might also have non-atomic domain.
relation is normalized if all domains are atomic.

In
A

As will be seen

later in our system we are dealing with normalized relations.
The

terms

funct i ona 1 dependency and ful 1 funct i ona 1 dependency

will now be defined.
Definition:

Let A and 8 be two sets denoting attributes (an

element of the set of domains) of a relation.

Let Domain(A) be the

domain of A and Domain(B) be the domain of 8, and let at a given
time t, ft be a function such that:
ft:
In f a ct ,

Domain(A)

➔

Domain(B)

f t is not a function because it is a 11 owed to change over

time in the sense that data base relations are allowed to change over
time.

That is, if there is a set of ordered pairs,
{(a,b)

I

a

£

Domain(A) and b

£

Domain(B)}

and at every point of time, for a given value of

11

a 11 there will be at

51

ENO

NAME

SALARY

DEGREE

M-125

SMITH

J.

19000

5

M

A-110

JOHNSON

C.

18000

5

A

N-090

EVANS

s.

25000

7

N

A-102

ENDRES$

B.

21000

4

A

M-011

GOETS

L.

17000

6

M

N-050

SHEET

D.

19000

5

N

A-111

BOLDEN

B.

22000

7

A

A-125

DEAS

H.

20000

6

A

N-021

ROOF

J.

18500

4

N

Figure 2.3.

The employee relation

DEPARTMENT

52

most one value b then ft is ca 11 ed functional dependency.
is a functional

If there

dependency ft, then Bis said to be functionally

dependent on A, and A is said to functionally determine B.

If there

exists a functional dependency between A and B then we can write
8.

In contrast A

-f

A ➔

B means there is no functional dependency between

A and B. For example, in Figure 2.3 set Bc(Name* Salary* Degree) is
functionally dependent on ENO; that is; given one value of ENO, there
exists exactly one corresponding value of (Name, Salary, Degree) for
it.
Definition:

Domain

B

is

fully

functionally

dependent

on

domain A, if it is functionally dependent on A and not functionally
dependent on any subset of A.

In Figure 2. 3, the domain DEPT is

functionally dependent on the composit domain (ENO, NAME), but it is
not fully functionally dependent on (ENO, NAME) since it is functionally dependent on ENO.
The problems of maintaining the integrity of data bases lead to
different canonic or normal forms representations of relations. These
are referred to as, first, second, third and fourth normal forms.

In

our organization relations should be in the first normal form which
is defined below:
Definition:

First

Normal

Form

(lNF):

a

relation

is

in

lNF if all domains of each tuple is atomic.
Since relations are sets, all of the usual set operations (union, intersection, complement, join, division, restriction and projection) are applicable to them.

Before giving the definition of the

above operations, the concept of union compatible should be defined.

53

Definition:

Two

atomic

domains

are

union

compatible,

if

they are of the same data type.
Definition:

Two

non-atomic

domains

A

and

B

are

uni on

compatible if they are of the same degree and for every i(i = 1, ... ,
n) the ;

th

atomic domains of A and Bare union compatible.

Definition:

The

relations

of

A and

B are

union

compat-

ible, if the domains of A and B are element wise union compatible.
If the relations A and Bare union compatible the product(®),
union (U), intersection Cn) and difference(-) are defined in the
same way as they are defined over sets.

In the following the other

set operations (projection, join, division and restriction) will be
discussed.
Projection:

Supposer is a tuple of n-ary relation R.

. . . ' n t he r ( 1. ) des 1. gnates t he 1. th componen t o f r .

For i=l,

For the index

set I= (i , ; , ... , ik) (not necessarily distinct)
1
2
1

~

i

~

n,

we define r(I) = (r(i ), r(i 2 ), ... , r(ik)).
1

Let R be a re 1at ion of degree n, and I as defined above, then
the projection Ron I is defined by:
R(I) = {r(I) Ir

£

R}

In other words projection operation returns only the specified
co 1 umns of the given re 1 at ions, and e 1 i mi nates dup 1 i cates from the
row.

For example if r(I) = (SALARY, DEGREE), and R is the relation

depicted as in Figure 2.3, R(I) will be as Figure 2.4.
Restrictions:
>,

Let

0 denotes any one of the re 1 at ions =, :I, <,

~, ~, and A, and B be domains of R, then the

0-restriction of R

54

ESALARY

Figure 2.4.

EDEGREE

19000

5

18000

5

25000

7

21000

4

17000

6

22000

7

20000

6

18500

4

A projection on the employee relations

55
on domains A and Bis defined by
R [A8B] = { r I r £ R A r(A) 0 r(B)}
In other words
relation which

a

restriction operator selects

satisfy a given condition.

those

tuples

of a

A restriction with the

condition SALARY> 20000 on the relation deinfed in Figure 2.3 will
result in a new relation as shown in Figure 2.5.
Join:

Let 0 be as defined above.

The 0-join of relation Ron

domain A with relation Son domain Bis defined by,
R [A0B]S = {(rs) Ir £ RA s£S A r(A) 0 s(B)}
providing every element r(A)
s(B).

is 0 compatible with every element of

In other words, join operator takes two relations R and S as

arguments,

a new relation is formed by concat i nat i ng a tuple of R

with a tuple of S whenever a given condition holds between specified
domains of Rand S.

As Rn example, if R is the same as in Figure 2.3

and S is as in Figure 2. 6 then R [DEPT = DEPTC0DE] S wi 11 be as in
Figure 2. 7.
The relational data model, which describes a relation is simply
a two dimensional
logical

data model

table.

Contrast this data model with the other

such as hierarchical

or network, which requires

elaborate linkage and pointer techniques between different data items
for successful access.

In general, the relational data model is a

more suitable data model in data base systems because:
1)

the tabular representation of data which is similar to the
ordinary users' view of data, provides simplicity and data
independence for the user.

56

ENO

NAME

N-096

EVANS

S.

25000

7

N

A-102

ENDRESS

B.

21000

4

A

A-111

BOLDEN

B.

22000

7

A

Figure 2.5.

SALARY

DEGREE

DEPARTMENT

A restriction on the employee relation

57

DEPARTMENT CODE

DEPARTMENT NAME

A

ACCOUNTING

M

MANAGEMENT

N

NUTRITION

E

ENGINEERING

C

COMPUTER

I

Figure 2.6.

INVENTORY

The S relation

58

NAME

NUMBER

SALARY

DEGREE

19000

5

C

Management

DEPARTMENT CODE

DEPARTMENT

M-125

SMITH

A-110

JOHNSON C

18000

5

A

Accounting

N-090

EVANS

s

25000

7

N

Nutrition

A-102

ENDRESS B

21000

4

A

Accounting

M-011

GOGTS

L

17000

6

M

Management

N-050

SWEET

D

19000

5

N

Nutrition

A-111

BOLDEN

B

22000

7

A

Accounting

A-125

DEAS

H

20000

6

A

Accounting

A-021

ROOF

J

18500

4

N

Nutrition

Figure 2.7.

J

A join of employee and department relations

59

2)

the mathematical properties of relations are well defined,
and the basic operations required to process the relations

are limited and well known.

This provides a powerful and

complete sublanguage based on a

relational

data

model

(Su and Emam 1978).
So far the concepts of associative memory and relational data
model have been discussed.

Although relational data model offers

simplicity by its natural correspondence with tabular representation,
in the conventional computers its representation has caused complexity in programs and redundancy in data.

Since memory is viewed as a

one dimensional array, there should be a software interface between
logical and physical representation of data.

These problems could be

generally reduced by using associative memory, to represent relational data model.
First, there exists a one to one mapping between associative
memory and the tabular representation of the relational data model.
Moreover, in the relational data model all attributes are equally
available to be searched, as in the associative memory, and search
can be performed on any defined field.
Second, since in the associative memory data is accessed by
content, there is no intermediate address mapping translation.
Third, in the associative memory, insertion and deletion of
words can be done at any row, as in the relational data model.
Next section will review literature of the previous works in the
design of DBMS architectures.
into two subsections.

The following section has been divided

In the first subsection, the architectures

60

which are based on backend machines will be discussed.
Previous Works
Fully Parallel Architecture
The first associative processor for handling non-numeric operations was designed by Lee and Paul (1963).

This bit parallel word

serial hardware machine is composed of an array of identical cells.
Each cell is a small finite state machine which can communicate with
its neighbors.

The array is controlled via a set of programming

commands broadcasted among all the cells by a controller. Each cell
includes a set of bistable devices called cell elements. Cell elements are divided into eel l state elements and the eel l symbol elements.

Cell symbol elements will hold a bit pattern corresponding to

a character in the alphabet.

There al so exists a matching circuit

which matches the bit pattern broadcasted by controller and the bit
patterns stored in each cell.

Data is organized as a single string

of symbols divided into substrings of arbitrary lengths by delimiters.

The

system

is suitable for text retrieval

operations,

but

because of the hardware cost of the memory, the memory size could not
be large enough for handling data bases.
propagation

In order to overcome the

timing problems, Gains and Lee (1965) redesigned the

logic circuitry.

This new design was able to perform simultaneous

operations of shifting and marking strings; but the cost was high for
practical implementation.

In the early 70 s some new hardware for
1

handling data bases using associative processors have been designed
(Moulder 1973; Linde et al. 1973; Defiore and Berra 1973).
these systems,

operations are performed in a

11

In all of

bit slice" fashion.

61
That is,

para 11 e 1 operations wi 11 be performed on one bit of a 11

words at a time.
made that all

In each approach a critical assumption has to be

data file should fit into associative memory.

The

comparisons between an associative processor based architecture with
its equivalent von Neumann architecture, have shown the superiority
of associative processors over van Neumann design with respect to
retrieval, update and storage operations (Berra 1974).
The implemented system by Moulder (1973) is a typical design of
a data

base

system based on associative processor,

system is restricted to moderate size data base (6 *10

al though the
7

bits).

This

system, is composed of a general purpose computer (Sigma 5) augmented
by a 4-array STARAN (Rudolph 1972) computer.
divided into a fixed size sector.
searched.

The data base is sub-

A sector is read in and will be

Then the system reads in another sector and performs the

same search, continuing in this fashion, until the entire data base
is

searched.

A general

purpose computer, which is an interface

between users and a STARAN computer is used for general information
handling application .
track disk.

The disk is composed of 72 tracks of which 64 tracks are

tied to STARAN.
sectors.

ST ARAN by itself is connected to a head per

The only surface of disk is subdivided into 384

Each track has the capacity of 256 bits per sector.

The sys tern des i gened by Linde et a 1.
associative computer.

( 1973) is an integrated

This system is composed of two associative

processing units of size 2K 256 bit elements, which is linked to an
IBM 370/145.

A comparison between the performance of this system and

stand alone IBM 370/145 has been done.

With respect to retrieval and

62
update operations (unordered, ordered), the system has shown superiority over conventional systems.
All the systems discussed not only are capable of handling data
base systems, but have also offered better response time than conventional computers.

For example, Defiore and Berra (1973) have shown

that associative processors need to use three to fifteen times less
storage compared to a data base system using inverted list organization. Moreover, the response time is ten times faster.
It is worthwhile to mention that,

besides the above special

purpose designs based on associative processor, there are some general purpose associative processors.

STARAN is an example of such a

system; it is composed of an associative array processor (one to 32
modular associative processor) with an interface ( custom interface
unit) to the users.
memory

It also has a conventionally addressed control

for program storage and data buffering.

Each associative

processor is a matrix of 256 words by 256 bits, with parallel access
up to 256 bits at a time in either the word or bit direction. Control
signals generated by the control logic unit are fed to the processing
elements
struction

in parallel, and all processing elements execute the insimultaneously.

The STARAN symbolic assembler language

APPLE provides a flexible and convenient assembler for programming
without the complex and costly indexing, nested loop and data manipulation construct i_ons required in convention~ l systems.

Instruction

execution time is dependent upon the number of bits in the operations
involved in the instruction.

63

In the rest of this section some of the hardware backend data
base machines, especially those which have considered the relational
data model will be discussed.

Block Oriented Data Base Architecture
In the late 60 1 s, Slotnick (1970) proposed the "Logic per Track"
concept.

This idea, which 1 ater on was the guide 1 i ne for a great

number of proposed hardware,

is simply based on assigning a read/

write head to each track of a disk.
mented

by adding more

brought

the

concept

of

logic

Later on, the idea was imple-

to each read/write head (cell) which

cellular

organization.

The general

idea

behind all the proposed hardware designs based on Slotnick 1 s idea is
the

se 1 ect ion

of tuples on the secondary devices.

These ce 11 ul ar

organizations, are composed of a set of identical cells supported by
a general purpose computer.

Each cell is acting as a small computer.

A block of memory is assigned to each cell, and each cell is capable
of perfromi ng

some

basic operations

associated block of memory.
equal

size blocks.

on the

resident data on

its

Data file will also be segmented into

As a result of this organization, the relevant

data selected on the secondary devices, will be transferred to the
front ~end machine.

This wi 11

reduce the channel

loads.

Moreover,

s i nee this system is composed of several i dent i ca 1 ce 11 s which can
perform ope rat i ans
achieved,

independent of each other, para 11 el ism will

and as a result response time will increase.

be

Figure 2.8

64

I/0 Machine

Gen. Purp.
Computer

Figure 2·8

Controller

General structure of a cellular
organization

65

shows a general cellular organization, where N, processors (P , P ,
1
2
... , PN) are controlled by a controller and to each processor Pi a
memory M. is assigned (1
l

~

i

~

N).

A Logic Per Trace Retrieval System is a fixed head disk with a
logic chip attached directly to each read/write head (Parker 1971).
Each head is allowed to search for a fixed key, garbage collection,
insertion of a record, deletion of a record and deletion or insertion
of a key. Si nee there is no communication path between heads, l ogi cally the maximum length of each record is restricted to the size of
a track.
11

Information on the tracks are divided into three groups,

holes, keys and data records. 11 Each i tern is preceded by some bi ts

showing the type and length of the i tern.

Moreover,

each record

should be preceded by a set of mark bits for instructions which take
more than one rotation.
RAPID (Rotating Associative Processing for Information Dissemination) is an array of identical cells controlled by a control unit.
(Parhami 1972).

RAPID is based on Slotnick 1 s idea and Lee's machine

(Lee and Paull 1963).

This combination will reduce the cost of memory

(compared to Lee's memory) at the expense of execution time.
mation on the tracks are in the bit parallel word

serial

Inforformat.

This implies no conversion of data when it is transferred to the
cells.

The strings of data which are stored on the secondary memory,

are read into the cell storage one character at a time, processed and
stored back.

Since data is handled one character

character should have some control

at

a

time,

each

storage associated with it to

store the temporary result similar to that used in Lee's design.

66

CASSS (Context Addressed Segment Sequential Storage) is another
non-numeric oriented system based on a cellular organization, augmented by a

fixed-head disk (Healy et al.

segmented into equal

Data is orga-

Each record is of variable

and is preceded with a set of contra l bi ts for storing the

intermed i ate
numeric

Data files are

length, one segment per track.

nized in bit serial word serial fashion.
length,

1972).

results.

operations,

CASSS
which

is

al so

includes

capable

automatic

of performing nongarbage

collection.

CASSM ( Context Addressed Segment Sequential Memory) is a data
base machine
well

as

a

capable of

relational

handling hi erarchi cal data structures, as

and

network data structure,

special

cases of hierarchical

1979).

Data

particular

is

since they are

data structures (Su et al.

stored in a bit serial word serial

feature

of CASSM

is

that,

1979; Su

fashion.

One

each processing element can

directly communicate with its adjacent neighbors.

This system which

is very similar to CASSS, is a result of a top down design and development process,

that is, the design of the system is done in the

following stepwise fashion:
i) definition of a data model, ii) definition of a high level data
sublanguage, iii) the design of the assembly language, iv) the definition of the machine primitives, and finally, v) the logic design
and

implementation of the

hardware.

words, each string 40 bits long.

Data consists

of strings of

In each word 32 bits can be used to

store either a delimiter, a column-value pair, a character string or
pointers and instructions.

The remaining bits are used as a tag to

identify word content, mark bits and internal processing.

67

RAP (Relational Associative Processor) (Ozkarahan et al. 1974;
Ozkarahan et a 1. 1975; Ozkarahan et a 1. 1976; Schuster et a 1. 1979)
is a data base machine capab 1e of handling re 1at i ona 1 data structures.

Like CASSM it is composed of an array of processors, without

any communication path between adjacent cells.
seri a 1 word seri a 1 fas hi on.

RAP uses a fixed 1ength representation

for the tuples of a relation.
relation.

Data is stored in bit

This length can vary from relation to

Data on each segment, will be preceded by a set of mark

bits with the same concept as the other machines.
RARES ( Rotating As soc i at i ve Rel at i ona 1 Store) uses a very di ff erent organization from CASSM and RAP (Lin et al. 1976).

Tuples are

1aid out on the secondary storage device across the tracks, rather
than along the tracks (bit serial byte parallel fashion).

Each set

of tracks used to store a relation in this fashion is called a band.
The number of tracks in a band is a function of the size of tuples in
the relation.

In this design a cell will be assigned to each band,

which can overcome the inefficiency of previously mentioned cellular
organizations,

at the expense of slower execution time, and addi-

tional hardware for association of cells to bands of different sizes.
Another special feature in this system is the elmination of mark bits
on the secondary devices,

since mark bits are implemented in the

cells (search modules) rather than on the secondary storage.
DBC (Data Base Computer) (Banerjee et al. 1978; Banerjee et al.
1979) consists of several functionally speci a 1i zed processors which

performs database search, directory search, access control, query and
update,

and

other database

management

functions.

The system is

68

designed to implement attribute based data model.
and

name

resolution wi 11

different processors.

All the mapping

be performed by hardware algorithms in

Data is stored on a movable head disk which is

clustered based on some prespecified keys.

In fact the design is a

hardware implementation of a cluster file organization.
Search Processor

All

the

hardware machines

mentioned above

(except DBC) are in the SIMD (Single Instruction Stream, Multiple
Data Stream) category .

There are some data base machines which are

in the MISD ( Multiple Inst r uct i on Stream, Single Data Stream) and
MIMD

(Multiple

Data Stream,

Multiple

Data Stream) classes.

search processor for data base management systems
class of MISD machines class (Leilich 1977).

The

belongs to the

Data could be of vari-

able length and is stored on secondary devices in bit parallel word
serial fashion.

This system is composed of a group of search modules

(14 units), each capable of executing a user's query.
from

secondary

modules.

storage

and wi 11

be broadcasted among the search

Each module will check the data against its stored query

independent of the other modules.
i ndi vi dual

Data will read

Although the search time for each

query is the same as sequent i a 1 search, because of the

concept of para 11 el ism behind the system (more than one query at a
time), the overall execution time will be reduced.
DIRECT (DeWitt 1979), which is in the MIMD (Multiple Instruction
Stream, Multiple Data Stream) class is another database architecture
which supports a relational data model.

The idea behind this design

69

is the elimination of processor's idle time in the CASSM or RAP, if
the size of data file is not as large as the total size of the secondary storage.

This wi 11

provide a multi-users environment.

The

concept of mark bits has been also eliminated, since sub-relations
will be explicitly created during the operations.
DIALOG

(Distributed Associative Logic Data Base Machine) (Wah

and Yao 1980)

is

a

network backend database machine.

System is

capable of manipulating data which are stored on different storage
devices.

System is composed of a host computer which is backed by a

cluster of processing elements.
data modules.
associative

Data in the system is stored in the

Each data module consists of a storage device and an
processor.

Data modules

are connected together by an

interconnection network. The host is connected to the cluster of data
modules via a controller.
Discussion
Para 11 el ism is the common feature in a 11 the backend machines
discussed in this chapter.

Based on this feature the backend systems

can be classified into three groups:

SIMD, MISD and MIMD.

In this

section the advantages and disadvantages of the machines in each of
SIMD and MISD classes will be discussed.
1)

SIMD - the general pattern of operations is as follows: The
micro operations will be distributed among several identical mini-processors, each capable of performing the basic
operations on the segmented data associated with it. The
secondary storage

is

a

rotating device such as a disk,

70

(segmented data in this case is a track).
or might not be idependent of each other.

The cells might
Among the advan-

tages of this approach we can say:
a)

the independence of search time from the data
base size;

b)

performing

the

search on the secondary device

rather than on the main memory;
c)

providing non-numeric operations; and,

d)

providing operations based on the variable length
tuples.

Based on this idea, a search through the data file wi 11
take one disk rotation for marking the tuples, but, these systems will show the following disadvantages.
A fixed head disk with about 10

9

bytes capacity implies

8000 tracks (12000 bytes/tracks), which means that 8000 proces-

sors should be attached to the heads.

The system is not going

to be cost effective, especially since processors should be a
part of the secondary storage.

On the other hand, the size of a

segment of memory is not large enough to hold a relation.

This

implies synchronization and control over 8000 processors, by a
controller, which means a complicated hardware implementation.
The cells could be dependent or independent of each other:
if the cells are independent, the phenomenon known as internal
fragmentation in the paging scheme will affect the efficiency of
the

system.

If

the ce 11 s are not independent,

the hardware

71

overhead which is the result of i nter-communi cation paths between the cells, will increase the cost of the system.

Further-

more, such an addition to a fixed head disk technology is difficult due to the problem of synchronizing the concurrent read/
write of all tracks and routing the output from a large number
of tracks.
Since the processors are attached to the secondary devices,
efficiency would be achieved if all the processors were active
during the operations which means that efficiency of the system
varies depending on the size of the relations.

In other words,

this system is not efficient for small to moderate size relations.
Existence of control bits or mark bits for each record is
essentia l , due to the fact that some operations will be excuted
in more than one disk rotation and subre l ations are not created
explicitly.

The concept of control bits not only reduces user's

independence

of data,

but also increases the execution time

s ince manipulation of control bits impl i es write operation on
the secondary devices.
Creation of implicit subrelations on the secondary storage
will drastically increase the execution time of interrelational
operations such as "join. 11
master/s l ave

procedure

in

For example, the execution time of
the

RAP will

be unacceptable for

joining two subrelations of size 10000 tuples which takes about
thr,e e minutes .

72

2)

MISD - in this type of machine the data will be broadcasted
among severa 1 processing uni ts.
handling a query.

Each unit is capab 1e of

Each query will be stored in the associ-

ated memory to each processing elements which are generally
independent of each other.

The only implemented machine in

this class is the "search processor 11 (Leilich 1977), so the
discussion will be restricted to this design.

The general

strategy is that, data will be broadcasted among identical
units

(search

units)

each capable of handling a

query.

Because of the concept of parallelism, this machine should
naturally offer a better execution time compared to the
conventional machine, but for an individual query the time
wi 11

not differ,

data

fi 1e

s i nee the search ti me is a function of

size.

This

system

is

capable of performing

operations on variable length fields. Moreover, since each
search unit is assigned to a query there is no need for
internal

communication paths between uni ts ( except in one

case which wi 11 be discussed 1 ater).

On the other hand,

this system is so restricted, that it decreases the efficiency of the system.

These restrictions can be seen in a 11

the cases:
a)

the secondary device is a rotating device (disk)
of 9 surfaces,

which means

moderate data file

size;
b)

queries

are restricted, each domain can be re-

ferred to only once in a query, unless more than

73

one unit is assigned to a query.

This implies

existence of internal communication paths between
search uni ts, and overhead on the hardware, but
s i nee

the number of uni ts

are not large this

overhead is tolerable compared to the SIMD class;
and,
c)

s i nee data

is shared among different users or

queries, queries should not be able to change the
data.
i)

This implies:
each query should be executed in one disk
rotation, unless control bits are implemented (one set for each search unit).

ii)

update

of

data

is

impossible

because

of

locking protocol
In the following chapters, the design and implementation of the
language ASL which

is based on the relational data model will be

discussed. ASL is a data base sublanguage which provides data independence, and is designed for non-professional programmers and it can
be directly implemented by the hardware.

74

CHAPTER III

ASL

-

AN ASSOCIATIVE SEARCH

LANGUAGE FOR DATA BASE MANAGEMENT

Introduction
The concept of stored program computer is an elegant practical
and unify i ng idea that so 1 ved a number of prob 1ems.

A1though the

concepts and conditions which produced this architecture have been
changed

radically,

we identify the notion of "computer" with this

thirty yea r old concept.

One of these changes is the use of com-

puters in specific areas such as data base management.
The

design

of

a genera 1 purpose computer offers

an overa 11

optimization for different tasks, but in case of a specific task this
design

is

not efficient.

This

inefficiency is

the

result of the

bottom up approach used in the design of a general purpose computer.
This

approach

has

created a prob 1em ca 11 ed

II

semantic gap" in the

lite r ature, which is the measure of differences between programming
languages and the hardware architecture.
have a large semantic gap.
software

unreliability,

Most of the current systems

This large semantic gap contributes to

performance

problems,

excessive

program

sizes, compi l er complexity and distortions of the programming languages.

In other words the semantic gap has been created because the

programming languages are much more powerful than machine languages.
For examp 1e,

in a

von

Neumann design, memory is viewed as a one

75

di mens i ona l

array, whereas data structures in programming languages

are not generally viewed as a one dimensional array.
In the data base systems the major concern is to maintain and
access large amount of data which happens to be mostly non-numeric
information.
today I s
mers.

Moreover,

society,
These

because

of

the

influence

of computers

in

the users generally are non-professional program-

facts

have forced the designer to create programming

languages easily understood and implemented by users having a minimal
mathematical

sophistication.

These English-like self contained data

languages a re called query languages, which bear an important fea"they provide

ture:

about computer."

users

independence of any detail

information

By "users independence" we mean the independence of

the user's model of data and computation from the data structure on
the secondary storage and the hardware complexity of the machine.

A

programming language with this feature naturally should be close to
the natural languages. This closeness increases -the existing semantic
gap if these languages are implemented on the conventional systems.
In

general ,

the

closer the programming 1anguages

to the

natural

languages, the larger the semantic gap .
Severa 1 query

languages which range from the high 1eve l pro-

gramming languages to the natural
implemented on the conventional

1anguages have been designed and

systems (Kim 1979), al 1 these lan-

guages more or less support the following features:
a)

they are non-procedural

data 1anguages which provide data

manipulation, data definition, and data control facilities
for non-technical users;

76

b)

they

provide

users

independence of data

structure

and

sophisticated knowledge about computers;
c)

they are close to natural languages, therefore, they are
easy to use and learn; and,

d)

they provide integrity and security of data.

The existing semantic gap for these languages has been fi 11 ed
out by software systems which translate the high level features into
the machine features.

This translation not only reduces the effici-

ency of the system,

but it a 1so increases the complexity of the

systems. This semantic gap can be reduced by taking a top-down approach to the data base systems.

In other words, the gap can be

reduced by designing special purpose machine for handling data base
systems.
This chapter addresses a new data base language ASL.

ASL is a

high level data base language designed for information retrieval and
storage operation on data bases.

In the design of ASL, semantic gap

has been reduced because of two important approaches which have been
taken in the design of ASL:
1)

A top down approach which is the central motivation in the
design of the ASL language (Mukhopadhyay and Hurson 1979).

2)

The ASL machine which is the hardware implementation of ASL.
In this approach we define, first the data model.

A rela-

tional data model was our choice, because it is simple, it provides
high degree of data independence and it is based on well known alge-

'
bric theory.

The language ASL is then defined based on this model

and on the requirements of a query language for handling large data

77

base systems, and finally the hardware micro instructions are specified.

These cons i de rations enabled us to define a set of micro

instructions which is very close to t he set theoretic operations. In
the conventional systems a large part of the data traffic between
main memory and central processor carry information regarding memory
names and data as well as operations and data used to compute such
names.

In a query the complexity of data base software system is due

to a great extent to the requirement for name mapping resolution,
which converts query variables into memory addresses where the data
named by the query can be found.
In the next section the formal

defini t ion of ASL is given.

Moreover, the implementation of the ASL language by an interpreter on
a conventional machine is presented in this chapter.

This implemen-

tation has the following motivations:
1)

It enables us to define the language precisely.

This lead

to some changes in the formal definition of the ASL productions.

As will be discussed later by using an LR(l) pars-

er, all the conflict states have been resolved, where these
resolutions
2)

had

direct effect on the ASL productions.

This implementation could be used as a tool for comparing
the efficiency of the ASL hardware; and,

3)

Since ASL is a query language, it could be used as a stand
alone query language for handling data base systems.

In the rest of this chapter the design and implementat i on of the
ASL language along with a brief review of some concepts of the compiler and related topics will be discussed.

78

ASL Language

Definitions
Definition:

An

alphabet or vocabulary

is

any finite set of

symbols.
Definition:
finite

A sentence over an alphabet V is any string of

length composed of symbols from the alphabet.

If V is an

alphabet then V* denotes the set of all sentences composed of symbols
of V including the empty sentence (i.e. a sentence consisting of no
symbols)
Definition:

A grammar is an ordered quadruple (Vn, Vt, S, F)

where Vn and Vt are disjoint sets of nonterminal and terminal symbols
and Sc-V

ordered

pairs

(P, Q) such that Qc-(Vn U Vt)* and Pc- (Vn U Vt)* Vn (Vn U Vt)*.

Ele-

ments

n

(P,

is cal led the start symbol.

F is a set of

of F are called productions and are written P

Q)

➔

Q.

Based on the Chomsky hierarchy a grammar could be regular (type
3),

context free

(type 2),

context sensitive (type 1) or phrase

structure (type O) (Hopcroft and Ullman 1969).
Definition:

A language L(G) generated by grammar G is defined

as:
L(G) = {X I Xc-V~
Where (

! )

and S

+

➔

XJ

means a derivation using one or

more productions.

Therefore, a language over an alphabet Vt is simply a set of all sentences (words) derived from the start symbol by one or more application of the productions.
if G is of type

11

i. 11

A language L(G) is of type

11

i,

11

0 ~ i ~ 3

79

Definition:

A grammar

is

LR(k)

if each

sentence

that

it

generates can be deterministically parsed in a single scan from left
to right with at most k symbols of "look ahead.

11

This means that

each reduction needed for the parse must be detectable on the basis
of left content,

the reducible phrase itself and the k terminal

symbols to its right.

In the following discussion we assume that

k is equal to zero.
LR parsing is based on the fact that for each LR grammar only a
finite number of states need to be distinguished to permit a successful

parsing.

Figure 3. 1 depicts a context free grammar G which

defines a set of arithmetic expressons.
Definition:

A

configuration or item

of

a

grammar

G

is

a

production of G with a position marker (a dot) in its right hand side
( e. g. A

➔

X. YZ) .

Definition:
Y) is

A ➔

The successor of a configuration

A ➔

X.YZ (under

xY.z where x and z are strings of terminal and nonterminals

(possibly empty) and Y is a terminal or nonterminal symbol.
Definition:
Definition:

The state is def i ned as a set of items.
Initial state will

be

defined by including the

following items:
1)

If S ➔ W is a production then S ➔ .W is in the initial

state.

Wis the right hand side of the production

WE {(Vn U Vt)* Vn (Vn U Vt)*} U {(Vn U Vt)* Vt(vn U Vt)*}
2)

For all the items in the initial state the closure set of
the items is also in the initial state.

The procedure to

find the closure set of an item is as follows:

80

Start Symbol (S):

EXPRESSION

Nonterminal Symbol (V n ):

Terminal Symbol (V.)

{

1

{EXPRESSION, TERM, FACTOR}

+

* () NJ

Production (F):

<EXPRESSION>

➔

<EXPRESSION>

+

<TERM>

<TERM>

<TERM>

<TERM>

➔

* <FACTOR>

<FACTOR>

<FACTOR>

➔

N
(<EXPRESSION>)

Figure 3.1.

A GRAMMAR G

81
If I

is a set of items (a state) then the closure(!) is

constructed by:
1)

Every item in I is in closure(!); and,

2)

If

A ➔

x.Yz is in closure(!) and Y ➔ y is a production

then Y ➔ .y will be in the closure(I) (y can be defined
as W above).
If A ➔ x. Yz

is an item in the state P a new state Q can be

generated by the successor of
the A

A ➔

x.Yz (under Y) plus the closure of

xY. z.

➔

Definition J:

A configuration is reduce configuration if it is

W.

On the other hand a shift configuration is of the

of the form
form A

➔

A ➔

xY. z.

In the LR parser each state receives a token; based on the token
and current state, the parser will go to the next state.

States are

classified as shift state, reduce state and conflict state.

A state

consisting of only shift configuration is called a shift state.

A

state consisting of a single reduce state is called a reduce state.
Other states are called conflict states.

In the shift state an entry

should be inserted into the parse stack, while in a reduce state the
generated handle (right hand side of a production) on the top of the
parse stack should be reduced.

In a conflict state the parser cannot

decide what action should be taken in order to continue to parse the
sentence.
Figure 3. 1.

Figures 3. 2 and 3. 3 depicts the LR parser of grammar of
Based on the above definitions states 4, 5, 11, and 12

~E➔ E + T.

( 7)

T➔

E➔

T➔
T➔

I
E
(1)
E➔

E➔
T➔

T➔
F➔

F➔

.E +T
.T
• T *F
.F
.N
• ( E)

(4)

I

E➔

E. + T

E➔

T.
T. *F

T➔

*(8)

,
T➔
F➔

F➔

-

T*.F
.N
• ( E)

(6)

I F+ ( • E)

GJ

E+ T
.T

E➔ •
E➔

T➔
T➔
F➔

F➔

.T *F
.F
.N
• ( E)

IT]

F ( 1 ll

I N

I

I

(9)
E

,~

I4

F

I;

\...,

Figure 3·2

I

~F+.N
F+ • ( E)

( 3'

T-

E + .T
.T* F
.F

T. *F

T➔

T*F.

4

I ~: ~ ~ ·l T

ff=i
(12

F+ ( E) •

~:

LR PARSER of grammar of Figure 3·1

co
N

~

E

~

+

T
~ 8

T

F

N

((

- . \\

(

0

I

0

I

\

N

15

\

~

~

0
0

~~

4, 5, 11, 12 reduce states
3,

.____,

10

conflict states

1, 2, 6, 7, 8, 9
Shift states

I~

Figure 3·3

LR PARSER of Figure 3·2
co
w

84
are reduce states and states 3 and 10 are conflict states and the
rest of the states are shift states.

Informal Definition of ASL
The
level

language ASL -

data

storage

base

an Associ a tive Search Language is a high

language

operation on

basic operations.

data

designed
bases,

for

information

retrieval

and

using associative principles for

The language has been defined based on the rel a-

ti ona l data model of Codd (1970).

The fundamental operation in ASL

is a search on a data base with respect to search criteria (search
arguments). The result is a relation yielding the retrieval information.

This operation can be thought of as an

assignment

statement

W= X, where Wis a relation and Xis composed of three parts,
HOW
11

How 11

specifies

relation name and
tion.

WHAT

WHERE

11

the

search

arguments,

11

Where 11

specifies the

What 11 specifies the output domains of the rela-

If R, C, and D stand for WHERE, HOW and WHAT respectively, we

can say Risa binary transformation operation whose operands are C
and D and can be expressed as:
C

R

D

C is a set of sets (C = (C , . . . , Ck)) where each element in
1
this set (C., 1 ~ i ~ k) is an unordered list of search arguments•
l

For i

= 0,

C

=0

denotes an empty set of search arguments.

Dis also

a set of sets (D = (ct , . . . , dk)) where each element in this set
1
(di, 1 ~ i ~ k) is an unordered list of domains over R. When d.
l

includes all the domains of R, it will be denoted by a.

The result

85

of operation R on C and D is a set of relations (W , . . . , Wk)'
1

where the domains of Wi (1
(1 ~ i ~ k).

~

i

~

k) are the same as those of the d.
l

In our implementation, we assume that C and D as well

as Ware singleton sets.
The conceptual
SQUARE

framework of ASL is very similar to that of

(Boyce et a 1.

1975),

a data sub 1anguage which formed the

basis for the development of SEQUEL.

The similarities of these two

language are

reflected in the definition of the basic operations

which brings

forth

cessing,

and al so

exp 1 i cit ly the associative nature of the proin avoiding the use of quantifiers required by

languages based on the relational calculus.

Both ASL and SQUARE are

relationally

for

complete,

provide

facilities

query,

insertion,

de 1et ion and update operations, and are meant for non-profess i ona 1
prog~ammers who do not possess a high degree of mathematical sophistication.

However, the originators of SQUARE did not emphasize the

associative nature of the primitive operators, since the language was
not considered for hardware implementation.

Moreover, the structure

of the statements in ASL are represented in arithmetic expression
like entities called set expressions. This close resemblance enables
us to implement a simple compiler or translator for the ASL statements in the same way a compiler will treat arithmetic expressions.
ASL is also based on the variable size tuples with complete data
independence.
Before giving the formal definition of the ASL language, we will
show

a

few

examples to

illustrate ASL informally.

The examples

formulate the different queries in the ASL langauge with respect to

86

the EMPLOYEE-DEPARTMENT data base as follows:
E (ENO, ENAME, DEGREE, LOCATION)
D (DNO, DNAME, HEAD, LOCATION)
ED (ENO, DNO, HOUR)
Where ENO, DNO stand for employee and department number, ENAME
and DNAME stand for employee and depart ment name, and HOUR shows how
many hours each emp 1oyee works in each department.

Each query wi 11

be expressed in form of an ass i gnment statement:

W : = X
Retrieval Operations
a)

Simple Retrieval

Example 1:

Get department number of all departments:
W

b)

0

@

DNO

Qualified Retrieval

Example 2:
Ll 11 and

11

Get employee number for a 11 employees in

degree greater than three.
1

11

location

11

W : =(LOCATION= L and DEGREE> 3)
This example illustrates that

II

®

ENO

C'' could be specified by a pre-

dicate. The output Wis a relation whose tuples are some of the ENO
values of the relation E.
c)

Complex Retrieval

Example 3:

Get employee name for employees who work in depart-

ment Dl.
W

: = ((DNO = Dl)

@)

ENO)

@

ENAME

The nested nature of operations has been i 11 ustrated by this
example. First, the relation ED will be searched for all department

87

numbers equal to 01 yielding a subrelation of ENO.
wi 11

This subrelation

then be used as an argument to obtain ENAME.

In fact, this

subrelation which will be searched in associative fashion will arrange the search argument for the second search.

The operation can

be seen as:

Example 4:

X

= (ONO = 01)

@)

ENO

W

=(ENO= X)

©

ENAME

Get employee number for employees who work in de-

partment managed by
W
Example 5:

11

SMITH.

11

( (HEAD = SMITH)

ONO)

@

ENO

Get employee names for employees who work in depart-

ment managed by

11

SMITH.

11

@

W : = (((HEAD = SMITH)
d)

@

ONO)@)

ENO)

©

ENAME

Join Operation

Based on the definition thee join of a relation Ron domain Dr
with relation Son domain D is defined as:
s

R

[D

r

0

D ]

s

=

{Cr " s) I rcR /\ st:S /\ r(D r ) e s(D s )}

Where r(D) and s(D) are assumed to bee comparable.
r
s

Moreover,

in the ASL langauge we can consider retrievals that use several
constructional

operations · like

tuple-compatible

union

(U),

intersection (n), cartesian porduct (®), . . . , etc.

Storage Operations
Storage operations such as

II

update, 11

11

insert, 11 and

II

de 1 ete 11 can

be performed easily by accessing appropriate tuples from the appropriate relation followed by modification of information.

88

Formal Definition of ASL
Based on the Chomsky hierarchy ASL is a context free language.
For our implementation we have adopted the alphabet L as shown in
Table 3. 1.

The alphabet symbols can be partitioned into three sub-

sets, character, digit and special characters.
respectively

show the sets of non-terminal

Tables 3.2 and 3.3

and terminal

symbols.

Table 3.4 shows the complete set of the ASL productions in BNF form.
In

general

"declaration

ASL

produc t ions can be partitioned into two classes,

part 11

and

11

statement part, 11 the description of data

which is manipulated, and the description of actions which should be
performed on the data r espectively:
<PROGRAM>

=

<DECLARATION PART> <STATEMENT PART>

The actions are described in the STATEMENT PART while structure
of data, as wel 1 as the additional temporary locations wi 11 be described in the DECLARATION PART.
conventional 1anguage by:

a)

A query l anguage differs from a

the existence of a centralized data

file on which all the operations are performed, and, b)
tion part of language is short.

the declara-

These two points have been consid-

ered in the design of ASL.
Before getting into the discussion about the implementation of
the ASL language, we would like to refer to our examples and present
them formally in the ASL format.
Retrieval Operations
a)

Simple Retrieval

89

TABLE 3.1.

ASL ALPHABETS

CHARACTER:
A, B, C -----X, Y, Z

DIGIT:
0, 1, 2 -----8, 9

SPECIAL
CHARACTER:
+

'

-

*

I,

(,),

U,=,,,.'

L

LI,&,<,>,

-

•

,' '

II

90

TABLE 3.2.
SET OF NON-TERMINAL SYMBOLS

Nonterminal Symbol

Nonterminal Symbol

Additive Op

Relational OP

Add Relation

Relation ID

Boolean

Relation OP

Cartesian

Relation

CH List

Restriction

CH Relation

RE List

DCL List

Search Set

Domain-Type List

Set Expression

Domain-Value List

Simple Expression

Domain Types

String

Domain Values

Term

Domains

Type

D Relation

T Relation

Expression

Relation

Factor

Multiplicative OP

Integer

Output Set

Join

Program

Length

P Relation

Log-Set Op

Q Relation

91

TABLE 3.3.
SET OF TERMINAL SYMBOLS

Terminal Symbol
Boolean
EQ

Terminal Symbol
$

=

GE
GT
Integer
LE

#

'/.

LT
NE
String

<

+

EOF

*
I
(

True

)

False

92

TABLE 3.4.
ASL PRODUCTIONS
The Formal Set of the ASL Productions in the BNF Format:

<ADDITIVE OP>

:: = ±

<ADD RELATION>
<CARTESIAN>

= <RELATION ID> U <DOMAIN-VALUE LIST>
<RELATION ID> U <DOMAIN-TYPE LIST>
- (<RELATION ID>) (<RELATION ID>)
=

<CH RELATION>

<DCL LIST>

<SEARCH SET> <RELATION OP> <DOMAIN VALUE LIST>
<SET EXPRESSION> <RELATION OP>
<DOMAIN-VALUE LIST>

= empty
<DCL LIST>

<RELATION ID>;

<DOMAIN-TYPE LIST:: = <DOMAIN TYPE>
<DOMAIN-TYPE LIST>,
<DOMAIN-VALUE LIST>

<DOMAIN TYPE>

= <DOMAIN VALUE>
<DOMAIN-VALUE LIST>,

<DOMAIN TYPE>

<DOMAIN>, <LENGTH>, <TYPE>

<DOMAIN VALUE>

<DOMAIN>= <EXPRESSION>

<DOMAIN>

<DOMAIN VALUE>

= <CH LIST>

<D RELATION>

= <RELATION ID> - <SET EXPRESSION>
<RELATION ID>-

<EXPRESSION>

= <SIMPLE EXPRESSION> <RELATIONAL OP>
<SIMPLE EXPRESSION>

<FACTOR>

= (<EXPRESSION>)
~<FACTOR>
<INTEGER>
<BOOLEAN>
<STRING>
<DOMAIN>

93

TABLE 3.4. -- Continued
<JOIN>

=

<LENGTH>

( <RELATION ID>)
( <RELATION ID>)

<DOMAIN> <RELATIONAL OP>

<DOMAIN>

<INTEGER>

=

<LOG-SET OP>

= &

<MULTIPLICATIVE OP>

I
<OUTPUT SET>
<PROGRAM>

=

<DOMAIN>
<OUTPUT SET>, <DOMAIN>

<DCL LIST>

=

<SET EXPRESSION>
<CARTESIAN>
<JOIN>
<RESTRICTION>
<RELATION ID>
<LOG-SET

=

<Q RELATION>

<RE LIST>

<RELATIONAL OP>

=

=

<CH LIST>

<RELATION OP>

=

<RELATION ID>

=

<SEARCH SET>

<T RELATION>
<UP RELATION>
<Q RELATION>
<RELATION ID> <OUTPUT SET>

<RESTRICTION>
<RE LIST>

<RELATION

LT
GT
LE
GE
NE
EQ

<RELATION ID>

<RELATION>

OP>

=

<RE LIST> <RELATION>;
<RELATION>:
= empty

<EXPRESSION>
<SEARCH SET> <LOG-SET OP> <EXPRESSION>

ID>

94

TABLE 3.4. -- Continued
<SET EXPRESSION>

=

<SIMPLE EXPRESSION>

<SEARCH SET> <RELATION OP> <OUTPUT SET>
<SET EXPRESSION> <RELATION OP> <OUTPUT SET>
<RELATION ID>
=

<SIMPLE EXPRESSION> <ADDITIVE OP> <TERM>
<TERM>

<TERM>

=

<TERM> <MULTIPLICATIVE OP> <FACTOR>
<FACTOR>

<TYPE>

=

<BOOLEAN>
<INTEGER>
<STRING>

<T RELATION>
<UP RELATION>

= <RELATION

ID>
<RELATION ID>

=

<CH RELATION>
<ADD RELATION>
<D RELATION>

=
=

<Q RELATION>
<UP RELATION>

95

Example 1:

Get department number of all departments:

D·

'

W=
11

[D]

ONO;

D" dee la res the name of the relation;

the set of the produc-

tions in the Table 3.4 shows that each program will be started
with the declaration of all the relations used in the program.
As it can be seen the declaration part of ASL is as simple as
possible.

This declaration part will be used by the compiler

for symbol table creation.
b)

Qualified Retrieval

Example 2:

Get employee numbers for all employees in "location
11

Ll 11 with degree>

3 11 ;

E·

'
W=

Location EQ

c)

Complex Retrieval

Example 3:

11

Ll" /\

Degree GT

11

3 11

[

E]

ENO;

Get employee names for employees who work in depart-

ment Dl;
ED;

E·

'

W = (ONO EQ

11

01 11

[ED]

ENO) [ E] ENAME;

The right hand side of an assignment statement will be called a
set expression.

A set expression, like arithmetic expression, can be

written in polish notation as follows:
W

ONO

11

D1 11

EQ

ENO

ED

ENAME

E=

96

which defines the same relation as W above.

The polish string repre-

sentation avoids redundant use of parenthesis, and also has the advantages with respect to the compilation and the design of hardware
controller that implements the language.
Example 4:

Get employee number for employees who work in de-

partment managed by

11

SMITH 11 :

D·

'

ED;

W = ((HEAD EQ "SMITH") [DJ
Example 5:

ONO)

[ED]

ENO

Get employee names for employees who work in depart-

ment managed by

11

SMITH 11 :

D·
'

ED;

E·

'
W = ( (HEAD EQ

II

SMITH"

[DJ

ONO)

[ED] ENO) [E] ENAME

In this case the parsing and computation will proceed as:
A= HEAD EQ "SMITH"

[DJ

ONO;

B = A [ED] ENO;
W = B [EJ ENAME;
For the following examples, more than one statement wi 11 be
used.

Al though our language is rel at i ona lly complete, it does not

satisfy

the

single

statement

requirement.

The data sublanguages

using relational calculus satisfy the single sentence constraint, but
they use existential and universal quantifiers which do not seem to
be easily understood by even reasonably sophisticated users.

More-

over, our motivation also has roots in the hardware implementation.

97

The multi-statements

feature

of ASL provides

a simple interface

design for translating queries into hardware control signals.
Example 6:

Get employee names for employees who do not work in

department Dl,
ED;

E·
'
X = ONO NE
W= X
d)

[ E]

11

D1 11

[ED]

ONO ;

ENAME;

Join Operation

Example 7:

Get a list of all employees with the amount of hours

they work for each department:

E·

'

ED;

X = [E] ENAME, ENO;
Y = [ED] ENO, HOUR;
W = (X) ENO= ENO (Y);
The last statement is a join operation on X and Y.
Example 8:

Get all employees and department name pairs, so that

the indicated emp 1 oyee and department are
place:
E·
'

D·
'
X=

[E] ENAME, LOCATION;

y

[DJ DNAME, LOCATION ;

=

z = (X) LOCATION= LOCATION (Y);
w = [Z] ENAME, DNAME;

1 ocated

in the same

98
e)

Storage Operations

Example 9:

Change the location of department Dl to "New York":

D;
11

ONO EQ
Example

D1"

[DJ

10:

LOCATION=

Delete a 11

11

NEW
YORK"·'
.

the employees whose degrees are less

than 4;

E·

'

E-. (DEGREE LE
Example 11:

11

4 11

[E]);

Delete relation E

E·

'

E ..,

Example 12:

Add new employee (ENO= 123, ENAME = JOE DEGREE=

5, LOCATION= ROME) to the employee relation;

E·

'

E = E U ENO =
11

ROME:

11

123, 11 ENAME =

11

JOE, 11 DEGREE = "5," LOCATION =

II

The ASL Interpreter

Some Definitions
A

Definition:

Translator

is

a

program

which

source program into an equivalent object program.
is a high level
the

translator

translates

a

If source program

language and object program is the machine language,
is

ca 11 ed

a

accepts a source program as

compiler.

In

contrast an interpreter

input and executes it.

will accomplish its job in two phases:

An interpreter

99

1)

it analyzes source program in much the same way as a compiler does, and translates it into an intermediate form;
and,

2)

executes the intermediate form generated in 1).

The intermediate form is designed to minimize the time needed to
11

decode 11 or analyze each statement in order to execute it.

In con-

trast to a compiler, an interpreter will analyze a source program
statement each time it is to be executed in order to discover how to
perform the execution.

This is the disadvantage of the interpreter,

while readibil ity and transportability are its advantages over compiler.
A compiler must perform an analysis of the source program and
then a synthesis of the object program.

First it should decompose

the source program into its basic parts, then build the object program parts from them.

Figures 3.4 and 3.5 show different modules of

a compiler and an interpreter.
classes:

A translator can be grouped into two

one-pass translator and multi-pass translator.

In one-pass

trans l ator all the operations of the translator will be accomplished
just by scanning once over the source program, while in the multipass translator source program will be scanned more than once.
A compiler is composed of the following modules:
1)

Scannar

2)

Syntax analyzer

3)

Semantic analyzer

4)

Intermediate form generation

5)

Code generation

100

SOURCE
PROGRAM

TOKEN

SCANNER
SYtJBOL
SYNTAX AND
SEMANTIC
ANALYZERS

r-----1--..-i-

SYMBOL
TABLE

ANALYSIS
INTERMEDIATE
FORM OF
SOURCE PROGRAM
PREPARATION
FOR CODE
OPERATION

l
CODE
GENERATION
SYNTHESIS
COMPILER

OBJECT
PROGRAM

Figure 3•4

Different modules of compiler

101

PHASE 1
SOURCE
PROGRAM

TOK!N

SCANNER

l l

SYMBOLS

~------L..SYNTAX AND
SEMANTIC
ANALYZERS

SYMBOL
TABLE

ANALYSIS
SOURCE IN
INTERMEDIATE
FORM

,-.------;-----------~

PHASE 2
DATA

INTERMEDIATE FORM
EXECUTION

INTERPRETER

RESULTS

Figure 3·5

Different modules of interpreter

102

Before any discussion about the ASL implementation, each of the
above terms along with its function will be explained briefly.
SYMBOL TABLE:

During the trans 1at ion of most programming 1an-

guages it is necessary to associate each occurence of an identifier
with its collected attributes.

This is fulfi 11 ed by means of a

symbol table, a directory which holds relevant information about all
active identifiers encountered in the source program.
SCANNER:
screening.

The function of scanner is twofold:

scanning and

Scanning involves finding substrings of characters that

constitute units called textual elements.
cl ass ifi ed as words,
character operators.

The textual elements are

punctuations, comments and single and mul t ;In its simplest form a scanner finds substrings

and classifies each substring to one of the above classes.

Screening

involves discarding some textual elements such as spaces and comments, which are not a part of the formal definition of the language.
Therefore,

scanner

passes

over input program and wi 11

skip over

spaces and comments to recognize the reserved symbols such as the key
words and operators, used in the particular language being trans1ated. The output of this process is usually called a token stream.
SYNTAX ANALYZER (PARSER) has two functions:

it determines that

the token appearing in its input (output of the scanner) occurs in
the pat tern that is permitted by the spec ifi cation of the source
language.

It also imposes on the token a tree-like structure that is

used by the subsequent phases of the compiler.

Figure 3.6 shows the

intercommunication paths between the parser and other modules of the
compiler in a one-pass compiler.

SOURCE
TOKEN

GET
SYMBOL

SCANNER

RETURN

SEMANTIC
ANALYZER &
INTERMEDIATE FORM

PARSER
RETURN
SYMBOL

CONSTRUCT
INTERPRETER
RETURN

EXECUTION

CALL
INTERMEDIATE _________,
FORM

RETURN

RESULT

Figure 3·6

Relationship between different modules
of one-pass interpreter

1-1
0

w

104

SEMANTIC ANALYZER:

the purpose of the semantic analyzer is to

drive an evaluation procedure from the structure of an expression and
the attributes of its components.

The semantic analyzer must deduce

the attributes of the various components of a structure, ensure that
they are compatible, and then select the proper evaluation procedure
from those available .

The input to the semantic analyzer consists of

the structure-tree (syntax-tree) and the dictionary which provides
attribute information.
INTERMEDIATE FORM:

in many compilers the source code is trans-

1ated into a language which is intermediate in complexity between the

programming language and the machine code.
mediate form

In general, the inter-

is easier to handle mechanically and operators wi 11

appear in a sequence which they should be executed.
INTERPRETER:

wi 11 read the intermediate form and wi 11 execute

it on the data .

The output of this module is the result of the

program.
ASL Implementation
A one-pass interpreter for a subset of ASL (Table 3.5) in pL/1
has been implemented which is capable of executing the ASL queries on
a convent i ona 1 computer.

The only restriction i nposed by the one-

pass interpreter is in the definition of a domain name, where each
domain name should be preceded by its relation name.

As shown in

Figure 3.6, the parser is the procedure which controls all the modules in the program.

105

TABLE 3.5.
SET OF IMPLEMENTED ASL PRODUCTIONS IN BNF FORMAT
<PROGRAM>

= <DEC LIST> <RE LIST>

<DEC LIST>

= EMPTY
<DEC LIST> <RELATION ID>;

<RE List>

= <RELATION>;
<RE LIST> <RELATION>;

<RELATION>

= <RELATION>;
<RE LIST> <RELATION>

<RELATION> :: = <T RELATION>
<T RELATION> : : = <RELATION ID>= <Q RELATION>
<RELATION ID>= <UP RELATION>
<UP RELATION>

= <CH RELATION>
<ADD RELATION>
<D RELATION>

<Q RELATION>

= <SET EXPRESSION>
<CARTESIAN>
<JOIN>
<RESTRICTION>
<RELATION ID> <LOG-SET OP> <RELATION ID>

<CH RELATION>

= <SEARCH SET> <RELATION OP> <DOMAIN-VALUE LIST>
<<SET EXPRESSION>> <RELATION OP> <DOMAIN-VALUE LIST>

<ADD RELATION>= <RELATION ID> U <DOMAIN TYPE LIST>
<RELATION ID> U <DOMAIN-VALUE LIST>
<D RELATION>

= <RELATION ID>~ <SET EXPRESSION>
<RELATION ID>~

<SET EXPRESSION>

= <SEARCH SET> <RELATION OP> <OUTPUT SET>
<<SET EXPRESSION>> <RELATION OP> <OUTPUT SET>

<CARTESIAN> :: = (<RELATION ID>) (<RELATION ID>)
<JOIN>

: : = (<RELATION ID>) <DOMAIN> <RELATIONAL OP> <DOMAIN>
(<RELATION ID>)

<RESTRICTION>

:: = <RELATION ID> <OUTPUT SET>

106

TABLE 3.5 -- Continued
=

<SEARCH SET>

empty
<EXPRESSION>
<<SEARCH SET>> <LOG-SET OP> <EXPRESSION>

= <DOMAIN>
<OUTPUT SET>, DOMAIN>

<OUTPUT SET>

<DOMAIN-VALUE LIST> :: = <DOMAIN-VALUE>
<DOMAIN-VALUE LIST>, <DOMAIN VALUE>
= <DOMAIN TYPE>
<DOMAIN-TYPE LIST>, <DOMAIN TYPE>

<DOMAIN-TYPE LIST>

<DOMAIN VALUE> : : =<DOMAIN>= <EXPRESSION>
<DOMAIN TYPE> :: = <DOMAIN>, <LENGTH>, <TYPE>
<DOMAIN> : : = <CH LIST> . <CH LIST>
<EXPRESSION> : : = <SIMPLE EXPRESSION> <RELATIONAL OP>
<SIMPLE EXPRESSION>
<SIMPLE EXPRESSION>
<SIMPLE EXPRESSION> :: = <SIMPLE EXPRESSION> <ADDITIVE OP> <TERM>
<TERM>
<TERM>
<FACTOR>

= <TERM> <MULTIPLICATIVE OP> <FACTOR>
<FACTOR>
= (<EXPRESSION>)
<FACTOR>
<INTEGER>
<BOOLEAN>
<STRING>
<DOMAIN>

<RELATION ID> :: = <CH LIST>
<RELATION OP>

:: = [<RELATION ID>]

<TYPE> :: = <BOOLEANL>
<INTEGER>
<STRING>
<LENGTH>
<ADDITIVE OP>

INTEGER
= ±

107

TABLE 3.5. -- Continued
<MULTIPLICATIVE OP>

=

*
I

<LOG-SET OP>

= &
I

<RELATIONAL OP>

=

LT
GT
LE
GE
NE
EQ

108

a)

Scanner:

Scanner wi 11 be ca 11 ed by the parser whenever a

new token is needed.

The output of scanner is a token with

its associated internal code.

Tokens are partitioned into

five classes:
Class 1:

Integers, any string of digits is a valid integ-

er.

<INTEGER>

<DIGIT>

=

<INTEGER> <DIGIT>
<DIGIT>

=

0
1

9

Class 2:

Identifiers, which are partitioned into variables

and reserved words.
Variable:

any string of characters and digits started with

a character is a valid variable name

<CH LIST>

=

<LETTER>
<CH LIST> <LETTER>
<CH LIST> <DIGIT>

<LETTER>

=

A
B

z
Reserved words:

Table 3. 6 revea 1s the set of reserved

words and their associated internal code.
Class

3:

Delimiters,

a

list

of delimiters with their

corresponding internal codes has been shown in Table 3. 7.

109

TABLE 3.6.
LIST OF RESERVE-WORDS

Reserved Word

Code

Boolean

1

EQ

2

GE

3

GT

4

INTEGER

5

LE

6

LT

7

NG

8

String

9

EOF

27

True

71

False

72

no

TABLE 3. 7.
LIST OF DELIMITERS

Delimiter
+

Code
10
11

*

12

I

13

(

14

)

15

$

16

=

17

C

18

19
%

20
22

&

23

<

26

>

30

..,

31
32

111

Class 4:

String, any string of the ASL alphabet (except")

surrounded by
Cl ass 5:
11

b)

(* 11

,

11

11

11

is a valid string.

Comments, any string of characters embedded in

*) 11 wi 11 be treated as a comment.

Symbol Table:

ASL has two distinguished features; first,

users are not aware of the physical structure of relations,
and

there

is

no declaration part for i dent ifi ers.

The

users should know the names of the domains which they are
authorized to manipulate.

Second,

symbol

table will be

generated at the beginning of each query, for the relations
which users are authorized to access.
In our implemented interpreter, a symbol table has the following
structure. A one dimensional array of m elements will be used, each
relation has an entry in this array and each element (relation) has
the following information:
VALIDITY BIT:

a tag bit

RELATION-NAME:

the name of the relation

TYPE:

a tag bit, which shows whether a relation
is a temporary (O) or Permanent (1) relation

#FIELDS:

number of domains in the relation

NMONIC:

the internal name of the relation

LINK:

for each relation this is a pointer which
points

to

a

linked list

structure:

DOMAIN-NAME:

name of a domain

with

following

112

DOMAIN-LENGTH: maximum length of the domain
DOMAIN-TYPE:

an integer specifying string (2),

Integer

(1) or Boolean (0)
SEQUENCE#:

the

LINK:

a pointer to the next element

sequence

of

domain

in

the

relation

Figure 3. 7 shows the internal structure of the symbol table.
c)

Parser:

In

our

implementation

the

Parser

is the main

driver of the program which uses three subroutines SCANNER,
SHI FT-RT and REDUCE-RT.

As mentioned before scanner wi 11

be called whenever a token should be parsed.

Based on the

token and current state, parser will go to a new state.

If

the new state is a SHIFT-STATE or REDUCE-STATE one of the
SHIFT-RT or REDUCE-RT wll be called.
LR(l) PARSER:

In order to be able to check the cor-

rectness of the syntax of the user's query we use an LR(l)
parser.
1)

The

The LR parsing technique was chosen since;

existence

of

computationally

feasible

constructive

algorithms.

These algorithms can construct a parsing table

(transition

table)

for

any

given grammar and determine

whether or not it contains any inadequate states (conflict
states).
2)

Any grammar in the LR class is a deterministic contextfree, so ambiguous grammar can be rejected.

>-

I1---1

I-

o .......
~Ci)
c:(

RELATIONNAME
( 10)

KIND
(1)

#FIELDS
( 3)

NMONIC
( 5)

LINK

-.....

.. .. .

>

DOMAIN NAME

DOMAIN LENGTH

DOMAIN TYPE I LINK

DOMAIN NAME

DOMAIN LENGTH

DOMAIN TYPE I LINK

DOMAIN NAME

Figure 3-7

DOMAIN LENGTH

DOMAIN TYPE I LINK

Data structure of the symbol-table
1-1
1-1

w

114
3)

Each LR parser can be guaranteed to correctly parse every
correct sentence in its language and to detect an error in
any incorrect sentence.
the first possible point.

The parser wi 11 detect errors at
For more efficiency the parse

table has been reorganized into three tables as fo 11 ows:
PRODUCTION TABLE:

For each production there exists an entry in

this table, which shows the left-hand side symbol and the number
of terminals and non-terminal
STATE TABLE:
LR(l) parser.
table.

symbols in the right-hand side.

This is an encoding of the output of the
Progress of the parse can be achieved by this

In each state based on the token, this table shows what

kind of action should be taken by the parser, on the PARSE-STACK
(e.g. SHIFT or POP).

If SHIFT oµeration should be taken, entry

shows a 1 so what is the next state.

In case of REDUCTION (POP)

it shows reduction should be performed based on what production.
The sequence of operation is as follows:
Based on the current state, the STATE-TABLE
wi 11 be accessed, which gives the boundaries of
the current state in the PARSE-TABLE.
In the defined boundaries the PARSE-TABLE wi 11 be searched,
based on the token, in case of match if the action
is SHIFT, a new entry wi 11 be inserted into the
PARSE-STACK, if action is reduce Production number
wi 11 enab 1e us to have access to the PRODUCTIONTABLE, which dictates how many entries should be
removed from the PARSE-STACK.
Figure 3.8 shows a flowchart of the sequence of operations.

115

CALL
SCANNER
BASED ON THE
CURRENT STATE
ACCESS STATE
TABLE
SEARCH THROUGH
THE PARSE
TABLE BASED
ON THE TOKEN

ERROR

INSERT AN
ENTRY TO
THE PARSE
STACK

ACCESS THE
PRODUCTION
TABLE

POP PARSE
STACK BASED
ON THE
RIGHT-HAND
SIDE

Figure 3·8

Sequence of steps

116
SHIFT-RT, in this routine just a new entry will be pushed into
the parse-stack.

This

new entry is

internal-code and the new state.

composed of the token,

its

After that the control will return

back to the PARSER.
REDUCE RT will be called whenever a reduction must be performed.
This means that a handle (e.g. right-hand side of a production) has
been generated on the top of the PARSE-STACK, which should be reduced
to its left-hand side symbol.

Dealing with one-pass interpreter will

force us to perform the semantic analysis, whenever the reduction is
performed.

This procedure is by itself a group of procedures for

checking the semantics.
d)

Semantic Analyzer:

Semantic analysis will be accomplished

by defining a set of attributes for the non-terminal symbols and a
set of evaluation rules for each production.
correct program a
analyzer,

derivation tree wi 11

For each syntactically

be defined by the syntax

during the parsing the evaluation rules associated with a

given production are app 1 i ed for a 11 instances of the production in
the derivation tree.

Attributes can be of two kinds, the inherited

attributes, whose values are obtained from the immediate parent node,
and the synthesized attributes, whose values are obtained from the
i mmedi date descendents in the tree.

The inherited attributes of the

left side of a production and the synthesized attributes of the right
side

represent va 1 ues

derivation tree.

obtained from the surrounding nodes in the

The evaluation rules of a production specify the

computation of the attributes, that is, the inherited attributes of

117
the right hand side and the synthesized attributes of the left hand
side of the production.

In our implementation attributes are all

synthesized, and since we are using a one-pass interpreter, semantic
routines (evaluation rules) will be applied whenever a reduction has
taken place during the parsing.

Table 3.8 shows the set of attri-

butes for each non-termi na 1 symbo 1 and Appendix I show the set of
evaluation rules for each production of Table 3.5.
e)

Intermediate Form:

Because of the semantic gap between the

source program and the hardware in many compilers, the source program
is translated
complexity
code .

into an

between a

This

intermediate form which
high

intermediate

language easier,

level

is intermediate in

programming language and machine

form can be trans 1ated

since it is hardware oriented.

into the machine
Each statement in

intermediate form involves at most one operation, which is similar to
the statements in a machine language.

Intermediate forms are capable

of optimization, thus increasing the efficiency of the system. In the
intermediate form the source program will be reduced, for example, no
intermediate code will
program.

be generated for the declaration part of a

In the intermediate form the operations should appear in

the same sequence as they should be executed, and because of this
there

is

no

need for

parentheses

in the arithmetic expressions.

Intermediate form also simplifies the design of the interpreter since
the sequence of the operations in the intermediate form is the same
as execution steps.

This enables the designers to have more know-

1edge about the ope rat ions and as a result a better knowledge for
designing an efficient algorithm.

118

TABLE 3.8.
SET OF NON-TERMINAL SYMBOLS AND ASSOCIATED ATTRIBUTES

NONTERMINAL SYMBOL
<ADDITIVE OP>
<ADD RELATION>
<BOOLEAN>
<CARTISEN>
<CH LIST>
<DEC LIST>
<DOMAIN-TYPE LIST>
<DOMAIN-VALUE LIST>
<DOMAIN TYPE>
<DOMAIN VALUE>
<DOMAIN>
<D RELATION>
<EXPRESSION>
<FACTOR>
<INTEGER>
<JOIN>
<LENGTH>
<LOG-SET OP>
<MULTIPLICATIVE OP>
<OUTPUT SET>
<PROGRAM>
<Q RELATION>

VALUE

TYPE

*
*

*

*
*
*
*
*

*
*
*

*
*
*
*
*

*

119

TABLE 3.8. -- Continued

NONTERMINAL SYMBOL

<RELATIONAL OP>
<RELATION OP>

VALUE

TYPE

*
*

<RELATION>
<RESTRICTION>
<RE LIST>
<SEARCH SET>
<SET EXPRESSION>

*
*
*
*
*
*

<SIMPLE EXPRESSION>
<STRING>
<TERM>
<TYPE>
<T RELATION>
<UP RELATION>

*

120
Among the different techniques which are used for generating

intermediate form such as:

polish notation, trees, guad-

rup l es and triples, we have chosen the polish notation format,
because:
a)

it can be handled by a stack; and,

b)

the

format

polish

notation.

quadruples
the

of the ASL instructions is suitable for
Al though the other forms

such as

and triples can be optimized better than

polish

notation,

in

case

of

a query

language

optimization is not a big factor, since optimization
is meaningful for large programs.
Polish notation is suitable to represent arithmetic and logical
expressions.

As mentioned before, there is a close similarity be-

tween ASL instructions and arithmetic expressions.
notation

for

instructions.

an

arithmetic

expression can be

Therefore, polish

implemented for ASL

The following examples show the polish format of an

arithmetic expression and the polish format of an ASL instruction:

ABCD/+*
the polish format for

A* (B

+

C/D)

CITY ORLANDO EQ EDUCTION 4 LE A SALARY

NAME E

the polish format for

<CITY EQ
In our
semantic

11

0RLAND0 11 > A EDUCATION LE

interpreter,

analysis,

intermediate

meaning

that

if

11

411

[E] SALARY, NAME:

form is generated along the
some

productions

satisfy the

associated semantic rules, an intermediate form in the polish format

122

TABLE 3.9.
SET OF PRODUCTIONS WHICH INTERMEDIATE FORM WILL BE GENERATED FOR THEM

= <RELATION ID> U <DOMAIN-TYPE LIST>
<RELATION ID> U <DOMAIN-VALUE LIST>

<ADD RELATION>
=

<CARTESIAN>

(<RELATION ID>)

(<RELATION ID>)

<CH RELATION>

= <SEARCH SET> <RELATION OP> <DOMAIN-VALUE LIST>

<DOMAIN VALUE>

=<DOMAIN>= <EXPRESSION>

= <CH LIST> . <CH LIST>

<DOMAIN>

<D RELATION> : : = <RELATION ID> -<EXPRESSION>

= <SIMPLE EXPRESSION> <RELATIONAL OP>
<SIMPLE EXPRESSION>

<FACTOR>

= . . <FACTOR>
<INTEGER>
<BOOLEAN>
<STRING>

<JOIN>

= ( <RELATION ID>) <DOMAIN> <RELATIONAL OP> <DOMAIN>
( <RELATION ID>)

<RESTRICTION>
<SEARCH SET>

:: = <RELATION ID> <OUTPUT SET>
=

<SEARCH SET> <LOG-SET OP> <EXPRESSION>
<SEARCH SET> <RELATIONAL OP> <OUTPUT SET>

<SIMPLE EXPRESSION>
<TERM>

<SIMPLE EXPRESSION> <ADDITIVE OP> <TERM>

:: = <TERM> <MULTIPLICATIVE OP> <FACTOR>

123

TABLE 3.10.
INTERMEDIATE OPERATIONS
OPERATION

CODE

OPERATION

CODE

AND

1

MINUS

15

CARTESIAN

2

MULTIPLY

16

CHANGE

3

NEGATION

17

DELETE

4

NOT EQUAL

18

DIVIDE

5

NOT

19

EQUAL

6

OR

20

EQUALITY

7

PLUS

21

GREATER THAN EQUAL

8

RESTRICTION

22

GT

11

SEARCH - T

23

JOIN

12

SEARCH -

p

24

LESS THAN EQUAL

13

UNION 1

25

LESS THAN

13

UNION 2

26

124
1)

CODE:

a

digit which defines the type of the entry.

Each entry can be of type operand or operator.
Each

entry

specified as

an

operator will

be

preceded by a vari ab 1e number of operands. The
number of operands for each operator is a function of the operator.
2)

VALUEl:

holds a digit.

3)

VALUE2:

holds a digit.

4)

VALUE3:

holds a character string.

The function of VALUEl, VALUE2 and VALUE3 will be clarified in
the discussion of each production.

In genera 1, for each entry these

domains carry appropriate information needed in later steps.
Accardi ng to the ASL product i ans,
11

WHAT 11 and "HOW'' varies.

the number of e 1 ements in

The specified domains or items in "WHAT" or

"HOW" should be reflected in the intermediate form. It was possible
to insert an entry to the array of intermediate form for each specified domain in "WHAT" or for each specified term in "HOW."

Adapta-

tion of this strategy provides a variable number of operands for each
operator.

Another solution which provides a fixed number of operands

for each operator has been considered and developed in our implementation. A tab 1e "T" of "m" rows has been defined.
a variable number of elements.

Each row can have

During the parsing and semantic check-

ing, enough information will be provided from the context of the ASL
program or the symbo 1 tab 1e.

This information wi 11 be he 1 d into the

"T." As mentioned before, each row of the ''T'' has a var1a
· bl e num be r

CODE

VALUE

Figure 3·9

1

( DIGIT)

VALUE

2

(DIGIT)

VALUE 3 (CHARACTER)

Data structure of each entry in the intermediate array

i,-J

N
(J1

SEQUENCE

Figure 3· 10

VALUE

OP-SEQ

OP-LSEQ

TYPE

LENGTH

The structure of each element in a row of the T
11

11

t-1
N

en

128
According to this approach, each row specifies "WHAT" or "HOW"
part of an ASL production.
In our discussion in the rest of this section, we will refer to
two types of relations:
data base system,
during

the

permanent relations,

which are a part of

and temporary relations, which will be generated

execution

of

the

ASL

program.

In

the

following,

the

intermediate forms of the productions of the Table 3.9 will be studied:
a)

<ADD
For

RELATION>
this

= <RELATION ID> U <DOMAIN-TYPE LIST>

production two entires will

array of intermediate form.
of operand which dee l ares the

be inserted in the

The first entry has the type
II

relation name" and elements

of DOMAIN-TYPE LIST (a row of the table

11

T 11 ) .

The second

entry is an operator which declares the operation (UNION).
b)

<ADD RELATION>

: : = <RELATION ID> U <DOMAIN-VALUE LIST>

The number of entries and their contents are the same as
the previous production.

It should be mentioned that the

type of operation in a) and b) are not the same.
c)

<CARTESIAN>

- (<RELATION ID>) (<RELATION ID>)

Three entries, two as operands and one as an operator, will
be

inserted

into · the

operands define the two

array
II

of

intermediate

form.

The

relation names II spec if i ed by the

user in the ASL program and the third entry defines the
operation which is a "CARTESIAN PRODUCT."
d)

<CH RELATION>

= <SEARCH SET> <RELATION OP>

<DOMAIN-VALUE LIST>

129

Based on the type of specified relation in the instruction

'

two different approaches will be taken:
1)

For a temporary relation, no intermediate form will be
generated.

2)

For a permanent relation, two entries, one as operand
and the other as operator, wi 11 be generated in the
array of intermediate form.
the

II

relation name" and DOMAIN-VALUE LIST.

tor is a bi nary operator.
; ng

The operand wi 11 reveal

implementation the

The opera-

Remember that in our pars-

intermediate form for SEARCH

SET has been inserted prior to this reduction.
e)

<DOMAIN VALUE >

: : =<DOMAIN>= <EXPRESSION>

For a temporary relation, this reduction means replacement.
In other words, the content of DOMA!N should be replaced by
the

EXPRESSION.

Therefore,

an

entry which

shows

this

operation wi 11 be inserted into the array of intermediate
form.

It should be mentioned that entries for DOMAIN and

EXPRESSION have been generated and are followed by this new
entry in this reduction.
f)

<DOMAIN>

:: = <CH LIST> . <CH LIST>

For a temporary relation, an entry will be created in the
array of intermediate form; otherwise, a new element wi 11
be created in the "T. 11
g)

<D RELATION>

: : = <RELATION ID> --

Two entries wi 11 be created.
11

The first one revea 1 s the

relation name" and the second one rev ea 1s the operation.

130
h)

<EXPRESSION>

=

<SIMPLE

EXPRESSION>

<RELATIONAL OP>

<SIMPLE EXPRESSION>
Two different approaches will be taken based on the type of
relation.

For

a

temporary

relation,

an

entry for the

RELATIONAL OP will be inserted into the array of the intermediate form.

For a permanent relation, the Table

11

T will
11

be updated by the RELATIONAL OP.
i)

<FACTOR>

=

... <FACTOR>

An entry for the unary operator

11

-.

11

wi 11 be inserted in the

array of the intermediate form.
j)

<FACTOR>

= INTERGER

BOOLEAN
STRING
For the above productions, the same action wi 11 be taken
based on the type of the relation specified in the context
of the ASL instruction.

For a temporary relation, an entry

will be created in the array of the intermediate form while
the table
k)

II

rr

<JOIN>

=

would be updated for the permanent relation.
(<RELATION

ID>)

<DOMAIN>

<RELATIONAL OP>

<DOMAIN> (<RELATION ID>)
Two entries which reveal the <RELATIONAL OP> and the JOIN
operation will be inserted into the array of intermediate
form.
l)

<LOG SET OP>

= &

If specified relation in the context of the ASL instruction

131
11

is a permanent relation, Table
or

m)

T11 will be updated by

11

& 11

"I"

<Q RELATION>

- <RELATION ID> <LOG-SET OP> <RELATION ID>

In this reduction, the LOG-SET OP would be interpreted as a
set

operation such as "union.

would be

11

Therefore,

three entries

inserted in the array of intermediate form.
11

entires are operands which reveal the

Two

relation names 11 and

the third entry reveals the LOG-SET OP.
n)

<RELATION OP> :: = [<RELATION ID>]
If the specified relation is a permanent relation, an entry
wi 11
which

be

inserted

reveals

appropriate

a

in

the

row

of

array
the

of

table

information for the

11

the intermediate form
11

T.

11

This

row

holds

HOW 11 part of the produc-

tion.
o)

<RESTRICTION>

= <RELATION ID> <OUTPUT SET>

Two entries will

be created in the array of intermediate

form.

One reveals the

veals

the

operation

II

relation name"

(restriction).

and the other re-

Entry(ies)

for

the

OUTPUT SET has been inserted to the array prior to this
reduction.
p)

<SEARCH SET>
The table

q)

11

::

= <SEARCH SET> <LOG-SET OP> <EXPRESSION>

T" would be updated according to the LOG-SET OP.

<SET EXPRESSION> :: = <SEARCH SET> <RELATIONAL OP> <OUTPUT
SET>
Two entries will be inserted in the array of intermediate

132
The first entry will specify the appropriate infor-

form.

mation for OUTPUT SET (a row number of the table T) along
w,·th

II

re l a t.ion name, II and the other entry will specify the

operation.
r)

<SIMPLE EXPRESSION>

= <SIMPLE EXPRESSION>

<ADDITIVE OP> <TERM>
<TERM>

: : = <TERM> <MULTIPLICATIVE OP> <FACTOR>

For both productions, an entry which reveals the <ADDITIVE
OP> or <MULTIPLICATIVE OP> will be inserted in the array of
intermediate form.
In this chapter the formal definition of ASL and its capabilities for handling queries has been discussed.

The types of examples

show the efficiency of the ASL language for handling data base operations. Chapter III also has addressed the implementation of a subset
of the ASL productions on a conventional computer.

Since ASL is a

query language, in this implementation we were concerned about those
features which are used in a query.

Therefore some of the production

such as (add relation) have not been implemented.

It is evident that

ASL will be fully practical if it be implemented completely.
The next chapter will address the design and implemented of ASLM
which is a hardware implementation of the ASL language.

133

CHAPTER IV
ASLM - AN ASSOCIATIVE SEARCH LANGUAGE MACHINE
Introduction
Chapter III presented the design and implementation of the ASL
language.

This chapter will address the hardware implementation of

ASL. ASLM, an acronym for Associative Search Language Machine, is a
language oriented machine, which supports non-numeric operations for
a relational data model on an associative memory.
In the previous chapter the concept of
explained.

II

semantic gap 11 has been

We noted that this concept can be reduced by taking a top

down approach in the design of the hardware.

Figure 4.1 shows some

of the approaches to computer architecture to close the semantic gap.
The

top

path

represents the traditional

approach:

a high level

language program is translated into a low level machine language; the
1atter is

first

then interpreted by the machine.

category

of

semantic

gap

The second line is the

closing architecture;

the source

program is compi 1ed into a higher-level machine language, which in
turn is interpreted by the machine (language oriented architecture).
In this approach the operations and data structures implemented by
the machine are more closely related to the operations and data
structures of one or more high level languages.

In path 3 the ma-

chine architecture is raised to such a level that the high level
language can be thought of as the assembly language.

That is, there

C

I

High
level
language
program
I

I

Path 4
C = Compiler
A= Assembler
I= Interpreter

Figure 4·1

Language-machine relationships

J-1

w

~

135

is a one to one correspondence between statement types and operators
in the high

level

instruction set.

language and the instructions in the machine I s
In path 4 the high level

machine language.
chines.

language is also the

Paths 3 and 4 represent hi gh-1 eve l language ma-

Based on the above categories, ASLM follows the second path,

but because of some features

such as the concept of associative

operations and hardware implementation of non-numeric operations it
can be categorized to belong to the third path.

Therefore, we can

claim that ASLM is a high-level language machine.

On the other hand,

since ASLM is a hardware implementation of ASL and ASL is a query
language we can also categorize ASLM as an input/output architecture
(Myers 1978).
The ASLM is a backend data base hardware system supported by a
general

purpose computer (GPC),

capable of performing information

retrieval and storage operations on non-numeric data.
backend machine, ASLM is composed of two parts:

Like any other

a GPC and a backend

data base machine called ASLH (Associative Search Language Hardware)
capable of executing the user I s query on data bases.

The GPC is a

host machine which is an interface between the user and the backend
machine. The result of the execution by the ASLH will be sent to the
host machine.

The

design of ASLM is significantly different in

comparison to all the previously designed data base machines.

In

this design the user's independence of data as well as the associative nature of operations have been taken into account.

Moreover,

since the result of the execution of each ASL instruction is a subrelation, the use of mark bits (Chapter II) has been eliminated and

136

interrelational operations such as join can be performed faster than
previous 1y designed bac kend machines.

It s hou 1d be mentioned that

our design has some resemb 1ance with some other proposed hardware
machines to be discussed at the end of this chapter.

In the follow-

ing sections the overall hardware of the system along with the overall

flow of data in ASLM will

be discussed.

Also, the detailed

hardware of each module with the flow of data in each module will be
studied.

Finally, at the end, there will be a discussion about the

classification

of ASLM in the Flynn (1972) architecture classes.
Overall View of the ASL Machine
General View of the Hardware

The general hardware organization of the ASL machine is depicted
in Figure 4.2.

A data base machine is supported by a general purpose

computer and backed by a fast secondary storage which contains data
bases being processed.

Copies of a 11 data bases in the system are

kept in a slower secondary storage device.

The processing functions

required for any data base operations are performed in three steps:
1)

translating the user's query into a set of primitive
operations described in the machine language;

2)

searching the data base and selecting records that satisfy
the search criteria; and,

3)

processing the output (i.e.
tracting

the

operations).

formatting the records, ex-

required information and other bookkeeping

s
E S

I

User
~

-

,

General
Purpose

I
-

Computer

I
7'

Data
ua ta
Base
Machine

C T

I

I

I

I

-

o o

N R
D A

A G

oy
.

R E

~

r--__

Figure 4·2

__,,,.,,,.,

General view of ASL machine
1---'

w

-....J

138

Among these three steps, the second step is the most difficult
and time consuming because of the large volume of information.
In our design steps 1 and
will be handled by DBM.

3

are handled by the GPC, while Step 2

After translation, the user 1 s query written

in the ASL instructions will be executed by DBM, then the result of
execution will be routed to the user via GPC. Figure 4.3 shows the
sequence of operations.

As can be seen there is a connection between

the backend machine and the backend storage, the backend storage is
used for operations such as update and deletion, which are explained
1ater in this chapter.

Two of the important features of the backend

machine that enhance the throughput are:

paral 1el ism and use of

special purpose hardware algorithm.
Figure 4.4 is the detail description of Figure 4.2.

The backend

machine is compo3ed of four modules:
1)

controller;

2)

index processor;

3)

secondary storage interface; and,

4)

non-numeric processor.

The index processor can be assumed to be a sma 11 mini computer
which holds an entry for each relation name and its data organization.
The secondary storage interface is an interface between secondary storage and the data base machine.

139

User's query
(ASL Program)

GPC:

Translation

to the ASL Primitives

Data
File

----

DBM: Execution of
ASL on the data

Storage

Result to the GPC

USER

Figure 4·3

General flow of data

7

Secondary
Storage
Interface

s

-

E

-

.

C S

-

J~

Nonnumeric
USER

~

~

General
Purpose
Computer

-

~

-

-

Contro 11 er
I

J
I

Processor
~

~

0 T
N 0
D R

A A

o~~

~

Index
Processor

~

Figure 4·4

ASLM architecture
1-.1
~

0

142

Retrieval.
from

Retri eva 1 is the operation of extracting the data

the data base which is addressed by the query.

Retri eva 1 is

the basic function of any query which may be performed as an operation (simple retrieval) or as a part of another operation (complex
retrieval).

Since ASLM is a high level language machine, one of the

objectives of this machine is to support this operation efficiently.
When a retrieval query is initiated by the user, the ASL compiler checks the query from both syntax and semantic points of view.
the query is correct,

If

the micro instruct i ans wi 11 be created and

routed to the controller.

The next step is the execution of the

translated program.
Figure 4 . 5 shows the sequence of the steps in a retrieval operation. The execution will be started by initialization of the nonnumeric processor which sets the parameters and arguments for the
search to the non-numeric processor.
the query wi 11

Then the specified data base in

be read from secondary storage and routed to the

non-numeric processor via the secondary storage interface.

Based on

the stored parameters in the non-numeric processor, each tuple will
be evaluated.

If we are dealing with simple retrieval, the tuples

which satisfy the search criteria will be transferred to the user via
GPC. In case of complex retrieval, the selected tuples will be stored
in

the

temporary

buffer

or

"associative

stack"

(see non-numeric

processor of the associative search 1anguage hardware).

At the end

of the relation, in case of the simple retrieval, the operation will
be terminated, while for complex retrieval the rest of the operations

Set parameters to the
non-numeric processor

Transfer data to the
non-numeric processor
via SSI

Yes

Check Search
criteria

Evaluate the
complex retrieval

No

Transfer the
result to the
GPC

Transfer the
result to GPC

Stop

Store tuple in
associative stack
(temporary buffer)

Figure 4·5

Retrieval

144
which are specified in the user I s program wi 11 be executed on the
stored data in the non-numeric processor and then the results are
transferred to GPC.

Update.

Update operation on the data base consists of one of

the fo 11 owing:
i)

deletion of tuples or a relation;

ii)

insertion of tuples or a relation; and,

iii)

modification of attribute values of a tuple.
i)

deletion operation
a)

deletion of tuples:

search and write.
groups,
not.

Tuples in the relation can be classified in two

those which satisfy the search criteria and those which do

The tuples

storage,
will

is a sequence of two steps:

in the second group wi 11 be written on the backend

in a pre-specified location, and those in the first group

be written

storage.

in

Therefore,

partitioned
changed .

into

another pre-specified

location on the backend

at the end of this procedure data file wi 11 be

two

groups,

The operation will

while

the

original

relation

is

un-

be terminated by putting a 11 the data

(non-deleted tuples) on the backend storage in place of the original
relation on the secondary storage.
operation.

This

sequence

of

Figure 4.6 shows the steps of the

operation

provides

two

significant

features:
1)

s i nee

deletion

is

not

in pl ace,

the user is capable of

145

set parameters to the
non-numeric processor

transfer data to the
non-numeric processor
via SSI
Yes

Check search
criteria

Yes

Stop

No

Storage

Figure 4·6

Deletion of tuples

146
backup, meaning that in case of failure in system or operati on

the

user

is

ab 1 e to

start the operation from the

beginning; and,
2)

s i nee

the

1 ocat ion

de 1 eted tup 1 es
on

are

stored in a pre-specified

the backend storage user is ab 1 e to provide

enough information about the deleted tuples.
b)

deletion of a relation:

for each relation, two

entries are kept in the index processor and the secondary storage
interface as explained in Section ASL machine architecture.

Deletion

of a relation is equivalent to deletion of these entries from the
index processor and the secondary storage interface.

It is clear

that in this fashion relations are not eliminated from the secondary
storage physically and this operation and elimination of any space
fragmentations on the secondary storages should be performed separately.
ii)

insertion
a)

insertion of tup 1 es:

A re 1 at ion can be expanded

by insertion of new tuples(s) to an old relation in two ways:

the

attributes of the tuples are specified by the user or the attributes
of the tup 1 es are the result of a sequence of operations.

In the

first case, for each tuple an entry will be created at the end of the
relation directly, the same way as we create an entry on an unordered
file in the conventional systems.
operations which

are

stored

In the second case, the results of

in the

non-numeric processor will

be

written at the end of the relation under the control of the control1er.

147
b)

insertion of a new relation:

As mentioned previ-

ously, for each relation there are two entries in the index processor
and secondary storage

interface.

Insertion of a new relation is

equi va 1ent to the creation of these two entries.
mass

After this, the

insertion of the tuples can be performed directly by a high

level language of which ASL is a sublanguage.
iii)
This

modification of attribute values of a tuple:

operation

tuples which
program .

modifies

satisfy the

the

content of some attributes of those

search criteria as specified in the ASL

Figure 4.7 shows the sequence of operations.

is a sequence of three di st i net steps:

The operation

search, modi fi cation and

write.
Search is the same as discussed in the

11

deletion 11 such that the

tuples will

be validated by the non-numeric processor.

The valid

tuples wi 11

be stored in the non-numeric processor; while, the in-

valid tuples will be written on the backend storage.
During the modification step, the stored data in the non-numeric
processor will be modified under the contro 1 of the contro 11 er.

In

the WRITE step the modified tuples will be written at the end of the
re 1at ion on the backend storage.
important features:

This sequence of operation has two

Firstly, at the end of the operation, there will

be two copies of the re 1at ion, an o 1d copy and a new copy.

This

enables the user to have "back-up."

Secondly, there is no need to

keep the address of modified tuples.

In some of the earlier systems

(Oliver 1979) based on fully associative memory, each block of data
fi 1e has been copied in the associative memory and at the end of

Set parameters to the
non-numeric processor

Transfer data to the
non-numeric processor
via SSI

Yes

Check search
criteria

Modify the
valid tu les

Store tuple in
non-numeric
processor
Store on
backend
storage
Backend
storage

Figure 4·7

Modification of attributes• values

Stop

149

modification, the modified and unmodified tuples in that block had to
be written back on the secondary storage.
our design.

This is not the case in

At the end of the operation, the data file is parti-

tioned into two groups of tup 1es:

those which are not modified at

the beginning of the relation and the modified tuples at the end of
the relation.

This strategy is applicable since we are dealing with

an unordered file and the address is not used in our design.
ASL Machine Architecture
The overa 11 hardware organization of the ASL machine has been
depicted in Figure 4.4.

It consists of a GPC, a DBM (ASLH), and a

fast access backend storage.

In the following discussion, the de-

tai 1ed architectures of each of the components wi 11 be presented
along with their functions.
General Purpose Computer
GPC is an interface between users and ASLH.

It receives user's

query (ASL programs) and transfers them to the controller of ASLH.
The ASL programs are validated by the ASL translator located at GPC.
Valid ASL programs are translated into the ASL micro operations (see
Chapter V). Moreover, the result of the query is routed to GPC from
the non-numeric processor, which are reformated by GPC and then sent
to the user.
In general, GPC provides the following functions:
a)

Supports a data communication environment for users

with proper I/0 facilities;

150

b)

Compiles

user's query expressed in the ASL language

into the ASL primitives;
c)

Transfers

compiled

ASL

primitives

with

associated

parameters to the controller;
ct)

Controls data base security and integrity;

e)

Inserts new entries for newly created relations in the

secondary storage interface; and,
f)
It

Edits the results of the execution.

should be mentioned that all the modules of the ASLH are

hidden from GPC ( except the index processor).

In other words, any

communication between GPC and the different modules in the ASLH can
be established through the controller.

The communication path be-

tween GPC and index processor will be used during the compilation.
Moreover, GPC could insert an entry to the secondary storage interface for each newly created relation in the system.

Figure 4.8 is an

expanded version of Figure 4.4.
Associative Search Language Hardware
ASLH is a computer which manipulates the data as specified by
the user.

This computer is enhanced by associative memory and cir-

cuits for handling non-numeric operations.
ASLH is composed of four independent modules:
a)

Controller

b)

Index processor

c)

Secondary storage interface

d)

Non-numeric processor

LJE S
s

C T

Secondary
Stor age
Interface

0 0

N R

□···" ··

D A
A G
•

USER

I

General

I

Purpose

L~

Computer

~~

I

~! 0

R E

y

Associative Stack
Module

Index
Processor

I

Data to backend storage
or GPC

Backend
Storage

Figure 4·8

Expanded version of Figure 4·4
1--1
(.J1

1--1

152
Before a detailed discussion about the modules of ASLH, we will
introduce the data structures used in ASLM.
Data structure.

The concept of data independence is one of the

key points in the design of ASL and ASLM.

User I s independence of

data structure implies that data representation should not affect the
user's view of data. In our design, relations are stored in the serial access memory as a stream of characters, which is the most natural
representation.

Two consecutive tuples are separated by a marker and

the system is based on variable length tuples.

Any pair of adjacent

domains in each tuple are separated by another marker.

Figure 4.9

shows the encoding of a relation.
Each relation will begin by a BOR (Beginning of Record) markers
and wi 11

be terminated by EOR ( End of Record) marker.

tuples are separated by
allows flexibilities

11

¢.

11

Variable

Adaptation to a variable length field

in the format of information.

tuple are separated from each other by a

11

$.

11

Domains of a

Figure 4.10 shows the

encoding of each tuple.
In some data base machines proposed earlier (Ozkarahan et al.
1974; Ozkarahan 1975), provision have been made to include control
fields that contain partial results of search operations on the data.
Moreover, in some designs, the data structure of each relation would
also be kept on the secondary storage as a part of data.

Since such

information necessarily increases the complexity of the software to
manipulate the data base, our decision was to design ASLM in such a

r

B

0

1st Tuple

¢

2nd Tuple

R

¢

\
i

I
J

Figure 4·9

I

E

I

nth Tuple

0

i

)

R

I

Encoding of a relation

f-1
u,

w

1st Domain

$

2nd Domain

$

1

$

mth Domain

l

Figure 4· 10

Encoding of a tuple

1-J
(J'1

~

155
way

as

to

remove

the

same

information

from

the data structure.

Figure 4.11 shows the encoding of Employee relation of Figure 2.2 on
the storage.

It should be mentioned that there is no restriction on

the way data

should be stored on the secondary storage (i.e.

serial

word

serial

or

serial,

bit

non-numeric

bit

parallel

serial

word parallel,

word parallel).

processor are

performed

bit parallel

Since operations

in the

bit
word

on the

bit slice (bit serial

fashion), an interface may be necessary to re-format the data in word
serial byte parallel form.
Controller.

The translated ASL program is stored into the mem-

ory of the cont ro 11 er.

This stored program is interpreted by the

controller and appropriate steps are taken by the different modules
of ASLH as directed by the controller.

The simplicity of the con-

troller is primarily due to the use of the associative hardware for
the execution of the ASL primitives.
1)

In particular:

There is no need for address generation, since in the

associative processor data is accessed by its value rather than by
its address. This has a drastic effect on the instruction format and
its execution time.

In the instruction there is no indexing such as

immediate, direct, indirect and indexed operands.
and va 1 ues

of

operands

operand address will
instruction,

are

a 11

that

needed.

The operation code
Eli mi nation of the

improve the execution time of each individual

s i nee there is no address generation and no addi ti ona 1

memory accesses.

B

0

M-125 $ Smith J. $ 19000

$ Johnson C.

$ 5 $ M¢

$ 18000

$ 5 $ A ¢ N-090

R

$ Evans S. $ 25000 $ 7 $ N

000

$ 5 $ M¢

¢

A-102 $ Endress B. $ 21000

N-050 $ 6heet D.

$ 19000

$ 4 $ A

¢

M-001

$ Goets L. $ 17

$ 5 $ N ¢ A-111 $ Bolden B.· $ 22000

$ 7 $

E

A ¢ A-125 $ Ogas H.

$

20000

$ 6 $ A¢

Figure 4· 11

N-021 $ Roof J. $

18500

$ 4$N0
R

Encoding of Figure 2·2

1---1

u,

O'l

157
2)

There

is

no

information in the memory.
nature,

need for a loop for searching stored
In the associative memory, by its very

the whole memory is searched simultaneously as opposed to

conventional

programming

memory involves a loop.

languages

where

each search through the

In other words, programs in a language based

on associative processors are sequential except for subroutine calls.
In the case of query languages where there are no subroutines, queries are strictly sequential (there are no control flow instructions
such as DO WHILE, DO LOOP, etc.

in ASL).

Sequential programs in-

crease the efficiency of execution time, since the next instructions
can be fetched by the program counter.

Therefore, the concepts such

as instruction prefetch and pipelining can be applied as discussed
later in Chapter VII.
The controller then is simply a microprogrammable control unit
with a programmable read only memory (PROM).

The content of the PROM

will be determined by the user's program (ASL primitives).
More specifically the controller should therefore:
1)

store the ASL micro instructions generated
by the ASL compiler;

2)

decode the ASL micro instructions; and,

3)

propagate the control sequences to the
appropriate modules in ASLH.

The above tasks imply communication paths between the controller
and the other part of ASLH, as follows:

158
A)

a path to the index processor, which will transfer appro-

priate instructions and data to the index processor.
tion will be used:

This informa-

i) to create a new entry in the index processor,

ii) or to delete an entry from it.
B)

a path to the secondary storage interface, which will be

used to:

i) initiate a read data out of the secondary storage, ii)

or delete an entry from the secondary storage interface.
C)

two

initialize

paths

to

the

non-numeric processor which will:

the non-numeric processor,

i)

ii) transfer appropriate in-

structions and data to the non-numeric processor in order to manipulate the stored data.
Figure 4.12 shows the controller.
each module of ASLH,

As will be discussed later,

except a part of the non-numeric processor is

autonomous once initiated by the contro 11 er.

This means that each

module can perform its task without any direct supervision by the
controller.
1)
would be

This approach will bear the following advantages:

The

communication between the controller and the module

simple,

and this simplicity will

reduce the size of the

program in the control memory.
2)
modules,

The
while

controller would be able . to communicate with other
some modules

are executing their tasks.

This in-

creases the overlapping of operations of the system.
3)

Imp 1 ementat ion

of 2)

as

discussed 1ater

in Chapter VII

enables the controller to control more than one ASLH, a feature which
increases the parallelism in the ASLM from SIMD to the MIMD.

....--I

I
NR
S E
T G
RE

Central

us

From
GCP

.

MAR

r

MBR

Figure 4·12

I

D
E
C
0
D
E

-

R

C .___
T T
I E
0 R
N

PROM

I

-

,-

I

I

PC

To
Different
Modules

;;.

I

I

Controller

1-J
(J'1

I..O

160

Index Processor.

The index processor serves

the

function

of

i sol at i ng the end users from the internal data structures used in
ASLM.

Each relation

in the

index processor is specified by its

unique name.
The index processor, as a part of ASLH, is composed of an associative memory
random

(AM),

and a small processing unit consisting of a

access memory (RAM),

some processing capabilities such as

addition and a hard wired control unit.
register

called

address

register

The processing unit has a

(AR)

which

holds the beginning

address of available space of the RAM memory.
The AM part is used for name resolution, while the RAM part
holds a descriptor for each relation.

The descriptor defines the

data structure of the components of each relation.
the descriptor of the index processor has

been

The encoding of

depicted

in Figure

4.13.

A)

delete bit:

invalid data.

This bit is used to separate valid data from

This bit can be used for garbage collection by soft-

ware .
B)

internal

name:

This field is used for security purpose.

Although in this dissertation we are not concerned about the security, this field could be used for this purpose in further development
of the design .
C)

number of domains:

For each relation this field shows the

number of domains in each tuple.
D)

domain

values:

For

each

domain

three

length format will be kept in the RAM as follows:

values

in fixed

nTH DOMAIN VALUE

1ST DOMAIN VALUE

r ....-----/'-,.,,

D

INTERNAL

B

I
T

NAME

'

#

OF
DOMAINS

I

NAME

MAX.
LENGTH

TYPE

NAME

TYPE

MAX.
LENGTH

I

Figure 4·13

Encoding of descriptor in the RAM memory
of the index processor

~
0)

~

162

domain

1)

name:

A unique name for each domain in a

relation .
2)

domain type:

Defines the type of each domain.

example, a domain can be of type
ter.
3)

11

digit 11 or

11

For

charac-

11

maximum length of domain:

As mentioned before in ASL

and ASLM we are dealing with variable length fields, a
feature which increases the flexibility of the system.
As wi 11 be explained later, in case the length of the
domain is less than this maximum value, certain fields
of tuples

have to be padded with OI s in order to

perform associative operations.

This field is used to

compute the number of O's for padding.
The

concept

of

name resolution and descriptor in the index

processor is the same as the concept of the inverted file, where an
inverted list is searched for the address of a record which holds the
specified key .

In our organization the key is the relation name and

the search is performed in associative fashion.

Figure 4.14 shows

the hardware organization of the index processor.
Associative memory holds an entry for each relation.

This entry

is composed of five components:
A)
wi 11

The name of the relation.

Any search through the AM part

be f u l f i l l e d by th i s uni q ue name , wh i ch i s ass i g ne d to each

relation.
B)
processor.

The size of the descriptor in the RAM part of the index
The size of descriptor varies from one relation to an-

From
Controller ,------------r-------------,

----..j

Compared

IMask

: Register

I
From Controller

Register

CONTROL
UNIT

R
E

s

p
0

V

A

I

AR Register

L

N I

s

0

E I

MEMORY (RAM)

T

R

y

E
G

B

I

s
T
E

I
T

s

R

Figure 4•14

I

MAR

I

MBR

I
I

Index processor

...,
O'l

w

164
other.

This field eliminates the existence of a separator between

each pair of descriptors.
C)

The beginning address of the descriptor in the RAM part of

the index processor.
D)

The size of the descriptor in the RAM memory of the second-

ary storage interface.
E)

This field will be discussed later.

The beginning address of the descriptor in the RAM memory

of the secondary storage interface.
Secondary Storage Interface.

The

secondary

storage interface

provides access to the relations stored in the secondary storage and
routes the raw data to the non-numeric processor.
ondary storage will be made invisible to ASLH.

In this way sec-

The secondary storage

interface (SSI) is composed of three parts:
a)

A control processor

b)

A channel processor

c)

A distributor box

The control processor is very similar to the control processor
of the index processor.

For each relation a descriptor will be kept

in the random access memory (RAM) of this module.

The data structure

of the descriptor is a function of different technologies and different vendors. It will suffice to say that each descriptor holds enough
information to provide access to the secondary storage.
The channel processor gets the content of the descriptor, and
routes the stored data on the secondary storage to the "distributor
box. 11

165

The distributor box receives the data transferred from secondary
storage by the channel processor.

This data will be distributed to

the different parts of the non-numeric processor in tuple-wise fashion .

This box is composed of three different subsystems:
1)

Queue

2)

Priority network

3)

Decoder

The description of the subsystems follow:
Queue:

Data from the secondary storage will be stored in this

unit . The Queue is an assembly of registers used to retain information

until

they

can

be

delivered

to

the

non-numeric processor.

Information will be delivered to the non-numeric processor based on
first-in first-out (FIFO) policy.

The depth of the Queue is a func-

tion of the number of basic time intervals required for processing
tuples in the non-numeric processor and the relative bandwidth between the secondary storage and ASLH.
Priority Network:

This unit determines which one of the cells

of the non-numeric processor receives data in the Queue and then
gates the tuples to the selected cell.
Decoder:

This unit decodes the unit number determined by the

priority network and selects the cell of the non-numeric processor in
order to process a tuple.
Figure 4.15 shows the hardware of the secondary storage interface. It should be mentioned that data in the non-numeric processor
is handled in the bit serial, word serial fashion.

Since data on the

secondary storage might not be stored in the bit serial, word serial

Uata to the non-numeric or.a cessor
C
H
A
N
N
E
L

Control
Unit
AM Register

I

p
R
0
C

E

s
s

0
R

-

I

Q

u
E

u

THE NON-NUMERIC

TO

- PROCESSORS

D

E
C
0

D

E
R

-

Priori t~

sIGNAL FROM

~

E
Memory ( RAM)

-

NON-NUMERIC
PROCESSORS

Network

I
I

MAR
MBR

I

t

I

Secondary storages

Figure 4· 15

Secondary storage interface

f--1
0)
0)

167
fas hi on,

an interface is needed to

Although the

perform

this

transformation.

organization of the distributor box seems similar to

that of the stunt box (Thornton 1970) in the CDC 6600, there is an
important difference between these two concepts.

In the stunt box an

address would be sent to a module of the memory without any knowledge
about the

status

module

busy the address is circulated into a set of registers

is

of the module (busy or available).

forming a closed loop.

In case the

This strategy is adapted si nee the access to

the main memory is slower than the generation of the addresses which
occurs

independent of the

status of the memory modules.

design the situation is reversed.

In our

Assume each tuple to be an address

and each processing unit of the non-numeric processor as a memory
module.

Since the computation of a tuple is faster than the trans-

mission of the tuple from the secondary storage to a processing unit,
a tuple will be sent to a processing unit if that unit is available.
The above structure of the secondary storage interface has the
following advantages:
1)

It provides user's independence of defining access path to

the secondary devices;

in other words, the user is not responsible

for defining information known as

11

job control language" common in

the conventional systems.
2)

Increases

the flexibility of ASLH for handling different

secondary storage technologies.
Non-Numeric Processor.
of ASLH.

The non-numeric processor is the heart

The data as well as the ASL primitives wi 11 be transferred

to this unit.

The ASL primitives manipulate the data, and the output

168
is transferred to the user via the GPC or to the

backend

storage.

Data will be viewed as a stream of bits and an operation will be performed in associative bit slice fashion.
In our hardware we introduce two new subsystems.

The associa-

tive stack and the automatic rotate register.
The associative stack consists of an associative memory with
comparand, mask and response register and the appropriate logic for
associative operations.

This associative memory is augmented by a

top of stack register which always points to the top most element in
the stack .

A PUSH operation will consist of two steps:

Firstly,

parallel associative search, and secondly, if the response register
indicates a success, then nothing will happen .

Otherwise, the com-

parand will be written at the one location above the location pointed
by the top of stack register.

A POP operation has the usual meaning.

Figure 4.16 shows an associative stack with six entries.

Notice that

the top of the stack is pointing to the top most element in the
memory.
The automatic rotate register is simply a circular shift register augmented by a piece of circuit capable of performing automatic
circular shift to the right.

As will be discussed later, several

automatic rotate registers have been used in the design of the nonnumeric processor. The size of the circular shift register is implementation dependent.

This value is equal to the maximum number of

domains which can be specified in the
instructions.

11

what 11 or

11

how 11 part of the ASL

The circular shift register can be seen as a group of

locations, each capab 1 e of holding a number equal to the sequence

169

COMPARAND REGISTER

MASK REGISTER

R
E

s
p

0
N

s

TOP OF ST ACK REGISTER

E
R
E
G

I

s
T
E
R

A-102 ENDRESS B.
M-011 GOETS L.
N-050 SHEET D.
A-111
A-125
N-021

BOLDEN B.
DEAS W.
ROOF J.

21000 4 A
17000 6 M
19000 5 N

~

22000 7 A
20000 6 A
185000 4 N

OUTPUT REGISTER

Figure 4·16

An associative stack

5

I

170
number of a domain in a tuple of a relation.

The contents of the

circular shift register is filled by these sequence numbers for the
domains which are specified in the

11

what 11 or

11

how 11 part of the ASL

instructions . The entire contents of the circular shift register can
be rotated one location to the right whenever the rotation is necessary.

During the operation sometimes, the contents of the circular

shift reg i ster should be circulated so that its configuration is the
same as that of the beginning of the operation. Therefore, to each
circular shift register a special circuit is attached. This hardware
consists of a counter and a comparator circuit.

The counter holds a

positive number equal to the difference between the size of the shift
register and the number of elements in it.

The comparator, compares

the content of the counter with zero, if it is greater than zero,
circular

shift

register will

be circulated one location and the

content of the counter wi 11 be decremented by one.

The operation

will be iterated until the content of the register is equal to zero.
Figure 4 . 17 shows the automatic rotate register.

In this figure each

box of the circular shift register is one location as mentioned
above.
Because of the practical l i mi tat ion on the size of the stacks,
it may not be possible to store all the available data in the stacks
in order to perform an operation on them.

Due to this restriction,

all the systems proposed previously based on fully associative memory
are not capable of holding large data bases.

In contrast, the design

of ASLM overcomes this problem through consideration of two points:
firstly, queries almost always refer to a subset of domains in the

CIRCULAR SHIFT REGISTER

Cantrall er

ff

,_

I

I

I

I

{ r..... -

!

From controller~
,ster

Q

Beginning of
Tuple signal

Figure 4·17

Coun- - ter

zero
comparator

Enabl e
Sigr al
i-

ft

·•

Deerementer

An automatic rotate register

t--1
-...J
t--1

172

tuples.

In other words, the user has only access to some of the

domains in a tuple; secondly, in each query a sma 11 subset of tuples
will satisfy the search criteria.

In our design those tuples which

satisfy the search criteria will be selected and among the domains of
the selected tuples, those which have been referred to by the query
would be transferred to the associative stacks.
The above cons i de rations lead to the general structure of the
non-numeric processor as shown in Figure 4.18.
separate units:

It consists of two

cells and associative stack module:

Before the execution of the ASL primitives on the data, the
ce 11 s will
cell

be initialized to contain the search parameters.

receives tuples

from the secondary storage interface.

tuple will be evaluated against the search parameters.

Each
Each

If the tuple

is valid, it would be stored in the associative stacks or would be
routed to the user·; otherwise, it would be ignored.
Each active ce 11 can be in one of the fo 11 owing eye l es, busy
cycle or ready cycle.

A cell is in the busy cycle if it is perform-

ing an operation on a tuple assigned to it by the priority network of
the secondary storage interface.

This cycle starts from the time the

decoder in SSI opens its gate to the cell.

Each busy cycle is fol-

lowed by a ready cycle, in which the cell would be idle waiting for
its request to receive another tuple.

It is clear that the system is

more efficient if the idle times are minimized.
Cells.

Cells will be used as a selector.

of each tuple will be selected.

The specified domains

Each generated subtuple will be sent

from secondary storage interface
~

-

~

from controller

-

1st cell

r--+-

2nd cell

~

,,

· · · · · · · · · ·

~

nth cell

~

Common -Data -Bus

,

'

-

•f

stacks
from controller

-

associative stack
module

to GPC or backend storage
Figure 4-18

Non-numeric processor

1--1
--...J

w

174

to the associative stack module if the tuple satisfies the search
condition(s).
Ce 11 s

are

an

array of sma 11

i dent i cal

processors capable of

performing some search operations on the tuples.

Each cell can be

viewed as performing the task of the processor attached to the read/
write head of the Slotnick's design (Slotnick 1970).
however,

these

processors

are

In our design,

not part of the secondary devices.

Cells perform operations independent of each other; when an operation
is started in the cells by the controller, there is no need for any
contra l by the contra 11 er.

The independence of the ce 11 s from each

other has resulted in a highly modular system.
degree of fault tolerance,

in the sense that,

The system provides a
if there are enough

cells the faulty cells could be by-passed.
Each ce 11 is composed of four interconnected parts as fo 11 ows:
Domain recognizer
Input register
Test Logic circuit
Buffer
Figure

4. 19

shows a ce 11 and the i nterconncet ion between its

components.
Domain recognizer.

Recall that the ASL instructions are of the

form:
HOW

WHERE

WHAT

Based on this format the ·domains of each tuple can be classified
as follows:

Data

Domain
recognizer

Test logic
Input register

Buffer
circuit

To the associative
stack module

Figure 4-19

CELL

~

-..J

u,

176
1)

Those which are the members of the search
criteria (HOW);
Those which are the members of the output

2)

set (WHAT); and,
Those which are not members of either

3)

1) or 2).

Among

these

three classes,

concern in a query.
2).

This

selection

domains

in 1) and 2) are of our

The domain recognizer selects members of 1) and
is simply based on the counting of the domain

separator in each tuple.

Among the selected domains, those which are

members of 1) will be gated to the test logic circuit and those which
are members of 2) will be transferred to the input register.
The domain recognizer consists of two automatic rotate regi sters , four match circuits and a counter.
ters hold two sequences of numbers.

The automatic rotate regis-

One shows the domains in the

output set and the other one shows the domains in the search set.
Two

match

circuits

simply match the

routed string of characters

(tuple) against the domain separator and tuple separator markers. If
a domain

separator

is sensed, the content of the counter wi 11 be

incremented by one.

At each eye le of the operation the content of

the counter wi 11 be matched against the contents of the automatic
rotate registers.
gated to
depicts

the
the

input register or test

logic circuit.

hardware organization of the domain

hardware of the
A)

In case of a match, the content of domain wi 11 be

deletion

domain
of

Figure 4.20

recognizer.

The

recognizer can be simplified in two ways:
the

match

circuit

associated

to the tuple

Data
Ancher
Ancher

To input register and
t-r--------test logic ctrcuit

,...........'I__._

,I

¢

I

reset counter

increment counter
Counter

Comparator

1

~

I

lo test

Comparlogic circuit ator

,

I
1

I

rotate register

automatic rotate register
S-

IU

......
0

0

1

* "AND" all bits of
first position in
circular shift
register

8.b
'- C:

...,'-

lJ.... 0

C:

u

0

u

8
s..

lJ....

Figure 4·20

Domain recognizer
~

-...J
-...J

178
separator of the Figure 4.20 and broadcasting tuple separator signal
from the SS I.
B)

combining the automatic rotate registers into an automatic

rotate register and associating a code to each of the domain which
specify the domains belong to the output set or the search criteria.
By this simplification the hardware cost of the cells will be decreased.

Figure 4. 21 shows

the domain

recognizer after the above

modifications.
In

our design we have chosen the first design of the domain

recognizer because of its simplicity and speed in price of hardware
cost.
Input Register.

The members of the output set

retained in a shift register of the input register.
dealing with variable length domains.

11

what" will be
In ASL we are

Performing an operation in the

associative memory in bit slice fashion forces the domains to be of
fixed

size.

Therefore,

the members

of the output set should be

changed into the fixed format before they are gated to the associ ati ve stack module or the user via GPC.

This operation will be per-

formed in the input register by appending special "zero" characters
11

0, 11 if necessary, to make the length of the domains equal to the

maximum allowable length.
The input register consists of an automatic rotate register, a
shift register, a "zero" generator and some additional hardware such
as a subtractor and a comparator as shown in Figure 4.22.
The

shift

register,

the

subtractor

and

the comparator work

Data
Ancher---Ancher
.......
I -'-_._.....,I

To input register & test logic
circuit

~ ~-1 ~T~
~-~
o

om

i,+ => m 3 m -, r, =>

.-+

7

-smn-::sc:on

counter

comparato
To input register

shift register
To test logic circuit

....,
c::

0

u

g
s...

4-

Figure 4•21

Another version of the domain recognizer Circuit
f-J

-.....J

I.D

Data
11

zero

11

Generator

)

-

Shift Register
I

From domain recognizer

I

I

-

:- Register

Zero
Comparator

-.

i---.

j

"

I mentor
Deere-

Automatic rotate register
-

Enable Signal

~

-I

From con troll er
Figure 4•22

Input register
~

(X)

0

---181

jointly as an automatic rotate register with three simple modifications:
ter;

firstly, the above combination is not a circular shift regissecondly,

should be

the

shifted

1ength

of the

tuple;

and,

number of positions which

is

not

fixed

domain

and

the

thirdly,

the shift register

and is a function of the maximum

actua 1 1ength of the domain in the

the shift register while being shifted right

will be padded with O's coming from the zero generator.

The automat-

ic rotate register as shown in Figure 4.22 holds the maximum length
of the domains in the output set.
latch holding special character

Test Logic Circuit.

11

0.

The

Two

zero 11 generator is simply a

11

Test logic circuit consists of two dimen-

s i ona 1 arrays of i dent i ca 1 uni ts.
components:

11

registers

(R ,
1

R )
2

Each unit is composed of three
and a comparison circuit.

R
1

holds a character of the search argument, and R holds a character of
2
the

search

variable.

content of R

1

comparison circuit is able to check the

against the content of R with respect to the<,=,>,
2

~, 1, and~ operations.
Each unit,

The

Figure 4.23 shows a unit and its components.

or in general, the system is similar to the Lee machine

(Lee and Paul 1963) with the following differences:
1)

identical units are arranged in 2 dimensional arrays; and,

2)

control lines which are used to transfer commands and data

have been eliminated , since each unit stores the search argument and
search variable and the comparison operation.

In fact, the system

can be seen as a fully associative memory.
Operations in the test logic circuit is performed in four steps:
1)

Load search argument (Load R )
1

Data Stream

~

step

3

step
step

4
2

step

-

-

Compar- ison
Circuit

-.

1

Figure 4-23

- Step 3

R2

~

'

--

...

Step 4
Step 2

-....

Step 1

~

+
Rl

The hardware of each unit

1-1
(X)

N

183

2)

Load comparison operator

3)

Load search variable (Load R )
2

4)

Execute (compare R and R according to
1
2
the operator).

The steps 1) and 2) will

be performed once before the execution
of the query and thereafter for each tuple
steps 3) and 4) will be repeated.

Each step

should be executed by each unit in the two
dimensional arrays.

Execution of the above

steps in the test logic circuit is as follows:
Step 1) is strictly sequential, from left to right,
top to bottom.
Step 2)

the same as Step 1.

Step 3)

is row-wise sequential, but

column-wise parallel.

Therefore, all

the units in a column will store the
same search variables from the input
data.
Step 4)

the same as Step 3.

This means

that all the units in a column will compare
the contents of R and R simultaneously,
2
1
while the operations will be executed in
sequential fashion in each row.

Figure 4.24

shows the connection of the units in each
row and Figure 4.25 shows the schematic of
the 2-dimensional test logic circuit.

DATA

,
Step 3

I
I

Step 4
Step 2
Step 1

I

I

-,

_J

I

I

.I

I

I

I

I

I

I

I

I

I

_,

I

I

I

I

I

Figure 4·24

f

'

-I
I

.. . .. . .
. ... . ...
.. . ... . .

I
I

- I

- I
- 1

I

f
I

I

.........

-I

I

I

I

next
row

Each row of the test logic circuit

f,-1

ex,
~

DATA

Step
Step
Step
Step

3
4
2
1

0

R

Figure 4·25

The hardware organization of the test logic circuit

~

CX)
(J1

186

In each ASL instruction the search criteria is of the following
form:

where each Q. (1
l

~

i

~

V

Qn

i

~

n) is of the form:
AP .. , 1
lJ

$

n, 1

~

j

~

m

where each P .. is of the form:
lJ
<DOMAIN> <RELATIONAL OP> <VALUE>
In the test logic circuit row i will be assigned to Q.; morel

P .. will be changed to a fixed
lJ
canonic format as:

Q.l

= P.

A P.

1

l

12

. . . . . /\ P.

1m

where m is equa 1 to the number of di st i net domains in the search
criteria .

As an example consider the following query:
(CITY=

11

11

ORLANDO 11 /\SALARY<

(STATE =

II

IOWA 11

/\

SEX =

20000 11 ) V
11

1 11 )

where n is equal to 2, mis equal to 4, so,

Q =(CITY= 11 ORLANDO 11
1

/\

SALARY

11

20000 11 )

and
Q

2

= (STATE =

II

IOWA 11

SEX = 11 1 11 ) .

/\

This query will be translated to:
(CITY=
11

000 11

11

A SEX=

(CITY =
11

IOWA 11

ORLANDO 11 A SALARY<
11

0

11

aaaaa 11

I\

/\

SEX=

11

11

)

11

20000 11 A STATE=

V

SALARY "aaaa''

I\

STATE =

1 11 )

where a is a don't care symbol.
Based on the design of the test logic circuit, the number of

187

terms (Q's) in search criteria is dependent on hardware implementation (number of rows in the test logic circuit).

The same argument

is valid about the number of distinct domains in a search criteria.
Buffer:

Buffer

is simply a shift register which retains the

content of each tuple routed to the cell.

The content of this buffer

is transferred to the backend storage in modify and delete operations .
Associative stack module:

In our design, tuples in the second-

ary storage are not preceded by any control bits, this is the result
of

the

subrelation which explicitly generated in the associative

stack module.

The elements of the subrelations are transferred from

the cells via a dedicated bus to the associative stack module.
The associative stack module is a set of associative stacks. An
associative stack is an associative memory augmented by a register
"top register" which points to the top most element in the stack. A
stack is empty if the content of the top register (TR) is zero. A
stack is full
stack.

The

if the content of TR is euqa l to the maxi mum size of

value of this

register

is

initially zero.

A search

through the memory will be against all the elements in the memory up
to the position which is pointed by the TR.
For this design, there are two main parameters; dependent on the
technology:
1)

The size of the associative stacks; and,

2)

The number of associative stacks.

188
If we accept the fact that during the query we are concerned with a
small

portion of the data

file (90-10% rule), first parameter is

solvable, since current technology will provide associative memory of
the

suitable

depth

(Chapter

II)

tuples by the test logic circuit.

capable of holding the

selected

Moreover, since the ASL primitives

are bi nary operations, at most three associative stacks are sufficient.

Therefore,

off between

the solution for the second parameter is a trade

speed and cost.

The situation is comparable to that

which exists in the virtual memory where program is executed faster
if the number of pages assigned to it is larger.

The concept of page

in, page out during the page fault and replacement algorithms can be
used

in this design in order to assign an associative stack to a

relation, when a stack fault occurs.
A sequence of registers
associative

stack.

primitives,

and

operations.

These

their

(control

registers

contents

bits)

are assigned to each

could be referred by the ASL

could

be

interrogated during the

The contents of these registers wi 11 al so be set and

manipulated by some ASL primitives.

The concept of the control bits

is similar to that used in the B-1700 series (Baron et al.

1977;

Salisbury 1976) where a group of registers are attached to a 24-bit
functional

box,

and their contents being manipulated by this box.

The ASL operations on the elements of the stacks similarly change the
contents of its associated bits.
Associative

stacks can transfer information among each other.

This is essential for some operations such as
bus

establishes

this

communication

between

11

UNION. 11

A common data

different

associative

189

stack modules.

Figure 4. 26 shows the genera 1 configuration of the

associative stack modules.

The modularity and independence of the

associative stacks from each other has resulted an expandable system
with a degree of fault tolerance.

In genera 1 , the major components

of the associative stack module are:

A)

A)

associative stacks

B)

common data bus

C)

shifter

D)

blank remover circuit

E)

reformat circuit

Associative stacks.

To each associative stack a set of

registers have been assigned.
Top register (TR):
element of the stack.

This register always points to the top-most
Any search through the associative stack would

be through all the elements in the stack from the 1st row up to the
row pointed to by the TR.

The number of bits in TR is a function of

the maximum depth of stack (e.g.

a stack with maximum depth 16K

implies a TR of 14 bits).
Pointer register (PR):

This register points to that element of

the stack which currently being manipulated by the hardware. The size
of this register is equal to that of the TR.
Reserve register (RR):

For each stack, RR is a buffer which

holds the content of the top register.

As wi 11 be discussed 1ater,

in some operations we might need to temporarily change the content of
the TR. This register will be used to save the content of TR temporarily. The size of RR is the same as that of TR.

CELLS
Data bus A
Data bus r.B-----.-t------ - - -- - -- - - - -- -- ---'C
M

Associative

Associative

Stack

Stack

0

Shifter
C:
M:
0:
Figure 4·26

Remove blanks
Circuit

Refonnat
Circuit

Comparand register
Mask Register
Output Register
The general organization of associative stacks module

backed
storage

Pointer
Register

Top
Register

Reserve
Register

.

Figure 4·27

Set of registers associated to
each associative stack

192

Stack
Empty

Figure 4-28

Stack
Full

Match
Bit

The control bits

193

To each TR a control register of three bits, is assigned. These
control bits are stack empty, stack full and match bit.
Stack empty indicates whether the stack is empty or not. If the
content of TR is equal to zero the content of this bit is set to l·,
otherwise, it is set to 0.
Stack full indicates whether the stack is full or not.

If the

stack is full this bit will be set to l; otherwise, it would be set
to 0.
Match bit holds an overall result of a search through the associ at i ve stack.

If there

is at least one match after a search the

content of this bit is set to l; otherwise, it would be set to zero.
The contents of the stack empty and stack full control bi ts
could be set automatically by the hardware whenever an element is
pushed into the stack or popped from the stack.
Figure 4. 27 and 4. 28 show the above-referenced registers.

In

these figures each row is assigned to an associative stack.
To

each

top

register

an

incrementer/decrementer circuit

is

assigned, which manipulates the content of the top register. All the
registers

discussed above are private to each associative stack,

therefore they are not connected to the common data bus.
B)

Common data bus.

As

can

be

seen

from

Figure 4. 26 the

common data bus connects all the associative stacks with each other
Moreover, it

(through comparand registers and output registers).
connects the

11

shifter, 11

11

blank remover circuit 11 and

cuit" to the associative stacks.

11

reformat cir-

194
C)

Shifter.

This is simply a circular shift register capable

of rotating the contents of the register to the left or right. This
register is composed of two halves.

The programmer can address the

register as a whole or any half of the register.

The size of the

shifter is equal to that of the comparand or output register.

D)

Blank remover circuit.

In some operations such as "PROJEC-

TION 11 as will be seen later, some intermediate blanks might be created during the operation.

This type of tuples should be packed.

Blank remover circuit is simply a device for this purpose.

It is

composed of two "shift registers" and a "comparison circuit."

Figure

4. 29 shows this part of the system.

The first shift register re-

ceives data from the common data bus (unpacked data).

The content of

the first shift register will be routed to the comparison circuit
character by character;

the comparison circuit simply checks each

character against the blank,

and if a character is not a blank-

character it would be gated to the second shift register; otherwise,
it would be ignored.

At the end of the operation on a tuple the

content of the second shift register will be gated to the common data
bus.

The size of the 1st and 2nd register are equal to that of the

comparand or output register
E)

Reformat circuit.

This is a hardware circuit which changes

the format of each tuple in the associative stack so as to correspond
to that of the tuples on the secondary storage.

Recall a pair of the

adjacent fields are separated by a domain separator and a pair of the
adjacent tuples are separated by a tuple separator.

In fact this is

Corrrnon data bus
j~

,~

I

J

,I

Shift register 1

I

.,

Shift Register 2

"

En

s

,-

'
Match Circuit

Figure 4·29

Remove blan~s circuit
f-J

I..O
lr1

196

a hardware implementation of the format procedure in the conventional
systems.

The circuit of this component is similar to that of the

input register of the cells.
Input register:

The circuit is composed of:

is simply a right to left shift register which

receives the tuples from the common data bus.
Maximum register:

a circular shift register which holds the

maxi mum lengths of all of the domains in the tuple.

The content of

this register will be set up by the controller.
Counter and match circuit:

for each character which will be

transferred from input register to the output register, the content
of the counter will be matched against the content of the maxi mum
register, if it matches, a domain separator or a tuple separator will
be inserted to the output register and meanwhile the maximum register
will be circulated one position.

For all trailing zeros the counter

will be incremented but the character (O) will not be gated to the
output register.
Output register:
ter or

11

¢ 11

or

11

stores the data received from the input regis-

$ 11 generators.

It is a right to left shift register

which is directly gated to the backend storage.
the hardware organization of this circuit.

Figure 4.30 shows

The direction of the

shift registers will specify the operations more specifically.
Backend Storage
As mentioned before, our system is capable of performing storage
operations as well as retrieval operation.

The storage operat i ans

are possible because of the backend storage.

This storage is a fast

197

-

C001Tion Data Bus
lnput register

Register

Char
Circui

Zero
Circuit

increment
signal .----L----.
Counter

Erase

Match
Circuit
automatic rotate
register.

From
Controller

¢---+-l

Figure 4·30

Refonnat circuit

198
secondary storage which will be used to retain tuples of the relation
during the deletion or modification operat i ans. The backend storage
gets its data from the cells or associative stack module (reformat
circuit).

The speed of the backend storage should be fast enough to

compete with the speed of the hardware operations in the non-numeric
processor.
The Operation of ASLH
In the previous sect i ans the overa 11 fl ow of data in ASLH has
been

discussed.

In

this

section the operations of the different

modules of ASLH and overall flow of operation in ASLH will be discussed.
Index Processor
There are three types of operations which are performed by the
index processor:
Read.

READ, INSERT and DELETE.

This operation wi 11 be performed by the compiler during

the compilation.

Figure 4.31 shows the sequence of the steps.

of a relation will

Name

be searched through the AM (Associative Memory)

and then the contents of the descriptor will be accessed for generation of the symbol table.
Operation will be started by setting the mask register (to all
l's)

and comparand

register

(relation name).

The content of the

comparand register will be matched against all valid rows (valid rows
are those for which the validity bits are set).

If search is suc-

cessful the corresponding descriptor on the RAM will be accessed and

199

Set comparand
register

Set mask
Register

Match
on
Equl ity

MAR
MBR

+
+

No

(-Pointer)
(Descriptor)

Figure 4-31

READ

Invalid
Relation

200

passed to the compiler.

This has been shown in the fl owe hart by:
MAR

~

(Pointer)

MBR

~

(Descriptor)

If the relation name does not exist an error signal will be sent to
the compiler.
Insert.

This operation inserts new entries in the AM and RAM

parts of the index processor.

Figure 4.32 shows the operation. Oper-

ation is started by a search through the contents of the associative
memory in the same way as described in READ operation.

If there is a

match an error signal is generated because duplicate relation names
are not allowed in our system; otherwise the relation name, contents
of the AR registers of the index processor and secondary storage interface and the sizes of the descriptors in the index processor and
SSI will be inserted into the first avail ab 1e row of the AM and this
row will be marked by setting its validity bit.

After this the con-

tents of the descriptor in the RAM part will be inserted

in

the

available space shown by the content of the AR register of the index
processor.

This has been shown by:
MAR

~

(AR)

MBR

~

information

At the end the contents of the AR registers will be updated by
the sizes of the descriptors in the index processors and SSI
(AR's
Delete.

~

(AR)'s + Sizes).

Relations might be deleted from the system.

In our

system erasing a rel at ion from the system is equivalent to erasing

201

Set comparand
register

Set mask
register
Match
on
Equality
Insert (relation name
sizes (AR)'s to the AM)

Yes

Set Validity bit
MAR (AR)

MBR AR's -

infonnation
(AR)'s + Sizes

Figure 4·32

Insert

ERROR

202

its entry from the index processor and secondary storage interface.
Deletion of an entry from the index processor as shown in Figure 4.33
is performed in two steps.

First, the entry will be erased from the

AM part and then the descriptor will be eliminated from the RAM part.
The operation will be started by a search through the contents
of the rows in the AM part in the same way as discussed in READ
operation.

If search is not successful an error signal will be sent

to the controller; otherwise the entries will be eliminated from AM
by setting the validity bit of the selected row to

11

0 11 and from the

RAM part by setting the first bit (delete bit) of the descriptor to
11

1.

11

Access to the descriptor is obtained by setting the MAR to the

address of the descriptor which is held in the AM part.
Si nee deletion of a re 1at ion from a data base system occurs
rarely in a system, there is no automatic garbage collection in RAM.
Whenever space

is

needed a software garbage collection should be

called to collect the garbage.

By garbage we mean all the descri p-

tors which have been marked in the RAM part.
Secondary Storage Interface
The set of the operations in the SSI can be partitioned into two
groups:

those which just manipulate the data in the SSI memory

(INSERT, DELETE OPERATIONS) and those which route data to the nonnumeric processor (READ OPERATION).
Delete.

This operation is the same as a part of the delete

operation in the index processor, where the first bit of the descriptor in the RAM will be set to 1.

The address of the descriptor will

203

Set comparand
register

Set mask
register

Match on
equality

Yes

MAR+ Address of
the Descriptor

Set validity
bit to 0

Set first bit of
descriptor to 1

Figure 4·33

Delete

ERROR

204
be obtained from the AM part of the index processor.
Insert.

Since access to secondary storage varies from system to

system, as well as technology to technology, this operation will be
performed directly by the GPC.

For each new relation an entry will

be inserted in the RAM part of the secondary storage interface.
Read.

This instruction will access information on the secondary

storage and route them to the non-numeric processor.
steps have be€n depicted in Figure 4.34.

The sequence of

The content of the descrip-

tor will be accessed from the RAM memory of 551, this has been shown
as:
MAR

~

address of the descriptor

MBR

~

content of the descriptor

The content of descriptor wi 11 be transferred to the channel
processor, in order to have access to the stored data on the secondary storage (Channel processor

~

(MBR)).

After this step the data

on the secondary storage is routed to the distributor box.

The

tuples are stored in the queue and then transferred to the cells
based on FIFO policy.
ity network.

Each cell will initiate a signal to the prior-

The priority network will select a candidate cell among

the ready cells.

Based on the information from the priority network

the decoder gates the top most tuple in the queue to the candidate
cell.

Meanwhile,

the priority network selects the next candidate

eel l.

All these steps have been shown by the "Route tuple to the

cell 11 in the flowchart.

These steps wi 11 be repeated for a 11 the

tuples on the relation.

At the end of the relation, cells will be

released and controller starts to perform the next operation.

MAR+ (Pointer)
MBR + (Descriptor)

Channel+ MBR
Initiate read

Stops
priority
network

Signal to
the Decoder
Controller

Route tuple
to the cell

Figure 4·34

READ

206

The Operation of Cell
As mentioned before cells are part of the non-numeric processor,
which receives tuples from the secondary storage interface.

Based on

the search criteria each tuple wi 11 be evaluated in each ce 11. If a
tuple satisfies the search criteria it would either be stored in the
associative stack module or routed to the user, otherwise it will be
ignored.

In this section we discuss the data flow and the sequence

of the operations within the ce 11 s.

Note that at the beginning of

execution of the query, before any tuple can be transferred to the
cells the contents of all the automatic rotate registers of the cells
are set by the controller.
Fl ow of Data in the Domain Recognizer:
sequence of steps .

Figure 4. 35 shows the

Reca 11 that the domain recognizer is composed of

two automatic rotate registers, a counter and four comparison circuits.

For each tuple routed to the cell, domain recognizer compares

the characters against

11

¢ 11 and

11

$.

11

If the input character is

operation in the domain recognizer will be terminated.

11

¢ 11 the

This will be

done by setting the automatic rotate registers so as to restore the
initial configuration.

If the input character is

11

$

11

the content of

the counter will be incremented by one and the content of the counter
is matched against the contents of the automatic rotate registers.
If there is no match, the fo 11 owing domain right after

11

$

11

is not a

part of the search criteria (HOW) or output set (WHAT); otherwise the
content of the domain is gated to the test logic circuit or input
register until the end of the domain (unitl the next domain separator

207

Set automatic rotate
register and counter
From SSI

Yes

No

Counter~counter +l

Check counter against 1st

Check counter against 2nd

automatic rotate register

automatic rotate register

No

No

Gate domain to the

Gate domain to the

test logic circuit

Input register

Figure 4·35

Sequence of operations in the domain
recognizer

208

is sensed) and the corresponding automatic rotate register(s) will be
circulated.
Flow of Data in the Input Register:

Figure 4.36 shows these-

quence of operations in the input register.

For each tuple the con-

tent of each domain in the output set of the ASL instruction will be
stored into the shift register of the input register.

If the size of

the domain is less than the maximum length of that domain, it will be
padded with appropriate number of zeros to its left.

Meanwhile the

circular shift register in this part of the cell will be circulated
one location.

At the end when the contents of the all specified

domains in the output set have been stored in the shift register, the
automatic rotate register will be rotated as to restore its initial
configuration.
Fl ow of Data in the Test Logic Circuit:

The contents of the

domains specified in the search set are stored in the arrays of the
elements of this unit.
Before

the

operation,

Figure 4.37 shows the sequence

steps.

the contents of the elements as mentioned

before should be set by the controller.
11

of

set array 11 in the flow chart.

This fact has been shown by

For each domain in the search set

its content is gated to this part, at the end of each tuple, the
search will be started.

The output signal from the test logic cir-

cuit will determine the validity of the tuple.

For valid tuple the

content of the input register will be sent either to the user (in
case of simple retrieval) or the associative stack module (complex
retrieval).

After this step the cell goes to the ready cycle by

sending a ready signal to the secondary storage interface.

209

Set automatic
rotate register

from SSI

Pad appropriate O's
to the left of
shift register

Pad appropriate O's to
the left of the shift
register.
No

Shift one character
to the shift
register.

Figure 4·36

Circulate circular
register.

Sequence of operations in the input
register

210

Set arrays

Yes

No

Test

Set units by data

No

Yes
Gate input register to
the stack module or GPC

Go to ready cycle

Figure 4·37

Sequence of operations in the
test logic circuit

211

Sequence of the Operations in a Ce 11:
sequence of the operations in a ce 11.

Figure 4. 38 shows the

Before data is read from the

secondary storage the contents of the all automatic rotate registers
in the domain recognizer, input register and the elements of the test
logic circuit should be set by the controller. After receiving a
tup 1e from the secondary storage interface the contents of the domains specified in the output set or search set are gated respectively to the input register or test logic circuit.

At the end of the

tuple, the tuple is evaluated against the search criteria. For valid
tup l es contents of the input register is gated to the GPC or associative stack module; otherwise tuple would be ignored, then contents of
all the automatic rotate registers are set to their initial configuration and then the cell goes to the ready cycle.
Flow of Data in the Associative Stack Module
The associative stack module performs the underlying set theoric
operations on which the data base functions are defined in the relational model.
The

set

of

operations for the associative stack module are

classified into three subsets:
1)

Stack operations:
POP and PUSH

2)

Set operations:
SELECTION, PROJECTION, JOIN, DIVISION, UNION, INTERSECTION,
DIFFERENCE, CARTESIAN PRODUCT, SET INCLUSION and SET EQUALITY.

, - - - - - - - - - - - - - - - - - - - - - 1 =~~ :~~~=t!~ ;~:•:: 1~gfsters

Go to ready eye 1e

No

test

Gate tnput register to
the stack module or GPC

No

Ho

No

No

Ho

shut off gate 2
tnput register
ad appropriate O's to
the left of the shfft
register

Shift character to the
test logic cfrcuit

counter ... counter + 1

match counter against content
of the first automatic
rotate reg Is ter

111tch counter against conten
of the second autonat fc
rotate reg f s ter

gate 1:
gate 2 :

Connection 1 lne to the
fnput register .
Connection line to the
test logic circuit.

N

......
Figure 4· 38

Sequence of operation fn a ce'l

N

213

3)

The usual associative operation on the memory:
MATCH, SELECT FIRST MATCH, ARITHMETIC OPERATION on MASS of
DATA, READ and WRITE.

The operations in class 1 and class 2 will be discussed in this
section.
Push.

PUSH is the insertion of a new e 1ement ( content of the

comparand register) into the associative stack.
the sequence of the steps.

Figure 4.39 shows

Since in our organization a duplicate

tuple is not permissible, the content of the comparand register will
be matched against all the valid elements in the associative stack.
If there

is no match, the top-register will be incremented by one;

then the content of the comparand register will be inserted into the
location which is pointed to by the top register (copy). This operation might change the content of the test bist (automatically by the
hardware) .
Pop.

The tope most element of the stack will be popped into the

output register, then the content of the top register will be decremented by one.

Figure 4.40 shows the steps of the operation. This

operation might change the content of the test bits.
Union.

If

A and B are two union compatible relations, then

Au

B = {c

I

C£A

or C£B}.

In this operation two associative stacks are involved where the
contents of the second stack will be inserted into the first stack.
Figure 4.41 shows the sequence of the steps.

In this figure, or in

genera 1 in a 11 the operat i ans in the fo 11 owing discussion, whenever
more

than

one

associative stack contributes in the operation an

214

Yes

Match on
Equality

Yes

Top register+ Top register+ 1

Copy

Figure 4·39

PUSH

215

Yes

READ

Top register+ top register - 1

Figure 4·40

POP

Set Mask registers
of both stacks to 1

Reserve register 2

Top register 2

+

+

Top register 2

Reserve register 2

POP2

Cl + 02

PUSH 1

Figure 4·41

UNION

N
f-1

en

217

indices will be attached to the components of each operation in order
to define on which stack the operation should be performed.
The operation will be started by setting the mask register of
both associative stacks to all l ' s .

Then the operator initiates a

sequence of Pops from the second stack and pushes into the first
stack for all the elements of the second stack.

At the end of the operation the contents of the second associative
stack will not be lost as a result of the saving the content of the
top register of the second stack:
(reserve register

top register ) and restoring it at the end of
2

2

the operation (top register

reserve register ).
2

~

2

Note that in Figure 4.41

c1

~

o2

means a transmission of the

content of the output register of the second stack to the comparand
register of the first stack.
Intersection.

If A and B are two uni on compatible relations

then:
A AB

=

{c I c£A

and

c£BJ

The common elements of the two associative stacks will be selected into a third stack.
operation.

Ope rat ion

starts

Figure 4.42 depicts the steps of the
by setting the content of the mask

registers of all stacks to all l's.

A sequence of pops from the

second stack and matching the popped element against the contents of

218

Set Mask Register of a11 Stacks

Top register 3 + 0
reserve register 2 + top register 2

Top register 2 +
Reserve register 2

Match stack 1
on equality

No

Figure 4·42

Intersection operation

219

the first stack performs the operation.
stack wi 11

Each element of the second

be popped to the comparand registers of the . first and

third stacks:

A match on equality will check the content of the comparand register
of the first stack againt its contents.

In case of a match, the

content of the comparand register of the third stack is pushed into
the stack ( PUSH ); otherwise it will be ignored.
3

At the end of the

operation no information will be lost because of the same reason as
mentioned for the union operation.
Difference.

If A and Bare two union compatible relation then:
A- B =

I c£A and ctB}

{c

The sequence of the steps is the same as in intersection operation, except in this case the search in the first stack will be on
not equality .
Inclusion.

Figure 4.43 shows the sequence of steps.
If A and Bare two union compatible relations then:
ACB

iff V, a£A
a

➔

a£B

The steps of this operation are the same as uni on, except no
elements will be inserted into the first stack.
the operation.

Figure 4.44 shows

It should be mentioned that the contents of A and B

will be unchanged at the end of the operation, because of saving the
content of the top register of the second associative stack and
restoring it at the end of the operation.

It should be noted that in

220

Set mask register of all
Stacks
Top register 3 + 0
Reserve register

+

Top register 2

Top register 2 +
Reserve register 2

Match stack 1
not on equality
No

Figure 4·43

Difference operation

221

Setmask registers to all l's

Reserve regi ster 2 + Top register 2

Set inclusion bit
Reset validity
Bits of stack 1
top register 2 +
reserve register
Match stack 1
on equality

Yes

Rest inclusion
bit

Figure 4·44

Inclusion

222
the flowchart of Figure 4.44 stack

2

stands for relation A and stack

1

stands for relation Bin the above formula.
Set equality .

If

A and B are two uni on compatible relations

then:
A= B iff

!,

a£A

➔

a£B and

t,

b£B

➔ b£A

The sequence of the operations is the same as inclusion, where
the inclusion should be checked for both A and B (i.e.,

ACB

and

In our design, the equality will be checked as depicted in

B C A).

Figure 4 . 45.

As it can be seen the inclusion will be checked for

both relations one after the other.

Operation will be performed by a

sequence of pops from both stacks followed by match on equality for
both stacks.

At the end of the operation the contents of the both

top registers are restored, therefore no information will be lost.
Cartesian product.

If A and Bare two sets the Cartesian prod-

uct of A and B will be defined as:
A~ B

=

{ab I Jr/, a£A and b£B}
a

Therefore the cartesian product of A and Bis simply a relation
whose elements are formed by catination of elements in the A and B.
Figure 4.46 shows the sequence of steps.

The operation is started by

setting the contents of all mask registers to all l's.
used to read out the elements of 1st and 2nd stacks.

This will be

As can be seen,

each element of the first stack will be popped into the shift register (shift ) of the associative stack module. Then all the elements
1
of the second stack will be popped and catinated to the content of
"shift

II

1

one at a time, and after catination, the contents of the

223

Set mask registers
to a11 l Is

reset the equality bit

reserve register 1 + top register 1
reserve register 2 + top register 2

No,

set equality
bit

top register1 +reserve registerl
top reg is ter2 +reserve register 2
match first and
second stacks
on equality

)v(No,Yes)v(Yes,No)

Figure 4.45

Set equality

224

set mask registers to all l's
reserve register 1+ top register1
top register 3

+

0

No

shift+ o1
top registeq +
reserve register 1
reserve register2
top register 2

+

top register 2 +
reserve reg1ster 2

Figure 4• 46

Cartesian product

225
"shift" will be pushed into a third stack. At the end of the operation the contents of the first and second stack will be addressable,
since the contents of the top registers of the above stacks will be
saved at the beginning of the operation and restored at the end of
the operation.
Selection.

As mentioned in Chapter II, from mathematical point

of view, the selection is simply equivalent to generation of a subrelation of a relation.
4.47.

The sequence of steps is shown in Figure

A search through the contents of an associative stack will

take place with respect to certain domain(s), then all
rows (those which satisfy the condition) will
serted into a new associative stack.

the tagged

be selected and in-

At the end of operation con-

tents of the first stack will be unchanged because of the same reason
as mentioned in the union operation.
Projection.

Recall from Chapter II, this operation generates a

new relation out of the fields of another relation.
the old relation will remain unchanged.

After operation

The steps of the operation

are shown in Figure 4.48.

In this operation, the mask register of

the

set according to the selected fields,

first

stack should be

therefore, after popping a row from the stack, there might be gaps
between the fields which should be removed.
be used in order to close the gaps.

The blank remover will

The output register of the first

stack will be gated into this circuit.

After elimination of gaps the

output of the blank remover circuit will be transferred to the comparand register of the second stack. Then a push on the second stack

Set mask registers to a 11 1 's

top register,,
reserve register 1

top register 1 +reserve register 1

+

+

0

top register 1

op register 1 + pointer register 1

POP 1
C2

+

01

Push 2

Figure 4·47

Selection
I'\)
I'\)

O'l

set 1st mask register based on
defined fonnat
set 2nd mask register to all l's

top register 2 + O
reserve register 1 + top reg1ster 1

No

top register 1 + reserve

POP 1
remove blanks+ o1

c2 +

remove b1an ks
Push 2

Figure 4-48

Projection

N
N
'-I

228

inserts the content of the comparand register to the

second

stack.

The sequence of the steps will be performed for all the elements of
the first stack.
Join.

From Chapter II,

if two relations in the stacks have a

domain in common, then they may be joined over that domain.

The re-

sult of the join is a new relation, in which each row is formed by
joining together two rows, one from each of the original relations.
As can be seen in Figure 4.49, the sequence of steps involves two
loops. At the end of the operation contents of the first and second
stacks will remain unchanged.
The operation is performed as follows:
Each e 1 ement of the first stack wi 11 be popped into
and vi a

II

shi fter 11

II

shift

11

1

into the comparand register of the second stack.

Then the content of the comparand register of the second stack will
be matched against the contents of the second stack.
elements of the second stack will

All the matched

be popped into the "shift/ one

element at a ti me, and then the whole content of the

II

shifter" wi 11

be pushed into the third stack.
Discussion
Chapter

II

addressed some proposed and implemented data base

machines in deta i 1.

At the end of that chapter the advantages and

disadvantages of the different cl asses (SIMD, MISD) were discussed.
In general

the proposed hardware

in the SIMD class are not cost

effective and the only machine in the MISD class is so restricted.

229

set 1st and 2nd mask
registers based on
.

d

s

top register 3 + 0

reserve register 1+top register 1

No

top register+ reserve
er

C + shift

reserve register 2 + top
regtster
match 2nd stack
on equality

top register 2+pointer register 2

top register 2+ reserve
register 2

c3 + shift

Figure 4·49

JOIN

2

230

In Chapters II I and IV the ASL language and ASL machine were
studied.

In this seciton based on our knowledge from the previous

chapters the ASLM will be evaluated and compared with the discussed
machines in Chapter II.
In general ASLM is classified as a SIMD machine, because of the
cells and their functions and existence of the associative memory.
Moreover,

the concept of backend machine and independence of the

different modules of ASLH increase its multiprocessing capabilities.
ASLM differs
major features:

from a 11 the discussed machines, because of two

Firstly, separation of the processing units from the

secondary devices and incorporation of these processing units in the
design of the data base machine.

This provides a cost effective

system compared to the system based on the Slotnick's idea (Slotnick
1970), since:
1)

There

is

no need to have process i g uni ts equal

to the

read/write heads or in general there is no need for any
specially designed secondary storage.

Therefore, ASLM is

capable of handling any kind of the conventional secondary
storages.
2)

Independce of the cells will also eliminate the need of any
interconnection network between the cells and as a result
will reduce the cost.

3)

Since the number of cells is not necessarily equal to the
number of heads, by enabling and disabling the ce 11 s the
system could be adjusted for handling data bases of different sizes more efficiently.

231

Secondly,

the

concept

of presearchi ng through the data fi 1e and

selection of those tuples which satisfy the search criteria eliminates the need of very 1arge associative memory capab 1e of storing
whole data file or part of it at a time.
which
11

Generation of subrelation

is the result of this presearch eliminates the need of any

control

bits"

associated to the tuples on the secondary storage.

Moreover, the explicit subrelations enables us to perform interrelati ona 1 operations such as

11

j oin 11 faster than the systems such as RAP

or CASSM.
Incorporation of the associative memory in the design of ASLM,
in addition to advantages which have been discussed in Chapter I I,
brings two important features to the design of ASLM, which is missing
in a large number of the previously discussed systems:
1)

Resemblance between associative operations and data base
operations specially for relational data model (Chapter II)
yields to implement data base operations by hardware efficiently.

2)

Sort can be performed easily (Foster 1976).

The concept of cells creates a general resemblance between the ASLM
and RAP or CASSM, if the number of cells is equal to the number of
the tracks in a fixed head disk and each track is assigned to a cell.
Independence of each unit in a non-numeric processor results in
a modular system which is fault tolerant in some degree.

This means

that all the malfunction cells or associative stacks can be disabled
by the controller.

Because of this modularity system could be ex-

tended to a multiprogramming system.

Designing a network such as the

232

one in the DIRECT (Dewitt 1979), capable of partitioning cells and
associative

stacks

among

different

users

(application

programs)

provides a multiprogrammable system, in cost of an additional network
and slower operation time for each individual program.
will change the class of ASLM to the MIMD.

This idea

233

CHAPTER V
THE ASL PRIMITIVES
Introduction
The design of ASLM and ASLH have been discussed in the previous
chapters.

As we have noted in Chapter IV, ASLH is composed of four

subsystems or modules which are under the control of the controller.
Micro instructions (ASL primitives) which are stored in the memory of
the controller are interpreted by the controller and then appropriate
steps are taken by one of the ASLH modules.

This chapter describes the

ASL primitives, their formats and their functions.
composed of fixed length word(s) of 16-bit.
one, two or more words.

ASL primitives are

Instructions might be of

The gereral format of the ASL primitives is

depicted in Figure 5.1.

It

is composed of three separate fields:

device code, operation code and operand(s).
DC (device code).

specifies

one

of

the

ASLH

modules

(index

processor, secondary storage interface and non-numeric processor).

It

is a 2-bit field as follows:
00

index processor

01

secondary storage interface

10 and 11

non-numeric processor

According to this, all the ASL primitives can be grouped into
three groups, where each group of instructons will be executed by one
of the above modules.

instruction code

Device
Code

1.E-? hi tc: ;

Operation
Code

~

F. hitc:

Figure 5·1

Operands

1n hi h~

General format of the ASL Primitives

N

w

~

235
Op-code.

This defines the type of the operation.

The DC and

op-code define a specified operation which should be executed by one
of the specified modules.

Op-code is a 4-bit field.

Therefore, the

index processor and the secondary storage interface each can have of
16 instructions and the non-numeric processor can have a total of 32
instructions.
Operand.
DC and op-code.
one operands.

Operand part of each ASL primitive is a function of the
There exist ASL primitives o·f zero, one or more than
Because of the use of the associative memory, there is

no address generation schemes in our system.

Therefore, there are no

addressing schemes such as direct, indirect, index, etc. in the micro
operations. The micro operations can be cl ass ifi ed into two groups:
a)

those which carry a number which specifies the number of

words immediately following the instruction as operand(s).
b)

those which refer to a hardware unit(s) of the non-numeric

processor such as an associative stack for operands.
Micro Instructions
In this section the ASL micro instructions will be. discussed in
detail.

This section is divided into three sub sect i ans.

Each sub-

section will address the set of micro instructions in one of the ASLH
modules.
Index Processor
Recall that the index processor holds a unique entry for each
relation. The information in each entry will be used i) by the compiler for generation of the symbol table, ii) by the secondary storage

236
interface to access the secondary storage.
deleted

from

the index processor; therefore,

Entries are inserted or
the DELETE and INSERT

operations are performed by this module.

Set MasK Register.
the mask register.
words.

This instruction sets a sequence of ones in

Each instruction might be fol lowed by zero or more

Each word specifies a field in the mask register which should

be set to one.

The instruction is depicted as in Figure 5.2.

DC:

00 specifies the index processor

Op-code:

SMKR

Operand:

#FIELDS defines the number of words which follow
the instruction as operand(s).

In case the mask register should be set to all ones, number of
fields would be equal to zero, and this instruction will not be foll owed by any word as operand.

Figure 5. 3 shows the format of the

words which might follow this instruction as operands.
the mask register is defined by two values:

Each field in

address - a field of

8-bit length which is a displacement from the leftmost bit of the mask
register,

and

the

length of the fields

(8 bits).

Since in normal

applications, users are concerned with exact match rather than partial
match,
This

this instruction is not followed by any word(s) as operand.
feature

might

be

match is permissible.

used

for

Therefore,

further

development, where partial

the number of fields (#Fields) is

equal to zero.
It should be mentioned that the DC part of all the micro instructions executed by the index processor is equal

to (00),

therefore,

237

0 0

#FIELDS

SMKR

10 bits

"

Format of the set mask register

Figure 5·2

Length

Beginning Address

--

8 bits

Figure 5·3

-

.

-

-

8 bits

Format of the operands of the set mask
register .

.-

238

this fie 1d wi 11 not be repeated in our discussion in the rest of this
subsection.
Set ComParand Register.

Sets the content of the comparand regis-

ter according to the value which is the relation name specified in the
words

following

the instruction.

The relation name is always left

adjusted in the comparand register.

Figure 5.4 shows the format of

the instruction.
Op-code:

SCPR

Operand:

SIZE,

defines

the

number of words which

fo 11 ow the

instruction and contain the relation name.

In case of

the index processor, this value is fixed and is a function of the implementation.
MaTCH.

Checks the content of the comparand register (with re-

spect to the fields specified by the mask register) against the stored
information in the associative memory.
register will

The content of the comparand

be checked against all the valid words in the memory.

Figure 5.5 shows the format of the instruction.
Op-code:

MTCH

Operand:

Variant,

a

3-bit

field

which

defines

comparison should be taken place.
the

variant.

In

case

of

the

what type of

Table 5.1 defines

index processor,

variants (000) or (011) are only used.

the

The comparand

and mask registers are extended over to the validity
bit, which means that the comparand register will be

239

SIZE

SCPR

0 0

-

Figure 5·4

0 0

10 bits

--;.

Format of the set comparand register

MTCH

VARIANT

UNUSED

~

Figure 5·5

Format of the match

3 bits __.,

TABLE 5.1.
THE VARIANT OF THE MTCH INSTRUCTION OF THE INDEX PROCESSOR

CODE

DESCRIPTION

000

Match on equality

001

Match on greater than

010

Match on less than

011

Match on not equality

100

Match on greater than or equal

101

Match on less than or equal
I\.)
~

0

241
checked against all the valid words in the associative
memory.

COPY.

Copy transfers a value to or out of the associative stack.

In the first case the whole contents of the comparand register (Figure
4.14) will

be copied into the associative memory and in the second

case the whole contents of a selected words will be copied from the
associative memory.
Op-code:

COPY

Operand:

Tag,

Figure 5.6 shows the format of the instruction.

a 1-bi t

operation.

register which specifies the type of the
(1)

means

the

contents

of the comparand

register should be copied into the associative memory
11

and

0 11 means the contents of a selected row should be

copied into the output register.
FIND.
ous

Locates the selected row which is the result of the previThis

operation.

instruction.

Figure

instruction
5. 7

shows

the

is

naturally preceded by a MTCH

format

of the

instruction.

It

should be mentioned that instruction does not carry any information as
Operand.
Op-code:

FIND

SET Register.
11

0 11 or all

11

1.

11

Sets the content of a specified register to all
The format of the instruction is depicted in Figure

5.8.
Op-code:

SETR

Operand:

is composed of two parts:

242

00

COPY

UNUSED

TAG

....1 bit.

Figure 5·6

00

FIND

Figure 5·7

Format of the copy

UNUSED

Format of the find

243

VARIANT:

A 2-bit field as follows:
00 comparand register
01 mask register
10 all the validity bits of the associative words
11 error bit

TAG:

a 1-bit register which is set to O or 1 corresponding the operations of setting register to all
O's or all l's respectively.

DELeTe.

Erases the entry corresponding to a relation name from

the associative memory and the random access memory of the index processor . The preceding instruct ion to this i ntruct ion is a MTCH instruction which marks a row corresponding to the relation name.

This

instruction is a sequence of the steps as follows:
i)

memory address register gets the beginning address of the

descriptor in the random access memory of the index processor.

This

address will be kept as a field along with the relation name in the
associative memory.
ii)
iii)

the first bit of the descriptor will be set to 1
validity bit associated to the word corresponding to the

relation name is set to 0.
Figure 5.9

shows this instruction.

Op-code:

DELT

Operand:

There is no operand in this instruction.

INSerT.

Creates a new entry corresponding to a relation name in

the associative memory and the random access memory of the index

244

0 0

SETR

VARIANT

UNUSED

TAG

l.-2 bits-. .....1 bit

~

Figure 5·8

0 0

DELT

Figure 5·9

Format of the set register

UNUSED

Format of the delete

245

processor. This instruction like the DELT instruction is preceded by a
MTCH instruction. The instruction consists of the fol lowing sequence
of steps :
i)

insertion of the contents of the comparand register into the

selected word of the associative memory.
ii)

setting the validity bit of the selected row in the associa-

tive memory.
iii)

memory

address register gets the content of the address

register of the index processor.
iv)

address register of the index processor is incremented by

the size of the descriptor of the index processor (SIZE ).
1
v)

the address register of the secondary storage interface will

be incremented by the size of the descriptor of the SSI (SIZE ).
2
Figure 5.10 shows the format of the instruction:
Op-code:

INST

Operand:

is composed of two fie 1 ds:
defines the size of the descriptor stored in the
index processor.

In fact, this field specifies

how many words follow this instruction as operand.
These words ho 1 d the content of the descriptor,
which
random

should
access

be

moved

memory

to
of

MBR
the

register of the
index

processor.

defines the size of the descriptor stored in the
RAM

memory

of the secondary storage interface.

As can be seen from the Figure 5.10, 6 bits has been assigned to
the Size , which implies a descriptor of size 64 words.
1

246

TeST Bit.

Is a conditional branch instruction which tests the

content of the match bit.
instruction.

This instruction is preceded by a MTCH

If a match is successful, the control transfers (rela-

tive to the next in-line instruction) by the number of words specified
in the instruction.

Otherwise, the next in-line instruction would be

executed. Figure 5.11 shows the format of the instruction.
Op-code :

TSBT

Operand :

DISPLACEMENT, for a successful match specifies how many
words the control should be transferred.

As can be seen from the Figure 5. 11 a di sp 1acement of 10-bi ts
allows a control transfer of lK words.
BRanch Relative Forward.
fers the control

Is an unconditional branch which trans-

(relative to the next in-line instruction) by the

number of words specified in the instruction.

Figure 5.12 shows the

format of the instruction.
Op-code:

BRRF

Operand:

DISPLACEMENT,
many words

a positive number which specifies how

(at most lK) the control

is transferred

forward.
BRanch Relative Backward.

Is the same as branch re 1at i ve for-

ward, except it transfers the control backward with respect to the
next in-line instruction.
struction.
Op-code:

BRRB

Operand:

DISPLACEMENT

Figure 5.13 shows the format of the in-

247

0 0

INST

SIZE 1

--

.

4 bits

.

DISPLACEMENT

TSBT

--

Figure 5· 11

. -

6 bits

Format of the insert

Figure 5·10

0 0

SIZE 2

10 bi ts

Format of the test bit

....

248

0 0

BRRF

DISPLACEMENT

--

Figure 5·12

0 0

10 bits

Format of the branch relative forward

BRRB

DISPLACEMENT

10 bl ts

Figure 5·13

--

Format of the branch relative backward

--

249

Secondary Storage Interface
Rec a 11

that secondary storage interface ho 1ds an entry for each

relation in order to enable ASL to have access to the secondary storage.

Therefore,

the DELETE and READ are the set of the operations

which will be performed by accessing the content of the descriptor on
the

RAM memory of the

secondary storage

interface.

The beginning

address of this descriptor is kept in the index processor.
It should be mentioned that the device code for all the micro
operations of the secondary storage interface is "01.
DELeTe.

E1 i mi nates

a

secondary storage interface.

descriptor from

the

11

RAM memory of the

This instruction will be preceded by a

de 1ete operation in the index processor.
de 1ete bit of the descriptor to 1.

The operation wi 11 set the

The format of the instruction is

depicted in the Figure 5.14.
DC:

01

Op-code:

DELT

Operand:

the

beginning address

of the descriptor

in the

RAM

memory.
READ.

Transfers the content of the descriptor for the relation

stored in secondary device to the channel processor, then initiates
the reading of the data from the secondary storage.

Remember that the

content of the descriptor provides enough information in order that
the channel processor is able to have access to the data file stored
on the secondary storage.
struction:

Figure 5.15 shows the format of the in-

250

0 1

ADDRESS

DELT

10 b1 ts

~

0 1

Figure 5·14

Format of the delete

READ

ADDRESS

-

Figure 5·15

7 bits

Format of the read

~

SIZE 2

r

-

3 bits

-

~

251

DC:

01

Op-Code:

READ

Operand:

Is composed of two parts:

ADDRESS:

Which

specifies

the

beginning

address

of

the

descriptor in the RAM memory of SSI.
SIZE:

Which specifies the size of the descriptor in the
RAM memory of SSI.

As can be seen 3-bits are assigned to specify the size of the
descriptor (at most 8 words).
the size of descriptor,

If 8 words are not enough to specify

the instruction could be extended to two

words, where the second word specifies the size of the the descriptor.
Non-Numeric Processor
Recall that the cells select the valid tuple (with respect to the
search criteria) and route them to the associative stack module.

From

the ASL productions (Table 3.9) we know that the search conditions may
be a membership with respect to a subrelation stored in an associative
stack from a previous computation or a sequence of disjunctions of
conjunctions of elementary search conditions.

In the first case the

cells are acting as intermediate buffers which transfer tuples or part
of tuples to the associative stack module, and the validity of the
tuples will be checked in the associative stack module rather than in
the cells.

In the second case the test logic circuit of the cells act

as selectors which select the valid tuples.

A bit in the cell speci-

fies which of the above two types of operations is to be executed.

252
In the rest of this section we describe the micro instructions of
the cells and the associative stack module.
All

the

transferred tuple or subtuples from the cells to the

associative stack module will be deposited in the associative stacks,
where all

the non-numeric operations as well

as numeric operations

will

be executed on them.

At the end of the operations, th result

will

be sent to the user (vi a the general purpose computer) or the

backend storage.
The set of the primitives should cover all the operations specified in Chapter IV.
can define

operations which

definition of ASL.
system.

Since the controller is programmable, programmers

Appendix

have

not been included in the initial

This increases the flexibility and power of the
II shows the microprograms of all the operations

discussed in the previous chapter.
In the design of the associative stack module, 16 associative
stacks have been used in the system; therefore, a field of 4-bit would
be enought in order to address a stack.
Set ConTent of Register.
ter

Sets the content of a specified regis-

( see Table 5. 2) by the contents of the words specified in the

operand field which follow this instruction.

The format of the in-

formation is depicted in Figure 5.16.
DC:

10

Op-Code:

SCTR

Operand:

is composed of two parts:

VARIANT:

A 4-bit

field

which

refers

to a

specific

10

SCTR

# WORDS/VALUE

-

Figure 5·16

6 bits

VARIANT

-

4 bits

:::

Format of the set content of register

N
u,

w

254
register

in

the

cells.

Table 5.2 defines the

variant.
#WORDS/VALUE:

Carri es two different concepts based on the
content of the variant.

For 000 $ variant $

0111 this field gives the number of words
which follow this instruction which form the
operand.

The

content of

these words

are

transferred to the specified register in the
instruction.

For 1000 $variant$ 1111, this

field contains a value which is transferred
to the register.
It should be mentioned that in case the search set (HOW) is empty for
variant= 0001 the content of the #WORDS/VALUE will be equal to zero
meaning that c i rcular shift register of the domain recognizer associated to the test logic circuit should be set to all l's.

In this case

the content of the R registers variant (0101) will be set to
1

II

don I t

care" characters and content of the comparison circuit of the units in
the test logic circuit (variant= 0100) will be set appropriately.

In

case the output set (WHAT) is empty the content of the #WORDS/VALUE
for the variant= 0000 would be set to zero, indicating that the
circular shift register of the domain recognizer associated to the
input register should be set to all l's.
MOVE.

This instruction transfers the content of one register to

another register.
be the source.

One register would be the sink and the other would

At the end of the operation the content of sink will

255

TABLE 5.2.
THE VARIANT OF THE SCTR

CODE

DESCRIPTION

0000

Circular shift register of the domain recognizer
associated to the input register

0001

Circular shift register of the domain recognizer
associated to the test logic circuit

0010

Circular shift register of the input register

0011

Circular shift register of the reformat circuit

0100

Comparison
circuit

0101

R register of the units in the test logic circuit.
1

1000

The bit register which defines the type of the
operation performed by the cell

1001

The register of the shifter register in the
domain recognizer associated to the input

circuit of

the

units

in

the test logic

register

1010

The register of the shifter register in the domain
recognizer associated to the test circuit

1011

The register of the shifter register in the
input register

1100

The register of the counter of the domain recognizer.

256
be lost and replaced by the content of the source. There are two types
of instructions in this group.
MoVe Register to Register.
i)

Transfers:

the content of the output register of an associative
stack to a specified register

ii)

the content of a specified register to the comparand
register of an associative stack.

Figure 5.17 shows the format of th instruction.
DC:

11

Op-code:

MVRR

Operand:

is composed of three parts:

STACK#:

Specifies which stack is involved in the opera-

ti on.
TAG:

Shows one of the operations.
operation ( i) and

11

11

0 11 means the first

1 11 means the second operation

(ii) as stated above.
VARIANT:

a 3-bit field which defines a pair of registers.
Table 5.3 describes the variant.

MoVe Output register to Comparand register.
between

two

different

stacks.

Transfers data

The content of the output register

(source) of a specified stack wi 11 be transferred to the comparand
register

(sink)

of another register.

shown in Figure 5.18.

The instruction format is as

257

11

UNUSED VARIANT

MVRR

STACK#

TAG

._ 3 bits~) bi_t

Figure 5-17

11

-

-

Fonnat of the move register to register

MVOC

UNUSED

STACK #2

STACK #1

~ 4 bits

Figure 5 · 18

4 bite

- -

4 bits

Format of the move output register to
comparand register

---+-

TABLE 5.3.
THE VARIANTS OF THE MVRR
CODE

DESCRIPTION

000

Top register, reserve register

001

Pointer register, top register

010

Output register, shift register, in this case tag specifies a half
of the shift register

011

Output register, shift register (tag is unused)

100

Shift register, comparand register (tag is unused)

101

Output register, remove blanks

110

Remove blanks, comparand register

111

Output register, reformat circuit
N
(J"I

co

259
DC:

11

Op-code:

MVOC

Operand:

Composed of two parts:

STACK :

Defines the source (output register)

STACK :
2

Defines the sink (comparand register)

1

Set MasK Register.

Sets a sequence of 1 1 s at specified loca-

tions(s) of the mask register of a stack.

Each location is determined

by its beginning address and its length.

The instruction is followed

by a sequence of words, each defining a field in the mask register.
Figure 5.19 shows the format of the instruction and Figure 5.20 shows
the format of the words which follow the instruction as operand.
DC:

11

Op-code:

SMKR

Operand:

Composed of two parts:

STACK#:

Specifies the mask register of a specific stack

#WORDS:

Defines how many words will follow the instruction
as operands.

As can be seen at most 64 words can

follow the instruction as operands.
Gate SubTuPle.

Gates the subtuple or tuple from a cell to a

specific stack or to the general purpose computer.
instruction is depicted in Figure 5.21.
DC:

11

Op-code:

GSTP

Operand:

Composed of two parts:

The format of the

260

6 bits

~

Figure 5·19

8 bits

Figure 5-20

.

-

4 bits__..

Format of the set mask register

LENGTH

BEGINNING ADDRESS

-

STACK#

# WORDS

SMKR

11

.

-

8 bits

Format of the operand of the SMKR

..

261
STACK#:

Specifies the comparand register of a stack which
should be set by the data coming from the cell.

TAG:

Defines the type of the operation.

11

0 11 means the

data should be transferred to the GPC,
case stack # is unused.

11

in this

1 11 means the data whoul d

be transferred to the associative stack module.
COPY.

This instruction transfers a value to or out of a specific

associative stack.
register will

In the first case the content of the comparand

be pushed into the stack and in the second case the

stack will popped and the result goes to the output register.

In both

cases, operations could be performed with respect to or without the
content of the mask register.

The format of the instruction is de-

picted in Figure 5.22.
DC:

11

Op-code:

COPY

Operand:

Composed of two parts:

STACK#:

Specifies one of the associative stack.

CODE:

A 2-bit field.

The first bit shows the direction

of the operation (e.g.
and

MaTCH.

11

1 11

means

defines

the

respect

to

POP

11

0 11 means push operation

operation).

The

second bit

operation should be performed with
the

mask

register

(O)

or not

(1).

Compares the comparand register against the stored data

in a specified associative stack with respect to the fields given in
the mask register.

Figure 5.23 shows the format of the instruction.

262

11

UNUSED

GSTP

STACK#

TAG

1 bit..- 4 bits

Figure 5·21

11

:::

Format of the gate subtuple

COPY

1.4-

Figure 5·22

STACK#

CODE

UNUSED

2 bits

Format of copy

-

--

4 bits ___..

263

DC:

11

Op-code:

MTCH

Operand:

Composed of two parts:

STACK#:

Specifies an associative stack.

VARIANT:

A 3-bit field which specifies the type of operaTable 5.1 defines the variant.

tion.
Test BiT register.
associative
( re 1at i ve

stack.

to

the

Tests any of the test bits assigned to each

If a test is successful,

next

i n-1 i ne

the control transfers

instruction) by the number of words

which is specified in the word following the instruction, otherwise
the next intruction will
similar to

a conditional

be executed.
jump

In fact, this instruction is

instruction.

Figure 5.24 shows the

format of the instruction.
DC:

11

Op-code:

TSBT

Operand:

Composed of two fields:

STACK#:

Specifies one of the associative stack.

VARIANT:

A 3-bit field which specifies

a

register.

Table 5.4 defines the variant.
SeT BiT.

Sets a specified register to one or zero.

the instruction is depicted in Figure 5.25.
DC:

11

Op-code:

STBT

Operand:

Composed of three parts:

Format of

264

11

MTCH

UNUSED

VARIANT

STACK#

~3 bits - _

Figure 5-23

11

TBTR

Format of the match

UNUSED

STACK #

VARIANT

._ 3 bits

Figure 5·24

4 bits~

.-

-

Format of the test bit register

4 bits---.

265

TABLE 5.4.
THE VARIANT OF THE TSBT

CODE

DESCRIPTION

000

Stack full

001

Match bit

010

Stack Empty

011

Inclusion bit

100

Equa 1 i ty bit

266
STACK#:

Specifies an associative stack.

VARIANT:

A 3-bit field as defined in Table 5.5.

TAG:

A 1-bit

field which defines the content of the

register
A combination of this instruction before a MTCH operation enables
us to perform

11

AND 11 or

11

0R 11 operation.

Remove Blank/ReFormat.

As mentioned in Chapter IV, two pieces of

hardware are attached to the associative stack module.

One is used to

eliminate blank symbols in the fields of a record in some operations
such as

II

projection.

11

The other one is used to change a fixed format

tuples of a relation which is stored in an associative stack to a
format like that used for the secondary storage.

The output register

of an associative stack is the input to these devices.
The format of instruction is depicted in Figure 5.26.
DC:

11

Op-code:

RBRF

Operand:

Is composed of a part.

TAG: Defines one of the hardwares.
blank" and

11

0 11 stands for

11

remove

1 11 stands for "reformat."

BRanch Relative Forward.
transfers the control

11

Is

an

unconditional

branch,

which

(relative to the next in line instruction) to

another part of the program.

The instruction is similar to the in-

struction which has been discussed for the index processor, except
that the device code differs.
instruction.

Figure 5. 27 depicts the format of the

267

UNUSED

STBT

11

VARIANT

~

Figure 5·25

11

RBRF

STACK#

TAG

3 bits--- 1 bit

~

4 bits

-

Format of the set bit

UNUSED

TAG

1 bit

Figure 5·26

Format of the remove blank/refonnat

268

TABLE 5.5.
THE VARIANT OF THE STBT

CODE

DESCRIPTION

000

Top register

001

Additional bit of comparand register

010

Additional bit of mask register

011

Inclusion bit

100

Equality bit

101

Validity bit of words

269
Op-code:

BRRF

Operand:

Displacement, a positive number which specifies
how many words the contra l , should trans fer forward.

BRanch Relative Backward.

Is the same as the previous i nstruc-

tion except the control transfers backward with respect to the next in
line

instruction.

Format of the instruction is depicted in Figure

5.28.
DC:

11

Op-code:

BRRB

Operand:

Displacement

INcrement/DeCrement.
top register by 1.

Increments or decrements the content of the

Recall from Chapter IV, to each associative stack

an i ncrementer/decrementer circuit is attached which can change the
content of the top register by one.

Therefore, the input and output

for incrementer/ decrementer is the top register.

Figure 5.29 shows

the format of the instruction.
DC:

11

Op-code:

INDC

Operand:

Is composed of two parts:

STACK#:

Specifies a top register.

TAG:

Defines
for

RoTATe.

the type of operation (e.g.,

increment

and

11

1 11

stands

for

11

0 11 stands

decrement).

Rotates the content of the shift register of the asso-

ciative stack module to the right or left by the specified value in

270

11

BRRF

DISPLACEMENT

1 0 bits

Format of the branch relative forward

Figure 5·27

11

DISPLACEMENT

BRRB

-

Figure 5·28

~

1 0 bits

Format of the branch relative backward

--

271
the
11

instruction.

join 11 operation.

This

instruction will

be

used for implementing

The format of the instruction is depicted in Figure

5.30.
DC:

11

Op-code:

RTAT

Operand:

Is composed of two parts:

TAG:

Specifies

that

the

contents

of

shift

register

should be rotated to the right of left, where
stands for rotate right and

11

11

0 11

1 11 stands for rotate

1 eft.

#BITS:

Defines how many bits the shift register should be
rotated.

WRiTE.

This instruct ion transfers a tup 1e from the associative

stack module to the general purpose computer or to the backend storage. In case of the genera 1 purpose computer the content of an output
register will be transferred, and in case of backend storage the tuple
will be transferred from the reformat circuit.

Figure 5.31 shows the

format of the instruction.
DC:

11

Op-code:

WRTE

Operand:

Is composed of two parts:

STACK#:

Defines an output register.

TAG:

Specifies the general purpose computer (e.g.,
or backend storage (e.g.,

11

1 11 ) .

11

0 11 )

In case of back-

end storage the content of stack# is irrelevant.

272

UNUSED

INDC

11

TAG

STACK#

1 bit...-- 4 bits--.

Figure 5·29

11

Format of the increment/decrement top
register

RTAT

# BITS

TAG

1 bit -

Figure 5·30

9 bits

Format of the rotate

--

1

1

UNUSED

WRTE

TAG

1 hit -

Figure 5·31

STACK#

4 bits

-

Format of the write

I\.)

-...J

w

274
Conclusion
In this chapter the set of the micro instructions of the ASLH has
been

presented.

operations

Appendix

described

in

II

the

shows

the micro programs

previous

chapter.

of all

the

As can be seen the

format of the instructions are more or less the same.
The set of the ASL micro operations and implementation of the
ASLM have direct effect on the ASL programs and their execution time:
firstly ,

in the conventional

systems a search through data file

is

performed by a search through the records one by one which means an
iteration of a sequence of the operations,

in contrast,

in the as-

sociative memory a search through data file is performed simultaneous l y

through

all

the

associative memory.

records.

Since ASLM

is

implemented by

the

This means a simplification in the ASL programs

and efficiency in their execution time; secondly, the set of the ASL
primitives

shows

immediate,

therefore,

direct,

that all

the

operands

in the ASL primitives

are

there is no need for addressing modes such as

indirect, index etc.

The immediate addressing mode which is

the only modes used in ASLM reduces the execution time; thirdly, as it
can be seen from the ASL primitives, the ASL programs are strictly
sequential ,

except

for

branch

forward or backward.

The sequential

operations can be handled faster than those in a program which have
nested loops or frequent jump operations,

since the address of the

next instruction is always available in the program counter.
features

These

not only reduce the execution time, but they also simplify

the hardware organization of the controller.
Chapter III addresses a sequence of the typical query examples.

275
For example, consider the following example:
Example 7:

Get a list of all

the employees with the amount

of hours they work for each department.
In ASL above query is translated into:
1)

E·

2)

ED;

3)

X

=

[E]

ENAME, ENO;

4)

y

=

[ ED]

ENO, HOUR;

5)

w = (X) ENO= ENO (Y);

'

As mentioned before (Chapter III), no code is generated for the
first two lines.
fourth

and

A close look at Appendix II shows that the third,

fifth lines respectively, are translated into 10, 10 and

25 primitive operations.

Therefore, the above program is translated

into at most 60 words in the memory of the controller. This example
shows two points:
1)

The effect of the above f ea tu res on the program size and

therefore the execution time; and,
2)

For the above example a controller with lK-bit size is ade-

quate.
In real

life situations, where we deal with more complex relations,

the above figure is more or less valid because of the nature of the
queries.

In the case of a very complicated query it seems a PROM

memory of 16K-word is adequate for the controller.

276

CHAPTER VI
ASL TIMING SEQUENCE
Introduction
Performance evaluation

of a computer is an art rather than a

science in its current state.

It seems, the best way to evaluate a

system is to build the system, run it on data, and then analyze the
results. As the systems are getting more and more complex and expensive this procedure seems less and less valid.
Computer systems
structures,

have evolved into a wide variety of complex

and are made of several

components.

As the computer

systems are getting more and more complex, so do the methods for
evaluation of the systems.

The evaluation of a new computer design

involves a lot of parameters which should be taken into account. For
example, in a computer system the central processor is one of the
important components which should be evaluated but the evaluation of
this

unit is

other words,

not equivalent to the evalwation of the system.

In

in the evaluation of a machine there is not a single

parameter which can give accurate evaluation of the machine, even if
we are using that system for a single application.

Even in the

evaluation of the central processor there is not a single parameter
capable

of defining

the behavior of the central

processor.

For

example, the cycle time of the main memory has been used to indicate
the power of a machine, but if we consider just this parameter, then

277

PDP 11/05 with a memory cycle time of 0.9 micro seconds would be more
powerful than the IBM 370/155 with a memory cycle time of 2.1 micro
second, s i nee in this comparison an important performance parameter
such as the memory interleaving was not considered.
Another parameter which might be important in the evaluation of
the computer system is the number of instructions per second executed
by the processor (MI PS - mi 11 ions of instructions per second). This
parameter is meaningful

in SISD (Single Instruction Stream Single

Data Stream) systems, but it is a misleading parameter in a vector
machine or in general in SIMD machines where one instruction execution needs many operands.

An alternative parameter to MIPS is MOPS

(millions of operands per second), but in this parameter the notions
such as word length and size of the main memory are missing.

We can

al so talk about some other parameters such as i) secondary storage
access time, that is, the time between a request for a record from
the central processor until it receives an interrupt saying that the
record has been transferred into the primary memory, or ii) throughput which is used to measure how well the capability of the system is
being used.
Measurement

of

actual

performance

of computer

systems

show

marked discrepancies with respect to what can be projected based on
the performance of i ndi vi dual components or what can be predicted
based on analytical or simulation models.

For example IBM 360/91 is

capable of i nit i at i ng a new instruction every 60 nanoseconds and
hence a potential of

16.7

*

10 6 instructions per second.

However,

the measurement of the IBM 360/91 indicates it rarely exceeds 3 or 4

278

MIPS.

But it is useful to compare the result of analytical or simu-

lation model with the performance of the real system in order to find
out more tools for performance evaluation of the systems.
The above brief introduction on performance evaluation serves to
point out some of the most difficult prob 1ems that one has to confront in dealing with the evaluation of any new system.

There is a

large body of 1 i terature (Stone 1975) that treats this cha 11 engi ng
field of computer science in great detail.

In this chapter we will

present some estimate of performance of the backend data base machine
ASLM based on some simplifing assumptions as follows:
1)

we are not considering the effect of the front-end (host)
machine.

For our backend data base machine, the evaluation

of the system is equivalent to the effect of the software
programs, general purpose computer and the backend processor (ASLH).

Si nee the effect of the software systems and

genera 1 purpose computer for compiling query program and
all bookkeeping operations for the query are common for any
systems,

in our evaluation we arc concerned with the per-

formance evaluation of the backend machine.
2)

For the query systems the response time which is the time
interval

beginning with a request for service until the

request is comp 1eted is a major parameter in the performance evaluation of the systems.

ASLM is a query system

and therefore, we believe the response time is a reasonable
parameter for performance evaluation of the ASLM.

Thus,

279
we restrict ourselves to the evaluation of different ASL
operations.
3)

Since ASLM is a data base machine we consider the evaluation

of

data

base

and

non-numeric

operations

proposed

earlier leaving the arithmetic and other conventional control operations aside.
4)

In

this

evaluation we are going to ignore the control

overhead of backend processor.

Therefore, we evaluate the

macro functions which have been discussed in Chapter IV.
5)

In our system as mentioned before (Chapter V), instructions
are of word(s) of 16 bits.
which

connects

the

Furthermore,

different

units

the

of the

data

bus

non-numeric

processor has a width of 1024 bits.
We will adopt the following general procedure for evaluation of
the ASL macros i) writing up the micro program of the ASL function
(Appendix II), and ii) adding up the execution times of the ASL micro
operations along with time for iterations, if any, of each individual
micro operation.

In order to be able to calculate the execution time

of each micro operation we estimated the execution times based on
currently available machine times.
systems:

We chose three different compter

VAX 11/780 which is a fast minicomputer based on the cur-

rent technology, UNIVAC which is a large scale machine and STARAN
which is a general purpose machine implemented by associative memory.
Appendix III shows the reported timing of micro operations of the
above mentioned systems.

In general, the execution time of an in-

struction is the addition of the fetch ti me, the operands access

280

time,

the execution time and the time to store the result to the

destination.

For unary operations the execution time of the instruc-

tion is the same as above except one operand will be fetched for each
instruction.

Notice that, branch instructions can be assumed as a

unary instruction where no operand is going to be fetched.

In our

system s i nee there is no addressing mode, and a 11 the operands are
immediate there is no time involved in the operands access time, this
results

in

a

simple

calculation

for

instruction execution time.

Table 6.1 shows the timing of the ASL micro operations based on
Appendix III.

For micro instructions which have been found in more

than one machine we computed the average.

For others ( for example,

the associative operations found in STARAN) we entered the reported
execution times.

The set of micro operations defined in the previous

chapter is the lowest level of the operation in the ASLH, where each
ASL function or macro is a combination of these micro operations.
In the following, the execution time of the ASL macros will be
derived.

These

expressions

could then be used to calculate the

execution time of the ASL functions.
Computation of the ASL Macros
In this section the execution time of each ASL macro based on
the Tables 6.l(a,b and c) will be calculated.

For each macro first a

semi ASL program will be presented, then the execution time will be
calculated.

This section is divided into three subsections each

addresses one of the ASLH modules (index processor, secondary storage
interface and non-numeric processor).

Some of the expressions which

TABLE 6.l(a)
INSTRUCTION TIME OF ASLM (INDEX PROCESSOR)
OPERATION

DESCRIPTION

EXECUTION TIME(µ sec)

ADD

.4

BRRB

branch relative backward

COPYO

store content of comparand register in associative memory

.87 - 2.87

COPY

load output register from associative memory

. 38 - 1. 05

1

.7

DELT

delete (execution time will be calculated)

FIND

find first equality

INST

insert (execution time will be calculated)

INC or DEC

increment or decrement

.4

LI

load immediate

.28

.2 -

.6

N

co

1-.1

TABLE 6.l(a)
OPERATION

Continued

DESCRIPTION

EXECUTION TIME(µ sec)

LOAD

.28

MTCH

match

SET

set validity bits of the rows in the associative memory

.17

SMKR

generate mask

.36

SCPR

set comparand register

.28

TSBT

test bit on equality

WRITE

store content of register in the memory

*n

. 2n* + 1. 08

. 35 - 1. 4

1.5

Length of each word of the associative memory (in bits)

N

co
N

TABLE 6.l(b)
INSTRUCTION TIME OF ASLM (SECONDARY STORAGE INTERFACE)
EXECUTION TIME(µ sec)

OPERATION

DESCRIPTION

EN

energize (start channel processor)

.13

LI

load immediate

.28

LOAD

.28

READ

store memory in register

1. 5

WRITE

store register in memory

1.5

TABLE 6.l(c)
INSTRUCTION TIME OF ASLM (NON-NUMERIC PROCESSOR)
OPERATION

DESCRIPTION

BRRB

branch relative backward

.7

branch relative forward

.7

EXECUTION TIME(µ sec)

-BRRF

-COPY O

store content of comparand register in the associative memory .87 - 2.87

--

COPY

1

store output register from associative memory

INC or DEC

increment or decrement

FIND

find first equality

. 38 - 1. 05

.4
.2 - .6

-GSTP

-MOVE

will be calculated later

N

co

+::>

TABLE 6.l(c)

-- Continued

OPERATION

DESCRIPTION

MTCH

match

RBRF

will be calculated later

ROTATE

rotate right or left U places

RTAT

will be calculated later

SCTR

will be calculated later

SMKR

generate mask

.36

STBT

set bit

.17

TSBT

will be calculated later

WRTE

store register in memory

*n

number of bits in each associative word

EXECUTION TIME(µ sec)
. 2n* + 1. 08

1

~

U

~

72

.35

1. 5

N

(X)

u,

286
are derived in this section depends on some parameters viz. depth of
stack, relation size, .. . etc.

In the next section, these expressions

will be evaluated with appropriate assumption of the values for these
parameters .
Index Processor
Delt.

This macro will delete an entry from associative memory

and RAM memory of the index processor.

In fact, this macro elimi-

nates a relation from the data base system.

The execution time for

this operation will be calculated in detail.

By a similar procedure

we can cal cul ate the execution time of the rest of the operations.
FIND

find first equality

LOAD

load MAR by the address of the descriptor

RSET

reset

LI

load 1 in the MBR

WRITE

write content of MBR on the memory

the

validity

bit

of

selected

row

The execution time is:

T
= 2.43
DEL T(mi n)
TDELT(max)
Inst.

=

2.83

This macro will

µsec

µsec
insert a new entry to the associative

memory and RAM memory of the index processor.
LOAD

load content of the AR register of the index
processor

to

the

AR

1

part

of

comparand

287

LOAD

1oad content of the AR register of the SSI
to'the AR

LOAD

1 oad

part of comparand

2

SIZE 1 parts of instruction to SIZE

1

part of comparand
LOAD

load

SIZE

2

part of

instruction to SIZE

2

part of comparand
COPY
SET

set tag bit of the selected row

LOAD

load MAR from AR register

ADD

add SIZE

TSBT

if SIZE

1

1

part of instruction to AR
part of comparand is equal to zero

branch +5
DEC

decrement SIZE

LOAD

load MBR

1

part of comparand

WRITE
INC

increment MAR

BRRB

branch backward

6

Execution time is:
TINST

= ST LOAD + TCOPY + TRSET + TADD + SIZEl * [T TSBT +

TDEC + TLOAD + TWRITE + TINC + TBRANCH] + TTSBT

TINST(min) = 4.24 + SIZE 1 [3.63]

µsec

TINST(max) = 6.24 + SIZE 1 [3.63] µsec
Delete. This function deletes a relation from the system, by
erasing associated entries of the relation in the associative memory

288

and RAM memory of the index processor:
SMKR

generate mask register

SCPR

set comparand register

MTCH

match on equality

TSBT

test match

STBT

set error bit

BRRF

branch forward 1

DELT

delete macro

bit

if

true branch forward 2

For a successful operation:
TDELETE

= TSMKR

TDELETE(min)

= 12.23

µsec

T
DELETE(max)

= 12.63

µsec

+

TSCPR

+

TMTCH

+

TTSBT + TDELT

For an unsuccessful operation:
TDELETE

= TSMKR + TSCPR + TMTCH + TTSBT + TSTBT + TBRRF =

9.62

µsec

in the above operation we assumed relation name of length 32 bits.
Insert.

Insert creates a new relation in the data base system.

Relation will be created by inserting an entry to the associative
memory and the RAM memory of the index processor.
SMKR

generate mask register

SCPR

set comparand register

MTCH

match on not equality

TSBT

test match bit if true branch forware 2

STBT

set error bit

289

BRRF

branch forward

INST

insert macro

n + 1

where n specifies the length of the descriptor.
For a successful operation we have:
TINSERT

= TSMKR

+

TSCPR

+

TINSERT(mi n)

= 14.04

+

SIZE

[3.63]

µsec

T
INSERT(max)

= 16.04

+

SIZEl [3.63]

µsec

1

TMTCH

+

TTSBT

+

TINST

For an unsuccessful operation:
= TSMKR + TSCPR + TMTCH + TTSBT + TSTBT + TBRRF

TINSERT

=

9.62

µsec

Secondary Storage Interface
Delt.

This macro erases the associated entry in the relation

from the RAM memory of the SSI
LI

load 1 in the MBR

WRITE

write content of MBR of the memory

TDELT

= TLOAD + TWRITE

TDELT

= 1. 78

Read.

µsec

This macro will extract information for each relation in

order that the channel program be able to access, data file on the
secondary storage.
LOAD

load the beginning address of the descriptor to
MAR

READ

initiate read operation

EN

start channel program

TREAD

= TLOAD + TREAD + TEN

TREAD

= 1. 91

µsec

290
Non-Numeric Processor
Move.

As mentioned before (Chapter V),

this instruction is

classified into two groups:
i)

for MVRR where the variant is equal
TMVRR

= TLOAD = · 28

and for variant equal
TMVRR
ii)

= n/1024

to (000) or (001)

µsec
to (010),

(0ll),

*TLOAD

= n/1024

*.28

µsec

*TLOAD

= n/1024

*.28

µsec

(101) and (ll0)

for MV0C we have:
TMVOC

= n/1024

where n is the length of the comparand register.
RBRF.

Reca 11 from Chapter V that, this is in fact two i nstruc-

t ions i) remove blanks, which eliminates blank characters in a record. The execution time would be a function of the length of the record.

Since the input to this circuit is the output register of an

associative stack, therefore, the execution time will be proportional
to the length of the output register.

The comparison of a character

with blank character will be about .4 µsec:
TRBRF

= n *.4

µsec

where n is the length of the output register (in byte) ii) reformat
circuit, recall from Chapter V that the execution time of this unit
is in proportion to the 1 ength of the record.

Si nee the input to

this circuit is one of the output register of the associative stacks,
the execution time will be equal as is the case of the remove blanks.

291
RTAT.

Recall from Chapter V, this instruction rotates the con-

tent of the

II

shi fter 11 of the associative stack module by the speci-

fied number of bits to the right or left (at most 512 bits).

There-

fore, the execution time is equal to:
TRTAT

=

fn/721 *TROTATE

=

fn/727 *.35

µsec

The shifting circuit is similar to that of the UNIVAC 1100/80
system, where by assigning a high speed shift matrix, the execution
time for a number of bits up to 72 is fixed and euqal to .35
SCTR.

Recall from Chapter V that for 0000

~

variant

~

µsec.

0111 the

content of the #WORDS/VALUE specifies how many words foll ow this
instruction

as operands.

TSCTR

=

Therefore,

the execution time will

#WORDS/VALUE *T LOAD

=

#WORDS/VALUE * . 28

be:
µsec

and for 1000 ~variant~ 1111 the content of the #WORDS/VALUE defines
the content of the register, therefore, the execution time will be:

TSCTR
TSBT.

= TLOAD = · 28

µsec

Recall from Chapter V, that this

is

an

unconditional

branch where the content of a bit will be tested and control will
transfer to a specified location if test is true.
successful test we have:
TTSBT

= 1.4

µsec

and for unsuccessful test:
TTSBT

= .35

µsec

Therefore for a

292
PUSH.

This macro inserts a new element into a specified as-

sociative stack.

New element resides in the location above the row

pointed by the top of the stack pointer.
TSBT

if stack is full branch to +n

MTCH

match on equality

TSBT

test match bit if true branch to +2

INC

increment stack pointer by 1
the content of the comparand register in
the associative memory. In this program n
determines the displacement

write

Based on the above program three cases might occur:
i)

ii)

stack is full

TPUSH

= TTSBT

TPUSH

= 1.4

µsec

stack is not full, but there exists an entry equal to the
contents

of

the

comparand

register

in

the

memory:

iii)

TPUSH

= TTSBT + TMTCH + TTSBT

TPUSH

= 2.83 + .2n

µsec

stack is not full and match is not successful:

TPUSH

= 2TTSBT + T

MTCH

+ TINC + TCOPY

3.05 + .2n
TPUSH(min) =

µsec

= 5.05 + .2n
T
PUSH(max)

µsec

associative

293
POP.

This macro pops up the entry pointed by the top of stack

register from a specified associative stack.
TSBT

if stack is empty go to +n
write the content of the row pointed by the stack
pointer in the output register

DEC

decrement the stack pointer.

Therefore, in an unsuccessful operation, when the stack is empty we
have:

TPOP

= TTSBT

TPOP

= 1.4

µsec

and for successful operation we have:

TPOP

= TTSBT + TCOPY + TDEC

T
POP(min)

= 1.13

µsec

TPOP(max)

= 1.80

µsec

Union.

Elements of two sets (contents

of

two

associative

stacks) are merged in one of the stack.
SMKR

sets mask register of first stack

SMKR

sets mask register of the second stack

MVRR

save content of the top register of the second
stack

POP

pop second stack

MVOC

transfers the content of the output register of
2nd stack to the comparand register of the 1st
stack

PUSH

push the 1st stack

BRRB

branch to the POP operation

MVRR

set the top register of the 2nd stack

294
If Vis the depth of the 2nd stack, then the execution time of ·
the operation would be:

TUNION

=

2TSMKR + TMVRR + V[T POP + TMOVC + TPUSH +
TBRRB] + TPOP' + TMVRR

where Tpop' is the time to perform a pop from an empty stack.
TUNION(min)

= 2.68+V[4 . 88 + n(.28/1O24 + .2)]

µsec

TUNION(max)

= 2.68 + V[7.55 + n(.28 /n + .2)]

µsec

where n is the width of bus line
Intersection.

The common elements of two stacks are inserted

into a third stack:
SMKR

sets the mask register of the 1st stack

SMKR

sets the mask register of the 2ns stack

SMKR

sets the mask register of the 3rd stack

STBT

resets the top register of the 3rd stack

STBT

resets the validity
the 3rd stack

MVRR

saves the content of the top register of the
2nd stack

POP

pops 2nd stack

MVOC

transfers the content of the output register of
the 2nd stack to the comparand register of 1st
stack

MVOC

transfers the content of the output register of
the 2nd stack to the comparand register of 3rd
stack

MTCH

match 1st stack on equality

TSBT

if match is successful go to +1

PUSH

push 3rd stack

bits

of all

the rows of

295
BRRB

go to POP

MVRR

set the top register of the 2nd stack

Recall from our previous discussion that the
preceded by a match on equality.

11

PUSH 11 operation is

In our system duplication is not

a 11 owed in the stacks; therefore, the push macro can be performed
without the match on the equality.
in the above program.

This has been shown by the (PUSH')

If the depth of the 2nd stack be V and the

number of the common elements in the 1st and 2nd stacks are V

1
,

then

the execution time will be:
= 3TSMKR

TINTERSECTION

TMTCH

+
+

2TSTBT
TBRRB

TTSBT) + TPOP
TINTERSECTION(min)= 3.38
µsec

+

V(3.26

+

2TMVRR

+

+

TTSBT)

+

n(.56/1024

+

V(TPOP

2TMVOC

+

V'(TPUSH 1

+

+

TTSBT -

1

+

.2))

+

(2.67)V'

TINTERSECTION(max)= 3.38 + V(3.93 + n(.56/1024 + .2)) + (4.67)V 1
µsec
where POP' is the same as discussed in the UNION operation.
Difference.

From Appendix I I it can be seen that, the sequence

of the operations for this operation is the same as the intersection,
except the match would be on not equality.

A close investigation

shows that the execution ti me of both the operations

are

equal.

Therefore:
+

V(3.26

+

n(.56/1024

+

.2))

+

(2.67)V

1

TDIFFERENCE(min)

= 3.38
µsec

+

V(3.93

+

n(.56/1024

+

.2))

+

(4.67)V

1

TDIFFERENCE(max)

= 3.38
µsec

296
where V and V' are the same as discussed in the intersection.
Inclusion.

Recall from Chapter IV, this function determines the

inclusion of a set in another set.

The sequence of the operation is as

follows:
SMKR

sets mask register of the 1st stack

SMKR

sets mask register of the 2nd stack

STBT

resets the inclusion bit

MVRR

save content of top register of the 2nd stack

POP

pop 2nd stack

MVOC

transfer the contents of the output register of
the 2nd stack to the comparand register of the
1st stack

MTCH

match 1st stack on equality

TSBT

if match is successful go to POP

BRRF

go to +1

STBT

set inclusion bit

MVRR

set the top register of the 2nd stack

Operation is successful if the 2nd stack is included in the 1st
stack.
(V).

Execution time is a function of the depth of the 2nd stack
Therefore,

i)

for a successful operation:

TINCLUSION

= 2TSMKR + 2TSTBT + 2TMVRR + TPOP +V[TPOP +

TMVOC + TMTCH + TTSBT]
T
= (3.02) + V[3.61 + n(.28/1024 + .2)] µsec
INCLUSION(min)
T

INCLUSION(max)

= (3.02) + V[4.28 + n(.28/1024 + .2)] µsec

297

ii)

for an unsuccessful operation:
TINCLUSION

= 2TSMKR + 2TMVRR + TSTBT + (V'+l) [TPOP +

TMVOC + TMTCH + TTSBT] -l.OS
TINCLUSION(min) = .4 + (V1+1)[3.61 + n(.28/1024 +
.2)] µsec
1
TINCLUSION(max) = · 4 + (V +1 )[ 4 - 28 + n(.28/1024 +
.2)] µsec
where O

~

V1

~

V and Vis the depth of the 2nd stack.

Set Equality.

This function determines two sets (contents of

two associative stacks) are the same or not, regardless of the order
of the elements.
SMKR

set mask register of 1st stack

SMKR

set mask register of 2nd stack

MVRR

save contents of the top register of 1st stack

MVRR

save contents of the top register of 2nd stack

STBT

set the equality bit to zero

POP

pop 1st stack

POP

pop 2nd stack

MVOC

transfer the content of the output register of
2nd stack to the comparand register of the 1st
stack

MVOC

transfer the content of the output register of
the 1st stack to the comparand register of the
2nd stack

MTCH

match 1st stack on equality

TSBT

if match is successful branch +1

298
BRRF

go to the end of macro

MTCH

match 2nd stack on equality

TSBT

if match is successful branch to the 1st POP

BRRF

go to the end of macro

TSBT

if second stack is empty go to +l

BRRF

go to the end of macro

STBT

set equality bit to 1

MVRR

set the top register of the 1st stack

MVRR

set the top register of the 2nd stack

In case that two sets are equal the execution time is:

TEQUALITY

= 2TSMKR + 4TMVRR + 2TSTBT + TTSBT + TPOP +

2V[TPOP + TMVOR + TTSBT + TMTCH]
TEQUALITY(min)
T

EQUALITY(max)

= 4.98 + 2V

[3.61 + n(.28/1024 + .2)] µsec

= 4. 98 + 2V[ 4. 28 + n(. 28/1024 + . 2)]

µsec

In case the sets are not equal, operation is terminated in three
different cases:
i)

termination occurs in the 1st stack:

TEQUALITY

= 2TSMKR + 4TMVRR + TSTBT + TTSBT + TBRRF +

2 (V'+l) (TPOP + TMVOC) + C2V'+l)TMTCH + 2V'TTSBT

T
= (3.06) + 2(V'+l)(l.13+.28/1O24n)+(2V'+l)
EQUALITY (min)
(.2n +l.O8) + 2V'(l.4) µsec

T
= (3.06) + 2(V' + 1)(1.8 + .28/1O24n) +
EQUALITY(max)
(2V' + 1)(.2n + 1.08) + 2V 1 (1.4) µsec

299

where 1~ V'
ii)

~

V and Vis the depth of the 1st stack

termination occurs in the 2nd stack:

TEQUALITY

=

2TSMKR + 4TMVRR + TSTBT + TTSBT + TBRRF
+2 (V'+ 1 )(TpQp + TMVOC + TMTCH) +

=

TEQUALITY(min)

C2V'+ 1 )(TTSBT)

(3.06) +2(V'+1)(2.21 + n(.28/1024 + .2))
+ (2V'+l)(l.4) µsec

T

= (3.06) + 2(V'+l)(2.88+n(.28/1024+.2))

EQUALITY(max)

+ (2V'+1)(1.4) µsec
where 1

~

V'

~

V and Vis the depth of the 2nd stack.

iii) termination occurs at the end of the 1st stack, meaning 1st
stack is a subset of the 2nd stack
TEQUALITY
2T

1
+ 2
] + T
+ T
+ T
TSBT
POP
TSBT
BRRF

= (4 . 46) + 2V(3.61 + n(.28/1024 + .2)) µsec

TEQUALITY(min)

T

MTCH

=

EQUALITY(max)

(4.46) + 2V(4.28 + n(.28/1024 + .2)) µsec

where Vis the depth of the first stack.
Cartesian Product.

Recall from Chapter IV that in this opera-

tion each element of the 1st set will be catinated with the elements
of the 2nd set in a 3rd stack. The sequence of operation is as follows:
SMKR

sets mask register of the first stack

SMKR

sets mask register of the second stack

SMKR

sets mask register of the third stack

300
MVRR

save

contents

STBT

sets conte·nt of top register of the third stack

POP

pops first stack

MVRR

transfers the contents of the output
of the first stack to the 11 shift 11
1

MVRR

save

POP

pops second stack

MVRR

transfers contents of output register of second
stack to "shift 11
2

MVRR

transfers contents of 11 shifter 11 to the comparand
register of third stack

PUSH

push third stack

BRRB

go to the

MVRR

sets

BRRB

go to the

MVRR

sets

contents

11

top

of top

register of first

register of

stack

register

second stack

pop second stack"

contents
11

of

of top

register of

second stack

pop first stack"

contents

of

top

register of first

stack

T
+
= 3 TSMKR + 2 TMVRR + TSTBT + TPOP' + V[ POP
· TPOP' + 2 TMVRR + TMVRR + TBRRB + V'[TPOP +

TCARTESIAN

TMVRR + TPUSH + TBRRB]]
where V and VI

are the depths of the first and second stacks, and

PUSH' and POP' are the same as discussed before.
T

CARTESIAN(min)

=

3.21 + V[3.79 + .28/1024+ V'[3.8 +.28/1024
(n+n 1 ) ] ] µsec

T

CARTESIAN(max)

= 3.21 + V[4.46 + .28/1024n + V' [6.47 +
.28/1024 (n+n')]] µsec

Selection.

This instruction creates a new relation out of the

301
selected elements of a previously generated relation:
SMKR

sets mask register of the first stack

SMKR

sets mask register of the second stack

MVRR

saves contents of top register of first stack

STRT

sets the contents of top register of second stack

TSBT

if match bit of the first stack is true go to +l

BRRF

go to the end of the macro

FIND

find first match and reset it

COPY

1

write the content of the pointed row by the top
register to the output register of the first
stack

MVOC

transfer the contents of the output register of
the first stack to the comparand register of the
second stack

PUSH

push second stack

BRRB

go to the

MVRR

sets

contents

TFIND

v•

TSBT 11
of

top

register of first stack

= 2TSMKR + 2TMVRR + TBRRF + TTSBT + V'[TTSBT +

TSELECTION

where

11

+ TCOPYl + TMVOC

+ TPUSH

1

+ TBRRB]

1
is the number of iteration and PUSH is the same as dis-

cussed before:
= 2.15 +V 1 [4. 65 + n/1024

*

.28]

µsec

= 2.15 +V 1 [7. 72 + n/1024
T
SELECTION(max)

*

.28]

µsec

TSELECTION(min)

Projection.

Recall from Chapter IV, that this instruction will

project the elements of a set on prespecified domains.
SMKR

sets mask register of the first stack based on
defined format

302

SMKR

sets mask register of the second stack

STBT

sets top register of second stack

MVRR

saves content of top register of the first stack

POP

pop first stack

MVRR

transfers contents of the output register of the
first stack to remove blank circuit

RBRF

remove blanks

MVRR

transfers contents of remove blanks circuit to
comparand register of second stack

PUSH

push second stack

BRRB

go to POP

MVRR

sets the top register of first stack

TPROJECTION

TSMKR + TSMKR + TSTBT + 2TMVRR + TPOP +

=

V[TPOP + 2TMVRR + TRBRF + TPUSH 1 + TBRRB]
where V is the depth of the first stack and

II

PUSH 1 11 is the same as

discussed before
TPROJECTION(min) = .36(#WORDS + 1) + .87 + V[3.l + n(.56/1024
+ .4/8)] µsec

TPROJECTION(max) = .36(#WORDS + 1) + .87 + V[4.67 + n(.56/1024
+ .4/8)] µsec

Join.

Recall from Chapter II, that join is an interrelational

function between two sets.
SMKR

sets mask register of first stack to all 1

SMKR

sets mask resiter of second stack based on
defined format(s)

303

SMKR

sets mask register of third stack

STBT

set top register of third stack

MVRR

saves contents of top register of first stack

MVRR

saves contents of top register of second stack

POP

pops first stack

MVRR

transfers content of output register of first
stack to "shift 11

RTAT

circulate the

MVRR

transfers content of
ter of second stack

RTAT

circulate the

MVRR

saves contents of top register of second stack

MTCH

match

TSBT

if match bit is true go to +l

BRRB

go to POP

FIND

find first match and reset

MVRR

transfers content of the pointer register
the top register of the second stack

COPY

writer the cntent of the pointed
second stack in the ouput register

MVRR

transfers the content of the output
of the second stack to the "shift2 11

MVRR

transfers content of "shift" to comparand regi ster of third stack

PUSH

push third stack

BRRB

go to TSBT

MVRR

sets top register of first stack

MVRR

sets top register of second stack

1

II

11

shift"
11

shift 11 to comparand regis-

shift 11

to

row of the
register

304
TJOIN

=

3TSMKR

+

2TRTAT

+

TMVRR

+

TTSBT

+

TSTBT

4TMVRR

+

TMVRR

+

TMTCH

TCOPY

+

2TMVRR

TBRRB]

+

TPOP 1

where Vis the depth of the first stack and V

I

a

+

+

V[TPOP

V'a[TTSBT
+

TPUSH'

~

Va 1

~

TJOIN(min)

JOIN(max)

TFIND

+

+

TBRRB]

+

is the average number
in

each

V where Vis the depth of second stack
= 3.77 V[3.54 + n ( . 56/1024) + (.2)n 1 + .7

n11 /72
T

+

+

of elements in the first stack which satisfies the match
iteration 0

2TMVRR

+

+

Va 1 [4.23

+

.56/1024n]] µsec

= 3.77 + V[4.21 + n(.56/1024) + (.2)n 1 + .7

n11 /72

+ Va'

[7.30 + .56/1024n]] µsec

In the above formulas the n is the width of the associative stacks,
n 1 is the number of bits of the field which participates in the join,
and n11 is the number of bi ts which

II

Shi fter 11 is going to be rotated.

Evaluation of ASLM
In the previous section the execution time of the ASLH micro
operations and ASLH macros have been calculated based on Table 6.1.
As can be seen most of the equations are functions of one or more
variables such as the depth of the associative stacks, the length of
each row of the associative stacks and so on.

The allowable maximum

size of these variables depend on current technology, each parameter
(variable) will be specified and each function will be calculated.
Recall from Chapter II that associative memory board of size 4K
bytes is available, and we can provide associative memory of size 32K
bytes by attaching 8 boards together.

Current technology is capable
6

of providing associative memory of size about 10

bytes easily.

This

305

means associative memory of depth 10
available.

4

words and width 256 bytes are

Based on the 90 - 10% rule (Chapter I) and the fact that

in each query just a small portion of the data file can satisfy the
search criteria we can claim that an associative memory of the above
depth can handle data files of 10

6

records.

Moreover, from Chapter I

we know that in data base systems users can have access to a subset
of domains in each record (at most 10%).

Therefore, an associative

memory of width 256 bytes is capable of handling records of size 10
bits.

4

Therefore, our system is capab 1e of handling data files of
10

size 10

11

- 10

bits which is a sufficient size for most of the data

bases used today .
This section is divided into three subsections which corresponds
to three ASLH modules.

In each subsection the execution ti me for

each operations, of that module wi 11 be ca 1cul ated.

For the non-

numeric operation calculations are based on the associative memories
7

8

of depths 2 , 2 , 2

9

and 2

10

7

words and width of 2

8

and 2

bytes.

It

would be useful to calculate the execution time of the ASL functions
and compare the results with the execution time of the same functions
on the current conventional systems and other data base machines.
This will be addressed later in Chapter VII.
Index Processors
Table 6.2(a) gives the execution time for the ASLH operations
and macros which are going to be executed by this module.

Each entry

in the table has two values, the maximum and the minimum execution
ti me, moreover, for the

II

i nsert 11 and

II

de l ete 11 macros the execution

time for unsuccessful operations have also been calculated.

306
Recall from the previous Section that the only variables in the
equations of this module are the size of the relation names and the
size of the descriptors in the random access memory of the index
processor.

A close investigation of the

11

INST 11 instruction (Chapter

V) shows that the maximum size of the descriptor would be equal to 2
= 64 words,

6

moreover, the size of the relation names are fixed and

equal to 32 bits.
Secondary Storage Interface
The only parameter involved in the operations is the size of
the descriptor in the RAM memory.

The maximum size for descriptor is

4

equal to 2

= 16 words.

Table 6.2(b) defines the execution time for

this module.
Non-Numeric Processor
In this section the timing equations of those operations of the
previous

section which are dependent on some parameter(s) will be

investigated.

At the end of the Section appropriate tables give the

execution time of different operations in the non-numeric processor.
MOVE.

recall that:
TMVRR

= n/1024

*.28 and

TMVOC

= n/1024

*.28

where n is the number of bits which should be transferred.

Therefore,

for n equal to 1024

TMVOC

= TMVRR =

. 28 µsec

TMVOC

= TMVRR = .56 µsec

and for n equal to 2048

TABLE 6.2(a)
THE EXECUTION TIME OF THE INDEX PROCESSOR
EXECUTION TIME (µsec)
OPERATION

MINIMUM

MAXIMUM

INST

236.56

238.56

INSERT (successful)

246.36

248.36

INSERT (unsuccessful)

9.62

9.62

DELT

2.43

2.83

12.23

12.63

9.62

9.62

DELETE (successful)
DELETE (unsuccessful)

w

0
--...J

308

TABLE 6.2(b)
THE EXECUTION TIME OF THE SECONDARY STORAGE INTERFACE

OPERATION

EXECUTION TIME (µsec)
MINIMUM
MAXIMUM

DELT

1. 78

1. 78

READ

24.41

24.41

309
RBRF.

The

execution

time

of this

operation

in

both cases

(Chapter V) is:
= n *.2

TRBRF

where n is the width of the associative stacks in bytes therefore,
for n equal to 1024

=

TRBRF

1024/8 *.2

=

25.6

µsec

and for n equal to

2048

= 2048/8

TRBRF
RTAT.

*.4

= 51.2

µsec

The execution time is defined by:
TRTAT

= fn/727 *.35

where n is the number of bi ts which
right or left.

µsec
II

SHIFT" should be rotated to the

Recall from Chapter V that n varies from 1 up to 512.

Therefore:
TRTAT(mi n)

T

SCTR.

RTAT(max)

= fl/721 *.35 = .35

=

f512;72l*.35

=8

µsec

*.35

= 2.8

µsec

Recal 1 that,
TSCTR

= #WORDS/VALUE *.28

µsec

The above formula shows that the execution time is a function of
the user's query and user's accessability to the fields of the relation.

The 90% - 10% rule and accessability of each user to a part of

the fie 1 ds in a tuple wi 11 he 1p us to have a reasonab 1 e assumption
6

for the va 1 ue of the "#WORDS/VALUE. 11

For a tup 1 e of 2

fie 1 ds and

accessability present of 20 then we have:
TSCTR

=

fl/2 * 10/100 * 64

l *.28 = 1.12

µsec

310
PUSH.
i)

Recall that:

if stack is not full, and search through the stored data in

the associative stack is successful, then:
TPUSH

= 2.83 + .2n µsec

where n is the width of the associative stacks, therefore, for n
equal 1024:

ii)

TPUSH

= 207.63

µsec and for n equal 2048

TPUSH

= 412.43

µsec

if stack is not full and search through the contents of the

associative stack is not successful, then:
TPUSH(min)
TPUSH(max)

= 3.05 + .2n µsec and
= 5.05 + .2n

µsec

where n is the width of the associative memory, therefore, for n
equal to 1024
T
PUSH(max)

= 209.85

TPUSH(min)

= 207.85

µsec and for n equal to 2048

TPUSH(min)

= 412.65

µsec

T
PUSH(max)

= 414.65

µsec

vsec

An execution time of 207.85 or 412.65 µsec seems unrealistic for
an operation in comparison with even complex arithmetic operations,
but

is

we remember that the above ti mes are equa 1 to searching

through data fi 1es of 10

6

- 10

7

bi ts, then we rea 1 i ze the power of

the associative operations.
Union.

Recall that the developed equation is:

T
UNION(mi n)

= 2.68 + V[4.88 + n(.28/1024 + .2)] and

T
UNION(max)

= 2.68 + V[7.55 + n (.28/1024 + .2)]

311

where n and V are respectively the width and depth of the associative
stacks.

Table 6.2(c) gives the execution time of the union operation

for different values of n and V.
Inclusion.

From previous section we know that for a successful

operation, where the second set is included in the first set the
execution time is a function of the width of the stacks (n) and the
depth of the second stack (V).

Table 6.2(d) shows the execution time

of the operation for different values of n and V. It should be emphasized that for unsuccessful operation when the second set is not
a subset of the first set execution time is bounded by the time of
the successful operation.
Projection.

The execution time can be defined by:

TPROJECTION(min)

= .36 (#WORDS+ 1) + .87 + V [3.1 +

n(.56/1024 + .4/8)]
TPROJECTION(max)

= .36(#WORDS + 1) + .87 + V[4.67 + n(.56/1024
+ .4/8)]

For the #WORDS equal to 1 and different values of the n and V
the execution times have been depicted in the Table 6.2(e).
As can be seen from Table 6.2 (c, d and e) the execution time is
directly a function of the width and depth of the stacks.

This fact

has been depicted in the Figures 6.1 - 6.3, more precisely.
Join.

The execution time is defined by:

TJOIN(mi n)

= 3.77 + V[3.54 + n (.56/1024) + (.2)n' + .7

n"/72 + V' [4.33 + .56/1024 n]J
a

312
T

= 3.77 + V[4.21 + n(.56/1024) + (.2)n 1 + .7

JOIN(max)

n11 /72

+ V1 [7. 30 + . 56/1024n]]
a

As can be seen, this interrelational operation is a function of three
parameters:

n the width of the as soci at i ve memories,

nI

is the

length of the field which participates in the operation, V the depth
of the second associative stack which is specified in the operation,
and VaI is the average number of e 1 ements in the first stack which
satisfy the match.

The va 1 ue of the VI is not fixed and varies

for each iteration therefore, in the above formula we are using an
average value for the V1 during the operation.

The value of the V1

is unpredictable and depends on the data file.

In the Table 6.2(f)

a

the execution time for join operation is calculated for v•, n 1 and n11
a

equal to 20 and 50 and 256 respectively.
An Example of an ASL Program
In this part an ASL program will be examined and execution time
will

be

calculated .

We made the fo 11 owing assumptions:

i) the

relations are the same as those which have been used in Chapter III,
where the maximum length for the ENAME, ENO, LOCATION, .ONO and HOUR
are respectively 30, 8, 8, 3 and 3 bytes.

ii) The tuples are dis-

tributed in the relations randomly, this is a valid assumption since
in our system data are not presorted on any specified field(s).

iii)

At most 1% of the records satisfy the search criteria, this is a valid assumption because of the so called 90% - 10% rule discussed in
Chapter I.

iv) The data fl ow rate from the ce 11 s to the associative

stack module is bounded by the insertion of an entry to the associative stack.

In other words, there always exists data available to

313

TABLE 6.2(c)
THE EXECUTION TIME OF THE UNION OPERATION

Execution Time (m sec)
n

V

Minimum

Maximum

1024

128

26.876

27.219

1024

256

53.750

54.436

1024

512

107.497

108.869

1024

1024

214.991

217.736

2048

128

53.128

53.469

2048

256

106.253

106.936

2048

512

212.503

213. 870

2048

1024

425.004

427.738

314

TABLE 6.2(d)
THE EXECUTION TIME OF THE INCLUSION OPERATION

Execution Time (m sec)

N

V

Minimum

Maximum

1024

128

26.715

26.801

1024

256

53.428

53.599

1024

512

106.852

107.195

1024

1024

213.701

214.388

2048

128

52.930

53.015

2048

256

105.856

106.028

2048

512

211. 710

212.053

2048

1024

423.417

424.103

315

TABLE 6.2(e)
THE EXECUTION TIME OF THE PROJECTION OPERATION

Execution Time (m sec)
n

V

Minimum

Maximum

1024

128

7.024

7.225

1024

256

14.046

14.448

1024

512

28.090

28.894

1024

1024

56.178

57.786

2048

128

13.649

13.850

2048

256

27.296

27.698

2058

512

54.591

55.395

2048

1024

109.180

110.788

. T(m,sec)

I

I

I

I
I

200

220

I

I

I

I
I

I

n = 4096
·n = 2048

I

n = 1024

I

I
I
I
100

------

50

25

-----

~-

219

---512
-- n =

18

---

217
V

128

256
Figure 6·1

512
Execution time vs. depth (union)

w

t--1
O'l

I
I

T(m sec)

I
I

/

200

I

_ 220

I
I

I
I

I

In= 4096
I
I

n =

2048

I
I
I

n

I

I

=

1024

I

100

I
~----:----7~-------=-rL._
_________________ 219
I
I
-----n = s12
I
I

I
I

I

50

25

~

/
I

7

(

7

~

--

- -- ---

--------128

Figure 6.2

256

-----------------512

---- - n =

2

18

256

217

V

Execution time vs. depth (Inclusion)
w

f-.J
-....J

Tm sec

;n

= 4096

/

I

I

I

I

I

I

/

I

2

20

/

I

50

/

n = 1024

I

I
/

/
/
/

I

30

219

I

/
/
/

20

I

I

10 r

/

r//
.·

7

,,/'.
:.,_;;;>-

~

128

~

-

----256

Figure 6·3

----

- - -

-- --

--

--

- - - - - n = 512

18
2

217
512

Execution time vs. depth (Projection)

V
w
j--J
co

319

TABLE 6. 2(f)
THE EXECUTION TIME OF THE JOIN OPERATION

EXECUTION TIME

(m sec)

V

Minimum

Maximum

1024

128

14.685

22.374

1024

256

29.367

44.745

1024

512

58.730

89.486

1024

1024

117.457

178.968

2048

128

16.191

23.880

2048

256

32.378

47.755

2048

512

64.751

95.507

2048

1024

129.499

191.010

n

320
the associative stack module in order to be pushed into the selected
stack.

This assumption simplifies the discussion, since it ignores

any discussion about the number of cells and their ownership on the
data bus.

v) There is a flow of data to the cells, in other words,

there exists a stream of data from the secondary storage to the cells
in order to satisfy the above assumption.
calculated for relations of 10
Example:

5

4

10 , and 10

vi) The execution time is
3

tuples.

Get a list of names and the amount of hour for all the

employees in the

11

New York" which are not working in the department

In ASL the above query is translated as:
E·

'

ED;

X = LOCATION EQ "NEW YORK 11 [EJ ENO, ENAME;
Y = ONO NEQ

11

0

11

1

[ED] ENO, HOUR;

W = (X) ENO= ENO (Y);
As mentioned in Chapter III, no code is generated for the first
two lines of the above program .

The third and fourth

lines

are

translated into a sequence of setting the registers of the cells
followed by initiation of a read from the secondary storage, and the
fifth line is a join operation, which is translated to a sequence of
the micro instructons as defined in the Appendix III.

Since the

maximum length of ENO and ENAME are respectively 8 and 30 characters,
an associative stack of 512 bits width is sufficient for handling
the selected subtuple in the third line of the above program.

By the

same discussion an associative stack of 128 bits and 512 bits width

321

would be suitable for the fourth and fifth lines.

The semi ASL

program is as follows:
1)

SCTR 1

sets the content of a 1-bit register in the cells
to one this bit specifies the type of the operation which is going to be performed by the cells.
In other words, it specifies cells are going to
act as selectors or not (Chapter IV).

2)

SCTR 1

Initiates the contents of the counters in each
domain recognizer by one.

3)

SCTR 1
11
value 11

Sets the contents of the automatic rotate register of the domain recognizer which is associated
to the test logic circuit by the content of the
following

word

11

value.

following

word

holds

11

11

4 11

In

our

which

example
is

the

equivalent

to the "LOCATION" field in the E relation.
4)

SCTR "value"

Sets the content of the register in the automatic rotate register associated to the test 1ogi c
circuit.

This speci fi e5:i that at the end of each

tuple, before a cell goes to the ready cycle how
many positions the automatic

rotate

register

should be circulated.
5)

SCTR 1
"value"

Sets the contents of the automatic rotate register of the domain recognizer associated to the
input register by the content of the fo 11 owing
word "value."

In our examp 1 e the va 1 ue ho 1 ds

322
11

2

11

11

and

1 11 which correspond with the ENAME and

ENO fields in the E relation.
6)

SCTR

11

value 11

Sets the content of the register in the automatic
rotate register assigned to the input register.
This register specifies that at the end of each
tuple before a cell goes to the ready eye le how
many

positions

the

automatic

rotate

register

should be circulated.
7)

SCTR 1
11
value 11

Sets the contents of the automatic rotate register.

11

Value 11 specifies the maximum length of the

domains

specified

in

the

example the content of

II

output set.

In our

value" would be 240 and

64.
8)

SCTR

11

value 11

Sets the content of the register in the automatic
rotate

register

of

the

input

register.

This

register has the same concept as the register of
the register discussed in the 6th instruction of
this program.
9)

SCTR 1
11
value 11

Sets the content of the comparison circuit of the
units in the test logic circuit by the content of
11

value. 11

In

our

example

11

value 11

shows

the

equality.
10)

11)

SCTR 4
"value 11
11
value 111
11
value 211
11
value 311
4

Sets the content of the R register of the units
1

READ

read macro, initiates reads from the E relation.

in the test logic circuit by the content of value
value , value , and value 4 .
3
2

1

323
12)

GSTP

Transfers relevant data from the cells to the
associative stack.

13)

PUSH'

Transfers the content of the comparand register
to the associative stack module.

This macro will

be repeated until all the tup 1es in the E- rel ati on are validated by the cells.
14)

SCTR 1

Same as instruction 1).

15)

SCTR 1

Same as instruction 2).

16)

SCTR 1
11
value 11

Same as instruction 3).

The content of "value 11

is equal to 2 which represents the ONO fie l ct in
the ED relation.
17)

SCTR "value"

Same as instruction 4).

18)

SCTR 1
"value"

Same as instruction 5).

The contents of "value"

is equal to 1 and 3 which specify the ENO and
HOUR in the ED relation.
19)

SCTR "value"

Same as instruction 6).

20)

SCTR 1
11
value 11

Same as instruction 7).

The contents of

11

value"

is equal to 64 and 24
21)

SCTR "value"

Same as instruction 8).

22)

SCTR 1
"value"

Same as instruction 9).

The content of "value"

reveals the non equality.
23)

SCTR 1
"value"

Same as instruction 10).
is equal to

11

The content of "value"

D 11
1

24)

READ

Intitiates read from the ED relation.

25)

GSTP

Transfers

relevant data from the cells to the

associative stack.

324

26)

PUSH'

Transfers

selected tuples from the cells

the as soc i at i ve memory.
for
27)

JOIN

all

the

valid

into

This macro is repeated

tuple

of

the ED relation.

Join macro.

Since in our system duplicate records do not exist
in the above program we use PUSH' macros.

therefore
'
'
The PUSH' has been defined

in the Appendix II.

TPROGRAM

=

TE+ TED+ TJOIN

where TE and TED are equivalent to the execution times of the selection of the relevant data from E and ED

relations

respectively.

TJOIN has been defined before, recall that TJOIN is a function
three parameters.

of

In our example v is equal to 1000, 100, and 10 n

is equal to 64 and v' is euqal to 1.

Table 6.3 specifies the exeuc-

tion time of the program for relations of different sizes.
The entries in the Table show that for the above ASL program
the execution time depends on the size of the data file.

Moreover,
5

for a join program the exeuction time for a relation of size 10
tuples is less than .5 µsec.
Discussion
The chapter addressed execution time of the ASLH micro operations based on the current technology, Appendix II al so shows the
micro programs for each ASL macros.

Table 6.1 (a,b and c) and a

simple calculation enabled us to estimate the execution time of the
ASL macros.

This was the subject of the second section of this

325

TABLE 6.3
THE EXEUCTION TIME OF AN ASL EXAMPLE

SIZE OF

EXECUTION TIME
Minimum

(msec)

Maximum

RELATIONS

5

10

405. 725

409.265

40.604

40.958

4.091

4.127

4

10

10

3

326
chapter.

As can be seen these functions are generally dependent on

some parameters, where each parameter is hardware dependent and is a
function

of the

implementation.

Based on some assumptions about

these parameters, the execution time of different macros have been
calculated in the third section, for different values of the parameters.

These assumptions are based on our discussion in Chapter I.

Tables 6.2(c,d,e and f) reveal the execution times of four ASL
macros using associative memories of different depth and width.

The

execution time of union, inclusion and projection are a function of
two variables, and execution times can be calculated straight forward.

For these macros execution times directly depends on the num-

ber of bits in the associative memory.

For example, if T be the

execution time of the operation on an associative memory of depth V
and width n then 2T would be the execution time on an associative
memory of depth 2V and width nor associative memory of depth V and
width 2n.

For join the calculated function ·depends on four param-

eters, where one of these parameters is not fixed and varies during
the operation · for our cal cul at ion an average value equal to 20 has
been considered .

Table 6.2(f) gives the execution time of the join

operation for associative memories of different depth and width and a
fixed number of 20 for V'.
a

(c,d,

and e)

Based on the entries of the Tables 6.2

the Figures 6.1,

6.2,

and 6.3 have been developed.

The relationship between execution time and size of the associative memory can be used to find out the execution time of the operation for associative memories of different sizes, the dashed lines in

327
Figures 6.1, 6.2, and 6.3 show this fact.

Since the execution time

is a function of the size of the associative memory we can find out
the execution time of the operation based on the size of the associative memory regardless of the depth or width of the associative memory . The horizontal lines in each Figure shows this idea.

For exam-

ple, the exeuction time of union operation for an associative memory
of size 2

19

bits is about 106 msec.

328

CHAPTER VII
CONCLUSION
Summary
In the previous chapters a top down approach to the design of a
backend data base machine based on the relational data model has been
discussed.

The design is an attempt for handling i neffi ci enci es of

the conventional von Neumann machines and the data base machines which
were subjects of the first and second chapters.
The inefficiencies of the conventional systems have resulted in a
large gap between high level languages and the hardware design of the
computer systems.

This gap is the consequence of i) name mapping

resolution which is directly related to the accessing data by address,
ii)

large amount of data which transfers between secondary storage

and main memory from one side and main memory and processor from the
other side, and iii) lack of the built-in hardware for handling nonnumeric operations as efficient as numeric operations.

The high cost

of the hardware in the past was more or less responsible for the above
inefficiency.

Since this high cost did not allow us to imcorporate

more hardware in the design of the machines.

On the other hand, some

of the systems which have been implemented based on Slotnick' s idea
(Slotnick 1970), i) are not cost effective, ii) are not efficient for
handling interrelational operation due to the implicit subrelations
which are generated on the secondary device, and iii) they are restricted to -a specific secondary device.

329

In order to overcome the above inefficiencies, ASLM is designed
i) as a backend data base machine implemented by associative memory,
capable of handling non-numeric operations, ii) it is not restricted
to any specific secondary storage, and iii) since subrelations are
generated explicitly the i nterre lat i ona l operations can be executed
faster than previously designed data base machines.
The main idea in the design of our machine is the closing of the
semantic gap, therefore we took a top down approach in the design of
ASLM.

Because of this approach we al so defined a high level program-

ming language ASL.

The query language ASL is a non-procedural data

language which provides data manipulation and user's independence of
data structure. Some of the important features of ASL are as follows:
first,

it is based on the rel at i ona l

data model.

Because of the

similarity between relational representation of the data and user's
view of data file the name mapping resulation would be minimized,
especially because of the one to one mapping between tabular representation of data and associative memory; second, because of the resemblance

between ASL statements and arithmetic expressions,

the ASL

statements can be implemented in the same way as we implement arithmetic expressions; third, the set of the ASL statements is relationally complete;

and fourth,

ASL is applicable on the variable length

records.
The formal definition of the ASL language is the topic of Chapter
III. ASL has been implemented by a one pass compiler which uses an LR
parsing technique.

Chapter III shows the effectiveness of ASL and its

ability of handling different queries.

330

The description of the backend machine ASLM and the sequences of
the operations in ASLM are described in Chapter IV.

The major di f-

ference between ASLM and other data base machines is based on the
removal of the processing capability of each read/write head from the
secondary storage and attachment of this capabi 1 i ty to the backend
machine.

This creates a cost effective system.

It should be men-

tioned that, the concept of backend machine minimizes the data rate
flow between the secondary storage and the host machine.
The design of the ASLM is based on pre-search through data file.
This pre-search enables us to pick up, select and store that part of
the data file which satisfies the search criteria.
is

stored

in

The selected data

the associative memory for further operations.

The

ability of generation of the subrelations explicitly increases the
speed of operations especially the interrelational operations such as
join .

Moreover,

it eliminates the existence of control bits which

should be attached to each record.

Since ASLH is

implemented by

associative memory the concept of sorting is naturally embedded in the
operation.

It is worthwhile to mention that ASLH is microprogramm-

ab 1e, this means, the set of the ASL functions is not restricted to
the set of the functions discussed in this dissertation, and this set
can be expanded easily.

The sequence of the operations for the ASL

functions have also been discussed in Chapter IV.
Another feature of ASLM is its modularity and its toleration of
fault in some degree. Because of the modularity of system and independence of modules from each other the level of parallelism among different modules, is quite high.

331
Chapter V addresses the micro operations of the ASLH.

The micro

operations are of fixed length words and are able to cover all the ASL
macros and functions.

Dealing with associative memory eliminates the

accessability of data by address, hence there is no addressing schemes
such as direct,

indirect or indexed in the micro operations. This

reduces the execution time and simplifies the design of the controller
of the ASLH.
The topic of Chapter VI is the evaluation of the ASLM.

As men-

tioned before, a lot of parameters should be considered in the evaluation of a system.

For a query system it seems the evaluation of the

response time for a paper design machine gives a reasonable value
about the performance of the system.
of the ASL functions.

This means the execution ti me

In our discussion we have simplified this

execution time to the execution time of the ASL macros.

Therefore,

based on three different systems the execution time of each ASL macro
has been calculated.
Further Development
At this stage ASLM is a paper machine.

Naturally for its imple-

mentation some of its features should be investigated more precisely
and its performance should be evaluated deeply.
ture

is

nove 1 and

is

This system architec-

capable of expansion in different aspects.

Consider the ASL language, although, ASL is relationally complete it
could be expanded in order to satisfy user's requirements more precisely.

For example, as can be seen from Table 3.5, in ASL numbers

are assumed to be integers, where in daily operations, at least the
language shoµl d be extended to cover real numbers, or ASL does not

332
provide any functions such as Max, Min, Count, etc. which should be

added to the ASL productions (The hardware capable of performing these
functions is already in ASLH). As mentioned before (Chapter III), an
interpreter for the ASL language has been implemented, this program is
capable of checking the ASL program from syntax and semantics points
of view and translates a correct ASL program to an intermediate form.
The ASL interpreter should be implemented in order to be practical on
the conventional systems and on the ASLM.

This expansion generates a

query language applicable on conventional systems.

Moreover, it would

be a good tool to evaluate the performance of the ASLM.
of the ASLM should be also investigated deeply.

The hardware

The ASL functions

should be calculated in the same way as ASL macros have been ca 1cul ated.

These analytical values should be compared with the execution

time of the same functions on the previously designed data base systems.
the

A fair comparison would be obtained if the execution time of
functions

on

different machines

methods as seen in Chapter VI.

were

calculated by

the same

The last step before the hardware im-

plementation of ASLM would be its simulation and evaluation of the ASL
functions on this simulated machine.
At its current design ASLM is classified among the SIMD machines.
Independence of the ASLH modules from each other, independence of the
cells from each other and independence of the associative stacks from
each other enable us to improve the class of ASLM to MIMD at the
expense

of

slower

operations

and more

complicated controller,

by

distributing variable number of cells and associative stacks to different users in order to response more than one user at a time.

As a

333

result the parallelism will be achieved among different queries and in
other words controller controls and responds different queries (users)
at a time.
Recall from Chapter III and Appendix II that the ASL programs are
mostly sequential, therefore, the concept of the instruction pre-fetch
or instruction look ahead which has not been addressed in this dissertation can be incorporated in this system.
Based on our discussion in Chapter VI and current technology,
10

ASLM is capable of handling data base of 10

11

through 10

bits, by a

concept similar to the memory allocation in the conventional systems
we can expand the capacity of ASLM.

In the future, because of the

cost and capacity trade off of hardware (Chapter I) ASLM will still
be capable of handling future data base systems.
As a result of our discussion we can say ASL and ASLM should be
investigated deeply for its hardware implementation in order to be
practical.

334

APPENDICES

335
APPENDIX I
SEMANTIC ROUTINES OF ASL
Recall that the semantic analysis will be accomplished by defining a set of attributes for the non-terminal
evaluation rules for each production.

symbols and a set of

In the following table the set

of evaluation rules of Table 3 . 5 will be defined.
11

type 11 and

11

value 11 are the set of attributes that we have assigned to

the non-terminal
for

11

value 11 and

symbol
11

information to v,

(Table 3.8).

T11 stands for

11

the

11

type.

In our discussion
11

11

v 11 stands

(v) means all the associated

for example if v is a domain name (v) means its

type, name and maximum length.
and

Remember that the

T-Table is as defined in Chapter III

env 11 is an area which holds appropriate information related to

<OUTPUT SET> or <DOMAIN-TYPE LIST>.

used to update the symbol table.

This information will

be

The indices of the attributes are

used to distinguish the attributes of the non-terminal symbols in a
production.
defined,

For some of the production no evaluation rules have been

this fact has been shown by "nothing."

symbol table is treated as a set.

In this appendix

TABLE A. 1.
THE SET OF THE SEMANTIC ROUTINE OF ASL
PRODUCTION
<PROGRAM>

EVALUATION RULE

= <DEC LIST> <RE LIST>

nothing

<DEC LIST>

= empty

Symbol Table is set to be empty

<DEC LIST>

= <DEC LIST> <RELATION ID>v 2

Check v 2 against the contents of the index
processor
{symbol table}~ {symbol table} U (v 2 )

<RE LIST>

= <RELATION>v 2

Check v 2 against the contents of symbol

;

table
<RE LIST>

= <RE LIST><RELATION>v 2

;

Check v 2 against the contents of symbol
table

<RELATION>v 1

= <T RELATION>v 2

V1

~

V2

w
w

CTl

TABLE A.l. -- Continued
PRODUCTION
<T RELATION>v 1

EVALUATION RULE

= <RELATION>v 2 = <Q RELATION>v 3

V1

+- V2

Symbol table will be updated by v 2
<T RELATION>v 1

= <RELATION

ID>v 2

<UP RELATION>v 1

= <CH

<UP RELATION>v 1

= <ADD

<UP RELATION>v 1

= <D

<Q RELATION>v 1

= <SET

<Q RELATION>v 1

= <CARTESIAN>v 2

<Q RELATION>v 1
<Q RELATION>v 1

= <UP

RELATION>v 2

RELATION>v 3

v 1 ~ v2
Symbol table will be updated by the v 2
V1

~

V2

V1

+-

V2

V1

+-

V2

V1

+-

V2

V1

~

V2

= <JOIN>v 2

V1

+-

V2

= <RESTRICTION>v 2

V1

~

V2

RELATION>v 2

RELATION>v 2
EXPRESSION>v 2

w

w

--.J

TABLE A. 1.
PRODUCTION
<Q RELATION>v 1

EVALUATION RULE

= <RELATION ID>v 2

<LOG-SET OP>v 3 <RELATION ID>v 4
<CH RELATION>v

Continued

1

<CH RELATION>v 1

= <SEARCH SET>v

<RELATION OP>v
3
<DOMAIN-VALUE LIST>v 4
2

= <<SET EXPRESSION>>v 2

<RELATION OP>v 3
<DOMAIN-VALUE LIST>v 4
<ADD RELATION>v 1

= <RELATION ID>v 2 U

<DOMAIN-VALUE LIST>v 3
<ADD RELATION>v 1

= <RELATION ID>v 2 U

<DOMAIN-VALUE LIST>v 3

v 1 is set by the compiler
Check v2 and v 4 against v

V1

~

V3

3

Check v2 and v4 against v3
V1

~

V3

Check v3 against v2
V1

~

V2

Check v3 against v2
V1

~

V2

{Symbol table}~ {Symbol table} U env

<D RELATION>v 1

= <RELATION ID>v 2

-

<SET EXPRESSION>v 3

Check v 3 against v2
V1

~

V2

w
w
co

TABLE A.l.
<D RELATION>v 1

-- Continued

= <RELATION ID>v 2

~

V1

V2

{Symbol table}~ {Symbol table} - (v 2 )
<SET EXPRESSION>v 1

= <SEARCH SET>v 2

<RELATION OP>v 3
<OUTPUT SET>v 4
<SET EXPRESSION>v 1

= <<SET EXPRESSION>>v 2
<RELATION OP>v 3
<OUTPUT SET>v 4

<CARTESIAN>
<JOIN>v 1

= (<RELATION ID>v 2 )(<RELATION ID>v 3 )

= (<RELATION ID>v 2 ) <DOMAIN>v 3

<RELATIONAL OP>v 4 <DOMAIN>v 5
(<RELATION ID>)v 6
<RESTRICTION>v 1

= <RELATION ID>v 2
<OUTPUT SET>v 3

Check v 2 and v 4 against v 3
{Symbol table}
V1

~

~

{Symbol table}

U env

V3

Check v 2 and v 4 against v 3
V1

~

V3

{Symbol table}~ {Symbol table}

U env

{Symbol table}= {Symbol table} U (v 2 )

A

(v 3 )

Check v 3 against v2
Check v 5 ~ v 6
{Symbol table} ~ {Symbol table} U (v 2 ) (v 6 )
{Symbol table}~ {Symbol table} U env
Check v 3 against v 2
V1 ~ V2

w
w

I..O

TABLE A.l.

-- Continued

<SEARCH SET>v 1

= empty

Update T-table

<SEARCH SET>v 1

= <EXPRESSION>v 2

V1

+-

V2

Update T-table by (v 2 )

<SEARCH SET>v 1

= <<SEARCH SET>>v 2

Check v 4 against v 2

<LOG SET OP>v 3
<EXPRESSION>v 4
<OUTPUT SET>v 1

V1 +- V2

Update T-table by v 3 and (v 4 )

= <DOMAIN>v 2

V1 +- V2

Update T-table for permanent relations by (v 2 )

<OUTPUT SET>v 1

= <OUTPUT SET>v 2

,

<DOMAIN>v 3

Check v 3 against v 2
V1 +- V2

Update T-table for permanent relation by (v 3 )

<DOMAIN-VALUE LIST>v 1

= <DOMAIN VALUE>v 2

V1

+-

V2

Update T-table by (v 2 )
w

..i::0

TABLE A. 1.

<DOMAIN-VALUE LIST>v 1

Continued

= <DOMAIN-VALUE LIST>v 2

Check v 1 against v 2

<DOMAIN-VALUE>v 3

V1

+-

V2

Update T-table by (v 2 )

<DOMAIN-TYPE LIST>v 1

= <DOMAIN TYPE>v 2

+-

V1

env

<DOMAIN-TYPE LIST>v 1

=

<DOMAIN-TYPE LIST>v 2 ,
<DOMAIN TYPE>v 3

V2

(v 2 )

+-

Check v 3 against v 2
V1

+-

V2

env +- env U (v 3 )

<DOMAIN VALUE>v 1

::

<DOMAIN TYPE>v1
<SIMPLE EXPRESSION>

= <DOMAIN>v 2 = <EXPRESSION>v 3

V1

= <DOMAIN>v2 , <LENGTH>v3 , <TYPE>

V2

+-

vl

+-

(

v2 ' V3 , T )

t1

+-

t2

T

t 1,

v1

= <TERM> t 2

,

v2

Vt+- V2

w

~
t-,-1

TABLE A.1.

<TERM> v 1 , t1

-- Continued

<TERM> V2,t2
<MULTIPLICATIVE OP> v 4 <FACTOR> v 3 ,t 3

=

Check v 3 against v 2
V1 +- V2

Check t 3 against t 2
t1
<TERM> v 1 ,t1

= <FACTOR> V2,t2

t2

+-

V1 +- V2
tl +- t2

<FACTOR> v 1 ,t 1 :: = (<EXPRESSION> v 2 ,t 2 )

V1 +- V2

t1
<FACTOR> v 1 ,t 1:: = <FACTOR> v 2 ,t 2

<DOMAIN> V1

= <CH LIST>v 2

.

<CH LIST>v 3

t2

+-

V1

+-

V2

t1

+-

t2

V1 +- V2

V3

Check v 2 and v 3 against the symbol Table
T-table is updated by (v 3 )
w

~

N

TABLE A.1.

-- Continued

PRODUCTION

EVALUATION RULE

<EXPRESSION> v 1 ,t 1:: = <SIMPLE EXPRESSION>v 2 ,t 2

Check v2 against va
V1 +- V2

<RELATIONAL OP>v 4

Check t 2 against ta
t

ta

1 +-

<SIMPLE EXPRESSION>va,ta
<EXPRESSION> v 1 ,t1

= <SIMPLE EXPRESSION>v 2 ,t 2

V1

t1
<SIMPLE EXPRESSION>v 1 ,t 1

::

= <SIMPLE EXPRESSION>

V2

V2

t2

+-

,t 2 Check v 2 against v4
V1

<ADDITIVE OP>v 4

+-

+-

V2

Check t 2 against ta
t

1

+-

ta

V1

+-

V2

t1

+-

t2

<TERM> V3,ta

<FACTOR> v 1, t1

= <INTEGER> V2,t2

w
~
w

TABLE A.l.

PRODUCTION
<FACTOR> v 1 ,t 1:: = <BOOLEAN> v 2 ,t 2

-- Continued

EVALUATION RULE
V1

+- V2

t1 +- t2
T-table is updated by (v 2 )
<FACTOR> v 1 ,t 1:: = <STRING> v 2 ,t 2

V1

+-

V2

t1 +- t2
T-table is updated by (v 2 )
<FACTOR>t 1

= <DOMAIN>v 2 ,t2

t1 +- t2
T-table is updated by (v 2 )

<LENGTH>v 1

= integer

v1

+-

V1

+-

<RELATION ID>v 1

= <CH LI ST>v 2

integer
V2

w

~
~

TABLE A.1.
<RELATION OP>v 1

= [<RELATION ID>v 2 ]

-- Continued
V1

+

V2

Update (T-table)
<TYPE>t 1 >

= BOOLEAN

<INTEGER>t 1
<TYPE>t 1

t1

+

0

t1

+

1

t1

+

2

= +

V1

+

+

=

V1

+

-

= INTEGER
=

STRING

<ADDITIVE OP>v 1
<ADDITIVE OP> v 1

<MULTIPLICATIVE OP>v 1

= *

V1

+

*

<MULTIPLICATIVE OP>v 1

= I

V1

+

/

= &

V1

+

&

=

V1

+

<LOG-SET OP>v 1
<LOG-SET OP> v 1

w

~

u,

TABLE A.1.

PRODUCTION
<LOG-SET OP>v 1

-- Continued

EVALUATION RULE

= -

V1

+-

-

<RELATIONAL OP>v 1

= LT

V1

+-

LT

<RELATIONAL OP>v 1

= GT

V1

+-

GT

<RELATIONAL OP>v 1

= LE

V1

+-

LE

<RELATIONAL OP>v 1

= GE

V1

+-

GE

V1

+-

NE

V1

+-

<RELATIONAL OP>v 1

= NE

<RELATIONAL OP>v 1

=

EQ

EQ

w
~
m

347

APPENDIX II
MICRO PROGRAMS OF THE ASL MACROS
In this part the mi ere code of the ope rat i ens discussed in
Chapter IV will be introduced.

This appendix is divided into three

subsections, where the operations of a specific module will be presented based on the ASL micro operations.
Index Processor
Delete .

Erases an entry from the index processor:
to all one

1)

SMKR

2)

SCPR 3

3)

11

4)

"relation name"

5)

11

6)

MTCH

7)

TSBT 2

8)

SETR

error bit to 1

9)

BRRF 6

transfer to 12

DELT

(this

10)

relation name"

relation name 11
on equality

macro

has

been

defined

in Chapter VI)
11)

DELT

beginning address of the
descriptor

of

the

secondary

storage interface
12)
Two points should be mentioned in the above program:

first,

the relation name is assumed to be of length six characters; and,

348

second, the second DELT is used to delete an entry from the secondary storage interface.
Insert.

Creates an entry in the index processor:

1)

SMKR

2)

SCPR 3

3)

11

4)

"relation name"

5)

"relation name"

6)

MTCH

7)

TSBT 2

8)

SETR

9)

BRRF (n + 1)

10)

to all one

relation name 11

on not equality

error bit to 1

INST

(this

macro

has

been

defined

in Chapter VI)

n words

10 + n)

n is equal to the size

1

+

size .
2

In fact, the n words which

foll ow the INST carry the contents of the descriptor in the index
processor and secondary storage interface.

►

349

Secondary Storage Interface
In this module of the backend processor, there is no micro
program involved.

Since the operations are just

11

delete 11 and "read"

which have been discussed in Chapter V.
Non Numeric Processor
Recall from Chapter IV, that cells can be used as a selector or
as a buffer.

In the first case, the contents of the registers in the

cells should be set according to the context of the ASL program while
in the second case the contents of the registers in the cells should
be set by the values which reveals the nature of the cells. In both
cases operations would be a sequence of "moves. 11

In the following

discussion the micro programs corresponding to the set of the operations which are executed on data in the associative processor will be
discussed.
Push.

Inserts the content of the comparand register of a spec-

ific stack into one location above the location pointed to by the top
of stack pointer.
by

11

A.

In the program the specified stack is designated

11

1)

TBTR (A)

is stack full

2)

6

Number of words which control
should transfer if the above
test is successful (transfer
to 8)

3)

MTCH (A)

on equality

4)

TBTR (A)

match bit

350
5)

2

Number of words which control
should transfer if the above
test is successful (transfer
to 8)

6)

INDC (A)

Increment the content of
the top register

7)

Copy (A)

Push the content of the
the associative stack

8)

In the above program the

11

6 11 and

11

2 11 wi 11 transfer the control

to the end of the operation, these values are not fixed, and as will
be seen later they can be varied.
Pop.

The content of the location pointed by the top of stack

pointer will be popped into output register, and then the content of
the top of stack pointer is decremented by 1:

1)

TBTR (A)

Is stack empty

2)

3

Number of words which control
is transferred if the above
test is successful (transfer
to 5)

3)

COPY (A)

Pop the stack into the
output register

4)

DECR (A)

Decrement the content of
the top register

5)

351
Uni on.

Merges

the

contents

of two stacks into one of the

stacks. In the program the contents of the stack B will be merged by
the contents of the stack A and result is remained in register A.

1)

SMKR (A)

To all one

2)

SMKR (B)

To all one

3)

MVRR (B)

Save content of the top
register

4)

POP (B)

In case stack is empty control
will transfer to the

5)

MVOC (A,B)

11

8 11

Move the content of the output
register of B to the comparand
register of A

6)

PUSH (A)

If stack is full control
transfer to the
11

11

8 11

4 11

7)

BRRB 4

Transfer to

8)

MVRR (B)

Restore the content of the top
register.

Intersection.

The common e 1ements of the two stack

II

11

A and B

will be pushed into a third stack.
1)

SMKR (A)

To all one

2)

SMKR (B)

To all one

3)

SMKR (C)

To all one

4)

STBT (C)

Sets the content of the top register to zero

5)

MVRR (B)

Save content of the top register

6)

POP (B)

In case
to the

11

stack is empty control
15 11

transfers

352
7)

MVOC (A,B)

Move the content of th output register of B
to the comparand register of A

8)

MVOC (C,B)

Move

the

content of the output register

of B to the comparand register of C
9)

MTCH (A)

On equality

10)

TBTR (A)

Match bit

11)

1

Number

of

words

which

control

transfers

to (transfers to 13)
12)

BRRB 10

13)

PUSH (C)

14)

15)

Control transfers to

11

6 11

BRRB 18

Control transfers to

11

6 11

MVRR (B)

Restore

Difference.

the

content of the

Those elements in the second stack

not members of the first stack

11

A11 are inserted to

11

top
11

8 11

register
which are

C. 11

1)

SMKR (A)

To

2)

SHKR (B)

To all one

3)

SMKR (C)

To all one

4)

STBT (C)

Sets the content of the top register to zero

5)

MVRR (B)

Save content of the top register

6)

POP (8)

In case
to

7)

MVOC (A,B)

all one

11

stack is empty control

transfers

15 11

Move the content of the aouput register of
B to the comparand register of A

8)

MVOC (C,B)

Move to the content of the output register
of B to the comparand register of C

9)

MTCH (A)

On not equality

353
10)

TBTR

Match bit

11)

1

Number

of

words

which

control

transfers

to (transfers to 13)
12)

BRRB 10

13)

PUSH (C)

14)
15)

Control transfers to

11

6 11

BRRB 18

Control transfers to

11

6 11

MVRR

Restore

Inclusion.

the content of the top

register

Checks whether or not a set (B) is a subset of

another set (A).
1)

SMKR (A)

To all one

2)

SMKR (B)

To all one

3)

MVRR (B)

Save content of the top register

4)

STBT

Sets the inclusion bit to zero

5)

POP (B)

If stack is empty control transfers to

6)

MVOC (A,B)

Transfers the content of the output register

11

11 11

of B to the comparand register of A
7)

MTCH (A)

on not equality

8)

TBTR

Match bit

9)

2

Control transfers to

5 11

10)

BRRB

11)

STBT

Sets the inclusion bit to one

12)

MVRR (B)

Restore

9

Set egualiti'..

11

the content of the

top

register

Checks whether or not the contents of two sets

(associative stacks) are equal
order of the elements.

or not,

regardless of whether the

354

1)

SMKR (A)

To all one

2)

SMKR (B)

To all one

3)

STBT

Sets the equality bit to zero

4)

MVRR (A)

Saves the content of the top register

5)

MVRR (B)

Saves the content of the top register

6)

POP (A)

If stack is empty control transfers to

11

18 11

7)

POP (B)

If stack is empty control transfers to

11

21 11

8)

MVOC (A,B)

Transfers content of the output register of

B to the comparand register of A
9)

MVOC (B,A)

Transfers content of the output register of
A to the comparand register of B

10)

MTCH (A)

On equality

11)

TBTR (A)

Tests the match bit

12)

1

Transfers control to the

11

14 11

13)

BRRF 7

Transfers control to the

11

21 11

14)

MTCH (B)

On not equality

15)

TBTR (B)

Tests the match bit

16)

5

Transfers control to the

11

21 11

17)

BRRB 18

Transfers control to the

11

611

18)

TBTR (B)

Is stack empty

19)

1

Transfer control to

20)

STBT

Sets the equality bit to one

21)

MVRR (A)

Restores

the

content of the top register

22)

MVRR (B)

Restores

the

content of the top register

Selection.

11

21 11

Se 1ects the tagged row in one of the as soc i at i ve

stacks (A), and pushes them to a second stack (B).

355
1)

SMKR (A)

To all one

2)

SMKR (B)

To all one

3)

STBT (B)

Set the top register to zero

4)

MVRR (A)

Saves the content of the top register

5)

TBTR (A)

Tests the match bit

6)

1

Transfers control to the

11

8 11

7)

BRRF

Transfers control to the

11

14 11

8)

FIND

9)

MVRR (A)

(A)

Finds the first match and resets it
Moves content of the pointer register to the
top register

10)

COPY (A)

Moves the content of the selected row of the
A to the output register

11)

MVOC (B,A)

Transfers the content of the output register
of A to the comparand register of B

12)

PUSH (B)

13)

BRRB 12

Transfers control to the

14)

MVRR

Restores

Projection.

the

11

511

content of the top register

Selects specified fields of each element of an associative

stack (A) and pushes them into another stack (B).
1)

SMKR (B)

To all one

2)

SMKR (A)n

n defines how many words follow this instruction as operands

356
n words

n+3) STBT (B)

Sets the top register to zero

n+4) MVRR (A)

Saves the content of the top register

n+5) POP (A)

If stack is empty control transfers to

n+ 7) MVRR (A)

Transfers the content of the output register

11

n+12 11

to the remove blanks circuit
n+8) RBRF

Removes created blanks

n+9) MVRR (B)

Transfers the content of the remove blanks
circuit

to

the

comparand

register

of

B

n+lO) PUSH (B)
n+ll) BRRB 12

Transfers control to the

n+12) MVRR (A)

Restores

11

n+5 11

the content of the top register

Join.
1)

SMKR (A)

To all one

2)

SMKR (B)n

n defines

the

select

fields

which

join

should be performed in our discussion we
assume

11

n = 1 11

3)

"FIELD"

Address of specified field

4)

SMKR (C)

To

5)

STBT (C)

Sets the top register to zero

6)

MVRR (A)

Saves the content of the top register

7)

MVRR (B)

Saves the content of the top register

8)

POP (A)

If stack is empty control transfers to

all one

11

25 11

357
9)

MVRR (A)

Transfers the content of the output register
to shift

10)

RTAT

1
Circulates the shift

11)

MVRR (B)

Transfers the content of the shift to the
comparand register of the B

12)

RTAT

Circulates the shift

13)

MVRR (B)

Saves the content of the top register

14)

MTCH (B)

15)

TBTR (B)

Tests the match bit

16)

1

Transfers control to

11

18 11 if match bit is

true
17)

BRRB 13

Transfers control to the

18)

FIND (B)

Find first match and reset

19)

MVRR (B)

sets

20)

COPY (B)

Writes the content of the pointed row to the

11

8 11

the content of the pointer register

output register
21)

MVRR (B)

Transfers the content of the output register
to shift

22)

MVRR (C)

2

Transfers the content of the shift to the
comparand register of the C

23)

PUSH (C)

24)

BRRB

Transfers control to the

25)

MVRR (A)

Restores

the

content of the top register

26)

MVRR (B)

Restores

the

content of the top register

11

15 11

358

APPENDIX III
THE EXECUTION TIMES OF MICRO OPERATIONS
In this part of the dissertation the execution time of three
different computer systems are going to be presented.
Table A.3(a) discussed the subset of the primitive operations of
the VAX 11/780.
Table A.3(b) is the same as Table A.l except for the UNIVAC system.
Table A.3(c) shows the execution time of a subset of the primitive
operations of STARAN computer.

TABLE A.3(a)
THE INSTRUCTION TIME OF VAX 11/780

OPERATION

DESCRIPTION

EXECUTION TIME(µsec)

ADDB R,R

Add byte (8 bits)

.4

ADDW R,R

Add word (16 bits)

.4

ADDL R,R

Add long word (32 bits)

.4

ASHL R,R

Arithmetic shift left (10 bit positions)

BRW

Branch (word displacement)

BBS

Branch on bit set

CLRL REG

Clear long word (32 bits)

.6

CLRQ REG

Clear quad word (64 bits)

1. 2

2.0
.8
1. 4

w

u,

\.0

TABLE A.3(a)

Continued

OPERATION

DESCRIPTION

CMPC

Compare character (10 characters)

EXECUTION TIME (µsec)
14.4

CMPL R,R

&

BLSS

Compare long word and branch on< 0

1. 2

CMPL R,R

&

BLEQ

Compare long word and branch on<= 0

1.8

MOVL R,R

Move long word (32 bits)

ROTL 10 R,R

Rotate left 10 bit positions

1. 2

TSTL & BLEQ

Test long word & branch on<= 0

1.0

.4

w

O'l
0

TABLE A.3(b)
THE INSTRUCTION TIME OF UNIVAC

OPERATION

DESCRIPTION

EXECUTION TIME (µsec)

DEC

Decrease by 1 and test if u # 0 skip next instruction

.4/.6

INC

Increase by 1 and test if u # 0 skip next instruction

.4/.6

J

Jump to u

JP

Jump to u if positive or else next instruction

.2/.3

JN

Jump to u if negative or else next instruction

.2/.3

sz

Store zero

.2

SPl

Store positive 1

.2

TZ

Test zero, skip next instruction if O

.3

.35/.55
w

°'
~

TABLE A.3(b)

Continued

OPERATION

DESCRIPTION

EXECUTION TIME (µsec)

TNZ

Test non zero, skip next instruction if# 0

.35/.55

TE

Test equal, skip next instruction if equal

.35/.55

TNE

Test not equal, skip next instruction if not equal

.35/.55

TLE

Test~, skip next instruction if~

.45/.65

TG

Test greater than, skip next instruction if>

.45/.65

DSC

Double shift circular right (u places)

.4

LDSC

Double shift circular left (u places)

.4

DSL

Double shift logical right (u places)

.3

LDSL

Double shift logical left (u places)

.3

w

O'I

N

TABLE A.3(b) -- Continued
EXECUTION TIME (µsec)

OPERATION

DESCRIPTION

SSC

Single shift circular right (u places)

.35

LSSC

Single shift circular left (u places)

.35

SSL

Single shift logical right (u places)

.2

LSSL

Double shift logical left (u places)

.2

w
w

0)

TABLE A.3(c)
INSTRUCTION TIME OF STARAN

EXECUTION TIME (µsec)

OPERATION

DESCRIPTION

B

Unconditional branch

.5

BNR

Branch if no response

.5

BRS

Branch if response

.5

BZ

Branch if zero

.5

CLEAR

Clear control bits associated to the rows in associative memory

.17

CLEAR COMP.

Clear comparand register

.33

DECR

Decrement

.2/.5

INCR

Increment

.2/.5
w

a,
~

TABLE A.3(c)

Continued

EXECUTION TIME (µsec)

OPERATION

DESCRIPTION

ENR

Energize

EQC

Equal to comparand search

FIND

Find first equality

GM

Generate mask

.36

LM

Load comparand register

.2

LR

Load register from memory

1.5

LRR

Load register from register

.28

sew

Store comparand in associative memroy

.13
. 2n* + 1. 08

.2/.6

.87 - 2.87

* n (number of bits)
w

O"'I
(J"1

TABLE A.3(c)

Continued

OPERATION

DESCRIPTION

EXECUTION TIME (µsec)

SET (X)

Set control bits of rows in associative memory

.17

SET (M)

Set comparand register

.33

LI

Load immediate

.28

RESV FST

Step to first match and reset others

.93

SR

Store register in memory

1.5

LC

Load comparand from associative memory

.38/1.05

w

O'l
O'l

367

LIST OF REFERENCES
Adam, J. J., and Haden, D. H. Computers Appreciation, Applications,
Implementations. New York: John Wiley and Sons, 1973.
Aho, A. V., and Johnson, S. C.
(June 1974): 99-124.

11

LR Parsing. 11 Computing Surveys

____ , and Ullman, J. D. Principles of Compiler Design.
MA: Addison Wesley, 1977.

6

Reading,

Hopcroft, J. E.; and Ullman, J. D. The Design and Analysis
of Computer Algorithms.
Reading, MA:
Addison Wesley, 1974.
Awad,

E. M. Automatic Data Processing, Principles and Procedures.
Englewood Cliffs, NJ: Prentice Hall, 1973.

Banerjee, J. ; Hsiao, D. K. ; and Baum, R. I. "Concepts and Capabil it i es of a Data Base Computer. 11 ACM Transactions on Data Base
Systems 3 (December 1978): 347-384.
_ _ _ ; and Kannan, K. 11 DBC - A Database Computer for Very Large
Data Bases. 11 IEEE Transactions on Computers
28 (June 1979):
414-429
Baron, J. R.; O'Donnell, J.; Riley, D; and Gyllstrom, P. "Introducing to the B-1726 Computer." Laboratory Report.
Iowa City,
IA: University of Iowa, Department of Computer Science 1977.
Barrett, W. A., and Couch, J. D.
and Practice .
Chicago, IL:
1979.

Compiler Construction: Theory
Science Research Associates,

Bartlett, J.; Mudge, J.; and Springer, J. 11 Associative Memory Chips:
Fast, Versatile and here. 11 Electronics, August 1970, pp. 96-100.
Bayer, R.; Graham, R. M.; and Seegmul ler, G. Lecture Notes in Computer Science. New York: Springer-Verlag, 1978.
Berra, P. B. "Some Problems in Associative Processor Applications
to Data Base Management." In AFIPS Conference Proceedings, Volume 43, 1974 National Computer Conference and Exposition
Chicago, May 6-10, 1974, pp. 1-5. Montvale, NJ: AFIPS, 1974.
Bhandarkar, D. P.; Barton, J. B.; and Tasch, A. F., Jr. "Charge Coupled Device - Memories: A Perspective." Computer 12 (January
1979): 16-23.

368
Bird,

R. M. ;
Tu, J. C.; and Worthy, R. M.
"Associative/Parallel
Processors for Searching Very Large Textual Data Bases. 11 Proc.
Third Non-Numeric Workshop, May 1977, pp. 1-9.

Bloch, Erich, and Galage, Dom.
"Component Progress:
High Speed Computer Architecture and Machine
Computer 11 (April 1978): 64-75.

Its Effect on
Organization. 11

Boyce, R. F.; Chamberlin, D. D.; King, W. F., III; and Hammer M. M.
"Specifying Queries as Relational Expression:
The SQUARE Data
Sub language. 11 Communications of the ACM
18
(November 1975):
621-628.
Boyle, W. S., and Smith, G. E.
"Change-Coupled Devices:
A New
Approach to MIS Device Structures. 11 IEEE Spectrum 8 (July 1971):
18-27.
Bray,

01 in, and Thurber, K. J.
"What's Happening With Data Base
Processors?" Datamation, January 1979, pp. 146-156.

Bremer, J. W.
"Hardware Technology in the Year 2001.
(December 1976): 32-36.

11

Computer 9

Chamberlin, D. D.
"Relational Data Base Management Systems.
puting Surveys 8 (March 1976): 43-59.

11

Com-

_ _ _ _ ; Gray, J. N.; and Traiger, I. L.
"Views, Authorization and
Loe king in a Rel at i ona 1 Data Base Sys tern. 11 In AF I PS Conf ere nee
Proceedings, Volume 44, 1975 National Computer Conference,
_A_na_h_e_i_m~,_CA___._,_M_a~y_19_-_2_2, pp. 425-430.
Montvale, NJ:
AFIPS,
1975.
Champine, G. A.
(May 1979):

11

Current Trends in Data Base Systems.
27-41.

11

Computer 12

11

Datamation,

Chu,

Y.
Computer Organization and Micro Programming.
Cliffs NJ:
Prentice Hall, 1970.

Englewood

Codd,

11
E. F.
Extending the Data Base Re 1at i ona l Model to Capture
More Meaning. 11
San Jose, CA:
IBM Research Laboratory, 1979.

11

Four Approaches to a Data Base Computer.
December 1978, pp. 101-106.

----•

"Relational Completeness of Data Base Sublanguages.
Courant Computer Symposia 6 (1971): 65-98.

11

A Rel at i ona l Model of Data for Large Shared Data Banks."
Communications of the ACM 13 (June 1970): 377-387.
11

Date,

C.
J.
An Introduction to Database Sys terns.
Addison Wesley, 1977.

Reading,

MA:

369

Defiore, C. R. , and Berra, P. B. 11 A Data Management System Utilizing
an Associative Memory." In AFIPS Conference Proceedings, Volume
42, 1973 National Computer Conference and Exposition, New York,
June 4-8, 1973, pp. 181-185. Montvale, NJ: AFIPS, 1973.
Denning, P. J. "Virtual
1970): 153-189.
Deo,

Memory. 11

Computing Surveys

2

(September

N.
Graph Theory With Applications to Engineering and Computer
Science. Englewood Cliffs, NJ: Prentice Hall, 1974.

Dewitt, D. J . "DIRECT - A Multiprocessor Organization for Supporting
Relational Database Management Systems." IEEE Transactions on
Computers 28 (June 1979): 395-406.
Eichelberger, E. B.; Caswell, H. L.; Holton, W. C.; Petschauer, R.
J.; Brown, G. A.; Krishnaswamy, S.; Stopper, H.; Carter,
D. H.; Lattin, W.; Sullivan, R. M.; and Losleben, P.
"Basic
Technology." Computer 11 (September 1978): 10-19.
Faggin, F.
''How VLSI Impacts Computer Architecture." IEEE Spectrum
15 (May 1978): 28-32.
Finnila, C. A., and Love, H. H., Jr. "The Associative Linear Array
Processor." IEEE Transactions on Computers 26 (February 1977):
112-124.
Flynn, M. J. "Some Computer Organization and Their Effectiveness."
IEEE Transactions on Computers 21 (September 1972):
125-140.
Foster, C. C.
Content Addressable Parallel Processor.
New York: Van Nostrand Reinhold, 1976.

2nd

ed.

Gaines, R. S., and Lee, C. Y. 11 An Improved Cell Memory. 11 IEEE Transactions on Electron Computers 14 (February 1965): 72-75.
Gi mp e 1 ,

J.

F.

Algorithms in SNOBOL 4.

New York:

Wiley,

1976.

Ginsburg, S. Algebraic and Automata - Theoretic Properties of
Formal Languages. Amsterdam: North-Holland, 1975.
Gries, David. Compiler Construction for Digital Computers.
John Wiley and Sons, 1971.
Harrison,
MA:

M. A.
Introduction to Formal Language Theory.
Addison Wesley, 1978.

New York:
Reading

Haskin, Roger.
"Hardware for Searching Very Large Text Databases."
Urbana, IL:
University of Illinois, Department of Computer
Science, 1978.
Hayes, J. P. Computer Architecture Organization.
Hill, 1978.

New York:

McGraw-

370
Heacox, H. C.; Cosloy, E. S.; and Cohen, J. B. "An Experiment in
Dedicated Data Management. 11 In Very Large Data Bases, Volume l,
pp. 511-513.
Edited by D.S. Kerr. New York: Association for
Computing Machinery, 1975.
Healy, L. D.; Lipovski, G. J.; and Doty, K. L. 11 The Architecture of
a Context Addressed Segment-Sequential Storage. 11 In AFIPS Conference Proceedings, Volume 41, Part II, 1972 Fall Joint Computer Conference, Anaheim ,CA, December 5-7, 1972,
pp.
691-701.
Montvale, NJ: AFIPS, 1972.
Held, G. D.; Stonebraker, M. R.; and Wong, E. 11 INGRES - A Relational
Data Base System. 11 In AFIPS Conference Proceedings, Volume 44,
1975, National Computer Conference, Anaheim, CA, May 19-22,
1975, pp. 409-416. Montvale, NJ: AFIPS, 1975 .
Hellerman, H. Digital Computer System Principles. New York:
Hill, 1973.

McGraw-

Hi 1 burn, J. L. , and Juli ch, P. M. Microcomputers/Mi crorpocessors:
Hardware, and Applications.
Englewood Cliffs, NJ:
Prentice
Hall, 1976.
Hoare, C. A. R.
"An Axiomatic Basic for Computer Programming."
Communications of the ACM 12 (October 1969): 576-583.
Hopcroft, J. E., and Ullman, J. D. Formal Languages and Their
Relation to Automata.
Reading, MA:
Addison Wesley,
11 Bubble Memory as Small Mass Storage. 11
Juliussen, J. E.
TX: Texas Instruments, Inc .• February 1977.

1969.
Dallas,

11 Magnetic Bubble Systems - Approach, \, Practical
Computer Design 15 (October 1976): 81-91.

Use. 11

Kerr,

11 Data Base Machines with Large Content Addressable
D. S.
B1 ocks and Structura 1 Information Processors. 11
Computer
12 (March 1979): 64-81.

Kim,

11 Relational Database
Won.
(September 1979): 185-211.

Systems. 11

Computing Surveys

11

11 Von Neumann's First Computer Program. 11
Knuth, D. E.
Surveys 2 (December 1970): 247-260.

Computing

Kuck, D. J. The Structure of Computers and Computations.
John Wiley and Sons, 1978.

New York:

Lamb,

S.
"An ADD-in Recognition Memory For S-100 Bus Micro Computers.11 Computer Design 17 (September 1978): 162-168.

371

and Vanderslice, R. "Recognition Memory: Low Cost Content Addressable Parallel Processor for Speech Data Manipula11
tion.
Paper presented at the joint meeting of the Acoustical
Society of America and Acoustical Society of Japan, Honolulu,
November 29, 1978.
Langdon, G. G., Jr. 11 A Note on Associative Processors for Data
Management."
ACM Transactions on Data Base System
3
1978): 148-158.
"Database Machines: An Introduction."
on Computers 28 (June 1979): 381-383.
Lea,

(June

IEEE Transactions

R.
M.
Associative Processing of Non-Numerical Information.
Dordrecht, Holland: D. Reidel Publishing Company, 1976.

Lee, C. Y., and Paull, M. C. "A Content Addressable Distributed Logic
Memory With Applications to Information Retrieval." Proceedings
of the IEEE 51 (June 1963): 924-932.
Lee, S. Y., and Chang, H.
"Associative Search Bubble Devices for
Content Address ab 1e Memory and Array Logic. " IEEE Transact ions
on Computers 28 (September 1979): 627-636.
Leilich, H. 0.; Stiege, G.; and Zeidler, H. C. "A Search Processor
for Data Base Management Systems."
Braunchweig, Germany:
Technical University of Braunchweig, December 1977.
Lewin, Douglas.
Introduction to Associative Processors.
Holland: D. Reidel Publishing Company, 1976
Lewis P. M., II; Rosendrantz, D. J.; and Stearns, R. E.
Design Theory. Reading, MA: Addison Wesley, 1978.

Dordrecht,
Compiler

Lin, C. S.; Smith, Diane; and Smith, J. M. "The Design of a Rotating
As soc i at i ve Memory for Rel at i ona 1 Database App 1 i cat ions." ACM
Transactions on Database Systems 1 (March 1976): 53-65.
Linde, R. R.; Gates, R.; and Peng, T. "Associative Processor Applications to Real-Time Data Management." In AFIPS Conference
Proceedings, Volume 42, 1973 National Computer Conference and
Exposition, New York, June 4-8, 1973, pp. 187-195. Montvale,
NJ: AFIPS 1973.
Lipovski, J. G., and Doty, K. L.
"Developments and Directions in
Computer Architecture."
Computer 11 (August 1978):
54-67.
Liptay, J. S. "Structural Aspects of the System 360 Model 85."
IBM Systems Journal 7 (1968): 15-21.
Maller, V. A.
"The Content Addressable File Store - CAFS."
Technical Journal 2 (November 1979): 265-279.

ICL

372

Martin, R. R., and Frankel, H. D. "Electronic Disks in the 1980' s. 11
Computer 8 (Feburary 1975): 24-30.
Mary ans ki, F. J.
"Backend Database Systems. 11 ACM Computing Surveys
11 (March 1980): 3-26.
Moulder, Richard.
"An Implementation of a Data Management System
On An Associative Processor." In AFIPS Conference Proceedings,
Volume 42, 1973, National Computer Conference and Exposition,
New York, June 4-8, 1973, pp. 171-176.
Montvale, NJ~ AFIPS,
1973.
Mukhopadhyay, A. "Hardware Algorithms for Non-Numeric Computation."
IEEE Transactions on Computers 28 (June 1979): 384-394.
____ , and Hurson, Alireza. "An Associative Search Language - ASL
For Data Management."
In AFIPS Conference Proceedings, Volume 48, 1979, National Computer Conference, New York, June 4-7,
1979, pp. 727-732. Montvale, NJ: AFIPS, 1979.
Myers, G. J.
1978.

Advances in Computer Architecture.

New York:

Wiley,

Oliver, E. J. "RELACS, An Associative Computer Architecture to Support a Relational Data Model. 11 Ph.D. dissertation, Syracuse
University, 1979.
Ozkarahan, E. A. ; Schuster, S. A. ; and Smith, K. C. 11 A Data Base
Processor." Technical Report. Toronto, Ontario, Canada: University of Toronto, November 1974.
11
A High Level Machine-Oriented Assembler Language for a
Database Machine." Technical Report. Toronto: University of
Toronto, October 1976.

"RAP - An Associative Processor for Data Base Management."
In AFIPS Conference Proceedings, Volume 44, 1975, National Computer Conference, Anaheim, CA, May 19-22, 1975,
pp.
379-387.
Montvale, NJ: AFIPS, 1975.
Parhami, Behrooz. "A Highly Parallel Computing System for Information
Retri eva 1. 11
In AFI PS Conference Proceedings, Volume 41, Part
II, 1972, Fall Joint Computer Conference, Anaheim, CA, December
5-7, 1972, pp. 681-690. Montvale, NJ: AFIPS, 1972.
Parker, J. L. "A Logic Per Track Retrieval System. 11 Technical ReUniversity of
port.
Vancouver, British, Columbia, Canada:
British Columbia, 1971.
Parnigrahi, G.
"The Implications of Electronic Serial Memories. 11
Computer 10 (July 1977): 18-25.

373
Prothro, V. C.
Information Management Systems.
Nostrand Reinhold, 1976.
Rege,

New

York:

Van

11

S. L.
Cost Performance and Size Trade Offs For Different
Levels in a Memory Hierarchy. 11 Computer 9 (April 1976): 43-50.

Reingold, E. M.
Combinational Algorithms.
Prentice Hall, 1977.

Englewood

Cliffs,

NJ:

11
Rosell, Juan Rodriguez.
Empi ri cal Data Reference Behavior in Data
Base Systems." Computer 9 (November 1976): 9-13.

11
Rudolph, J. A.
A Production Implementation of an Associative Array
Processor - STARAN. 11 In AFIPS Conference Porceedings, Volume 41,
Part II, 1972, Fall Joint Computer Conference, Anaheim, CA,
December 5- 7, 1972, pp. 229-241.
Montvale, NJ:
AF I PS, 1972.

Salisbury, A. B. Microprogrammable Computer Architectures.
Elsevier, 1976.
Salomaa,

Arto.

Formal Languages.

Salzer, J. M.
"Bubble Memories,
(March 1976): 36-41.

New York:

New York:

Academic Press, 1973.

Where Do We Stand?" Computer

9

Schuster, S. A.; Nguyen, H. B.; Ozkarahan, E. A.; and Smith, K. C.
11
RAP2 - An Associative Processor for Databases and Its Appl i cati on. 11 IEEE Transactions on Computers 28 (June 1979):
446-458.
Slade, A. E., and McMahon, H.O. 11 A Cryotron Catalog Memory System."
In Proceedings of the Eastern Joint Computer Conference, New
York, December 10-12, 1956,
pp. 115-120.
New York:
American
Institute of Electrical Engineers, 1956.
Slana, M. F.
"Workshop Report:
A Computer Element Technology Update.11 Computer 10 (July 1977): 37-39.
"Workshop Report: Computer Elements for the 80 1 s. 11
Computer 12 (April 1979): 98-102.
Slotnick, D. L. "Logic Per Track Devices. 11 In Advances in Computers,
pp. 291-296.
Edited by Franz L. Alt and Morris
Volume 10,
Rubenoff. New York: Academic Press, 1970.
Smith, Diane, and Smith, J. M.
Computer 12 (March 1979):

11

Rel at i ona l Database Machines.
28-38.

Stone, H. S., Ed.
Introduction to Computer Architecture.
Science Research Association, 1975.
Su,

Stanley Y. W. "Cellular Logic Devices
tion." Computer 12 (March 1979): 11-25.

Concepts

11

Chicago:

and Applica-

374
11
____ , and Emam, Ahmed.
CASDAL: CASSM' s Data Language, 11 ACM
Transactions on Database Systems 3 (March 1978): 57-91.

11
____ ; Nguyen, H. B.; Emam, Ahmed; and Lipovski, G. J.
The
Architectural Features and Implementation Techniques of the
Multi Cell CASSM. 11 IEEE Transactions on Computers 28 (June
1979): 430-445.

Thornton, J. E. Design of a Computer.
Foresman, 1970.

Glenville, IL:

Scott

Paral lel Operation in the Control Data 6600. 11 In AFIPS
Conference Proceedings, Volume 26, Part II, 1964 Fall Joint
Computer Conference, San Francisco, pp. 33-40. New York: AFIPS,
1964.
11

Thurber, K. J.; Jensen, D. E.; Jack, L. A.; Kinney, L. L.; Patton,
P. C.; and Anderson, l. C. 11 A Systematic Approach to the Design
of Digital Bussing Structures." In AFIPS Conference Proceedings,
Volume 41, Part II, 1972, Fall Joint Computer Conference,
Anaheim, CA, December 5-11, 1972, pp. 719-740.
Montvale, NJ:
AFIPS, 1972.
Toombs, Dean. "An Update: CCD and Bubble Memories." IEEE Spectrum
15 (April 1978a): 22-30.
CCD and Bubble Memories:
Spectrum 15 (May 1978b): 36-39.
11

System Implications."

IEEE

Tsichritzis, D. C., and Lochovsky, F. H. Data Base Management Systems. New York: Academic Press, 1977.
Ullman, J. D. Principles of Database Systems.
puter Science Press, 1980.

Rockville, MD: Com-

Wah, B. W., and Yao, B. S. "DIALOG - A Distributed Processor Organization for Database Machine." In AFIPS Conference Proceedings,
Volume 49, 1980, National Computer Conference, Anaheim, CA, May,
19-22, 1980, pp. 243-253. Arlington, VA: AFIPS, 1980.
Wilner, Waynet. Problem - Language Oriented Architecture.
Holland: D. Reidel Publishing Company, 1976.

Dordrecht,

Yau, S. S., and Fung, H. S. 11 Associative Processor Architecture A Survey. 11
Computing Surveys 9 (March 1977): 3-27.

