University of Wollongong

Research Online
Department of Computing Science Working
Paper Series

Faculty of Engineering and Information
Sciences

1983

A screen oriented simulator for a DEC PDP-8 computer
Neil Gray
University of Wollongong, nabg@uow.edu.au

Follow this and additional works at: https://ro.uow.edu.au/compsciwp

Recommended Citation
Gray, Neil, A screen oriented simulator for a DEC PDP-8 computer, Department of Computing Science,
University of Wollongong, Working Paper 83-2, 1983, 65p.
https://ro.uow.edu.au/compsciwp/69

Research Online is the open access institutional repository for the University of Wollongong. For further information
contact the UOW Library: research-pubs@uow.edu.au

THE UNIVERSITY OF WOlLONGONG

DEPARTMENT OF COMPUTING SCIENCE

A SCREEN ORIENTED SIMULATOR FOR A DEC PDP-8 COMPUTER

. ".

N.A.B. Gray

Department of Computing Science
University of Wollongong

Preprlnt No 83-2

January 25. 1983

P.O. Box 1144. WOLLONGONG. N.S.W. AUSTRALIA
telephone (042)-282-981
telex AA29022

A Screen Oriented Simulator for a DEC PDP-8 computer.
N.A.B. Gray.

Department of Computing Science. University of Wollongong. PO Box 1144.
WOllongong NSW 2500. Austr"1lia.

ABSTRACT

This note describes a simulator for the DEC PDP-8 computer. The
simulator is intended as an aid tor students starting to learn assemDly
language programming. It utilises the simple graphIcs capaDilities of
the terminals in the department's laboratories to present. on the termInal screen. a view of the operations of the simulated computer.
The complete system comprises two versions at me program tor
simulating a PDP-8 computer and a simplified "assembler" tor preparIng students' programs for execution. There are also a numDer of
example PDP-8 programs illustrating partiCUlar aspects of that computer.
The first version ot the simulator is intended to help illustrate a
conventional computer's fetch-decode-execute cycle. In this versIon.
mere is a three part display. The three parts represent (j) the central
processing unit ("cpu"). (il) the communicatIons path Joining the cpu
and main memory ("bus") and (iii) a Window into main memory. These
displays allow a detailed presentation of how a program IS actually executea on a computer. Data. both instructions and program data. can De
seen Deing read out from memory and being transterrea over the bus to
regIsters in the cpu. There instructions are aecoded and program data
mantpUlated. Results from computations can be seen DelOg transterreo
oaCk out of the cpu registers. over the bus. to memory Where they
overwrite previous values. The simulation may De run continuously. at a
user selectable speed. or may be set to pause. ana await user
response. oetween each of the stages of the machine'S instructIon
CYCle.
A secona. more elaborate version of the SImulator program. proVides an environment tor introducing basic concepts of assemoly
language "debugging". This version also provides displays of the simulated machine's cpu ana memory. The level of detail of these Oisplays IS
user selectable. It incorporates a tairly conventIonal "debugging" function that allows users to run their programs in a controlleo manner.
Users may. for example. specify "breakpOints" In the PDP-8 programs
mat they have preparea tor simulated execution. On reaching such a
breakpoint. execution of a program is suspended lemporarily to allow
inspection of the contents of the cpu registers and ot the memory of the
SImulated machine.

-2This more advanced version of the simulator program provlaes a
reasonably realistic model of how input ana output ("110") are performed on small mini- and micro-computers. The simulated machine IS
eqUipped with some standard peripherals sucn as a clock and analogdigital converter. These simulated peripherals can Qe operated using
eitner "flag-driven" liD or through an "interrupt" mechanism.

1. 'ntroduction
In their second year of Computing Science, students at tne University of WOIlongong are required to take a course introducing machine organization and assembly
language programming. Prior to taking this course, students' sOle computing experience IS a one year intrOductory course on programming, using the PASCAL language
on a time-shared system. For students continuing in computing science, the "assembly language" course prOVides a grounding for SUbsequent stUdies on compilers.
operating systems and more advanced courses on micro-computers ("micros"). The
major benefit for general students is a wider perspective on computers. A basic unaerstanoing of hOW computers work, and how they may communicate with peripheral devices, is partiCUlarly advantageous to students in the physical sciences who may later
need to interlace computers with their experiments.
Such an introductory course on machine organization and assemoly language
involves first some general overview of the organization of conventional computers and
then some revision of various number representations and 'ogical operations (so that
students will not experience any difficulties due to unfamiliarity with binary, octal or
hexadecimal representations of data in a computer). After completing this intrOductory
part of the course. students will proceed to programming in the "assembly language"
of some particular computer. This involves learning new programming constructs. The
high-level PASCAL constructs for procedure calls and conditional statement execution
are conSiderably removed from the detailed operations that can be performed directlY
by a computer. Consequently, students must change their programmmg styles from
PASCAL to the regime of the new machine's assembly language. Furthermore, students must learn totally different methods for localizing and identifying errors In their
programs.
Such an intrOductory course on assembly language programmmg COuld oe
aeveloped around the use of small micro-computers by the students. It is quite pOSSIble to use a time-shared computer to ,prepare programs tor micros, and to transfer
these programs to connected micros for trial execution. This is mdeed the scheme
employed in the more advanced third year -Microcomputers- course as taugnt in tne
department Although such -hands-on- experience might well be advantageous to stuaents. departmental resources preclude this approaCh being used with the large
secona year classes. Instead. the introductory course 10 machine organization ana
assemoly programming must rely on the resources of the department's time-shared
Perkin-Elmer computers running under the UNIX operating system.
In prevIous years, the course has emphasized the use of the Perkin-Elmer
maChi.nes and has, in large part, been an exposition on hOW to write large assemoly
language programs for a complex machine with a sophIsticated operatrng system.
There are a number of disadvantages relating to the use of the Perkin-Elmer
maChines when first Introducing assembly language programming. These machines
are complex, (baroque?), in their architecture. Students are immediately confrontea
With a plethora of instructions. data formats and programmmg conventions. The UNIX
system does provide some aids for debugging assemoly language programs. speCifically the adb interactive debugging program, but these aids are themselves complex

-3ana difficult to learn. Of course. because the stuOents' programs must be run unaer
me general time-sharing regime. all input anO output must De performed through
"magical" calls to the operating system. If all 1/0 tasks are thUS delegated to the
operating system then it oecomes more difficult for students to gain any appreCIation
of how input and output are actually performed; such an appreCiation IS an essentIal
prereqUIsite to subsequent studies of operating systems.
In 1982. an attempt was made to find a simpler environment In whiCh the oaslc
concepts of assembly language programming might be intrOduced. Mr. R. Nealon. the
software Professional Officer in the department. had rrevlously developed a simple
screen-onented simulator for a hypothetical machine. the "r80". This Simulator runs
on the department's time-shanng system and utilizes the limited graphiCS capabilities
of the stanaard terminals. The simulator system allows for programs to be written In
the assembly language of the raO and then ·visibly executed". The raO has a relativelY
small memory and only a couple of registers in its cpu. All elements of the maChine.
oom cpu registers and memory, can be simultaneously displayed on the terminal
screen. This display can show hOW the execution of each instruction Changes tne
state of the simulated machine. The raO Simulator allows programs to be executed
normally or. in "single-step" moae. one instruction at a time. This r80 simulator was
adopted and used in both the lecture course and in the first two assembly language
assignments. Most students found the display of the raO executing theIr programs to
De of considerable assistance in obtaining some understanding of how ~ computer
operates. Only after these initial assignments had been completed did students move
on to the greater complexities of assembly language programming for the PerklnElmer machines.
The raO is a ·hybrid" combining features present In many current micros; its
meChanisms for addreSSing memory. for calling subroutines and for manlpulatmg different sized data elements are. however. somewhat unconventIOnal. The r80 deSign
does not attempt to represent any realistic I/O mechanism. Apart from the smglestepping facility. the simulator does not provide any debugging alOs aKin to tnose that
stUdents must employ 10 more advancea exerCises. Although of conSiderable value In
mltlal exerCises, the overall applications for tne r80 were limited by these features and
by the small size of memory available for programs. <The r80 has only four hundred
bytes of memory and each instructIon requires four bytes).
It was decided to try to deVise a more elaborate computer Simulator starting from
the concepts Nealon had aeveloped ana expressed In the r80. The objectives of tne
new Simulator Included (i) proviSion of more memory to allow for larger programs. (ii>
mcorporatlon of some conventional debugging mechanisms tnatcould be usea In
aSSOCiation with visual displays of the simUlatea machine. (iiD Imp,ementation of some
SCheme for illustrating program execution at varymg levels of detail down to the tndiv,dual micro-program steps of the computers instruction CYCle. and (Iv) realistic SimUlation of I/O. HopefUlly. this more elaborate Simulator will allow students to learn more
aspects of assembly language programming within a controlled and helpful environment and may possibly allow students to write simple I/O programs without in anyway
adversely impacting the real computer's time-share system.

Given the decision to implement a simulator. one can then of course Choose any
real or hypothetical computer to simulate. The new simulator that has been developed
IS based Closely upon the Digital EqUipment Corporation's <DEC's) PDP-8 computer.
POSSibly the first widely used mini-computer. This almost arChaiC machine IS of
Course extremely limited in capaCity and unrepresentative of mOdern mlni-. ana
micro-computers. However. the limitations of the PDP-8 are of little consequence for
this initial teaChing application.
The pop-a's instruction set ,s very sparse; frequently many pop-a instructions
are necessary to realize the same effect as can be obtained by a single instruction on

-4 -

a more SOPhisticated machine. The available instructions are however quite sufficient
for the assignments attempted by students. Larger Instruction rep~rtoires tend to confuse by offering many alternative mechanisms for attaining the same objective.
On the PDP-8 all data elements. program data and instructIOns. are constant 10
size. Much greater flexibility can be attained on more modern machines with bit, byte.
half-word. full-word and double-word data elements; however. this very flexibility
entails artificial problems of data-alignment that are frequently diffiCUlt to comprehend
when first beginning assembly programming. The SUbroutine call mechanism. and
interrupt handling mechanism. on the PDP-8 are both rl.latively Simple. This very simplicity is inconvenient in sophisticated applications bUt. until students are familiar witn
the simple approaches (and their limitations). more elaborate mechanisms frequently
seem Doth arbitrary and over complex.
The Digital EqUipment Corporation manufactured several variants on me oaslc
PDP-8 machine. These differed not only in their actual hardware realization Out also.
to minor degrees. in their instruction repertoire. Later models tended to have rather
more instructions and some had an extra register in their cpus. The simulator ooes not
attempt to capture any specific model of the PDP-8. It is closest to one of the earlier
varrants.
80th the simulator programs and the assembler are written In standard PASCAL
The sources for these programs are available to students. and components of the
COde are used in illustrative examples during the course.
The rest of this document consists of notes for stUdents. Topics covered inclUde
the following:
aJ

An overview of the PDP-B. (This is a somewhat cursory review ot conventional
computer organizations. using the PDP-B as a specific example. ThIS material
should have been covered. in both greater Oreadth and detail. In lectures prror to
stUdents starting to use the simulators).

b)

An introduction to the basic simulator. <ThiS section explainS the form of tne
display in that version ot the simulator used to illustrate the "tetch-decooeexecute" cycle of the maChine. The concept of an ·obJect" file containIng a
machine compatible representation of a program is introduceo. as are assumptions about where programs are placed in the memory of the maChmeJ.

c)

Preparation of programs for the simulator. (A simple assemt>ler program IS IOtroouced. ThiS IS a standard two-pass assembler prodUCing abSOlute cooe. The
baSIC instruction set of the PDP-8 is presented and a simple example program.
SUCh as students might be expec~ed to write. IS g,ven).

0)

AOdressmg mechanIsms of the PDP-B and sUOroutlOe calls. The "paged"
addressing mechanism of tne PDP-8 is covered 10 a little more detail. the early
program examples avoid the addreSSing prOblems oy usang only page 0 and page
1. The subroutine call mechanism is illustrated.

eJ

Deougging. assembly language programs --- the aovanced simulator. (The
aovanced verSIon of the simulator is Introduced together with its "deougglng"
tunctlons. These functions Implement anotner vanant on DDT. adb or other Similar Oebugging systems. The various display options of the aovanceO Simulator are
covered).

f)

Flag-driven flO. Both the sectIOns on I/O are consideraOly abbrevlatea versions of
material covered in the lecture course. Input and output are presented here first
In terms ot ·flag-drlven" I/O methOds. The advanced SImulator has a "pseuaokeyboard" and a "pseUdo-teletype" which are In reality standard files on tne UNIX
system. Data can be read from and written to these pseudO-deVices by standard
110 Instructions of tne PDP-8. The simulation IS realistIC save that, for ObVIOUS
reasons. tne speed of these slOw peripheral devices has been Increased oy

-5somewnat more tnan three orders of magnituae.
g)

Interrupts. Interrupt driven 110 is introduced using an example program tnat
"acqUIres and processes" data from a pseudo-analog/digital converter. The data
acquIsitIon rate is constant. being clock driven; tne processing time necessary for
eacn datB element varies (data are random numbers. the processmg realiy consists of counting the number of binary 1s etc). In the long run tne acqUiSItion ana
processmg rates are approximately balanced but there are snort term fluctuations
making it necessary to buffer data between acquisition ana processmg.

n)

limitations of the PDP-B architecture. A few of thd limitatIons of this computer are
briefly noted.

-62. The DEC POP-8 Compoter.

The PDP-8, circa 1965. is an early model laboratory computer. Typically. it IS
usea for tasks such as monitoring simple laboratory expenments or runnmg remotejOb-entry stations for larger computers. The processor is stili manufacturea ana used
as the basis of certain rather restricted wora-processmg systems; it ,s also usea 10
cne of DEC's less sophisticated ·personal" computers.
A considerably simplifiea representation of the machine is shown belOW. We car.
oescribe the machine in terms of four main component5.
First. there is the computer's main memory where both program and data are
storea.
.

Second there is the central processmg unit. The cpu contains three subsystems
--- a control unit which effects the execution of the program. an arithmetic logic unit
Which modifies data and a set of registers. The cpu registers hold data currently 0810g
usea such as. for example. the last character read in from a terminal or the current
temporary result of some computation.
The third main component. the "bus", joins the cpu to memory (and to penpneral
oevices). The bus can be viewed as a communIcatIons highway conSisting of many
sIgnal wires. Some of these signal wires carry contrOl signals. others convey indiVIdual bits of data and still others speCify the destination of the data belOg transferred
on the bus.

(oata highway)
·bus·

Peripheral
Oevjce-"A"

Controller
for
oevice
MAIN
MEMORY

K
_ _ _.... 1 Controller

Peripheral
Devlce-"B"

_---I for
deVice

'•.g. an

"8"

anatog-Q(glcat

ContrOl
Unit
Arithmetlc- lr"
I'-LogIC unit
RegIsters

I

convener)
: ConrroUer
for
devIce

·C·
Peripneral

CENTRAL PROCESSING
UNIT

Devlce-"c·
(e.g. a

"teletype"

I.e. printing termonal)

The fourth and final component of our system IS comprlseo of the vanous penpneral oevlces ana the,r controllers. Data can be sent to or recelveO tram perlpnsra.
aevlces. For eaCh SUCh device there will be a controller. A controller mediates Demeen
the aevlce Itself and the computer's bus; eaCh controller will have specially aeslgnea
Ctrcuitry to convert data tram their external form {e.g. a slaw sequence
electrlCa.
pulses tram something like a keyboani or a particular tranSIent VOltage on an ana,og
to algilal converter ("aId"» into the conventional Signals useo within rne computer.

0'

It IS necessary to know something of the IOternal structure of main memory ana
the cpu: Oetails at tne bus and aevlee controllers are less Important

0'

2.1. The Main Memory.
The malO memory ot a PDP-B computer comprises 4096 ·worOs", <Our slmulatea
maChine has less, real PDP-Bs could be extenoeo to have more memory through variOUS naroware "Iduoges·), It IS convenient to regaro memory as D910g a vector. 1.9,
one-oimenslonal array. so a PASCAL Oata structure oeclaratlon tor tne memory wOula
reaa something like "const coresJze ::: 4095; yar store : arraylO..coreslzeJ ot
word", The inoex number of each word in thiS memo)' 1$ reterreo to as its "aOOre5s"
or ·'ocatlon".

0000'0000'0000
110110110110
010101010101
101010101010

(- aaOresa 0
Memory.
4096 ·worOs" 0'
12 bits eaCh

000111000111
101010011101
001010011100

(- aOOress 4095

MAIN MEMORY

On the PDP-B. memories were usually constructea out Of magnetic cores;
moaern variants of the mach me use semi-conouctor memories. It takes a fmite time to
write Oata mto a wora of memory or to reao data out at a worO. Typically. me "memory
cycle time" IS aDout one micro-second (I.e. one one-millionth Of a second) WIth
seml-conouctor memories being somewhat taster. ThIS memory cycle time was me
malor factor In aetermmcng the speeo of execution at programs. As Will be explalnea In
more aetall fater. the executIon of any indiVIdual instruction on a PDP-6 InvOlveO tram
one to three memory accesses. Consequently. with a one mlcro-secona store. tne
macnlne COUld execute sometnmg around 300.000-500.000 instructions per secona.

Eacn worO in a PDP-B's memory held 12 bits. A word could be regaroea as hOIO109 an Instruction. an unsigneo number m the range 0-4095. an adOress (WhICh was 0'
course Just an unslgneo number In the range 0-4095). a SIgned number In the range
-2046..+2047. one eight bit Character <with tour bits unuseO) or two SIX bit CharaCters
paCkeo together. The oata In any worO are at course ,ust some partICUlar panerns at
twewe 1$ or Os. The interpretation of these binary panerns aepenas SOlely on now mey
are usea by the program. The followmg are examples of different bmary panerns ana
thelf alternatIve interpretations;

92.1.1. 'nstructions)

o

~

I

3 11- 5

6

9

7· If

10

1/

I

[
opcoa.

1

interpretation at rest of word
depends on opcooe

e.g.

0 0

I

opcooe

0 0

r

0

0

J 0

0 0 0

addressing mooe and location

binary value 001000010000
octal value 1 0 2 0
interpretation as an instruction: TAD 20
meaning add the contents of location 20 to
tne accumulat~r register.

)

J 0

1 opcooe

0 0 D <r> )
device iOentWcatlon

I

I I

I

0

;functlon

binary value 110000011110
octal value 6 0 3 6
interpretation as an instruction: KAB
meaning reaa the contents ot the keyboard
butter into the accumulator and
clear keyboard flag.

10 -

2.1.2. Characters)

o

,

2.

J

b 7 fI 9

J.p 5

10 II

I1

J
1at

SIX

bit character;

I

2nd

8U(

bit Character

e.g.

t>mary value 001000010000
octal value 1 0 2 0
mterpretatlon as cnaracters: HP

/

I

0 0 0 0

0

J

omary value 110000011110
octal value 6 0 3 6
mterpretatlon as characters: O'

}

I J 0

- 11 -

2.1.3. Unsigned Numbers)

e.g.

o

~:!>

/.y,

5

6

7· 8'

0 0

I

0

0

0

I

0

0

C) 10

010

omary value 001000010000
octal value 1 0 2 0
interpretation as a numoer: octal 1020
1x5 12 + Ox64 + 2)(8 + Ox 1
512
+ 16
=528 oeclmal

I
tlloary value 110000011110
octal value 6 0 3 6
interpretation as a numoer: octal 6036
6)(512 + 0 x64 + 3x8 + 6x1
3072
+ 24 + 6
=3102oeclmal

/I

0

-

.,,-

2.1.4. Two's complement signed numbers)
(As will be discusseo in lectures. there are several different conventIons regaromg
now negative numoers snould be represented in a computer. The PDP-8 uses two'S
complement notation as 00 most. but by no means all other mooern computers>.

e.g.

tt 8' 9

fa

I'

100
binary value 001000010000
octal value 1 0 2 0
mterpretatlon as a numoer 528 deCimal

J

)

0 0

0 0

0

)

J J I

bmary value 110000011110
octal value 6 0 3 6
IOterpretat,on as a number 60360ctal
IS 2's complement of 17420ctal
i.e -(1x512 + 7)(64 + 4)(8 + 2x 1)
-(512 + 448 + 32 + 2)
-(994)

D

- 13-

2.2. The Central Processing Unit (CPU>.

As noted earlier. there are three subsystems withm the centra, processmg unit.
These three suosystems bemg the control unit. the arithmetic logic unit and tne
registers. The simplest of these suosystems is tne ·registers" subunit.

CENTRAL PROCESSING UNIT

contro' unit:
pc 000010000011 .
(flags
000000>
Ir 001010001000 .
mar 000010001000
mdr 000000000011
arithmetic logic
unit
registers:
acc 000000000001
link
0
<really pan of
"flags·)

2.2.1. The registers.
Data being maOlpulated by a program are held m the registers. Some. or all. of
the inputs to and outputs from the arithmetiC logiC unit must be routed via registers. A
computer's "registers" constitute a kind of smaller but much faster verSion ot maIO
memory. The time needed to access data in a. register will typically De 'ess lOan a
tenth ot the time needed to access data in main memory. Most maChines have several
registers. anything from four to sixteen. Registers are usea to hola those aata that are
of greatest Import at anyone stage in a program. Typically. they are used to hOIa
counters for "for-loops". temporary results from computations. indices {SUbSCriPts) for
arrays.
Some machines build restrictions into register usage 50 that. for examp,e. particular registers can only be used as array indices. The task of the assemoly language
programmer. or of the compiler processmg a high level language program. IS to proouce cooe such that efficient use 15 made of the available registers. Lite 15 slmp,er on
me PDP-B. there IS only one data register --- the accumulator {acc). ahe ·link" can
De regaroeo variously as an extra I-bit register. or as an extension of the accumUiaror
or as one Oit of the ·f1ags· register in the control unit {see below».

The PDP--8's Registers:

o

I

2.

~

5

if

,

7

I

acc

I

ff 9

/0

/I

I--

L

The PDP-8's one twelve-bit "acc" register has to be useo tor all aspects 0; a
computation. For example, In a simple program lOOP summmg the elements of an
array the "acc" is employeCJ as follows. First. tne lOOP mCJex is 10adeCJ into tne acc ano
"compareO" With the lOOP limit. (Since there is no "compare" instruction, tnlS operation
in tact ental's a sequence ot instructions to compute the Oifference between current
ana limit values tor the loop index. This difference IS left in the acc; the Oitference Will
be negative it the loop has not terminate(1). It the lOOp has not termmateo (as oetermlnea by testmg tor a positive or negative value in the acc), tnen the lOOP Inoex IS
again loaoea mto the acc and combineO with the adCJress,ot the start ot the array (0
oenve the address ot the particular element reqUired. ThiS computeo aOOress IS then
storeo In some temporary location 10 memory. Then. making use of the aoaress lust
calculateo and storeo. the required oata element IS loaoed Into the accumUlator. The
runntng total IS aCJCJed in ana the result storea back into memory. Finally. the program
WOUlo have a lOOP back to the pomt where the acc was agam loaaea WIth the lOOP
mOex for the termination test. ObVIously. when only a single regIster IS available to
perform eaCh ana every one ot these taSkS. mUCh ot the program Will comprise cooe to
loao ana store values from that regIster.

2.2.2. The Arithmetic Logic Unit CAlU).
,

Data are only manipulated in the arithmetic logIC unit. This unIt contam many
speclalizeO circuits. There will tor example be an "adCJer" Circuit that can aaa together
two blOary numbers; other CirCuits will perform boolean operations suCh as "anCJing" or
"onng" together two binary patterns. More costly computers incorporate more CirCUIts
In tne arithmetIC logic unit there could be a CirCuit tor multiplying integers or even ClfCUllS capable ot processing "real" numbers.
.
One of the teatures by which computers can be differentiated is the flexlbiHty of
rne mecnaOlsms tor teeaing data mto tne arithmetiC lOgiC unit anCJ tor airectmg tne
reSUlts into storage. The Data General NOVA computer IS an example ot a maChine
witn a Simple Out restrictive mechaOism. On the NOVA. only data in high speea registers can be passea to the arithmetiC logic unit and the results must be retumea In one
ot the input registers. More typically, machines allow also aala fetcheO trom memory
to oe combined With (1ata in a register wUh the result being placed in that regIster.
Some maChines are stili more flexlOle.
The PDP-8 has the Simplest and most restrictive torm of arithmetiC logiC unit.
There are really onlV two Circuits --- an adder. ana a cirCuit tor performing a ooolean

- I ~~ -

"and" operation. These circuits CU'"01ne data in the acc with data fetcned trom
memory leavmg the results m me acc.

2.2.3. The Control Unit.
The tinal component ot the cpu IS the control unit. Th,s comprises a numoer of
regl6ters contaIning Information that define the state ot execution of a program along
With CirCUitry mat laentlfies what InstruClion must De performed next and hOW 10 perform that instrucuoo.

2.2.3.1. The ControJ Unit's Reg'sters.
pc 000010000011
(flags
000000>
I( 001010001000
mar 000010001000
mdr 000000000011

Memory Address RegIster (-ma,-) and Memory Data RegIster (-mdr-).
The control untt registers with the simplest applications are the memory aooress
register. "mar·. and the memory oata (DuHer) reglsler. "mOr" ("mor"). The rOle tor
these registers 15 to interface between the cpu CirCuits and the DUS. If me cpu wants
some oatum, either a program datum or an Instruction. then it loa05 the aO~ress 01 the
memory locat.on contalntng the reqUirea oalum mto me mar and puts a ·reao
memory" signal onto the COntrol lines of Ihe ous. The memory unit responas 0'1 gOlOg
to the lOCation defined bV the value in the mar register. letChlng a cOP'l ot me contents
or mat locauon ana returning thiS copy, over 1he bus, to me mdr. Once me oalUm nas
mus been om8lned. the cpu Circuits route it Internally either to the inStruction register.
"ir". It It .s 10 De Interpreted as an instruction, or v.a Ihe arithmetiC logiC un.t Ihrough to
me accumulator rna datum represents something that the program .5 to manipulate.
Oata I>e.ng written to maIO memory also pass Via the mdr ana thence onto me ous.
Similar conventions pertain when trans.emng oa18 between ane cpu ana perlpnera.
deVice contrOllers.

I'

Instruction Rogt.ler ("tr-),
The instruction register holOs the bit pattern representing the instructIon currently
be.ng executeo. Through thelf programming e.penance In PASCAL, students are ,am.Ila( w.rn sort of macroscopic "mstrucltons· such as the aOdlllonsimulllpllcatlons etc
Inllolveo In expression evaluation. assignment -instructIons" ana proceaure call
"instructions·. When programming at assembly language 'evel. students have to learn
10 oefine the correspondmg operatIOns a. a more mlcroscop,c 'eve' where, lor example. aven a Simple PASCAL assignment slatemenV"instructlon", e.g. a:=D;, expanos
IntO three maChtn8-I~el matructlons (VIZ clear the ace. aad contents o' memory IOcalion "0" lO the acc, slOre me contents o' acc 10 memory lOCation "a"). Execution O' a
program inVOlves j) fetcnlng eaCh successIve InSIructlon tOto me ir reg.ster. II)
docodlng the 'elcheO IOstruCllon 10 determine exactly what It specifies, IIIJ executIon
01 the speclfieo sequence of aata manipulations.

- 16-

Program Counter (·pc·).
Another concept. already developed through the programming of PASCAL loops
and conditional statements. is that of the "locus" of control. Students are tamiliar witn
the idea of something --- "the computer" --- stepping through a sequence of PASCAL
statements comprising a program. Physically. this locus is realized in the form of the
program counter "pc·. The contents at the "pc" register is the aOdress in memory containing the next instruction to be executed. for straight-line code the program counter
can Simply be increased atter each instruction is fetched so that it contams the next
instruction address. (On the pop-a. all instructions occupy exactly one woro at
memory eaCh. so that for straight-line code the pc can be incremented by 1 eaCh
time).
Transfers of control. when encoding loops or for jumping around cooe that is only
conOltionally executed (as in "'f ... then begin ..• ; ... ; ... 'end" etc). are a bit more
complex. Basically. the address of the next instruction to be executeO has to be calCulateo. or retrieved. This computed address is loaded into the pc.
Subroutine calls. (procedure calls>. are even more tiresome. When catling a suoroutine it is not sufficient merely to transfer control to that bit of code (as achieveO by
loading me pc with the address of the subroutine); somehow a mechanism must be
prOVided to get back to the main calling procedure, resuming execution at that instruction immediately following the subroutine call. The mechanisms that are provided tor
adjusting the pc across subroutine calls constitute another of the more obVIOUS ways
for differentiating between various designs of computer~. The pop-a adopts a peculiarly crude approach adequate only for the simplest applications. This subroutme call
mechanism is detailed later.
Flags resister.
Another typical constituent register of the control unit in a cpu IS the "flags·
register. This comprises a set of one-bit flags each indicating various status settings.
Some of these flags might record the results of previously performed comparison
operations on machines with explicit compare instructions. Others might detail information regarding the status of the bus and Its use by peripheral devices. The pop-a.
at least its early variants, does not really possess such a ·flags" register.
2.2.3.2. Instructions. their format and decoding.
On the pop-a. the formats of instructions are simple. Three bits of an instruction
word. bits 0-1-2. identify the aetua~ instruction to be performed. With three bits it is
POSSible to represent eight different binary patterns. viz 000. 001. .... 111 (or 0-7 oetat).
Correspondingly. the machine has eight basic instructions. The different binary patterns. interpreted as representing instructions. are referred to as "opcodes".
One of these eight basic instructions. "iot". is used to speCify control signals tor
peripheral devices. Consideration of lot instructions is deferred until later.
Six of the remaining seven instructions are basically similar in form. These are
tne "memory reference" instructions. They use the nine remaining bits of the tweivebit instruction word to identity a memory location. This location might constitute the
source from WhiCh data are to be fetched when performing an addition or an "ana"
operation. or might represent the destination into which the current contents of tne
accumulator register are to be copied. In a "jmp" (i.e. goto) instruction. tne address
bits of the Instruction word will specIfy the memory location containing the next
instruction.
The tinai "instruction". "apr". really comprises two whole tamilies of instructions
tor manipulating data in the acc and link registers. Some of these "operate" instructions involve clearing (i.e. setting to zero) the acc and/or link registers, comp~mentJnQ

- 17tne bits in these registers (all binary 18 become Os and vIce versa) and rotating me bit
patterns around.
Other operate instructions implement a rather restricted form ot conditional
branch instruction. These "skip" instructions are limited in that they allow tne program
to branch around only one instructionl A typical skip instruction. "sna" (Skip on nonzero accumulator). involves testing to see if the acc is non-zero. if so tne program
counter will be incremented causing the immediately succeeding instruction to be
skipped over. In the same group as the skip instructions there is me "hit" (halt)
instruction; this stops the computer at the end of a program.

2.2.3.3. The Fetch-Decode-Execute Cycle.
The circuits both for identifying the instruction to be performea and for performing instructions can best be conceived in terms of a stored ·program". (In tact. rnat IS
quite otten how the circuits are implemented). This "program" defines a sequence of
data transfers between specified registers. This control program can be envlsagea as
being something of the the torm:
repeat
fetch instruction;
decodelnstructlon;
executelnstructlon;
until halted;

The "halted" flag gets set when tl1e machine executes a "halt" instruction.
The individual procedures. fetchinstruction, decodeinstructton, and executemstruction. will comprise code that specifies how data is to be transferrea between various registers of the cpu and locations in memory. Thus. fetch instruction could De
something like:
procedure fetchlnstructlon;
begin
( Copy program counter to mar
mar:=pc;
( Send request to memory, via bUS, for contents )
}
( of location specified by mar
trommemory;
( Copy retrieved datum from mdr to ir
ir:=mdr;
( Increment pc so that Its pointing at next
}
(insuucHon.
)
( i.e. pass 1 & contents of pc to adding circuit)
(
truncate the result to 12-bJts
)
)
(
store truncated result back In pc
pc:=add(pc,1) mod 4096;
end;

The proceaure tor decoding an instruction consists. pnmarily. ot a big "case" statement. The three bits. 0-1-2. containing the opcooe must be abstracted from me ir
register. This opcode. 0-7 octal. defines the branch of the case statement appropriate
tor the particular instruction to De executed. Thus. opcode 0 signifies that an "ana"

- 18instruction is needed. while opcode 7 specifies an "opr" operate instruction.
The interpretation of the remaining bits of the instruction word depends on tne
particular instruction being executed. For instructions like "and", "tad" (i.e. add>, "jmp"
<i.e. Jump or goto) and "dca" (i.e. deposit contents of acc in memory and tnen Clear
acc) the remaining bits in the ir register will specify. directly or indirectly. the address
of tne memory location to be used. For these instructions, tne rest of the deCOding
process consists of resolving exactly what memory address is being referenced and
getting this address into the mar register. In an "iot" instruction. the remaining bits
will specify which device controller is to receive a c )mmand signal and. also, will
identify the particular command signal that must be sent. The remaining nine bits in
an operate instruction specify the particular bit-manipulations or skip-tests that must
be performed on the contents of the acc and link.
Finally, once the instruction has been fully decoded it must be executed. ExecutIon can again be described in terms of a program specifying transfers between regIsters. A simple example is provided by the PDP-8's "dca" instruction. The effect of this
instruction is to store the current contents of the acc in a specified memory location
and to clear the ace. The instruction decoding process will have identified the instruction as being "dca" (from its particular opcode Oll-binary 3-octal). and will have
decoded the address bits to derive the required address which will be held in tne mar
register. The actual execution of the "dca" instruction can be defined in terms of tne
tollowing micro-program for manipulating the contents of cpu registers and store:
{ Execute dca instruction
} ,
{ (instruction decoding stage has already set the )
( address of the memory location in the mar register) }
{First, copy contents of acc into mdr register
}
mdr:=acc;
( Now send signal over bus to memory telling memory }
{ unit to store the contents of mdr register in the }
)
( the location specified by the address in mar
tomemory;
( Now clear the acc
)
acc:=O;
( Finished execution of dca

The control unit will have such micro-programs corresponding to eacn possible
instruction of the computer.

. 19-

S. The Basic Simulator.

S.l. The simulator and Us display.
At this paint. it is best to look at the simulator program. Details of how to run tnls
program under UNIX are given elsewhere. Essentially. the simulator expects to reao m
tram a tile. me "obJect· tue. a prevIously prepared ano encodeo Oefinltlon Of tne
sequence at PDP-8 instructions that comprise the program to De run. These de~ are
reao in; me user is required to specify a display .,peed appropriate tor the terminal m
use and to mdicate whether the program is to run continuously or IS to pause between
eaCh stage in the instruction tetcn-deCOde-execute cycle. Once appropriate control
parameters have thus been specifiea. an init,al display or the state at the SlmUlareo
maChine IS presenteo. ThiS display IS continuously updated as the simulateo maChine
executes the program provided.
The general torm of the Oisplay IS shown below. Bold type has been useo to 100'cate those fields that are high-lighted on the screen <through the "inverse-vloeo·
display capability>. Fields containing asterisks are filled with specific octal data m real
dIsplays. The fields <translated Instruction>. <Major stage of Instruction> and <minor
stage of instruction executIon> contain text defining exactly What actions are currently
being performed. The field <single step prompt> contains a prompt when the maChine
is being run in single-step mode and IS waiting for a user-response betore continumg
. operations.

+--------------------------------------------------------------IC.P.U.

acc

,"'U"', link Cl

pc

1..

1< ....

1 ir

,..."",,)

mar , .... uJ mdr ~ <translated IOstruction>

<Major stage o. Instruction
: <minor stage Of instruction execution>
:BUS: control

nUll

address

""",."It

IMEMORY

..... .....

AaOress Contents

. _.1ll .....

... " ....
.... ......
••••

a.a ..

<SlOgIe step prompt>

+---------------------~--------------------~-------------------There are tnree components in the display. The top portIon of the screen snows
Oara definIng tne state of the cpu along with the textual deSCriptions of current acttvlty.
The central portIon ot the screen shows the sratus at the bus. ThIS portion of tne
Olsplay defines the last data transfer between the cpu and memory specIfying Whether
a memory READ or WRITE operation was performed. the address of the reterencea
memory location and the value for the aata transferrea.
Finally. in the thira component of the display. there is a ·wlnoow· into memory.

- 20 -

Even In the basic simulator. the simulated PDP-8 possess more tnan five nunoreo
·woros· ot memory. It IS Obviously Impractical to simultaneously display tne contents of
all memory locations. Instead. the ·wlndow· into memory snows that location to whicn
access. tor reading or writing. nas most recently been maoe. The two preceoing ana
~wo sUDsequent memory locations are alSO shown. If tne most recem memory access
was a READ tor an instruction tetch. then tne memory Window wlll snow tne sequence
ot Instructions currently belOg executeO. 11 the last memory access hao been a WAITE
to a location representing some element of a vector being processeo in some lOOP
tnen the five memory locations shown would represent the array element. the two elements preViously processed and the elements stili to be processed in the lOOp. The
memory wmdow Changes at eaCh memory access.
A typical instantaneous Clisplay IS shown below. ThiS example represents tne stale
ot the maChine when it IS part way through executing a particular add instruction. ThiS
aoO instruction was at memory 10caUon 0201 (all values gIVen are in octal); me program counter has already been incremented and identifies 0202 as tne memory location containmg the next instruction to be executed. The mstruction register contams
the value 1207; Just belOW tne pc/ir reg.sters the display shows the translation. I.e. tad
0207 lmeaning aOO the contents ot memory location 0207 to the current contents ot
me accumulatorJ. The acc currently holO the octal value 1. the link is zero.

+--------------------------------------------------------------IC.P.U.

"

•

ace 10001] link

lID

pc

[02021 ir J1207l

mar {0207l mdr ~

tad

0207

E.ecut'ng 'natructlon
IOata Fetcn
IBUS: control AEAO

address 0207

data 0002

IMEMORY
I

Address Contents
0205 7402
0206 000,1
0207 0002
0002
0210 0003
0211

0000

To contlOue pressAETUAN

+--------------------------------------------------------------Execution ot the instruction nas reaChed tne POlOt where tne reqUireo oatum, I.e.
tne contents ot memory location 0207. has been fetched trom memory. The text
oescnptlons identity the major stage as being ·Bcecutlng Instruction" wltn tne manor
stage as ·Oata Fetch·. The mar reg.ster identifies location 0207 as tne last memory
element reterenced. The bus display snows that tne most recent tra-nster was a READ
trom memory.' at aOdress 0207. The memory wmdow shows locations 205---211 octal.
The reterenceo aOOress IS highlighted. and the value reao from tnat adoress repeateO
to me ngnt nand siOe of the column ShOWlOg me contents of the oisplayeo memory

- 21 -

locations. This value. 0002. is also indicated as being the last data element transferred
over the bus. from where it has been copied into the mdr. As values are transferred
into. or out from registers. the corresponding display fields are briefly highlighted.
Since the program was being run in single-step mode. execution has Deen suspended
at this point awaiting a response from the user.
On receiving a response from the user the program would continue. The contents
at the mdr and acc registers <i.e. 0002 and 0001) would be (conceptually) passed to
the adding circuit of the ALU at the simulated machine (the ALU is not fOcorpora~od in
tne display). From the ALU. the result 0003 of this addition would be routea into tne
ace. Execution of this add instruction would then be complete. The simulator would
proceed then to the fetch cycle for the next instruction.
The display would change appropriately. The major cycle would be identifieo as
instruction fetch. The contents of the pc. 0202. would be seen to be copied to the mar
ana a memory read access performed an the appropriate location. The memory
display would change to show locations 0200-0204 and the appropriate datum. i.e. tne
bit pattern representing the next instruction. would be movea from memory to the mar.
Once again. if in single step mode. the simulation would pause for user response.
3.2. An example program. and conventions for arranging programs in the
memory of a PDP-8

The program that the machine was executing. when this display recordmg was
maae. was:
location

0200
0201

0202
0203
0204
0205
0206

0207
0210

0211

instruction
add the contents of memory location 0206
to the current contents of the acc
add the contents of memory location 0207
to tne current contents Of the ace
-add the contents of memory location 0210
to the current contents of the ace
store the resulting sum in memory location
0211. and clear the acc
reload sum trom location 0211
halt
the constant, 1
the constant. 2
the constant. 3
zero. (for the sum)

This program fragment introduces some assumptions. Why. tor example. tne start at
location 0200? What did the ace contain before we added in the contents at memory
location 02061
In tne general case, deciding where a program is to be executed in memory IS a
ratner onerous taSk. Typically. a computer will be tlme-sharea. programs tor many (Jifterent users will be in various parts of the machine's main memory with cpu-attentIon
being swapped around between them. If you specify particular memory locations. like
0206 as in the example above. then obviously your program will only execute correctly
if your data is loaded into the real memory location 0206. If this location happens to be
occupied by some other program then you must walt before you can run your program.
Since generally you would want to have programs in all parts of memory you would
have t() establish some arrangement that particular programs go at particular

aooresses. SuCh an arrangement is hopelessly inconvenient. Consequently. it IS normat to try to Oefer the decision as to where a program IS to be run tor as long as possible. Usually. one pretends that the program starts at some fixed location sucn as 0 or
40000000-octal or whatever; all memory addresses are tnen described relatIVe to tn,s
presumeo starting POIOt. Then. Just before the program is executed (or possibly dUring
execution). these relatIVe addresses are converted to real maChine addresses
appropriate tor wherever it IS that the program has actually been stored in memory.
This adjustment of addresses in the te)(t of the progral ) can be effected by hardware.
or Dy software or Dy a combination of ooth.
Life of course is easier on the PDP-B. Usually such a Simple macnme IS aedlcated to a specific application and then it is possible to choose appropriate addresses
and incorporate tnese in the program. There are certain conventions aDout now
memory IS utili2ed. For reasons that hopefully will become clearer later. tne first 200
octal 028 decimal> locations. with addresses 0--177. are essentially tne place wnere
one stores one's global variables. The next 200 locations. addresses 0200-0377. are
where tne main program goes along with pOSSibly one or two small suDroutlnes,
Restrictions on how the pop-a can access lts memory make it appropriate to
tnink in terms ot 200-oetal 028-decimal> word -pages·, Page 0 is tor the glooals. page
1 tor the main program: subsequent pages are tor subroutmes or oata arrays. The
baSIC simulatOr has tour such pages of memory. The simulator has a built 10
presumption that the main program will commence at locatiOn oetal-200. ;'e. tne first
word of page 1. and appropriately initializes the pc.' It also zeros tne ace ana link
registers prior to executing the pop-a program read from file. On a real PDP-B computer. there were small switChes on the -operator's console- that alloweo tne user to
Clear registers and appropriately initialize the program counter.

f-~~;OO~~~~~~l'
010101010101

~

000 100000000

Page O. addresses 0.. 177. globals

i

------------111011000000
001000010000

Page 1. addresses 200..377. maIO
progra.m: start address 200.

000000001010i

------------

t

000000000000
110000011010

l 000000000000 i
r------------,

Page 2. addresses 400..577.
subroutines or /lata

- 23-

3.3. Representing a program to be executed.

The program. that is to be executed by the simulator. IS not of course representeo
in terms of textual definitions of instructions like 0200 add contents ot memory location
0206 to ace. Instead. the "object" file read in by the simulator contalOs. as a set of
octal numbers. definitions of what memory locations are being usea ana the IOstructlons. or data constants, that are to go in these memory locations. The Object fiie for
tne example program given earlier reads as follows:
·0200
1206
1207
1210
3211
1211
7402
0001
0002
0003
0000
$

o
Lines beglOning with an asterisk define the first address for a subsequent sequence of
instructions; there may be more than one such origin directive in a object file as
would. for example. be the case if there were some program text at 0200 and a large
pre-initialized data array starting at 0400. Lines beginning with a slOgle space are
interpretea by the simulator as specifying data. instructions Iconstants Iwhatever. tnat
are to be placed in the next available word of memory. The dollars terminates tnis
section of the- "object" file. The simulator contains a couple of routines to read in an
appropriately formatted file of octal numbers and store them in the array that
represents the memory of the simulated pop-a.
With a real computer. things tend to be a bit more complex. firstly. a mucn more
compact encoding would be used for the object tile. Here we use a sequence of characters '1':2':0':7' to represent the 12-blt binary pattern '001010000111'. But eacn
sucn ordinary character takes up eight bits in itself so. really. we have a 32-bit
sequence conveying only 12-bits of information. Many more efficient encoding
schemes are possible. In other respects. our object file is reasonably realistic; any
real object file must convey the same information concerning addresses and their
contents.
Of course. unlike our simulator. a real computer can't have built in PASCAL functions for reading in. "loading". the contents of an Object file. However. it is possible on
almost any machine to write a short "loader" program that can read in object tiles from
some stanoard device and can fill out the appropriate memory locations with tne data
representing instructions and constants. Rarely would such an "absolute" loaaer
require more than a score or so Of instructions. It is a common expeOient to utilize tne
last few words of available memory to hold this loader program.

- 24-

(-- aaoress 0

USER PROGRAM AND DATA
HERE

000000000000

(-- last location available to user

111011000000

<-- address 7740QJJ

011l1l11111j
000010000000

G------------

System LOADER program tor loading
user .programs 'rom some standard
Input deylce.

MAIN MEMORY

Sucn an "absolute" loader represents about the simplest form of "systems· software on
a computer; as we elaborate thIs "operating system" we tlnd tnat a rapIdly aecreaslng
fractIon of the memory of the computer is left for user's own programs.

- 254. Preparing Programs for the Simulator.
4.1. Assembly language programming with the ·smap· assembler.
A program represented as a table of octal numbers may be qUite appropriate for
a computer but it is barely comprehensible to a human. Of course one can learn to
associate a particular binary pattern. e.g. OOO-binary O-octal. with a particular operation that the computer can perform (in this case an -and" instruction). But it is a lot
easier if the text of the program had some totally equivalent but more r\;aaiiy
comprehensible alternative representation. such as for example the wora "and" if its
an and instruction that we want.
It is in fact quite exceptional for anyone ever to have to write out a program tor a
computer in terms of the binary patterns representing instructions (or tneir eqUIvalent
octal or hexadecimal representations). Instead. one writes in "assembly" language.
There are no general assembly languages. each such language IS unique to a
particular machine. However. most take essentially the same simple form. Each line of
tne assembly language program represents information that will eventually correspona
to one instruction that can be executed by the computer. Each line is divided up into
"fields".
First. there is a label field (familiar to those with some exposure to FORTRAN (with
statement "labels"). or even BASIC (with line numbers». A label can identity a particular program statement and allow reference to be made to it elsewhere as. tor example.
in a jump (goto) instruction. Labels are also used to identify those memory locations
employed for holding constants or program variables. The label field will frequently be
left blank (more like FORTRAN than BASIC).
For an instruction. the next "field" will hold. not the opcode itself, but instead
some name. or "mnemonic". for the opcode required in the current instruction.
Mnemonics are supposedly chosen to remind programmers of exactly what operations
a particular instruction entails. "and" is a tairly obvious mnemonic tor a logical AND
instruction; the PDP-8's use of "tad" to designate the addition instruction seems oOd
until one knows that the programmers who thought It up liked to De reminoeo that they
were using two's complement arithmetic (hence. two's complement addition).
What comes after the opcode's mnemonic depends on the particular mstructlOn.
If the operation involves for example transfer of data between a cpu register ana
memory then the next field(s) will contain some specification of tne aOdress of the
appropriate memory locatIOn.
Differences in assembly languages are fairly obvious ana closely related to
specific machine characteristics. For example. in many ways the NOVA is like a pop-a
but the NOVA has two accumulators. "A" and "B", where the pop-a has but one. On
the pop-a. data can only be stored from the acc register: but on tne NOVA data can
come tram either "A" or "B" accumulators so. in any store Instruction. one must
specify which register Is being referenced. This information may be Incorporatea in
the opcode. e.g. have distinct STOREA and STOREB instructions. or may be specified
through some additional instruction fle'd. e.g. STORE (AlB). Machines Oiffer significantly in the range of ways in which the required memory location may be specifleO.
so another area at difference in assembly languages is in those fields wherein the
address is defineo.
Comments can be included in assembly programs. These are SOlely tor tne
Denefit ot the programmer and are thrown away before the program is ever preparea
tor machine execution. However. given the intrinsic difficulty of reading an assembly
language program. comments are essential. (Maybe more essential than tne COde?
The code itself will soon be obsolete. good comments will at least tell the next guy
about a program that was once thought worth writing and then maybe he can rewrite

- 28it>. ·Usually. comments are introduced by some special character uypically this Char-

acter might be ./'. ';' or '''''); all text on a line following this comment marKer is considered to be comment. Some assembly languages make special provision tor comments on each line of code (everything after the 35th character. or other arbitrary
limit, gets considered as being a comment).
A program written in assembly language has ot course still to be converteo into a
form tnat can be processed by a computer. This conversion is the task ot an a assembler", An assembler reads the source text of the program and generates trom this an
Object file. similar to those we have already considered. and a listing that Oetails
exactly the instructions generated and the storage locations assigned both tor those
instructions and data. The object file is tor input to the loader of the computer rnat is to
execute the code. The listing is used as a reference by the programmer when checking out the actual execution of the program.
The assembly language that one uses is really defined by the assembler program, Different assemblers. all devised to produce code for the same machine. will
vary greatly in the facilities rnat they proVide their users. One assembler might allow
users to include string constants In their programs. e.g. "Hello World". wnereas
(mother assembler might require that the programmers specify each byte ot such a
string as an octal or hexadecimal constant. <i.e. 110; 145; 154; 154: .. J. Such differences in the assembler are manifest in the assembly language that is celinea.
An assembly language program for our "add three numbers" example is as tollows:
I Add the values of the three constants in consta. constb.
/ and constc. Store the sum in ·sum". Stop with the sum
I in the ace.
·200
tao consta
tad constb
tad constc
dca sum
taO sum
hit
consta. 1
constD.2
constc.3
sum. 0

•

It is more useful to provide comments that attempt to explain tna purpose at the subsequent section of code than to append a comment to each individual instructIon. Quite
often. programmers first working in assembly language will. on being tOld to incluoe
lots at comments, produce a program that reads:

- 27-

"'200
tad a

a.
b.

c.
d.

tad b
tad c
dca d
tad d
hit
1
2
3

o

1 add a
1 add b
1 add c
I store

to accumulator
to accumulator
to accumulator
accumulator in d and clear ace
/ add d to accumulator
/ halt
I a
1

=
/c =3

Ib=2

fd

$

While tnere are indeed many comments in the second version tney convey little information.
This fragment of assembly language is written in tne format used by the ·smap·
assembler that has been devised for processing students' PDP-8 programs. For smap.
comments should be introduced by '/'. smap accords no particular significance to tne
column used for instructions. standard tab positions are quite convenient. FOllOWing
standard PDP-8 assembler notations. label names (whiCh must begin with letters and
comprise six at tewer characters) are terminated by a comma •••.
Apart from the lines containing the six instructions. the program nas three constants and one variable. For smap. the values of the constants must De gIVen as
(unsigned) octal numbers; the variable should be initialized to zero. In more sophisticated assemblers. there are ·pseudo-ops· or "assembler directives" that will. tor
example. allow one to define a decimal constant or create a text string rnat IS to De
Istored as a sequence of characters in several successive memory locations. smap
does not implement any such "data definition" pseudo-ops.
In fact. smap only has two assembler directives. One is represented by tne '$'
sign in the example. This is used to mark the end of the input so that smap knows that
it nas read in the complete program. The other assembler-directive implemented in
smap is an ·origin" directive. An origin directive allows the programmer to speCify
where a particular bit of code is to go. (presuming at course that one can speCify
absolute addresses). Here. we want the code to start at 0200. Following stanoard
PDP-8-assembler conventions. an asterisk ,'/(, is used to identify an origin directive.
·"'200· speCifies that the next sequence of instructions go into locations starting at
memory location 0200.
The example also illustrates a minor difference in layout between PASCAL programs and typical assembly language programs. In PASCAL. all variables are
declared before the code. Usually. though not invariably. assembly language programs
have a set of instructions followed by -declarations· of local variables. Some programmers like to gather all variables together and place them subsequent to all cooe sections. while others will keep each subroutine and its local variables grouped together.
Restrictions on how a program may address variables may preclude one or otner of
these options on a particular machine. On the PDP-8. it will usually be more appropriate to declare a subroutine's local variables immediately after its code segment.
4.1.1. The assembly process.
The operations of an assembler program will be considered in more detail in the
lecture course (some details may also be available in an appendiX to this document).
Essentially, an assembler has to read through the source text of the user's program.
take cognizance of origin directives. find instructions and generate appropriate COde.

- 28We can consider. briefly. how an assembler might process rne example program.
First. it can obviously ignore all comment lines. <those beginning with 'n. On finding
an origin directive, e.g. *200. it m.ust update whatever record it keeps of where 10
memory code is to be placed. The first real instruction in the program being assembled reads "tad consta".
The word "tad" can obviously be easily abstracted from this line. An assembler
nas an internal table. its ·symbol table". wherein there are definitions of. in effect.
reserved words. The assembler can look up "tad" in this table and confirm that it is a
valid instructIon mnemonic. that It's a memory referenctl instruction and so shOuld be
followed on the same line by some address specification. and that the opcode that
shoula be written to the object file is "1".
The assembler could then continue to process the same line and would isolate
the word "consta". Of course. this is the first time that this word has been encountered.
There is no definition for it in the symbol table. From the context. the assembler might
infer that its intended to be the name to be accorded to some memory locatIon but it
has no way of determining the appropriate address.
There are various ways of reSOlving problems of such "forward references"
(references to variables or program labels whose definitions have yet to be encountered). The simplest solution is to read through the program text twice. On tne first
pass. the assembler program just finds all variables. and program labels. and oetermines their appropriate addresses. These data are inserted into the symbol table.
Then on the second pass. code can be generated because the assembler program
will by then possess the required addresses.
'
smap is an example Of such a two pass assembler. On its first pass. smap will
note the origin directive 1t200. The next line. "tad consta". is recognizable as an
instruction; smap can ignore details and simply increment Its location counter to 0201.
Each additional instruction is processed similarly and results just in an Increase in tne
location counter. When the line. "consta. 1". is encountered the smap assemDler will
Identify a label (that comma makes it very easy to write a label detection routineO. The
current value ot the location counter is 0206; smap can add the information
"name=consta". ·symboltype=label". "valu8=0206" to its symbol table. Similar processing defines the labels "constb". "conste" and "sum" with values 0207. 0210 and 021l.
On its second pass. smap can now process an instruction like "tad consta". The
"tad" is recognized as a valid instruction mnemonic; the first three bits ot the instruction word being assembled can thus be set to l-octal OOl-binary <the corresponding
0pcOde). "consta" is recognized as a valid operand in an instruction's aoaress field.
because it has been defined as a user label. Its value. 0206. can be used to till out tne
nine-remaining address bits ot the instruction word. Thus. the full instruction "1206"
can be generated.
Like any other language. an assembly language has a few rules of syntax. Irs not
right to say something like "tad tad" (because one shOUldn't be using "tad" to name a
memory 10catlonL Similarly. it's wrong to have an instruction such as "and fred" if the
label "fred" isn't defined anywhere in the program. It's also wrong to use tne same
labe' name twice; it may be tempting to use the label "Ioop'- at every pOint where a
loop back is necessary but such usage will just confuse the assembler. One other syntax restriction is the limitation of labels to start with a letter and comprise six. or fewer.
Characters; another restriction in smap limits the size ot numbers (which should not De
greater than 4095 decimal>.
There appears to be a general agreement among authors of assemDler programs
that reports ot syntax errors in users' code should be as unhelpful as possible. On
detecting an error. the usual response of an assembler program is to print a one
character message of complaint and to then stop.

- 29-

smap attempts to be more helpful in that it identifies. reasonably precisely. tne
pOint at which the error was detected and the perceived nature of the error. It may weU
still be necessary to run the smap assembler several times betore all errors are eliminated. For example. if smap discovers. during tne course of its first pass. that you
have oefined the label "loop." as being in two different places tnen it reports tnis "ooubly aefined symbol" error <identifying "loop' as the offending symDoi along wltn tne
paint of second occurrence) and then stops. It may also be tne case tnat you nave
referenced another label. e.g. ·loopl" as in "imp loop1". which you have never defineo.
smap does not detect such ·undefined label" erro's until it has been able to successtul/y complete its first pass and is working on the second pass through your program"s
sou rce text.

4.2. The Instruction Set.
It is at this stage appropriate to consider the maIO instructions. tneir opcooes ana
mnemoOlcs. that are available to you when programmtng the pop-a.
The Memory Reference Instructions
There are six instructions for referencing memory. tnese being and. tad. dca.
Isz. jms and jmp. Three of these are used either to transfer data trom tne accumulator to tne maIO memory (dca) or to combine data trom memory with the current contents of the accumulator (tad. and>. The jmp and jms instructions provide tor unconditional transfers of control and for subroutine calls.

The isz instruction has rather a lot of related uses. What it actually does is first to
take me contents of a speCified memory location. increment this value by 1. ana store
tne result back into the same memory location; then. the instruction causes tne cpu to
check whether the addition gave a zero result. I.e. the old value had been -1. if so tne
program counter would be incremented causing the next instruction to be sKippeo.
There are three common ways in which this instruction is used. It can contrOl simple
"for loops"; you_initialize some memory location to minus the value of the loop limit and
tnen at the end of the body of the loop. just preceding the jmp instruction tnat takes
you back to the beginning of loop sequence you ·isz" this location. wnen you've been
around the loop enough times you wUl get to skip the jump back.
pseudo-PASCAL (with labels)

tad limit
cia
dca x

a:=limit;
a:=-a;
x:=a;

100:

code

LlOO.
(lOOp body)

x:=x+ 1;
}
if x=O then goto 101;
goto 100;
101:

}_>--:-

isz x
jmp LlOO

LlOl ••

The other uses both take isz as just a convenient method of incrementmg a value.
and ignore the bit about Skipping on zero. An iSZ instruction might for example be a
convenient way of updating some counter expected to lie only in tne range say 101000: a single isz instruction can update the value whereas at least tnree instructions

- 30-

would De necessary if the current value of the counter had to be loadeo Into the ace.
incrementeo by 1 and the result stored back. The other. rather similar use. uses tne
ISZ instruction to increment an -aOdress pointer·. Pointers will be oiscuseo further later
on: Daslcally they allow reference to some memory location whose aOdress nas oeen
Jenved through some calculation. like tor example in COde tor accessing an array element. If an array is being processeo in sequence then quite commonly the ISZ instruction will be useo to increment tne aoaress pOlOter to reference the next element.

Summary of Memory Reference Instructions.

o

I

~

.3

5 6

7 8 9

10 II

1:: I:::::::: [
I

opcoOe

(

If

address moo. and 'ocatlon

OPCODE
OOO-binary
OOl-blOary
OlD-binary
Oll-binary
lOO-binary
101-blOary
0)

1)

0
1
2
3
4
5

,

MNEMONIC
AND
TAD
152
DCA
JMS
JMP

and: logical AND. The and instruction causes a Dlt-Oy-Oit Boolean AND operation between the contents of accumulator and the oata woro speclfieo by the
InstructIOn. The result is left in the accumulator.
tad: Two's Complement Addition. tad performs addition between tne specllleo

aata woro ana the contents Of the accumulator leavlOg tne result of the aaaition an
me accumulator. If a carry out ot the most Significant bit ot the accumulatOr
Should occur tnen the link bit IS complemented.
,

2)

lsz: Increment ana Skip if laro. The ISZ Instruction adds a J to tne referencea
aata woro ano then examines the result of the aOditlon. It a zero result occurs. tne
IOstructlon fOl/owlOg the ISZ IS skipped. It the result IS not zero. tne IOstructlOn tOIlOWIng the ISZ IS performed. In either case. the result of the addition replaces the
onginal data wora In memory.

3)

dca: Deposit and Clear Accumulator. The dca instruction slores tne coments of
tne acc In the referenced location. destroying me onglOa' contents ot tne lOcatIon. The acc IS tnen set to zero.

4)

jms: JuMp to Subroutine. <Discussed in next section on addresslOg mooes ana
subroutInes) .

5)

jmp: JuMP. The Jump (goto) instruction loads the eHective address. calculatea
auring IOstructlon oecoding. IntO tne program counter pc.

1-

operate lnstructions.
In principle. tnere are 9 bits available to loentity operate class instructions. supposedly meretore allowing some 500 SUCh instructions. It doesn't work mat way.
Operate instructIons are "mIcrocoded". SpeCific bits are used to deSIgnate speCific
"mIcro-instructIons". One "micrO-instructIOn" Clears the acc O.e. sets It to zero);
anotner Clears me link. "Micro-InstructIons" can be combined In various ways. Thus.
one can clear the accumulator and then go on to Increment it {SO getting the constant
0001 10 the acc) all within a Single operate ins.:uctlon. There are however lots ot restflctlOns on the allowed combinations ot mIcroinstructions. Consequently. only a tewof
the 500 Odd possible 9-bit binary patterns actually represent valid. executaOfe 105truclions.
The oaslc operate "micro-instructions" are:

data manipulation:
i)

cia. Clear me accumUlator. I.e. set it to 0000.

ii)
iii)

ell. Clear the linl<.
cma. complement the accumulator. i.e. ali binary 1-5 oecome 0-5. all O-s
become 1-5.

IV)

em'. complement the link.

v)

rar. rotate tne accumUlator and link fight. ThiS instructIon treats tne acc ana link
as a Closed lOOP and shifts all bits one position fight:

example:
oetore
L

o

/

0

0

5
0

J

I

I

0

.0

10

9

10 1/

0

0

0

•

atter

-_.

000

-- .. _.
vI)

vII)

---

..

_.

_. -- -. _. -

"-'"-'

_..

0

-.I

rtr. rotate two fight. a snlft of twO places to tne fight is executed. 80th rar and rtr
use what 15 commonly called a Circular snlft. meaOlng that any Oit rotated off one
ena of the accumulator will pass into the link and then on again intO tne other
ena ot the accumulator.

rat. rotate left. This instruction treats the acc and link as a Closed loop anO Shifts
all bits in the lOOp to the left performing a circular Shift left.
viii) rtf. two place rotate left.

- 32ix)
Xl

lac. increment the accumulator. the contents of the acc are increased by l.
nap. nooperation is performed: the program control is simply transferred to tne
next instruction in sequence.
A particularly common combined instruction. which has acqUired its own
mnemonic. is Acia A. AclaA combines complementing and incrementing and is tne
instruction necessary to negate the number in the accumulator.

skips. (conditional jumps over a single instruction)

v)

sma. Skip on Minus Accumulator. The next instruction is skipped if tne contents
of the accumulator. interpreted as a two's complement number. is less tnan zero.
spa. Skip on Positive AccumUlator. The next instruction is skipped if tne accumulator is greater than or equal to zero.
sza. Skip on Zero Accumulator. The next instruction is skipped if tne accumulator
IS zero.
sna. Skip on Nonzero Accumulator. The next instruction is skipped if the accumulator IS non-zero.
snl. Skip on Nonzero Link. The next instruction is skipped when tne link bit is a 1.

vi)

szl. Skip on Zero Link. The next Instruction is skipped when the link bit is a O.

vii)

skp. unconditional SKiP. The next instruction is skipped.

j)

jI)

iii)

iv)

viii> hit. HalT. The computer will stop at the completion ot'the current instruction.

4.3. A more realistic example program.
Now that we have at least cursorily covered the general concept at assembly
language programming. and have considered the instruction repertoire available. its
wortn looking at a slightly larger program than the Aadd three numbers" example. All
that tne program does is loop around. for some fixed number of times. stOring into
memory a number representing the current count of iterations. The program introduces OnlY one new concept and a further minor detail concerning tne arrangement of
programs ,n memory. The new concept is that of painters and Aindirect addreSSing".
In this program. it's necessary to refer to successive elements of memory as one
iterates around a lOop. On the first iteration. one has to store 1 in. as it happens. location 217; on the next iteration. a 2. must be stored in location 220; then a 3 in 221 and
so torth. Eventually. depending on tlie number of times the lOop must De executeo.
reference may need to be made to locations 400. 401 etc. Consequently. we WOUld
appear to require a Astore" instruction in which the address gets cnangeo. It is not
possible to 00 this.
Instead. indirection is used. A variable is used to record the address of that
memory location in which the next number generated is to De stored. After eacn
number is stored. the value in this variable can be incremented so that the new valUe
refers to. or points to the next location to De used. Because it "pOints" to some
required memory location. such a variable is generally known as a "pOinter variable"
or just a ·pointer".
The actual store instruction in the program loop contains rne address at this
painter variable. Of course one doesn't want the datum stored in the address specltied. that would just cause the value at the pointer to be overwritten. Rather. one wants
to indicate that the value of referenced pOinter be used to determine where exactly tne
datum must be stored. Thus. the instruction nas to read something like "deposit rne
contents of the acc in that memory location Whose address is currently stored in this

'"'

- 33-

pOinter varlable R or. more concisely. "dca indirect pointer R •

This implies that we have some means of indicating that the method of interpreting the address part of an instruction be changed. On the pop-a. one of the nine
address bits in a memory reference instruction is used to oesignate the "mooe" for
interpreting the address. There are thus two address modes; the setting of this control
bit (which is actually bit 3 of the instruction word) determines how the remaining
address bits are to be used. If this bit is zero. we have "direct" addressing. The
aodress given in the instruction is the address in which data is to be stored. or (rom
which data is to be fetched or to which control is to be transferred. If bit 3 is a 1. then
"indirect" addressing is necessary. The address given in the instruction designates tne .
memory location of a pointer variable: the required. effective address must be fetchea
from that location before performing the store. add. jmp or other instruction. (More
sophisticated machines typically exhibit a larger number of addressing mooes ana
must use more bits of an instruction word to designate an appropriate mooe).
Note that indirection makes an instruction take longer to execute. One extra
memory cycle. and some other additional work. are entailed. In aecoding the address
in an instruction like "dCB indirect pointer". the bits that identify the address of the variable "pointer" have first to be abstracted and interpreted so that the appropriate
memory location of "pointer" is identified. That first decoding step is. of course. standard to all memory reference instructions. However. if indirection is being useo. one
must then go on to read from memory the value in this referenced location. So an
extra memory access must be performed. It is the value thus read out of memory tnat
represents the real effective address. This effective address must then be used in tne
subsequent data fetch or data s«>re operation.
The first couple of lines in the program. shown below. initialize a pointer variable.
"ptr". so that it contains the address where the first datum is to be stored. Then. a
. counter is initialized with (minus) the number of times the loop is to be traverseo. The
body of the loop in this program starts at label "loop" and ends with the "jmp looP"
instruction.

The Source Text of the Program.
I
This is an example PDP-8 program written more or less in
I standard DEC PDP-8 assembler style.
I
Basically. the program does the following
I
I
for i=l to 13 do store[i+216):=i; (where all numbers are
I
in octat>.
I
I
I

like most PDP-8 programs. it starts at address octal 200.

I
I

what it actually does is.

I
I
I
I
I
I

1) set a pointer to pOint to where to store next item
2) set a counter to -13 octal (i.e. 7765)
3) (ndx. Its Iteration index. is already zero so does not
need to be initialized).

-34-

I

loOp>

I
I
I

4) pick up the value in ndx
5) increment it
6) store the incremented value back in memory
7) pick up the new value of ndx again
8) store it ·indirect ptr·, i.e. ptr is assumed
to contain the address where wa will store
the current contents of the accumulator.
9) increment pointer so that it points to the
next memory location ready for next time
(note. this increment and skip won't cause
us to Skip since ptr is never going to
get to point to location zero).
10> increment counter, Skip If the value of counter
becomes zero. Since we started off setting
counter to -13(octal> we should Skip after
gOing round the lOOP the right number at
times).
11> jf haven't skipped this instruction, go back to -loop"
12) halt

I
I

I
I
I
I
I
I
I
I
I
I
I
I
I
I

11:20
COunt, 0
ptr, 0
nox. 0
11:200
taO tab
dca ptr
taa x
aca count
loop. tad ndx
iac
aca ndx
taO ndx
dca i ptr
isz ptr
isz count
jmp loop
hit
x. 7765
tab,0217

I initialize ·ptr· from predefined value in "tab·
I

and ·count· from value in ·x·
I increment value in ndx

I store It in next "array elementI update ptr so it references next array element
I increment loop control & check for terminatIon

I if not yet terminated, go back
I equivalent to -13 octal
I constant specifying where array to start.

$

In the boOy of the loop. the value of "ndx" is incremented and copy of the incremented value stored in the array. The -dca i ptr· is the store instruction tnat must m
effect reference successive elements of memory. The ·1· between tne opCOd9 ana tne
address is how the use of an -indirect· addressing mode is signalled to the assembler.
If a source statement specifying a memory reference instruction contams an
between the instruction mnemonic and address label. then smap generates an
instruction In which bit 3 is set to 1 (so flagging the indirect moae); usually. bit 3 at a
memory reference instruction is zero (so flagging direct addressing).

·i-.

- 35 ....

After the -dca i ptr· instruction. there are two "isz"s. The first simply Increments
tne pOinter SO that the next location is appropriate'y referenced prior to anotner iteration of the loop. The second is the instruction that tests for the end of the loop; wnen
tne loop is has been completed sufficient times. the count (whiCh startea as a negatIVe
value) will reacn zero causing the jump back instruction to be skippea. The program
terminates at the halt instruction.
The minor detail regarding program layout relates to where on page O. the "global variables· ptr. ndx and count get placed. The first of these is at ll20 rather tnan llO.
It happens that the first sixteen memory locations. addresses 0.. 170ctal. on an POP-S
typically have rather special uses and should not be employed simply to hold globa.
variables.
When this program is processed by smap. the first output proaucea is a symbot
table. This is written to UNIX standard output. i.e. usually the terminal. at the end ot
pass 1. It defines both the labels used in the program being assembled ana also tne
standard "reserved- words (which are mainly mnemonics for the operate and 1/0
instructions). A fragment of the symbol table produced for the example program IS
shown be'ow:

SMAP

Pass 1. Symbol table listing:

Symbol table:
Name

Type

aacrb iot
and
mri

taD
tIs

x

label
iot
label

Value
6601
0000

0216
6046
0215

Name
adsf
cia

iot
opr

Type

Value

6602
7041

.

1000
tad
mri
6041
tsf
iot

On completion of its second pass. smap writes a listing to -stanaard output" ana
an object file. Part of the listing follows. It is typical of listings prOduced by assemblers
wIth various columns of information. The leftmost column identifies tne address in
which data is to be stored; next follows the value to be stored in that address; the tnira
.column. i.e. the rest of the line. shoWS the source text from WhiCh these aata were
tad x" shows that
generated. Thus. for instance. the line reading "0202 1215
smap has arranged that memory location 0202 will. contain the octal value 1215 which
it has aetermined to be the appropriate binary pattern for an instruction to add tne
contents of location 0215 to the ace.

- 36SMAP

Pass 2 Assembly listing.
I

this is an example pdp-8 ...

I

12> halt

-20
0020 0000
0021 0000
0022 0000

count. 0
ptr. 0
ndx. 0

""200
0200
0201
0202
0203
0204
0205
0206
0207
0210
0211
0212
0213
0214
0215
0216

1216
3021
1215
3020
1022
7001
3022.
1022
3421
2021
2020
5204
7402
7765
0217

tad tab
dca ptr
tad x
dca count
loop. tad ndx
iac
dca ndx
tad ndx
dca i ptr
isz ptr
isz count
jmp loop
hit
x. 7765
tab. 0217

<The effect ot the indirection bit can be seen by comparing the cOde generatea for
"deB count" (3020 in location 203) and "deB i ptr" (3421 in location 210), Bit 3 of tne

instruction wora corresponds to 0400 octaO.
smap's other output is the object file. This can be seen to contain the essential
summary of the data shown in the first two co'umns of the listing. Not every aOoress
need be speCified because. given the starting point tor a cooe segment. most
addresses are implicit. Also in this object file. following the code. is a copy ot the symbol table. This is purely for the convenience of the simulator system. It allows tne
<translated instruction> field. In the display. to explicitly reter to tne program"s original
labels. Thus. the display. at the poinLwhere instruction "1215" (at 0203) was being
executed. WOUld read "tad XO rather than just "tad 215".

The Oblect File

- 37-

"0020
0000
0000
0000
"0200
1216
3021
1215
3020
1022
7001
3022
1022
3421
2021
2020
5204
7402
7765
0217
$

45
aacrO 36601
36602
aasf
1 0000
ana

tis
tsf

x

36046
36041
00215

$

This example program can De executed on the baSIC simulator, Since it invOlves
a few iterations around a multi-instruction loop. it takes some time to run with tne very
aetailea display,
The aisplay of program execution is of course transient ana provlaes no permanent racora ,of the program's correct running. The program aoes cnange the
memory of the simulated pop-a: the changes effected can show whether or not tne
program ran successfUlly. The simulator contains a provision for tne contents of
memory to De printed off both prior to. and subsequent to. execution of a program.
The printout of memory thus obtained is an instance of a program "aump", It
snows addresses and contents of those memory locations that are non-zero. The
aump proauced by the simulator shows simply the memory contents in octal. Quite
commonly. computer systems provide a much more detailed "dump" shoWing eacn
memory location in several different printing formats e.g. hex. instructions. characters.
The example shown below demonstrates that the program did indeea moaity
memory by writing in integers 1.. 13 octal starting at memory location 217.

..,. 38The Program ·dump·.

Printout ot contents of memory prior to program execution.

Address

Contents

0200: 1216 3021 1215 3020 1022 7001 3022 1022
0210: 342120212020 5204 7402 7765 0217 0000

Printout of contents of registers and memory suosequent to
program execution.

acc: 0000

pc: 0215

link: 0
Contents

Address

0020: 0000 0232 0013 0000 0000 0000 0000 0000
0200:
0210:
0220:
0230:

1216
3421
0002
0012

3021
2021
0003
0013

1215 3020 1022 7001 3022 1022
2020 5204 7402 7765 0217 0001

0004 0005 0006 0007 0010 0011
0000 0000 0000 0000 0000 0000

- 39-

5. The PDP-S's addressing mechanism and subroutine calls.
5.1. Addressing.
In 1965. haraware was relatively expensIVe so the aesigners of tne PDP-8
sklmpea on what they providea. The result is a relatively Clumsy way of accesSing
memory. The fOllowing description ot the adOressing mecnanlsm is taken. with mmor
aaaptatlons. from DEC's aocumentatjon.
Only nine bits are available to speCify a 10cUIon in a memory reterenClOg frlstructlon. However, a full 12 bits are needed to uniquely address the 4096 locations that are
contamed In tne PDP-8's memory unit.
To make the best use of the available nine bits. the PDP-8 utilizes a loglca' diVision ot memory 10 bloCks <pagesJ ot 20o-oCtal <128 decimal> locations eacn as snown
in following table (all values in octal>:
Page

Page

Memory
locations
0-177
200-377
400-577
600-717
3600-3777

0
1
2
3

17

Memory
locattons

20
21
22
23
37

4000-4177
4200-4377
4400-4577
4600-4777
7600-7777

Since tnere are 200-octal 'ocations on a page and since seven bits can represent
200-octal different numbers. seven bits (5 through 11> are usea to speCify tne relative
location within the page.
The page is speCified by bit 4, called the current page or page zero bit. Ii bIt 4 IS
O. then tne reference is interpreted as being to a location on page zero. I.e. one ot rne
first 200 iocatlons in memory. If bit 4 IS a 1. the page address IS mterpretea to be on
me current page, i.e. the page con taming the inslructlon current'y bemg executea. for
example. if bits 5 through 11 represent 123-octal and bit 4 IS a zero. tne 'ocatlon
referencea is absolute address 123. However. It bit 4 IS a one ana me current instruction IS In a core memory locatIon whose absolute address IS between 4600 ana
47770ctal then the current page address 123 aeslgnates rne absolute aOdress 4723.

o

I

~

3

4

' 7 ' ff

I;

9

/0

I~

I : ": litI : : : : : : JI
I

:>pcoae

I

;

location In page

• Currenr page or page zero bit.

o

a)

I

11:)

page zero
current page

- 40-

Indirect addressing.
The scheme aescrlbea above allowed addresslOg ot 400-octal locations oy any
instruction --- 200 page zero lOcationS and 200 current page lOCations. HOW are me
remaining 7400 locations to be accessed?
Bit 3 ot a memory addreSSing Instruction identifies tile addreSSing moae. When
bit 3 15 a zero. tne operand is a direct address. When bit 3 IS a one. the operand ,s an
Indirect address. An Indirect address <polOter address) identifies the location tnat cor.t8lnS the deslreO aOOress (effective aOdress), To addrtJss a location mat IS not directly
addressable. the abSOlute address of the Oeslred location is stored In one ot me 400
directly adOressable locations; this pOlOter address IS tne used In rne memory reference Instruction but with bit 3 set to 1. When executing. the machine will fetcn rne contents Of the specified pointer aOdress and then use this to speCIfy the effective address
10 8 suosequent tetcn CyCle to get the required datum.

()

I

'-.

3

~

S

~

7

--+-:
I : : [ill]----+-:
I: : I

9

8

/1

I

--+-:-+-:

-+---fo
: :

, opCOde

10

page aOdress

I

" Current page or page zero bit
• DirecVlnOtrect aOOresalng bil
o oirect aOOresslng. referenced memory
location containS reqUired Oatum.
1 indirect aOOreSSlng. reterenced memory
location contains the aOdress ot
me reqUired datum.

- 41 5.2. Subroutine calls.
An effective suOroutlne call mechanism has to resolve two problems. first. as
noted earlier. one needs to maintain some record of where a sUDroutme was inVOKed.
50 that when the subroutine has been completed one can return ana resume tne main
program. Second. one reqUires some mechanism for passing arguments to a sUOroutine and retrieving results back when It is complete.
5.2.1. Subroutine linkage on the PDP-S.
The method by which connection is establisl'led between a calling routme ana a
called subroutine is referred to as a subroutine linkage mechanism. A linkage
mechanism must provide some means for preserving the address of the instructIon to
which return must be made. Suppose for instance that we have a subroutine call "jm3
sUb1" at location 0204. This instruction will be fetched. the pc incremented. to 0205.
and the instruction decoding process carried out. The address 0205 now in the program counter represents the paint to which return must be made. This value must be
savea somewhere before the program counter is changed to paint to the first instructIon of the subroutine.
The problem is. of course. where to save the return address. On machines with
lots of registers one can use a register; provided that the subroutine knows Which
register has thus been reserved to hold its return address. then all is well. The POP-8
doesn't have any registers to spare and so It can't adopt the "linkage register"
approach. A more satisfactory approach. using "stack" data structures in tne main
memory of the computer, will be considered in the lectures. Unfortunately. tne
addressing mechanisms on the PDP-8 really preclude the stack oriented approach
(besides. very few people had thought of such sophisticated tricks back in 1965 ,when
tne PDP-8 was designed).
It's necessary on the pop-a to use main memory to store the return address. The
memory location used must somehow be known to both subroutine and call1ng program. The only obvious place is the first location of the subroutine. This is at course
the address referenced' in the JuMp to Subroutine instruction and so it's known to tne
calling program. A subroutine can be expected to know Its own starting address. So.
on the pop-a. the first location of a subroutine doesn·t in fact contain an executable
Instruction; Instead its a place for storing the return address.
When a jm3 instruction is executed the effective address must first be evaluated
(It may be a direct address or entail Indirection). This effective address. held temporarily in the mar register. Identifies the start of the subroutine. The current vatue in
the program counter. already Incremented and thus pointing to the locatIon following
tne jms instruction. is written Into the memory address specified by the contents of tne
mar register. The contents of the mar register are then incremented by 1 ano this
value loaded into the pc. The next instruction fetch will consequently retneve tne first
Instruction of the subroutine.
The state of the machine at the point where the jms instruction has been fetChed.
decoded but not executed would be something like:

42 -

0204

4240

jms sub1

0240
0241

o
1020

0
tad count

0257

5640

jmp i sub1

sub 1.

pc

10205( ~ar l024~

After execution of this jms instruction one would have:

0204

4240

Jms sub1

pc

lQEil mar {9240l

ir 1~240J
0240
0241

0205 sub 1. (original ·0· now overwritten by return address)
1020
tad count

0257

5640

jmp I sUb1

with tne next instructIon tetcn trom location 241. Le. tne tirst Instruction ot tne sUDroutine. ana witn tne return aaaress stored at 0240.
Eventually the subroutine will need to return I.e. It neeas to reset rne pc to tne
value 0205 stored at sub 1. This return simply requires and "indirect" lump. The execution or the "Imp t sub '" instruction at 0257 entails the tOllowlng sequence 0; actIons In
address aecodlng. First. the bit 4 is Isolated. as Its a 1 tnlS Identifies tne adaress lies
on tne current page: tnen bits 5.. 11 are abstracted and interpretea as IdentIfying tne
40tn location on tne current page. I.e. 0240. Then. because bit 3 IS set an Inolrect
CyCle IS performea; the contents. 0205. of the referenceo memory locatIon. 0240. are
retnevea Into tne mar. Finally. tne Imp instructIon IS executed. I.e. contents
mar
copied to pc; ana tnus rne return nas been maae.

0'

It was noted earlier tnat usually. tne main program IS on page 1 <locatIons
200.. 377) and that SUbroutines were on other pages (the baSIC SimUlator nas Only
pages 0..3. tne aavanceO versIon nas pages 0.. 17>. So typIcally. one WoulO be trying to
call a SUbroutine on another page. Note that tne toll oWing coOe IS erroneous:

-43 -

"'200
jms subl
jms sub2

"'400

suo 1, 0

Its not possible to directly reTerence location 0400 in an instruction at 0200; sucn a
reference is to a different page. One can only directly reference a current page or
page zero location. Such a subroutine call must be coded using an indirectIon via a
pOinter variable. viz:
"'200
jms i psubl
jms i psub2
psuo 1, subl
pSUb2. sub2
"'400
suO 1, 0

The smap assemOler should detect and complain about any erroneous cross-page
reTerences.

5.2.2. Passing parameters on the PDP-So
Normally, on any machine, if only a Tew arguments need be passed to a sUOroutlOe tnen they are passed in registers. Similarly, results are returned in registers. Of
course, on tne PDP-8, we only have the ace. Consequently. we can only pass one
argument into a subroutine via a register and can retrieve only one result back. A single argument and result will be quite sufficient for any example programs tnat will be
written for the simulator.
If there are insufficient registers to pass all necessary arguments then one typIcally passes to the subroutine the address of some ·array· in memory in which additional required arguments are tabulated. This approach will be covered in lectures on
·stack-oriented· systems for subroutine linkage and parameter passing.

-446. Debugging Assembly Language Programs and the Advanced Simulator.
6.1. The Problem of Errors in Assembly language Programs.
One soon learns that a successful, error free compilation or assemOly represents
but a small step towards a working program. On attempted execution the program may
run. but proouce wrong results. may loop performing forever some mysterious anO
futile computation or may simply ·die· obscurely.
Almost any given algorithm takes far more statem\~nts to express in an assemoly
language than in some high level language such as PASCAL. Even if programmers'
error rates. In errors per hundred statements. were constant. assemOly language programs would inevitably contain more errors because of their greater length. "TYPIcally.
programmers find greater opportunity for error in assembly language ana tlleir error
rate in fact rises. thus compounding the problem.
Errors in assembly language programs tend to be more damaging tnan mose
committed in some high level language. Consider for example the sort of error wnere a
loop termination condition is inappropriately expressed resulting in reference to some
array element beyond the true array dimension. A PASCAL compiler typically insens
coae to verify each computed array subscript. At run-time. such code will trap an
erroneous reference and terminate the PASCAL program with an error message inOicating the general nature of the error and its approximate location (in terms of a line
in the PASCAL source program). Not so with an assembly program. One might conceptually have had an array. equivalent to PASCAL's ·var tft : array[O.. 1271 of integer".
stored in memory locations 500..677 octal. A run-time reference to the -3ro element of
this array would simply be a request for the contents of store location 0475. Such a
request is perfectly valid and the contents of that location will be retrieved ana used in
the computation. Now address 0475 presumably contains either some otller oata element used In the program or. possibly. an instruction. An instructIon. when interpreteo
as Oata. is of course just another number (save that it has a perverse tendency to
represent the greatest or least element of the array, or whatever else was sougllt).
Whatever it was that was in the referenced location gets used. Consequently. the program Will. prObably. run to completion but may yield erroneous results (depenoing of
course on tile particular values in the test data employed).
Even more mysterious behaviours are manifest if. with the same sort ot error. one
tried to write to the -3rd element ot the array. In so doing, one would overwrite tne original and correct contents of memory location 0475. The processing of the array might
well De completed satisfactorily, so enhancing the programmer's confidence 10 that
processing routine and diverting the search for the error from its true location. SUCh
an error of overwriting will only be apparent if. at some later stage ot the computation.
tne original instruction or datum is again reqUired. If the overwritten location was useo
to hold an element of program data then the new value. erroneously written into tne
location. will cause unexpected. and inexplicable results to be obtained elsewnere.
Most bit patterns representing numbers can also be interpreted as instructions. Consequently. if the overwritten location was supposed to contain an instruction tilen there
will be an attempt to interpret the new erroneous value as an instruction. The attempt
may De successful and some instruction, albeit not the intended one. will get executed.
The most common type of assembly language error. apart from the kino of
erroneous data addressing just discussed. probably relates to arOitrary transters of
control. Jmp. jms and skp instructions all admit any degree of misuse. In PASCAL programs are forced to be well structured. It is. for example. impossible to jump into tne
middle of a "for-loop· and thus end up Checking against loop limits that have never
Oeen initialized. Such uncontrolled transfers are trivially realized at assemOly language
level. Although all programmers intend to write clear well structured assembly programs from algorithms expressed in some pseudo-PASCAL. the liberty alloweo at tne

- 45assemDly language level frequently subverts these intentions. Programs with complex
control flows are created which. after a couple of rounds of modification become completely incomprehensible and in which all transfers involving any address computatIon
become unreliable.
One particular cause of errors. relative jumps defined in the program source text.
is eliminated with the smap assembler. Quite often. one wants to code an instructIon
that will effect a jump around say the following five instructions (as for instance when
encoding the false branct:'t of an ·W statement). One may not feel inclined to invent an
additional label on the instruction to be jumped tc' rather one simply wants to say
"jump over five instructions". Many assemblers do provide a notation. e.g. ·jmp .+6".
WhiCh aChieves the desired effect. Unfortunately. the subsequent addition ot a couple
of extra instructions to the "true" portion of the conditional may then have overlooked
side effects. <The lack of such relative addressing in smap is due solely to the laziness Of the implementor and not to any overt intent to prevent SUCh self-inflicted
errors).
Other common. but trivial errors. include specitylng the wrong address mooe (SO
perhaps resulting in the use of the address of a variable rather than its vaiue). or
speCifying the wrong instruction. In their normal helpful manner. designers of assemblers contrive to prOVide pitfalts such as instructions with very similar mnemonics. It IS
difficult. especially when working with a listing produced on a low quality printer. to
notice that one has wrItten Ib where lh was intended. The program that loads one
byte. when two were desired. runs but does not achieve. its intenOed purpose. Not
surprisingly. paranoid delusions concerning malevolent machines are enaemlc
amongst those first learning assembly language.
The normal response to mysterious bugs in a PASCAL program is to tnrow in a
whole series of ·wrlte" statements. These extra -tracing" statements allow one to print
ott tables containing the values of all important control variables. or to print appropriate messages at procedure invocation. loop termination etc. With a tew well chosen
trace statements. a programmer can usually rapidly localize the erroneous portion of
code.
This technique is less readily applied to assembly language programs. A one hne
PASCAL trace statement. e.g. ·wrlteln('entered sort routine, n
',n:8);". may well
require a score. or more. of assembly language instructions, A "few well chosen trace
statements· may consequently represent more lines of code than the original erroneous program, Even if one had the tenacity to insert the additional tracing code (and
sheer luck sufficient to perform the insertion without introducing additional errors) one
has a further problem. The new code will have resulted in significant changes to the
detailed construction ot the program. The address containing the instruction overwritten in the original buggy program. or address to Which erroneous transfer was made.
now most probably contains something different. One still has a bug to chase. but its a
different bug trom that In the original program and may express Itself In some compjetely different fashion.

=

The response ot despair is to use a "program dump", Earlier we examined the
·dump· produced from the basic simulator; it showed the original memory contents
and the contents on completion at the program. If the program contains errors one
can get a oump made at the point at which It died. or was aborteo it it was stuck in
some lOOp. This dump represents the state of memory at some unknown time after the
commission of an error. One may examine the contents of memory as presented in
this oump. One searches tor evidence of deviations from the required behaViour at the
program. such as memory locations that contain unexpected and inappropriate data.
From such evidence one may be able to identity the source of the disruptions. Debugging tram program dumps requires considerable patience. care ana an appropriate
mental attitude. It is a task apparently well suited to acolytes ot the ChurCh of Latter

- 46Day Saints.
6.2. Interactive Debugging.
The best approach to debugging assembly language programs (and high level
language programs for that matter) is to work interactively. at a terminal. controllingtae execution of the program. Single-stepping of a program is a powerfUl teChnique.
One does not work one-cycle at a time (as in the basic simulator). insteaa one exe. cutes one statement (i.e. one instruction in assembler) at a time ana views the
Changes effected before proceeding.
The problem with single-stepping is that its only viable when one has brought
. execution of the program very close to the point where the error IS actually committed.
Suppose tor example that one had a binary search routine in which the error related to
tile test for the twO array pointers passing one another. Even in the simplest contrived
test case. it might require several hundred. or several thousand. instructions to De
executed betore the erroneous code was reached. It is impractical to singie step
through several hundred instructions in order to reach the region of an error.
A more flexible system is required wherein one can run the program normally up
to some point chosen to lie just before the suspect code. SUCh a point. whereat execution is to be temporarily suspended. is referred to as a "breakpoint". On reaching a
breakpoint. control should be transferred to some interactive routine that will let the
user inspect the values of chosen variables. set subsequent breakpoints. anO. if
necessary. to invoke single-stepping of the next few instructions. When the user is
satlstied. execution of the suspended program should be resumed.
It is easy to implement such a system on a simulator. One simply prOVIdes an
extra array of boolean variables of the same size as the array representing memory.
Instructions at which there are to be breakpoints are flagged by setting the
corresponding boolean to true. The simulator program can check in this bOOlean array
to see if a breakpoint is needed prior to the next instruction fetch. If a breakpoint has
been reached. the simulator can call a "break" handling routine that prOVides the user
with various options for inspecting memory and cpu registers. When the user indicates
that the program is to continue. the break routine is exited and simulator resumes With
the next instruction fetch. It is possible. but considerably more difficult to implement a
similar interactive debugging function for a real computer.
The "advanced simulator" contains some break (debugging) functIons loosely
modelled on standards such as DEC's DDT and UNIX's adb. As well as allowing tor
control over the execution of the program. the break functions enable the user to
define the required detail in the display ot the simulated computer. Control is passed
to these break functions prior to attempted execution of the user's program; this allows
the initial setting of display options and. if necessary. initial breakpoints. The simulator attempts to trap all errors. such as erroneous memory references. and pass control to the break functions.
6.2.1. The -break- functions.
The break functions announce themselves by clearing the screen. printing the
current value of the program counter (Break address : xxxx> and then prompting the
user with the prompt "break>". The user may then enter a variety of commands.
J)
+/- commands. These turn on (+) or turn off (-) various run-time displays. These
display options are considered In the next section.
2)
: commands. These commands enable the user to set. or to remove breakpoints.
resume execution of the suspended program. abandon executIon entirely and to
obtain various other information.

- 47-

3)

Memory display commands. These allow the contents of specified locatIons in tne
memory to be displayed in various chosen tormats. One can tor example vIew a
particular range of locations on the assumption that these locations should contain instructions; comparison of the instructions displayed with those in tne anginal program listing will reveal any that have been overwritten with program aata.
The basic tormat for these commands is "<address>L<repeat-factor>J/<format>";
tnat is. an address (either an octal constant or the name of one of tne program
labels) optionaUy foUoweG by a repeat-factor (given as a comma foHowed by an
octal number). a "slash" character and then a one character format specificatIon.

Each command entered by the user is processed immediately; on completion at a
command. the break function again prints the "break>" prompt. This cycle only terminates when the user enters ":c" (for "continue" I.e. resume/start execution of user
program) or ":q" (for "quit" i.e. abandon it all).
The full set of ":" commands comprises:
:b
;c
:d
:q
:r
:s
:u

set breakpOint.
continue execution of PDP-B program.
"dump" registers and memory to terminal screen.
terminate PDP-B program.
"dump" registers to terminal screen.
list user defined symbols (labels).
undO. I.e. remove. a breakpoint

For the :b and :u commands (set and remove breakpoint). the break function responos
with "Address tor breakpoint>". An address must be either an octal constant or one of
tne program labels. Breakpoints can only meaningfully be set at locations contalOlOg
instructions that are fetched. Breakpoints set on data elements. or on tne first word of
subroutines. are ineffective for no instruction fetch is performed on such locatIons.
The :r command simply prints the current contents of the acc. link and pc registers. These values are also printed for the :d command. which then proceeas witn a
memory aump similar to those previously illustrated and discussed. The :5 command is
simply a convenient way of getting a summary at some of the data available in tne
symbol lable listing.
The formats in which memory words can be displayed are:
c one ASCII character/word.
a word value in signed decimal.
j word interpreted as an instruction.
a word value in octal.
p two 6-bit characters paCked in a word.
Some example memory display commands might be (user input shown in bold type):
break>
loop.10/i
loop 0204 1 tad ndx
0206 / dca ndx
0210/ dca i ptr
0212/ isz count

02051 iac
0207/ tad ndx
0211 / ;sz ptr
0213/ jmp loop

(Here tne user requested that the contents of 10 <10 octal. I.e. 8 decimal) memory
iocat.ons, starting at the address of the label called lOOP, be displayea as if tney
represented instructions).

- 48-

break>
204.4/0
loop 0204 I 1022
02061 3022

02051 7001
0207 1 1022

('-jere. the request was tor the contents ot four memory locations to De cHsplayeCl In
octal).
break>
215/d

x

0215/-11

<This was an example of print out as a signed decimal number).

6.2.2. The displays in the advanced simulator.
The Clisplay options are set using the "+" and "-" commanas m response to tne
"break>" prompt. The commands are:
"
c
d
i
I
p
s
t

list current display settings.
seVreset CPU display.
seVreset memory display for "data".
set/reset memory display for "instructions".
set/reset logging of registers on interrupts.
set/reset display tor status of peripherals.
set/reset single step mode.
seVreset instruction translation display.

The initial. default. settmgs for the various options can be obtained by "+"".
break>
+Current display settings:
C.P.U.
true
"Data" Memory Window
true
"Instructions" Memory Window
true
Status ot Peripherals shown
true
Single step mooe
false
Translation of instructions
true
Some example commands are:

··d
<This command would turn off the "Data" Memory Window during any subsequent
display of program execution).
+5

<This command would cause subsequent program execution to be in single step mooe.
i.e. tne simulator would pause between each complete instruction until a user
response is received).
Even at their most elaborate. the displays in the advanceCl simulator are less
complete than those of the basic simulator. The C.P.U. display takes the fOllOWing form:

C.P.U.
acc IUU(link

8

pc

[uul

-]

Time:

uu

x 10 seconds

Only the acc. link and pc registers are displayed. There IS a new component. replacing
mar mdr and text fields; this IS a crude ume estimate. The estimate IS baseO on a one
micro-second memory CyCle time and various somewhat arbitrary measures of other
components in instruction times. (Rather than use fractional micro-second times. tne
units are 1O~-7 seconds), It the CPU display is on. then the register displays are
upOated eacn time the corresponding register is cnanged; rne time is updateo at tne
eno ot each instruction. If the Instruction translation option IS set rnen. wnen an
instruction has been completely deCOded its "disassembled" torm IS OIsplayed (II
appears to rne ngnt of the pOint where the pc register would be displayed. irrespective
ot whether or not tne CPU display option is enabled).
There are two separate windows into memory, One shows rne region last
accessed on an instruction fetCh. the other the region tor the last data teten or write
operation. These display components are separately selectable. Indivloually tney are
like rne single display of the basic Simulator. The memory displays. if selected. appear
on rnat region of the terminal screen below where tne CPU display Would go. An example witn both memory windows selected IS:
MEMORY
Instructions
Data
Address Contents Address Contents

0176
0177
0200
0201
0202

0000
OOQO
1216
3021
1215

0214
0215
0216
0217

7402
7765
0217
0000
0220 0000

The peripherals display. which occupies the bottom portion of rne screen. Will
contaan messages identifying any peripherals which currently have ·flags· set. lana
what data is in any related input buffers). The status of the interrupt line 15 also
reporteo.
Maintaining an elaborate display considerably SlOWS down the simulation. In normal use. the display may be limited to Just the tran,slated instruction while the program
IS executing code that seems correct. More elabOrate options can be selecteo wnen
the program has been run up to a breakpoint just preceding some. as yet unldentlheo.
error.

6.2.3. A worked example of the use of the break package.
Systems like Interactive debugging functions really reqUire live oemonstratlon on
a program containing genUine bugs. Contrived examples rarely suffice. for me bugs
are known ana easy to discover. ThIS example IS only partially worked out; II IS
intenOed to suggest some uses for the debug functions.
The program is a sljghtly modified variant of that presented earlier wnen diSCUSSIng indirectIon. The Intended Change was simply rnat the numoers J.. 13 OCtal De
storeo on page O. In locations 150.. 162. rather than 10 locatIons fOllowmg rne cooe.

- 50-

Another change. introducing the bug. has also been made.
U the program is executed as is. it will loop seemmgly endlessly. It will alSO De
seen to be executing "and" instructions. although there are none of these In the text.
I a Duggy versIon of the standard example program tor.

stonng some data.

I

"'20

count. 0
ptr. 0
nox. 0
"'200

. tao tab
oca ptr
tao x·
aea count
lOOp. tad nax
jac

aca nax
tad nax
aca i ptr
isz count
isz ptr
jmp lOOP
hit
7765
tab. 0150

x.
$

Followmg sucn an unsuccessful attempt at execution. tne program was rerun wIth
a breakpoint at ·loop·. The recording shows tne initial settings of the controls. OnlY rne
instruction translation display was desired SO the others were turnea off.
BREAKPOINT
Break address
break>

: 0200

-c
breal<>

-d
break>
-i

break>

-p
breal<>

:b
Address tor breakpoint>
loop
break>

:c
Eacn time tne program completed a cycle it stopped at label lOOp. It would tnen nave
been possible to inspect the values in ·count·. ·ndx· and ·ptr·. or in any otner memory
locatIons. Instead. the program was just ·continued". i.e. :c response to break>. This
process was repeated until it was noticed. through the translated instruction display.

- ',1 -

that mysterious "anCl" instructions were being executeCl. Then. at tne next breakpoint.
variOus Cllsplays ot memory were requesteCl.
Break aClClress
break>

: loop

:c

("anCl" instructions observeCl in translateCl Instruction Clisplay)
Break aCldress
: loop
break>
:0
ace: 0000

pc : 0204

ACldress

hnk: 0

Contents

0020: 0025 0207 0000 0000 0000 0000 0000 0000
0150 :
0160 :
0170 :
0200 :
0210:

0001
0011
0022
0032
3421

break>
200.20/i
02001
02021
lOOP 02041
02061
0210 1
02121
02141
02161
tao

0002
0012
0023
0033
2020

0003
0014
0024
0034
2021

anCl 0032
and 0034
and 0036
and- 0000
dca i ptr
ISZ ptr
hit

and 0150-

0004
0015
0025
0035
5204

x

0005
0016
0026
0036
7402

0006
0017
0027
0001
7765

02011
02031
02051
02071
0211 1
02131
02151
0217 I

0007
0020
0030
0000
0150

and
anCl
ana
taCl
I&Z
Jmp

0010
0021
0031
1022
0000

0033
0035
0001
ndx
count
lOOP

opr 20.
and 0000

break>

:q
Program finished.
Do you want final contents ot memory and registers "dumped" to a file? l Y or N)n
These displays showed that some anomaly had ocCurred at the POint where the Iterative cycle shoUld have terminated. The value 13 should have been tne last value
stored in memory. at location 162; after storing this value. the program should nave
halteCl. Instead. this value is missing and the program has continued Its CyCle storing
values trom 14...36 in successive memory locations. It has then agam cnangea ana
storea 1 and 0 in locations 205 and 206. Because it did not stop as expectea. It nas
overwritten part of the program with the last few values generateCl. These values. e.g.
0036. when interpreted as instructions are perceived as "and" instructions. 0036 = ana

036.
The example was again rerun. This 'inal time. when the lOOp was near to termination. count = -2. the displays of instructions and data fetches/wntes were reenabled. The last 'ew Instructions preceding the expected loop termination were

- 52-

executed in single step mode. Through these methods. the source of tne error was
identified.

.. 53-

7. Talking to peripheral devices.
So tar. our example programs have been contnved to be wholely self contatneo.
All data that they have reqUired were defined internally. The only "results" generateo
are the displays of successful execution (possibly eVidenced by the tina I "dump").
More tYPically. programs are Intended to process data defined at run-time and need to
be able to perform both Input and output.
The most common requirement is that a program be able to read a sequence ot
Characters typed in on a Keyboard and to write some appropriately transformed cnaracter sequence to a teletype (Le. printing terminal) or a video screen. The aovanced
Simulator Incorporates a "teletype" and "keyboard" which can oe operated througn the
standard I/O instructions ot the PDP-8.
When a key is struck on a keyboard. electronic or electromeChanical deVices
cause a sequence ot voltage pulses to oe transmitted over connecting wires to a keyOoard control unit. The particular sequence of voltage pUlses enCOdes the reqUireo
Character In some agreed manner. The controller contains an el.ght-Oit "buffer· register In which it builds up the character as it receives the successive voltage pUlses.
Similarly. a teletype controller containS a buffer register In whiCh it stores the cooe tor
rne charaCter to oe printed or displayed on the terminal. The CirCUitry In the teletype
contrOller reads the successive bits out of this Ouffer register ana uses them (0 generate vOltage pulses for controlling the electronic or electromeChanical aisplay deVice.
These "buffer" registers in the deVice controllers are also indirectly acceSSible to
the cpu. The cpu can send. via the Ous. a Signal to. say. the keyOoard controller asking
it tor the current contents of the keyOoard buffer register. ThiS aatum woulO De
returned. via the Ous ana mar register. to the acc where it could oe analyzed Oy the
program. Similarly. the cpu can send a Signal to the teletype controller telling It to
copy 8-0its trom the acc (sent via mar and bus) intO the teletype (tty) buffer register.
ana then to use these data to generate appropriate Signals as Will cause the
corresponaing character to be printed.

o

2

I

_l.

1,

3

I

lOr

it

5

,

7

g

Cf

/0

1/

deVice Identification
functIon

OP-CODE
(6)

lOt
On the PDP-8. such requests to aevlce controllers are effectea through 101
(input-output) instructions. The opcode tor thiS instruction is. as always. in bits 0-1-2;
tor lOt'S. the opcoae IS 6. The remaining nine bits ot the instruction word identify the
deVice controller. to which a command is to be sent. and the Signal iaentifytng the
function that the controller is required to perform. Six bits are usea to identity the devIce. the last three bits encoae the tunction.
For example. device identIfier 04 happens to deSignate the teletype controller. A
commana sent to this tty controller speCifying function 4 will cause tnat COntroller to
take a copy of the low order 8-blts from the ace (via mdr ana bus) ana transmit mese
to the connectea terminal deVice. The tull Instruction word WOuld thus be "6044". There
IS a mnemonic 'or this instruction. 'pc (Teletype PrInt Character).
Messages could thus apparently be printea on a teletype Oy means 0; tne

- 54following code:
I initialize a pointer to the address of the start of a message.
I message to consist of individual ASCII characters

lone per wora and terminated by a zero wora.
tad amsg
dca ptr

Iloop.
I
cycle around until the zero word marking end ot message
I
is found; transmit each character to tty.
loop. tad i ptr
sna
jmp done
I found end marker. leave this section
I have cnaracter. transmit to tty.
tpc
I increment ptr so that it points to next word. then go
I back to continue loop
iszptr
jmp loop
amsg. msg
msg. 110
I H
IE
145

o

I ena marker

Thougn reasonably logical. such code would not in fact work correctly. The most likely
result of executing the code segment would be for the teletype to try to proauce a
"rubout' character.
The cpu. the bus. the controllers all run in microsecona time quanta. The cpu
senas data to the tty controller --- and they're there. in the teletype buffer register. in
a couple of microseconds. Then its up to the controller to forwara these data bits to
the device.
Electromechanical devices proceed at a slower pace. A printing terminal may
take one tentn of one second to print a character; even a video terminal takes at oraer
one one-hundredth of a second. The physical nature of the devices limits the speea at
which they can accept data; the controller must limit its transmission rate to what tne
aevice is capable of handling. Consequently. a teletype controller will prObably have to
hold, in its buffer register. its copy of the bits representing a character for something
like one tenth of a second. If. as in the example program. the cpu attempts to sena a
second character. only a few microseconds after the first. then tnese new aata bits are
'or'-ed in with those representing the previous character. The entire character
sequence representing the message will be loaded into the teletype buffer register
before tne controller would have completed sending of the first one bit of the first of
these characters.
tf 1/0 is to proceed correctly. the cpu must wait until a aevice controller nas completed one task before it gives it another. Its simple for the circuits in the aevlce controller to register completion of a task; in the example ot the teletype controller the task
would be complete when the last bit of the current character had been successfully
transmitted. At that point. the controller could set a one-bit bOOlean variable. or flag.
to indicate that it was reaay to process more data. So. as well as a buffer register to
hotd data being transmitted or received. a typical controller will Incorporate a "ftag"

- 55-

register.
Ous

EIuag

I

-_·_·· __

·_-·--·De~C9

010010001

butter

Controller
Like the DuHer register. a controller's flag regIster can also be accessea cy me cpu.
What one normally wants to 00 IS make a conditional Jump. or skip. It tne flag IS set 50
Inalcatlng that the deVIce IS reaay. A typical 110 instructIon testmg a flag IS "6041· 1St
(Teletype Skip Flag); it the tty controlier's flag is set men executlor of the tst Instruction causes the pc to be Incrementea by one so that the next instruction In sequence
IS skippea.
Through tne use of such Instructions one can create loops that walt until a aev,ce
IS reaay:
twalt. tsf
Imp twait
The program will lOOp around this pair of instructions until tne teletype contrOller sets
its flag indicating th8t it can accept another character.
We now have almost the complete mechaOism for a Viable "sena message" routme.
I100p.

cycle around until the zero worO marking ena of message
IS fauna; transmit each Character to tty.
lOOp. tao i ptr
sna
Imp done
I found end marker. leave thIS section
I have Character. transmit to tty.
tpc
I wan until it got tnerel
twalt. tsf
Jmp twalt
I increment ptr 50 that it pOints to next word. then go
I baCk to continue lOOP
isz ptr
Imp lOOP
I

I

ThIS program sendS the first character of the message and wans appropriately until
that CharaCter has ceen successfuliy transmittea. The program then lOOps caCk. co,lects the next Character ana senas it. However. the "ready· flag on the teletype controller IS stili set as a consequence of the successful transmISSion of the tirst

- 56-

Character; so. the wait loop is immediately satisfied and the program proceedS at once
to sending 3rd. 4th and subsequent characters. It is necessary to clear the flag at
some appropriate point in the lOOp (extra redundant clear flag operations are not 10
any way harmful). There is another iot instruction. 6042 tef (Teletype Clear Flag), which
performs the clear operation (on the simulator it also clears the data buffer).
A correct version of the program is:
loop.

tad i ptr
sna
I found end marker. leave this section
jmp done
I have character. transmit to tty.
I but first clear ready flag if set
tcf
tpc
I wait until it got therel
twait, tst
jmp twait
I increment ptr so that it points to next word. then go
I back to continue loop
isz ptr
jmp loop
The two instructions 6042 (tcf) and 6044 (tpc) can. and normally would be comoinea
into a single 6046 (tIs) instruction.
Reading data from a keyboara reqUires a similar loop. o'ne must wait testing tne
flag on the keyboard controller until it gets set to indicate that its reaay to give new
data to the cpu. The two instructions most needea to control the keyboara are ksf
(Keyboard Skip Flag) and krb (Keyboard Read Buffer); the krb instruction Clears the
nag ready tor next time. Thus. an appropriate wait loop for getting aata from a keyboard might be:
kwalt. ksf
jmp kwait
krb
The real computer terminal is of course used to control the Simulator. It was considered that any arrangement whereby it was also usea as the terminal on the simulated PDP-B would prove too complex (one would need to indicate wnether the next
character typed was intended to control the simulator or constituted data to De read oy
the PDP-B). Instead. the simulated PDP-B has a pseudo-keyboara and pseuaoteletype that are. in fact. standard UNIX files. Data are read from and printed to these
flies. one character at a time. using standard PDP-B iot instructions. (Currently. tne
files must appear in the user's working directory with fixed names; the mput file being
-.B.kbd.l-).
The following is an example program for the simulator that uses these input and
output meChanisms. It copies characters from the pseudo-keyboara to tne pseuaoteletype; the program terminates atter copying the first zero. '0'. character fauna in tne
input file.

- 57-

I
I

Copy Characters from keyboard to teletype.
Stop after first zero character.

~200

cia
I wait for keyed input
ksf
jmp loop
krD
/ read it
tiS
I print it
wait. tsf
I wait for printing
jmp wait
cia
I check if it was the character zero.
I i.e. 60 octal.
tad const
sza
jmp loop
hit
const. 60
loop.

$

The execution of the two instructions tsf; imp wait would. on a real machine. tal<e
anout 5 microseconds. If one were waiting one tenth of one second. as one would if
waiting for a teletype to print a single character. then one WOUld have to go around
that lOOP something like twenty thousand times. On the simulator. the deVices have
been speeded up by a factor of some three orders of magnitude. Wait loops are still
noticeable. but one only has to go around some twenty times. ratner man twenty
tnousand.
8. More advanced I/O --- interrupts.
The concept of interrupt driven 110 is elaborated in the lecture course. The main
example used concerns an PDP-8 system on which data IS to be acqUired. at fixed
time intervals from an analog to digital converter. simultaneous with the processing of
prevIously obtained data. Acquired data have to be buffered in memory until tney can
be processed.
The example program used is included here for reference. The ·Clock" and "aId"
converter Incorporated in the simulator dO not correspond to any real DEC manufactured deVices and their deVice identification numbers and function COdes are arOitrary.

- 58-

I
I Simp,e example of an interrupt driven program.
I
It simulates a simple kind of laboratory data acquisition taSk:
I

I

(n.b. its not a perfect simulation)

I
I
I
I

every 200microseconds sample the dla. store data in a buffer while
processing previously acquired data

We assume that time taken to process each datum is vanable
and that. although processor can in long run keep up with data
acqUisition. there will be short term fluctuations where data acquisition
gets ahead. So we have a circular buffer (512 words long starting at
location 512) with two pointers into it --- one for the functions
that process data and one for those that put the data in. We assume
that acquisition process will never get 512 words ahead of processing
so that there is no need to check for buffer being tull. We assume
that it is sufficient just to compare pointers. if they are equal
I that means that processing has caught up with acqUisition and must
I wait. otherwise assume it means some data available tor processing.

I
I
I
I
I
I
I
I
I

I
I

I
I
I
I
I
I
I
I
I
I
I
I
I

I
I
I

I
I
I
I
I
I
I
I
I

I
I

the clock interrupts every 200 microseconds (approx) once started
tne aid can be started. it stabilizes and can be read after'
time equivalent to about 50 microseconds
we use both under interrupt control
algorithm:
initialize
start clock
until there is oata to process do loop:
processing data:
pick up next unprocesseo datum from circular buffer. analyze it
Uhe analysis routine is a phony. we just shuffle bits
around. number of iterations depends on number of binary
ones in the datum (which is actt~ally a random number
generated by the program that simulates all this»

interrupt analysis:
save system
Skip chain to find who done it
if clock. clear clock flag Cleave it running so will
get another interrupt in 200time units)
and start an aid transfer

I
I

return trom interrupt

I
I
I

if ald. clear aid flag. read value and save it 10
buffer

- 59-

/

/

return

trom interrupt

j

/

I
I the main program IS one page 1. locations 200-240 octal
/
1 the data analysis routine is on page 3. locations 600-630 or so
j

I page 0 has a tew globals etc. and also the hardwirec interrupt
j

entry pOint at location zero

I
I the interrupt analysis. Skip chain etc is on page 2 around locations
j

400 ete.

j

-0
/save program counter on interrupt

o
Igo off and identity cause of interruption
jmp i pints
pints. 0400
I places to save acc and link
accsav.O
Inksav.O
-20
ptr1. 0
ptr2. 0
mast<. 1777
bstart. 1000
datum. 0
-200
/ main program starts nere
stan. cia cll
I initialize both pointers to start ot buffer
tad bstart
oca ptrl
taO bstart
oca ptr2
I start tne ClOCk.
6504
I turn Interrupts on
6001
/ la rather unsafe cneck to see if data waiting)
loop. tad ptr1
1 compare pointers to see it more tilled in
cia
tao ptr2
sna cIa
jmp lOOP
/ since some data. get it and save in Oatum
taO i ptrl
oca datum
j upOate pOinters. rememDer its a circular buffer so thiS
I is a bit fUSSy

- 60tao ptrl
iae
ana mask
sna
Jmp repos
aea ptrl
jmp proe
I to repos. we ran off the top of the circular buffer so reset
I pOinter baCk to bottom
repos. tad bstart
dea ptrl
I .
I to proc. call the subroutine that actually processes datum

Jms i procs
jmp loop
proes. 0600

proc.

I
I Page 2.
I
first save the aec
I
then save link
I
I

then go down skip chain trying to find who Interrupted
"400
ints. aea aeesav
rar
aca Inksav
I
first check the clock. device 50. operation 2. Skip
I
if clock has raised its flag
6502

Skp
it was Clock, go ao something about it
jmp clksrv
otherwise. proOably the aId
device 60. operation 2, skip if aId stablized and flag set
6602
skp
jmp ad

10k.

I
I

I

lOOpS. something interrupted but don't know what.
I
Oest stop dead
hit
I
I
I
I

here is coae for controlling return from interrupt
we restore the lin k
xit, cia cll
taa Int<sav
ral
I
ana restore the accumulator
tad accsav
I
and turn interrupts back on. note that
I
this is actually delayed a couple of instructions to
I
gIve us a chance to get back before another interrupt
I
could be accepted

- 61 -

6001
How to return? Just go back to whereever address storeO
in locatIon zero says

I
I
I

jmpi 0
I
I
I
I

serve the ClOCk.
that means just clear its done flag. leave it running
Clksrv. 6501

I
I

and it also means that we should start the aid on its next sample
.
6604
but that is all so return from interrupt after
restoring status appropriately
jmp xit

I
I

I
I

I
I
I

ao.
I

I

an interrupt from the ald. it means the next sample is reaay
read it into the accumulator. then store it away.
update pOinter into buffer (usual fuss for a circular pOinterJ
6601
value from ald. 10bits. now in acc
dca i ptr2
value now saved. do the pointer updating
tad ptr2
iac
and masl<

5na
jmp reset
I
haven't hit end of circular buffer so just
I
carry on
dca ptr2
I
go and dO return from interrupt
jmp xit
I
if have reached top of circular buffer then reset pointer
I
baCk to bottom
reset. tad bstart
dca ptr2
I
return from interrupt
jmp xit
I
I
I
I
I
I

Page 3.
the coae of the processing routine
its not senous so no comments

"'600
procl. 0
tad datum
Jms count
cll rar
snl
Jms count

- 62cia
taO Oatum
cia

nr
snl
Jms count
cia cll
jmp I proc1
count. 0
sna
jmp I count
oca woro
oca c1
count!. tao worO

rar
Oca woro
SZI

isz e1
cia ell
taO worO
sza cia
jmp countl
taO c1
jmp i count
worO. O·
e1.
0
$

- 639. limitations of the POP-8 architecture.
The principal advantage of the PDP-8 as an introductory machine IS its simplicity.
Despite its somewhat clumsy addressing mechanism. tne maChine IS fairly easy to
program <perhaps because one can usually remember the entire mstructlon repertOire(?». Some disadvantages to the design are obvIous. One would like more instructions. It IS. for example. tiresome to have to write a subroutine to perform an ·or· of two
aata elements through some contrived sequence of complementing ana "anding· of
aata. Certainly. a few additional instructions would allow for shoner programs.
There are more SUbstantial problems in virtually every aspect of the deSign. The
suoroutlne call mechanism proves extremely clumsy if one nas many arguments ana
results to pass; it's impossible to have recursion. The response to Interrupts IS
unnecessarily SlOW; a little extra hardware can make things a lot faster. The fixed size.
one word. for every datum is very cramping and leads to clumsy cooe. The arithmetic
facilities are inadequate. there are no mUltiply or divide instructions In tne DaslC
machme. it's difficult to detect arithmetic overflow. The address space IS too small.
one can't have large programs. The paged addressing scneme leads to inefficient use
of even sucn memory as is available. There are too many restrictIons on now one may.
pass aata to ana from the arithmetic logic unit.
Many of these problems will be addressed in the lecture course wnen more
sophisticated machine architectures are introduced.

NAME

assemble. exec8. trace8 - prepare and run programs for a simulated minicomputer
SYNOPSIS
Ipub/211lassemble name
Ipub/211/exec8
Ipub/211/trace8
DESCRIPTION
exec8 and trace8 simulate the execution of programs on a Digital Equipment Corporation PDP-8 minicomputer. assemble converts source programs. written in
PDP-8 assembly language, Into object files for these two simulators.
trace8 is intended to help illustrate the basic "fetch-decode-execute" cycle of the
machine. It maintains a display of the cpu registers. of the bus connecting the
cpu to memory. and of a window into memory.
exec8 is used to illustrate more advanced topics. SUCh as flag- and interruptdriven i/o. The displays showing the status of the simulated machine are less
comprehensive than those maintained by trace8; vari,ous components of these
displays are optionally selectable. exec8 also incorporates a simple interactive
"debugging" package that allows breakpoints to be set and the contents 01
memory and cpu registers to be displayed.

FILl:S

Ipub/2111source/symbols. symbol table containing standard opcode mnemonics
used oy assemble.
'.8.kbd.t ,.B.try.t' , these two flies (in the user's working directory) are the input
for the pseudO-keyboard and output for the pseudo-teletype of the simulated
PDP-8. If the program run on the simulator requires input then this must be'
copied into the file .8.kbd.l prior to execution.

name.

the file containing the user's PDP-8 source code.

oOject a file. created by assemble, containing the assembled Object cooe.
dumpfile. a file (optionally created by exec8 or trace8) showing the contents 01'
memory of the machine before and after execution of the simulated program.
SEE ALSO

Course notes describing the simulator displays and the debugging options bulln
into execS.

BUGS
Please report bugs. by mall nabg, as they are found.

