Cordic Algorithm Implementation for Trigonometric Function Evaluation in Hp21mx by Hu, Peihsung Thomas
, 
THE CORDIC ALGORITHM IMPLEMENTATION FOR , 




PEIHSUNG THOMAS HU ,, 
Bachelor of Science 
National Chiao Tung University 
Hsinchu, Taiwan 
1972 
Submitted to the Faculty of the Graduate College 
of the Oklahoma State University 
in partial fulfillment of the requirements 
for. the Degree of 


























































THE CORDIC ALGORITHM IMPLEMENTATION FOR 
TRIGONOMETRIC FUNCTION EVALUATION 
IN HP21MX 
Thesis Approved: 
fi. F ~~ 
Thesis Adviser 
~&.11 
Dean of the Graduate College 
ii 
PREFACE 
This paper describes the Cordie algorithm and its implementation 
for the evaluation of the sine function in a HP21MX computer. A 
polynomial method is also described and implemented in the HP21MX 
computer for the purpose of comparing the result with the the Cordie 
algorithm. The HP21MX microprogramming. is also applied in this 
experiment to increase the programining efficiency. 
I would like to express my gratitude to my major advisers, 
Dr. Edward Shreve and Dr. G.E. Hedrick for their advicei and guidance 
during this project. Also, appreciation is expressed to my other 
committee member, Dr. T.E. Bailey for his invaluable assistance in the 
preparation of the final manuscript. Thanks are also extended to 
Mrs. Pam Haught for her typing this paper and her invaluable help in 
preparing the final copy of this paper. 
fii 
I 
TABLE OF CONTENTS 
Chapter 
I.' INTRODUCTION 
II. ST~~DARD TECHNIQUE FOR THE EVALUATION OF TRIGONO~lliTRIC 
FUNCTIONS , . . . 
III. THE CORDIC ALGORITHM 
Introduction 
Functional Description 
Representation of Angles in Cordie 
Sine and Cosine Algorithm . . . . . 
IV. COMPUTER IMPLEMENTATION AND PROGRAMMING RESULTS 
System Features . . . . 
Hardware Registers . 
Display Register 
Interrupt System . 
APL Description of HP21Jvl:X 
The Processor . . . . 
Instruction Fetch . . . 
Instruction Decoding 
Instruction Execution 
Interrupt Service . . 
Input/output Interrupts 
Memory Access Routine 
Address Computat~ion Routine 
"Instruction Execution Routine . . 
.Ydcroprogra.mming .... " ... 
Conventional Control Section . . 
Ydcroprogrammed Control Section 
The Hi.cro-programmable Computer 
Control Section 
The Control Processor . . • . 
Main Nemory . . . . . • • . 
Input and Output . . . 
Arithmetic and Logic Section . . . • • 
Implementation of a Polynomial Algorithm in 






































Implementation of the Cordie Algorithm on the 
on the HP21MX Computer 
Calculation of Execution Time . . • • 
V. OTHER USES OF CORDIC .. 
Arctangent Algorithm 
Functional Description 
Decimal to Binary Conversions in Cordie 
VI. SUMMARY AND CONCLUSIONS 
A SELECTED BIBLOIGRAPHY 
APPENDIXES 
APPENDIX A - FUNCTIONAL BLOCK DIAGRAM 














LIST OF TABLES 
Table Page 
I. Typical Rotation Computing Sequence . 18 
II. Typical Vectoring Computing Sequence 19 
III. Interrupt Assignments . 26 
IV. "PROC" Program Segments 27 
V. Decoding Vectors 30 
VI. Instruction Classes 31 
VII. The Navigation Matrix 32 
VIII. Polynomial Method Implementation Results (Assembly 
Language) of Evaluating the Sine Function . . . . 70 
IX. Polynomial Method Implementation Results (Microprogram) 
of Evaluating the Sine Function . . . . . . . . 72 
X. Cordie Algorithm Implementation Results (Assembly 
Language) of Evaluating the Sine Function . . . 82 
XI. Cordie Algorithm Implementation Results (Microprogram) 
of Evaluating the Sine Function . . . 85 
XII. The Conventional Decimal-To-Binary Conversion 91 
XIII. Decimal-To-Binary Conversions in Cordie 94 
XIV. Generation of + Code for 45° 96 
XV. The Comparison Between the Cordie Algorithm Implementation 
Result and the Standard Sine Value . . . . . . . . 99 
XVI. The Comparison Between the Polynomial Method Implementation 
Result and the Standard Sine Value . . . . . . . . . . 102 
vi 
LIST OF FIGURES 
Figure 
1. Typical Computing Step 
2. Cordie Arithmetic Unit 
3. Representation of Angles in Cordie 
4. The Processor System Program .. 
5. Input/Output Interrupt Generator 
6. Instruction Decoding Matrices 
7. Memory Access Operation 
8. Address Computation Operation 
9. EXEC Routine . 
10. A Microprogram Implementation of One Macroprogram 
Instruction 
11. Cordie Algorithm 
12. The AHPL Description for the Cordie Algorithm in 
Implementation in HP21MX Microprogram 























































LIST OF SYMBOLS 
Function 
Address computation defined operation 
Instruction execution defined operation 
I/O interrupt generator system program 
Memory access defined operation 
Processor unit system program 
Run indicator 





I/O device flag 
Main memory 





OP code vector 
Current interrupt priority level 
T-bus 
viii 
v 56 I/O device control bit 
X 16 X-register 
y 16 Y-register 
z 56,8 I/O device data buffer 
a,b,m,t,i,j Local vectors 
d 2 Local vectors 
e 4 Prci'gram exceptions 
eo Power fail 
el Memory parity 
e2 Dual-channel port controller 1 
e3 Dual-channel port controller 2 
g 16 Local vectors 
h 2 Interrupt holder 
ho Exceptions 
hl I/O interrupt 
1 16 Local Vector 
n 9 Navigation vector 
n0,nl,n3 Branch control in EXEC 
n2 Entry line in EXEC 
n4 Instruction class 
q 4 Memory access quene 
r 4 Memory access request 




In the past, the transcendental·functions were computed by 
mathematicians using many different algorithms. Power series, polynomi-
nal expansions, continued fractions, and Chebyshev polynomials have all 
been used. Since the advent of large scale computing in the twentieth 
century, many mathematical functions including trancendental functions 
have been calculated by computers. As a general rule, multiplication 
and division are very time-consuming functions compared to addition 
and subtraction implemented in a computer. A review of the conventional 
methods which are used for solving transcendental functions, such as 
power series, polynomial expansions, continued fractions, and Chebyshev 
polynomials, shows that a number of multiplications and divisions are 
required that results in inefficiency of implementation. 
Therefore, much effort has been made to search for alternate ways 
which can best suit the requirements of speed and programming efficiency 
for real-time applications. 
Henry Briggs (17) first developed the concept of pseudo-division 
and pseudo-multiplication in 1924. He used this method to generate a 
table of logarithms. 
In 1959, J. E. Volder (9) described a Coordinate Rotation Digital 
Computer (Cordie) for the calculation of trigonometric functions, 
multiplication, division, and conversion between binary and mixed radix 
1 
2 
number systems. In the same year, Dagget (10) discussed the use of the 
Cordie computer for decimal-binary conversion. In 1962, Meggitt (11) 
developed a pseudo-division and pseudo-multiplication processor using 
the Cordie technique, while in 1971 J. S. Walther (12) developed a 
technique for calculating elementary functions using Cordie. David 
S. Cochran (14) in 1972 implemented the Cordie B;lgorithm in HP 35 
calculators, and Despain (13) in 1974 developed a technique for 
Fourier transformation using the Cordie algorithm. 
Generally speaking, the trigonometric functions are calculated by 
polynomial expansions, power series, or Chebyshev polynomials in most 
current general purpose computers. 
The major goal of this thesis is to implement the Cordie algorithm 
in a general purpose computer for evaluation of trigonometric functions. 
The speed and accuracy of the results are observed and compared with 
those of conventional algorithms. Microprogramming has been used in 
this research to increase the program efficiency. The anticipated 
result is to determine the best way of evaluating the trigonometric 
functions, which can reduce the computer execution time to a minimum 
and give reasonable accuracy of the results. 
Only the sine function is implemented as a part of this research. 
The tasks are divided into four parts: 
1. Implement the Cordie algorithm in an assembly coded program. 
2. Implement the Cordie algorithm in a microprogram. 
3. Implement one of the conventional methods in an assembly 
coded program. 
4. Implement the same conventional method in a microprogram. 
CHAPTER II 
STANDARD TECHNIQUE FOR THE EVALUATION OF 
TRIGONOMETRIC FUNCTIONS 
The evaluation of elementary functions for various values of their 
arguments is required to solve a number of mathematical problems. 
Because of this, the computation of values of elementary functions was 
an important factor in stimulating the development of mathematical 
analysis. Therefore, a great deal of effort has been made by many 
mathematicians in the past two centuries to find methods of evaluating 
these elementary functions. Power series have been and still are used 
for this purpose. Mercator used a power series for logarithms; Newton 
used it then for trigonometric and inverse trigonometric functions; 
and Euler used one for the exponential function. Iterative processes 
(e.g., Newton's method) were also applied for solving equations (3). 
Furthermore, in the eighteenth century, many mathematicians (Lambert, 
Euler, Lagrange, et al.) used continued fractions to represent elemen-
tary functions. In recent years the technique of expansions in 
orthogonal polynomials has been widely applied for computing elementary 
functions. The Chebyshev polynomials which give good convergence are 
widely used for this purpose too. 
All those methods mentioned above are well documented and are de-
scribed in many mathematics books; thus it is not necessary to explain 
them here. Power series for evaluating trigonometric functions are used 
3 
4 
in this paper as a conventional method of evaluating trigonometric 
functions in order to compare them to evaluations using the Cordie 
algorithm. Therefore, for convenience, the power series method is de-
scribed as follows: 
Power Series 
The elementary functions can be represented as power series in a number 










Truncating this at the nth term produces an nth-degree polynomial 









The polynomial s (x) has the following properties: 
n 





where S (x) is the unique nth-degree polynomial of best approximation 
n 
P (x), for which 
n 
f(x) - P (x) 
n 
n 
0 (x ) (2.4) 
If f (x) sin(x), then sin(x) can be represented in a power series as: 
sin(x) (2.5) 
Cos(x) can be represented in a power series as: 
cos (x) 
00 2k 
L k X 
k = 0 ( -1 ) -:-'( 2=-k-::-)-:-! (2. 6) 
5 
In order to implement this algorithm in a computer for evaluation 
of trigonometric functions, the number of terms (i.e., constant k) 
_required for specific accuracy is determined first. 
To determine the constant k, the maximum accuracy of evaluation 
in the computer must be known first. The computer used in this research 
is an HP21MX, the memory word of which contains 16 bits. Although 
multiple precision could be achieved by using multiple words in arith-
metic operations, single precision (single word) is still used in the 
Cordie algorithm and power series here for the sake of simplicity of 
programming. 
Hastings (4) set up three equations by using power series to 
evaluate the sine function, which are as follows: 
II 3 5 
(2.7) sin 2 x c1x + c3x + c5x 
cl 1.5706268 
c3 = - 0. 6432392 
c5 0.0727102 





11 3 5 7 9 
(2.9) sin 2x c1x + c3x + c5x + c 7x + c9x 





where -1 ~ x ~ 1 
6 
To determine which equation will be used in this paper, the 
maximum value of the error of each equation is checked. The maximum 
value of the error is 0.0001 for equation (2.7), 0.000001 for equation 
(2.8), and 0.000000005 for equation (2.9). For a 16-bit computer 
word, the maximum accuracy that can be represented is 5 decimal digits. 
The accuracy of equations (2.8) and (2.9) is more than 5 decimal 
digits. If they are used to evaluate sine functions in a 16-bit word 
machine, they will consume a lot more execution time than equation 
(2.7) with just a slightly more accurate result. Therefore, in order 
to get the best execution time and accuracy, equation (2.7) is used in 
this research. 
CHAPTER III 
THE CORDIC ALGORITHM 
. ' 
INTRODUCTION 
Cordie is a special purpose, binary computer which contains a 
unique arithmetic unit which differs from the arithmetic unit of con-
ventional computers. Although Cordie is a single processor computer, 
its arithmetic unit is composed of three shift registers and three 
adder-subtractors which are operated in parallel instead of sequentially. 
Each programmed operation is accomplished in a fixed number of steps. 
Each step involves modifying three. numbers which reside in three arith-
metic unit registers by adding or subtracting a constant for each one. 
Setting of all three adder-subtractors is controlled by the sign of 
the quantity in one of the arithmetic unit registers. In this way, 
calculations related to the addition or subtraction of constants can 
be executed simultaneously. 
Functional Description 
There are. two computing modes in Cordie for the trigonometric 
operations: ROTATION and VECTORING. In the ROTATION mode the coordinate 
components of a vector and an angle of rotation are given and the 
coordinate components of the original vector, after rotation through the 
given angle, are computed. In the VECTORING mode, the coordinate 
7 
components of a vector are given and the magnitude and angular argument 
of the original vector are computed. The basic computing technique 
used in both the ROTATION and VECTORING modes in Cordie is a step-by-
step sequence of pseudo-rotations which result in an overall rotation 
through a given angle (ROTATION) or result in a final angular argument 
of zero (VECTORING). 
It is necessary that the angular increments of rotation be comput-
ed in decreasing order (9). In order to evaluate the sine and cosine 
functions for the angles from -180°. to 180°, the magnitude actually 
chosen for the first increment should be 90°. The expression for a set 
I 
of coordinate components, x1 and Y1 , rotated through plus or minus 90° 
is simply 
8 
R . (9 + 90°) 1s1n 1 (3.1) 
(3.2) 
Where R1 and 9, are the magnitude and angle of the vector (X1 , Y1) and 
x2 and Y2 are the coordinates of vector (X1 , Y1) after rotating 90°. 
The first step is unique in that a perfect rotation step is per-
formed. The remaining computing steps can be clarified by examining 
relationships involved in a typical rotation step which are shown in 
Figure 1. Consider two given coordinate components, Yi and Xi, in the 
plane coordinate system shown. In this discussion, the quantity i is 
equal to the number of the particular step under consideration. The 
components Y. and X. are associated with the ith step and describe a 
1 1 
vector of magnitude R. at an angle 9. with respect to the origin 
1 1 












In Figure 1 the angle a. is the magnitude of rotation associated 
~ 





tan -1 2-(i-2) (3. 5) 
The reason for choosing this particular magnitude of a. is that a 
~ 
rotation of coordinate components through ~ ai may be accomplished by 
the simple process of shifting and adding. The two choices of positive 
or negative rotation are shown in Figure 1. The general expressions 
for the rotated components are 
Y = vil+2-2 (i-2) 
i+l 
R.sin(El. +a.) 
~ ~ ~ 
+ -(i-2) Y. 2 X. 
~ - ~ 
and 
= 11 + 2-2(i-2) R.cos(El. +a.) xi+l \/J ~ ~ - ~ 
X. + 2- (i-2)Yi (3. 7) 
~ 
Note that the right-hand terms of (3.6) and (3.7) may be obtained 
by two simultaneous shift-and-add operations, if the angular rotation 
magnitude is restricated to (3.5). This is the fundamental relation-
ship upon which the Cordie computing technique is based. 
















Figure 1. Typical computing step 
11 
of X. toY. to obtain Y. 1 , while simultaneously subtracting (or adding) 1 1 1+ 
a shifted value of Y. to X. to obtain X. 1. is termed "cross addition". 1 1 1+ 
The terms under the radical in (3;6) and (3.7) indicate the 
increase in magnitude when i > 2; either of the two choices of direction 
produces the same change in magnitude. If the rotation is always 
through either a positive or negative a. at each step, then the increase 
1 
in magnitude may be considered as a constant. This requirement does 
not allow the choice of zero rotation at any step. In order to identify 
the choice in a particular step, the :notation may be represented 
by the binary operator v., where v. can be either +1 or -1. This 
1 1 




Y =vi + 2 - 2 (i-2) R sin(". + ) i+l i 0 1 vi ai 
X =1 I + 2-2 (i-2) 
i+l Vl R. cos(8. + v a.) l l i 1 
+1 or -1 
(3. 8) 
(3.9) 
Similarly, after the completion of the rotation step in which the 
i + 1 terms are obtained, the i + 2 terms may be computed from these 
terms with the results 




.;;_ + 2-2 (i-1) J + 2-2 (i-2) 
(3.11) 
12 
Likewise, these rotation steps can be continued through any 
finite. pre-determined number of steps. Consider the initial coordinate 




Suppose the first rotation step is ~ 90° and the number of steps 
is determined as n. The exnressions for the final coordinate components 
will be 
Yn+l =cvi + 2 -O ..)J_ + 2-2 vf + 2-2 (n-2) ) R · (9 + v + ... lsln 1 lal 
and 
y{ -2(n-2) 
. . . 1 + 
(3 .15) 
The increase in magnitude of the components for a particular value 
n is a constant and is represented by k. The value selected for n is a 
function of the desired computing accuracy and can be a constant for a 
particular computer. For example, 
if n = 24, 
k 1. 646 760255. 
The basic components required to perform the cross-addition are shown 
13 
in Figure 2. It has not yet been shown how the prescribed sequence of 
rotation steps can be controlled to effect the desired over-all rota-
tion. By examination of (3.14) and (3.15), the rotation of a set of 




In the VECTORING mode, 








The sequence of (3.18) and (3.19) form a special radix representation 
equivalent to the desired angle, A or 9, where 




tan 2 45° (3. 21) 
a3 
-1 -1 




tan 2 (3. 23) 
The a terms are referred to as ATR (Arctangent Radix) constants and 
are precomputed and stored in the computer. The v terms are referred 
to as ATR digits and are determined during each operation. 
In the Cordie computer, the ATR digits are determined sequentially, 
















F ANGLE REGISTER .----. 
~~ 11111 1111111111 r 1' 1=====1~~~~~CTOR 
ATR CONSTANTS ' f> ._-__,1./.t,__· _ _j 
Figure 2. Cordie Arithmetic Unit 
14 
15 
action of the adder-subtractors in the arithmetic unit. The following 
paragraphs contain a description of the manner in which the ATR code 
representation, v1 , v2 , v3 , ... , vn can be determined for any given 
angle, A or G. 
First, for any angle A or 9, there must be at least one set of 
values of Y for the operators that will satisfy (3.18) and (3.19). 
Second, a simple technique must be available for determing the ATR 
code digits that satisfy these equations. The following relationships 
are necessary and sufficient for any sequence of radix constants to 
meet the above requirements (3.9). 
(3.24) 
(3.25) 
For the satisfaction of (3.20) through (3.23), it is required that 
or 9 be constrained by 
-180° < A or 9 < + 180° (3.26) 
Equation (3.26) imposes no special consideration if the two's complement 
notation is used. By employing an additional register and adder-
subtracter (identified in Figure 2 as the angle register) the relation-
ship of (3.16) (ROTATION-mode) can be instrumented by 1) sensing the 
sign of the angle of rotation (or remainder if i > 1) and 2) either 
subtracting or adding to the angle the ATR constant corresponding to the 
particular step. In each step, the relationship instrumented is 
I I A. 
l 




Equation (3.24) is equivalent to 
(3.28) 
Application of the relationships of (3.25) results in 
I A 






Equation (3.30) can be used to prove that the remainder in the angle 
register converges to zero in the ROTATION mode (9). 
The sequence of operation signs used to null A to zero is the 
negative of the equivalent ATR code for the original angle. More 
simply, the ATR code digit of each step is equal to the sign of the 
quantity in the angle register before each step. Therefore, simulta-
neously with each step in the angle register, the ATR code digit may be 
used to control the cross-addition step in the Y and X registers (shown 
in Figure 2) to effect a rotation of components through an equal angular 
increment. 
The proof of the convergence of the effective angular argument 
9 1 to zero, which is necessary in the VECTORING mode, may be obtained n+ 
by replacing A by 9. The sign of the angle 9. is obtained bv sensing 
~ 
the sign of Y .. The sequence of signs of Y. is the negative of the ATR 
~ ~ 
code for the effective rotation performed on the components Y1 and x1 . 
, 
During each cross-addition operation in the Y and X register, the 
corresponding ATR constant can be conditionally added or subtracted, 
depending on vi, to an accumulating sum in the angle register so that, 
at the end ofthe computing sequence, when e 1 = 0, the quantity in 
n+ 
the angle register will be equal to the original angular argument 
~\ of the coordinate components Y1 and x1 . 
17 
The step-by-step results of a typical rotation computing sequence 
are shown in Table I. The two's complement notation is used for all 
quantities, and shift quantities are truncated without round-off. The 
step-by-step results of a typical rotation computing sequence are · 
shown in Table I. 
Representation of Angles in Cordie 
In Cordie, angles are represented as a binary fraction of a half 
revolution (IT) with two's complements for negative angles, as shown 
in Figure 3. Since a one to the left of the binary point is used to 
represent a negative quantity in the two's complement system, angles 
from +180° to slightly less than+ 360° are interpreted internally as 
negative angles measured clockwise from 0°. For example, 45° in 
Cordie is 
= co. 25\o 
For 90° the Cordie representation is 
IT/2 
IT ~ = (0.5)10 = (0.1)2 
For 270° the Cordie representation is 
3IT/2 
-IT- = (1. 5 \o 
18 
TABLE I 
TYPICAL ROTATION COMPUTING SEQUENCE 
Y Register X Register Angle Register 
yl = 0.0101110 1.1000101 = xl 0.1100101 = :\ 
+ 1.1000101 - 0.0101110 - 0.1000000 -1 tan 00 
1.1000101 1.1010010 0.0100101 -1 + 1.1010010 - 1.1000101 - 0.0100000 tan 1 
1.0010111 0.0001101 0.0000101 -1 -1 + 0.0000110 - 1.1001011 - 0.0010010 tan 2 
1.0011101 0.1000010 1.1110011 -1 -2 
- 0.0010000 + 1.1100111 + 0.0001001 tan 2 
1.0001101 0.0101001 1.1111100 
-1 -3 l + 1.1110001 + 0.0000101 - 0.0000101 tan 2 
1.0001000 0.0011010 0.0000001 -1 -4 + 0.0000001 - 1.1111000 - 0.0000010 tan 2 
1.0001001 0.0100010 1.1111111 
-1 -5 - 0.0000001 + 1.1111100 + 0.0000001 tan 2 
1.00010000 0.0011110 0.0000000 
19 
TABLE II 
TYPICAL VECTORING COMPUTING SEQUENCE 
Y Register X Register Angle Register 
y 0. 0101110 1.1000101 xl 0.0000000 1 
- 1.1000101 + 0.0101110 + 0.1000000 -1 tan 00 
0. 0111011 0.0101110 0.1000000 -1 - 0.0101110 + 0. 0111011 + 0.0100000 tan 1 
0.0001101 0.1101001 0.1100000 -1 -1 
- 0.0110100 + 0.0000110 + 0.0010010 tan 2 
1.1011001 0.11011ll 0.1110010 
-1 -2 + 0.0011011 - 1.1110110 - 0.0001001 tan 2 
1.1110100 0.1111001 0.1101001 -1 -3 + 0.0001111 - 1.1111110 - 0.0000101 tan 2 
0.0000011 0.1111011 0.1100100 -1 -4 - 0.0000111 + 0.0000111 + 0.0000010 tan 2 
1.1111111 0.1111100 K K1 0.1100101 = 0 
20 
Sine and Cosine Algorithm 
As mentioned above, there are two computing modes for Cordie, 
ROTATION and VECTORING. Evaluating sine or cosine functions makes use 
of the ROTATION mode by setting the original vector on the X-axis and 
rotating the vector through an angular argument whose sine or cosine 
is computed. 
Functional Description 
In order to use the ROTATION computing sequence (Table I) of Cordie 
to evaluate sine and cosine functions, several initial conditions and 





The Y-register is initialized with 0. 
The X-register is initialized with a unit vector. 
The A-register is initialized with the angle which is going 
to be computed. 
A sign digit of 0 in the A-register establishes a v. of +1, 
1 
which causes the top adder - subtracter to add, the middle 
adder-subtracter to subtract, and the bottom adder - sub-
tractor to subtract. A sign digit of 1 has the opposite 
effect. 
5) The number of steps (iterations) is initialized depending 
on the desired accuracy. 
The Cordie ROTATION computing sequence is started as shown in 
Table 1. 
The final result is in the Y-register if the function evaluated- is 
sine and in the X-register if the function evaluated is cosine after 











1. 00 0.00 
I 
NEGATIVE 
-135° or 225° 




-90° or 270° 
1.10 
Figure 3. Representation of Angles in Cordie. 
CHAPTER IV 
COMPUTER IMPLEMENTATION AND 
PROGRAMMING RESULTS 
The four tasks described in Chapter I are performed and the pro-
gramming results are obtained in this chapter. 
The description of the HP21MX computer which is used to aid 
this research is given below. 
System Features 
The HP21MX computer is a powerful user-microprogrammable mini-
computer with 178 micro-instructions and 4K words of control space. 
Each word is 24 bits long. It has 128 standard instructions, 80 of 
which emulate the HP 2100 series computer; 42 of which are new instruc-
tions for indexing, byte and bit manipulation, byte and word moves, and 
byte string scanning; and 6 of which are single-precision floating 
point instructions. There are four general purpose registers, two of 
which may be used as index registers. It is a fully microprogrammed 
processor, including all arithmetic functions, input/output, and opera-
tor panel control. Writable Control Store (WCS) is optional. 
The read-only memory (ROM) modules in which microprograms are 
stored are referred to collectively as control store. Standard control 
consists of 1,024 directly addressable locations configured into four 
modules of·2S6 location each. Each control store location accommodates 
22 
23 
one micro-instruction, which in turn consists of a 24-bit word 
encompassing six micro-orders. The control store address space of each 
processor is 4,096 words. 
Microprograms in standard control store for executing the vaY.'ious 
machine functions are divided into three groups: 
Base instruction set (modules 0 and 1) 
Floating point instructions (module 14) 
Extended instruction group (module 15) 
Unused modules of control store are available for user-supplied 
microprograms. Microinstructions in control store are 24 bits lqng; 
whereas, machine language instructions residing in main memory are 
16 bits long. In addition, microinstructions have access to many 
internal registers and logic functions that machine language instruc-
tions cannot use. 
The Writable Control Store (WCS) option provides a read-write 
control store module which can be used for the development and execution 
of user-supplied microprograms. Microprograms in WCS are executed at 
the same speed as those in the read-only control store. 
Hardware Registers 
A 16-bit accumulator which holds the results of arithmetic 
and logical operations performed by programmed instructions. 
B-register 
Serves the same purpose as the A-register, but is independent 
froin it. 
M-register 
A 16-bit register used to hold the memory address which is 
currently bei~g accessed by the CPU. 
T-register 
A 16-bit register used to hold the data which are stored 
into or retrieved from memory. 
P-register 
Program counter, 16 bits long, pointing to next instruction 
to be fetched from memory. 
S-register 
A 16-bit utility register. In the halt or run mode, it can 
be loaded via the display register. 
Extend register 
24 
A one-bit register used to link the A- and B-registers by 
rotation instructions or to indicate a carry from the most signif-
icant bit (bit 15) of the A- or B-register by an add instruction 
or increment instruction. 
Overflow Register 
A one-bit register used to indicate that an add instruction, 
divide instruction, or an increment instruction has caused the A-
reeister or B-register to exceed the maximum positive or negative 
number that can be contained in these registers. 
Displav register 
A 16-bit register included in the front panel and used to 
disnlav and modifv the contents of the six 16-bit working registers 
when the comnuter is in the halt mode. 
X- and Y-registers 
Two 16-bit registers serving as indexing registers which are 
accessed through the use of 30 index register instructions and 
2 jump instructions. 
S. to s1 A scratch pad registers .L L 
Twelve registers (each 16 bits long) used to temporarily 




The vectored priority interrupt system has up to 60 distinct 
interrupt levels, each of which has a unique priority assignment. 
25 
Each interrupt level is associated with a numerically corresponding 
interrupt location in memory. 
Of the 60 interrupt levels, the first two are reserved for 
hardware faults (power failure and parity error); the next two are 
reserved for the Dual-Channel port controller completion interrupts; 
and the reamining levels are available for I/0 device channels. 
Table III lists the interrupt levels in priority order for the HP 
2108 processor of the 21 MX. 
-
APL Description of HP21MX 
In the APL description of the HP21MX, the comuuter svstem is 
described as seen bv a uroerammer. and the descriution is indenendent 
of anv uarticular hardware imulementation. All those instructions 
which are not connected with this research are not included in this 
description. Iverson (2) gives a complete definition of the notation 
used. The description is based on the HP21MX Computer Series Reference 
Manual (5) and consists of a set of programs and tables. 
* Macroprogram - programs stored in main memory. 







Interrupt Location Assignments 
00004 Power Fail Interrupt 
00005 Memory Paritv/Protect 
26 
Interrupt 
06 00006 DCPC Channel 1 Completion Interrupt 
07 00007 DCPC Channel 2 Completion Interrupt 
10 00010 1/0 Device (highest priority) 
11-20 00011-00020 1/0 Device (Mainframe) 
21-42 00021-00042 1/0 Device (Extender No. 1) 
43-64 00043-00064 1/0 Device (Extender No. 2) 
The programs are either system programs or defined operations. 
All programs operate concurrently and continuously, with one line 
active in each program. The defined operation program operates only 
when invoked by another program. In the description presented, PROC 
and IOIG are system programs, whereas ADC, EXEC, and MAC are defined 
operations. 
The Processor 
The PROC program, Figure 4, describes the sequencing and exe-
cution of instructions and the servicing of interrupts. The program 
segments and their functions are summarized in Table IV. 
TABLE IV 
"PROC" PROGRAM SEGMENTS 
Lines Function 
1-4 Instruction fetch 
5-14 Instruction decoding 
15-26 Instruction execution 
27-30 Trap interrupt service 
Instruction Fetch 
27 
The first step in program execution is to fetch the instruction 
from memory. In order to prepare for instruction fetch, the exceptions 
vector is initialized to zero (line 1). The 16-bit instruction is 
fetched from memory at the address given by the program counter, and 
placed in the instruction register (line 2). The program counter is 
incremented by 2 (line 3), and in case of any exceptions during 
instructjon fetch, control branches to line 27. Exceptions during 
fetch may be due to errors in parity check. 
P •· (16)T 1 + P 
1 : v/e 
~--·--
i ~ l 
j + 0 
[j ': : : :''"Q" " ,, 
< J + i. 
c15/I, 0 14,15/I, cl3, 14,15/I 
011,13,14,15;1 , 010,11;1 , 
c. 9,10,ll;r, c4/I, c7,8,9/l, 
10 
c6,7/l, 07,8,9/I). 11 
J 
kz •· l (r. 4 /!, 0, 0, r 14 /l, c 13 /l, 
r",ll,l2/l, c 4 'U/I, c 4 /I, 
r6 ;r, c 7 /I, c 4/I, c 4/I, 
r. 4 /I)j 12 
L_ 
m ~ jDk2 
kl 
m 
n + N 
~ (16' 25, 25, 25, 
25, 25, 25, 20, 
22) j 




b •· (l,}O/I) + 15 X t X 1024 
ADC (I0 : b; a) 
1: v/e 
a "'" JJ;.}4/I 
(Nm) 1 ~ 16 
K1 ~ 13,14,15/I, 8,9/I)l5 
.. j + (0, 2)15 
EXEC 
0 : v /e 
L ___ ---.~ h0 .... 1 
0 : V/h 
1 .6; ) 0 
MAC ((4,5,6,7,"w T ((e,h1)/t ) 0 ,£:1) 
h(hf,olo~o 


























To determine the operation specified by the instruction, the 
instruction is decoded next. Because the operation code of an instruc-
tion in this machine may be varied from 4 bits to 16 bits and several 
microinstructions may be involved in a single instruction word for some 
type of instructions, the decoding task is very complicated and tedious. 
Many steps and two sets of decoding vectors named u and q are used in 
this APL description to aid the decoding task. These two sets of 
vectors ar listed in Table V. The instructions are divided into 13 
classes. Table IV summarizes those 13 classes. The number involved 
in this table is used to identify the class of the instruction during 
the decoding. 
The class identifiers j and i are initialized in line 5 and 6. 
The decoding vectors U. and E. are used in lines 7, 8, and 9 to identify 
~ ~ 
the class of the current instruction. Once the class of the current 
instruction is found, it is stated in j (line 10). 
The components of the selection vector k take on the values of the 
fields depending on j (lines 11 and 12). Lines 13 and 14 interpret 
i 
the instruction by selecting a row N from the navigation matrix 
N (Table VII), to specify the vector n used in subsequent control of 
the instruction execution. The row of N selected, is determined by an 
element of a particular decoding matrix D, Figure 6, specified by 




WMI u = 1 (1000101111111110) Q = 1 (1111111111111110) 
JMPI u2 = (1000101111110010) Q2 (1111111111110111) 
BIMI u = 3 (1000101111111000) Q3 (1111111111111000) 
BYMI u4 = (1000101111110000) Q4 (1111111111111000) 
DMI u = 5 (1000001111000000) Qs (1111011111100000) 
IRI u6 (1000001111100000) Q6 (1111011111100000) 
FRI u7 (1000101000000000) Q7 (1111010000000000) 
EAMR us (1000000000000000) Q = 8 (1111010001110000) 
EAR u9 (1000000000000000) Q = 9 
(1111010000000000) 
IOI u1o= (0000010000000000) Q10= (1111010000000000) 
A/S Ull= (0000010000000000) Q11= (1111010000000000) 




MRI: Memory reference instructions 
WMI: Word manipulation instructions 
MJPI: Jump instructions 
BIMI: Bit manipulation instructions 
BYMI: Byte manipulation instructions 
DMI: Dynamic mapping system instructions 
IRI: Index register instructions 
FPI: Floating point instructions 
EAMR: Extended arithmetic memory reference 
instructions 

























THE NAVIGATION MATRIX 
no nl n2 n3 Class Index Mnemonic Name Op Code 
0 0 ao a3 MRI 1 ADA Add to A -1000 -----------
1 0 ao a3 MRI 2 ADB Add to B -1001 -----------
0 bo bs IRI 3 ADX Add memory to X 
1000101111100110 
1 bo bs IRI 4 ADY Add memory to Y 
1000101111101110 
0 eo e3 S/R 5 ALF Rotate A left four 
0000001111-1-111 
0 eo e4 S/R 6 ALR A left shift. clear sign 
0000001100-1-100 
0 eo es S/R 7 ALS A left shift 
0000001000-1-000 
0 0 ao a2 MRI 8 AND "AND" to A 
-0010 ----------
0 eo e6 S/R 9 ARS A right shift 
0000001001-1-001 
co EAR 10 ASL Arithmetic shift left 
100000000001 
cl EAR 11 ASR Arithmetic shift right 
100000100001 
1 eo e3 S/R 12 ELF Rotate B left four 
0000101111-1-111 
1 eo e4 S/R 13 BLR B left shift, clear sign 
0000101100-1-100 
33 
TABLE VII (Continued) 
-
no nl n2 n3 Class Index Mnemonic Name Op Code --------
1 eo e5 S/R 14 BLS B left shift 
0000101000-1-000 
1 eo e S/R 15 BRS Bright shift J. 0000101001-1-001 
0 2 b4 b8 IRI 16 CAX Copy A to X 
1000001111100001 
0 3 b4 b8 IRI 17 CAY Copy A to Y 
1000001111111100 
BIMI 18 CBS Clear bits 
1000101111111100 
BYNI 19 CBT Compare bytes 
1000101111110110 
1 2 b4 b8 IRI 20 CBX Copy B to X 
1000101111101001 
1 3 b4 b8 IRI 21 CBY Copy B to 
y 
1000101111101001 
0 fo - A/S 22 CCA Clear and complement A 
00000111 -------
1 f - A/S 23 CCB Clear and complement B 0 00001111 -------
£7 A/S 24 CCE Clear and complement E 
0000-1--11 -----
0 £2 - A/S 25 CLA Clear A 
00000101 -------
1 f2 A/S 26 CLB Clear B 
0 do d2 IOI 27 CLC Clear control 
100011-111 
f5 - A/S 28 CLE Clear E 
0000-1--01 -----
0 do d6 IOI 29 CLF Clear flag 
1000-11001 
IOI 30 CLO Clear overflow 
1000011001000001 
34 
TABLE VII (Continued) 
-------------- -----
no nl n2 r:.L Class Index Mnemonic Name Op Code --------------
1 B4 B9 IRI 47 DSY Decrement Y and skip if zero 
1000101111111001 
0 0 eo e7 S/R 48 ELA Rotate E left with A 
0000001110-1-110 
1 0 eo e7 S/R 49 ELB Rotate E left with B 
0000101110-1-110 
0 1 eo e7 S/R so ERA Rotate E right with A 
0000001101-1-101 
1 0 eo e7 S/R 51 ERB Rotate E right with B 
0000101101-1-l-1 
FPI 52 FAD Floating point add 
1000101000000000 
FPI 53 FDV Floating point divide 
1000101000110000 
FPI 54 FIX Floating point to integer 
1000101001000000 
FPI 55 FLT Integer to floating point 
1000101001010000 
FPI 56 FMP Floating point multiply 
1000101000100000 
FPI 57 FSB Floating point subtract 
1000101000010000 
0 do dl1 IOI 58 HLT Halt 1000-1-000 
A/S 59 INA Increment A 
000001-------1--
A/S 60 INR Increment B 
000011-------1--
1 0 ao a2 HRI 61 IOR "Inclusive OR" to A 
-011------------
0 0 b4 b9 IRI 62 ISX Increment X and skip if zero 
1000101111110000 
0 1 b4 b9 IRI 63 ISY . Increment Y and skip if zero 
1000101111111000 
35 
TABLE VII (Continued) 
~1 nz n3 Class Index "Mnemonic Name Op Code 
A/S 31 CMA Complement A 
00000110 -------
A/S 32 CMB Complement B 
00001110 -------
f6 A/S 33 CME Compare E 
0000-1--10 
WMI 34 CMW Compare words 
1000101111111110 
0 a as MRI 35 CPA Compare to A 0 -1010 ---------..-
1 ao as MRI 36 CPB Compare to B 
-1011 ----------
2 0 b4 bs IRI 37 CXA Copy X to A 
1000001111100100 
2 1 b4 bs IRI 3S CXB Copy X to B 
1000101111100100 
3 0 b4 b8 IRI 39 CYA Copy Y to A 
1000001111101100 
3 1 b4 b8 IRI 40 CYB Copy Y to B 
1000101111101100 
EAMR 41 DIV Divide 
100000010000 
DMI 42 DJP Disable mem and jump 
1000101111011010 
DMI 43 DJS Disable mem and jump to sub-
routine 
1000101111011011 
EANR 44 DLD Double load 
100010001000 
EAMR lf5 DST Double store 
100010010000 
1 0 b4 b9 IRI lf6 DSX Decrement X and skio if zero 
100010111110001 
36 
TABLE VII (Continued) 
no nl n2 n3 Class Index Mnemonic Name Op Code 
ao a9 MPI 64 ISZ Increment and skip if zero 
-0111-----------
go JMPI 65 JLY Jump and load Y 
1000101111110010 
1 al a13 MPI 66 JMP Jump 
-0101-----------
g4 JMPI 67 JPY Jump indexed by Y 
1000101111111010 
DMI 68 JRS Jump and store status 
1000101111001101 
0 a1 a12 MPI 69 JSB Jump to subroutine 
-0011-----------
0 0 bo b11 IRI 70 LAX Load A indexed by X 
1000001111100010 
0 1 bo b11 IRI 71 LAY Load A indexed by Y 
1000001111101010 
BYMI 72 LBT Load byte 
1000101111110011 
1 0 bo b11 IRI 73 LBX Load B indexed bv X 
1000101111000010 
1 1 bo b11 IRI 74 LBY Load B indexed by Y 
1000101111101010 
0 0 a1 a MRI 75 LDA Load A 7 -1100-----------
1 0 a1 a7 MRI 76 LDB Load B -1101---~-------
0 bo b12 IRI 77 LDX Load X from memory 
1000101111100101 
1 bo b12 IRI 78 LDY Load Y from memory 
1000101111101101 
DMI 79 LFA "'Load fence from A 
1000001111010111 
DMI 80 LFB 7'Load fence from B 
1000101111010111 
37 
11\.HLE VI (Continued) 
----· .~·-----·---·-------------· 
no nl nz n3 Class Index Mnemonic Name Op Code 
---·--·-
0 0 do d12 IOI 81 LIA Load into A 
100001-101------
1 0 do d IOI 82 LIB Load into B 12 100011-101------
c2 EAR 83 LSL Logical shift left 
10000000001-----
c3 EAR 84 LSR Logical shift right 
100000100010----
DMI 85 MBF Move bytes from alternate map 
1000101111000011 
DMI 86 MBI Move bytes into alternate 
1000101111000010 
BMI 87 MBT Move bytes 
1000101111110101 
DMI 88 MBW Move bytes within alternate 
1000101111000100 
0 0 d dl3 IOI 89 HIA Merge into A 0 100001-100------
1 do d13 IOI 90 MIB Merge into B 
100011-100------
EAMR 91 MPY Multiply 
100000001000----
WMI 92 MVW Move words 
1000101111111111 
DMI 93 MWF Move words from alternate map 
1000101111000110 
DMI 94 MWI Move words into alternate map 
1000101111000101 
DMI 95 MWW Move words within alternate 
map 
1000101111000111 
- S/R 96 NOP No Operation 
0000000000000000 
38 
TABLE VII (Continued) 
n n 112 113 Class Index Mnemonic Name Op Code 0 1 
----
0 do d20 IOI 97 OTA Output A 
100001-110------
0 do d20 IOI 98 OTB Output B 
100011-110------
DMI 99 PAA Load/store port A map per A 
1000001111001010 
DMI 100 PAB Load/store port A map per B 
1000101111001010 
DMI 101 PEA Load/store port B map per A 
1000001111001011 
DMI 102 PBB Load/store port B map per B 
10001-1111001011 
0 0 eo e9 S/R 103 RAL Rotate A left 
0000001010-1-010 
1 0 eo e9 S/R 104 RAR Rotate A right 
0000001011-1-011 
0 1 eo e9 S/R 105 RBL Rotate B left 
0000101010010010 
1 1 eo e9 S/R 106 RBR Rotate B right 
0000101011-1-011 
c4 EAR 107 RRL Rotate left 
100000000100----
c5 EAR 108 RRR Rotate right 
100000100100----
DMZ 109 RSA Read status register into A 
1000001111011000 
DMI llO RSB Read status register into B 
1000101111011000 
A/S 111 RSS Reverse skip sense 
0000-1---------1 




TABLE VII (Continued) 
--
no nl n2 n3 Class Index Mnemonic Name Op Code 
DMI 113 RVB Read violation register 
into B 
1000101111011001 
0 0 bo b14 IRI 114 SAX Store A indexed by X 
1000001111100000 
0 0 b bl4 IRI 115 SAY Store A indexed by Y 0 1000001111101000 
BIMI 116 SBS Set bits 
1000101111111011 
BYMI 117 SBT Store type 
1000101111110100 
1 0 bo b14 IRI 118 SBX Store B indexed by X 
1000101111100000 
1 1 bo b14 IRI 119 SBY Store B indexed by Y 
1000101111101000 
A/S 120 SEZ Skip if E is zero 
0000-l----1-----
BYMI 121 SFB Skip if flag clear 
1000-10010------
0 do dl4 IOI 122 SFC Skip if flag c.lear 
1000-10011------
0 d d16 IOI 123 SFS Skip if flag set 0 1000-10011------
DMI 124 SJP Enable system map and jump 
1000101000100000 
DHI 125 SJS Enable system map and jump 
to subroutine 
1000101111011101 
S/R 126 SLA Skip if LSB of A is zero 
00000-------1---
S/R 127 SLB Skip if LSB of B is zero 
000010------1---
0 do dl4 I.OI 128 soc Skip if overflow clear 
100001-010000001 
40 
TABLE VII (Continued) 
I -n Ill n n3 Class Index Mnemonic Name Op Code 0 2 
0 do. dl6 IOI 129 sos Skip if overflow set 
100001-011000001 
A/S 13'0 SSA Skip if sign of A is zero 
000001-----1----
A/S 131 SSB Skip if sign of B is zero 
000011-----1----
DMI 132 SSM Store status register into 
memory 
1000101111001100 
0 1 al a7 MRI 133 STA Store A 
-1110-----------
1 1 al a7 MRI 134 STB Store 
-1111-----------
0 do dl8 IOI 135 'STC Set control 
100001-111------
0 do d19 IOI 136 STF Set flag 
1000-10001------
0 do dl9 IOI 137 STO Set overflo·w 
1000010001000001 
0 bo b13 IRI 138 STX Store X to memory 
1000101111100011 
1 bo b13 IRI 139 STY Store Y to memory 
1000101111101011 
DMI 140 SYA Load/store system map per A 
1000001111001000 
DMI llfl SYB Load/store system map per B 
1000101111001000 
A/S 142 SZA Skip if A is zero 
000001--------1-
- A/S 143 SZB Skip if B is zero 
000011--------1-
BYMI 144 TBS Test bits 
1000101111111101 
41 
TABLE VII (Continued) 
-
~0 nl n2 n3 Class Index Hnemonic Name Op Code 
DMI 145 UJP Enable user map and jump 
to subroutine 
1000101111011110 
DMI 146 UJS Enable user map and jump 
to subroutine 
1000101111011111 
DMI 147 USA Load/store user map per A 
1000001111001001 
DMI 148 USB Load/store user map per B 
1000101111001001 
0 0 b4 b15 DMI 149 XAX Exchange A and X 
1000001111100111 
0 1 b4 b1s IRI 150 XAY Exchange A and X 
1000001111101111 
1 b4 b15 IRI 151 XBX Exchange B and X 
1000101111100111 
1 1 b4 bl5 IRI 152 XBY Exchange B andY 
1000101111101111 
DMI 153 XCA Cross compare A 
1000001111010110 
DMI 154 XCB Cross Compare B 
1000101111010110 
DMI 155 XLA Cross load A 
1000001111010100 
DMI 156 XLB Cross load B 
1000101111010100 
DHI 157 XHA Transfer maps internally 
per A 
1000101111010000 
DMI 158 XMB Transfer maps internally 
per B 
1000101111010010 
DMI 159 XMM Transfer maps or memory 
1000101111010000 
42 
TABLE VII (Continued) 
no nl n2 n3 Class Index Mnemonic Name Op Code 
DMI 160 XMS Transfer maps sequentially 
1000101111010001 
BPI 161 XOR "Exclusive OR" to A 
-0100-----------
DMI 162 XSA Cross Store A 
1000001111010101 
DMI 163 XSB Cross store B 
1000101111010101 . 
0 2 b4 b8 IRI 164 CAX Copy A to X 
1000001111100001 
3 b4 bg IRI : 165 CAY Copy A to y 
1000001111101001 
1 2 b4 hs IRI 166 CBX Copy B to X 
1000101111100001 
1 3 b4 b8 IRI 167 CBY Copy B to y 
1000101111101001 
* Base page fence register 
43 
1 v/F 0 
hl + (S>(F/t
0) 0) 1 
-1-
,--- 1 hl 2 
w6fT + T(F/ 0 l ) 0 3 
k--1 L--~----~----~----------------~ s + (F/lO)O ,4 
Figure 5. Input/Output Interrupt Generator 
Instruction Execution 
The instruction execution starts at line 15. The effective 
address computation of MRI is performed at lines 16, 17, 18 and 19. 
Line 20 sets up the immediate value for EAR. Line 21 sets up I/0 
flag clear/hold information for IOI. Line 22-24 subdecodes the packed 
micro-instructions in A/S and S/R instructions. 
Interrupt Service 
Servicing of exceptions is given priority over I/O interrupt 
service. In case of any exception the bit (0 for exception, 1 for I/0 
interrupt) in the interrupt holder his set (line 27). The interrupt 
service ~equence is initiated if at least one interrupt is pending 
(line 28). The sequence consists of fetching a new instruction from 
one of the five fixed locations in memory. The interrupt vector address 
of the peripheral device is obtained from the six least significant 












































































0 1 2 3 4 5 6 7 8 9 10 . . . . . 15 
114 16 . 70 37 149 
SAX CAX LAX CXA XAX 
115 17 71 39 150 
SAY CAY LAY CYA XAY 
118 20 73 138 38 77 3 151 62 46 
SBX CBX LBX STX CXB LDX ADX XBX ISX DSX 
ll9 21 74 139 40 78 4 152 63 47 
SBY CBY LBY STY CYB LDY ADY XBY ISY DSY 
6D 
(c) IRI Instruction 












0 1 2 3 4 5 
8 161 61 1 35 
AND XOR lOR ADA CPA 
69 66 64 2 36 
JSB JMP ISZ ADB CPB 
OD 





(e) JMPI Instru<:!tion 
0 1 2 3 
72 
LBT 
117 87 14 121 
SBT MBT CBT SFB 
4D 







(g) BIMI Instruction 
0 1 2 3 
52 57 56 53 




(h) FPI Instruction 































(j) EAMR Instruction 
0 1 2 3 
')8 136 122 123 
HLT STF SFC SFS 
58 29 128 129 
HLT CLF soc sos 
, 10D 
(k) IOI Instruction 
0 1 2 3 
25 31 22 
CLA CMA CCA 
26 32 23 
CLB CMB CCB 
11n 
(£) A/S Instruction 
7 9 103 104 
LS ARS RAL RAR 
14 15 105 R~~b LS BRS RBL 






























The 1'/0 interrupt generator (IOIG) system program, Figure 5, 
determine the presence of interrupt requests by peripheral devices and 
sets the bit in'the interrupt holder, h, accordingly(line 1), The 
dwell at line 0 checks for interrupts on the device flag. The setting 
of any I/O device flag means an interrupt request from that I/0 device. 
If a higher priority device has already gained control of the processor, 
the lower priority device cannot be served until the higher priority 
device has finished its service routine (lines 1, 2, and 3). 
Memory Access Routine 
The memory access (MAC) operation, Figure 7, fetches or stores a 
specified number of bytes from the memory at a given address. The 
general form of the operation is MAGi (j;l), where i specifies the 
device requesting access; j is a two-component vector specifying the 
address in memory (j 0 ) and the type of operation (store; jl = 2; 
fetch:j 1 =f), respectively; and 1 specified the vector into/from which 
the accessed data are to be stored/fetched. 
The request for service is entered in the bus request vector r, 
and in' the queue if it is empty (line 0). The program dwells at line 1 
until i is recognized as the first nonzero entry in the queue. After 
the requ~st has been honored, the entry in the request vector is blanked 
out (line 2). The parity error exception is noted (line 5), and entered 
in the exception vector e. If no exception occurs, a fetch (line 4) 
or store (line 7) is performed. 
48 
Address Computation Routine 
The address computation (ADC) operation, Figure 8, is used for 
effe~tive address calculation of the operands. The general form for 
ADC is (m; g; k) wh~re m is'the mode of addressing ( 0 means direct, 
1 means indirect), g is the primary address, and k is the effective 
address returned by the,operation. 
defined operation 
~ r.' qi ..e_, _vlq 0 ]. i (ql 0 1 
=I= 
: l ) 0 
rl +- 0 2 
- jl : s 3 -
J ~- (j +a.1) I /M 
0 4 
-
1 : e +-1 ~ =f I J 5 
-
, ___ l+- wl6 IJ 6 t--
J +- ~ =f I ..e_, 1 7 
(j 0 ta.l) //M +- J 8 
~ 
q +- r 9 
Figure 7. Memory Access Operation 
49 
ADC(m d k) defined operation 
-=l.=:_ 0 m 0 
k +- dl 1 ' 
~ 
1 I 
MAC (d1 , f; f) 2 
k +- .Ll 3 
Figure 8. Address Computation Operation 
Instruction Execution Routine 
At the entry point EXEC, Figure 9, the routine for an instruction 
is determined by n2 (line a 0). Execution involves setting up condition 
codes (if necessary) after the execution. 
All MRI instructions are executed here. AND, lOR, XOR ADA, ADB, 
CPA, CPB, and ISZ are entered at line a1 to get data from memory. 
STA, STB, LDA, and LDB are entered at line a2 . All MRI instructions 
are diverged at line a2 and enter their own routine. The "Exit" here 
means go back to PROC ; the outgoing arrow at the right side of the 
line also indicates return to PROC if the arrow does not direct to any 
other line. This is true not only here, but also in any other line of 




1 MAC (a,f; C) 
--> n3 















E,U ~((17)T(.l(A,B) ) + .lC) a 5 no 
0 ~(u ~Co) A(~ ((A,B) . $ c0) a 6 a no 
(A,B) <:-- u 
1 no 
MAC (a,(f,s)n1 ; (A,B)n0) 
(all' Exit)(O= .l (A,B) e C) 
nl 





0 : .lC a --) 
11 
P .(-n + .1P 
1 
MAC (a,s;P) 
P <f-T a + (1,0) no 
1 
MAC ( .iP, f ; C) 
P ~Tl + .lP 
ADC (CO 
1 MAC (a, f ; C) 
a) 
Figure 9. ~XEC Routine. 
so 
51 
\ CXA, CXB (X, Y) .c; • \ 
b7 ) 
! no- u 
~ 
CBX,CAX (A,B,X,Y) .:0- (A,B;X,Y) bs > 




0 (X,Y) ~---T(l,-1) +~(X,Y) b9 J:__> ) n1 no nl 
ISY,DSY 
p +---Tl + :LP 
blO 
_c,. 
LAX, LAY 1 f; (A,B) ) > MAC (a+ (X,Y)n1 , bll no 
·LBX,LBY 
LDX,LDY (X,Y)no ~ c b12 > 
STX, STY MAc 1 (a,s,(X,Y)n0 ) bl3 I~ 
1 
SAX, SAY MAC (b+~(X,Y)nl' s· (A, B) ) b14 1-----) ' no 
SBX.SBY 
XAX.XAY c ~ (A.B) 
bl5 no 
XBX,XBY (A, B) .~- (X, Y) 




(;)31/( t 31 B, A) --'> ASL B, A)<:- a ow /( co 
ASR w31/( B, A) ~-(E(31) ,aa(31) )B 
0 
31 cl 1------) v (a ~ w I ( B , A) ) 
t 
A) LSL B, A~-- ao (B, c2 
---7 
LSR B, A~-- a? (B, A) c3 
RRL B, A <:-- at (B, A) c4 ------? 
RRR B, A <:-- a -t (B, A) cs --> 
~ 

























0 : a 
V<~(56) 
0 : n1 
1 : a 
F <-- ~ a 
0 : a 
-





(A,B) <-- (A,B) 
no no 
-->(d3 , Exit) a 
V<--1 a 
(Fa' 0) (a+o) <:-- 1 
v z 
a 


































0 : a 
(A,B) <-- 4t (A,B)' 
no no 
(A,B) <-- 6 (A,B) 
no no 
w15/(A,B) < +w15/(A,B) 
no --- 6 no 
- 1 
f;J15/(A,B) <--(E(15)a (15)) (A,B) ) 0 no no 
v(~(A.B) ) 
no 
fE, (A,B) < (t,+) (E, (A,B) 
no-- n1 no 
(A,B) <--(t,+) (A,B) 
no n1 no 




112 : 0 
1 O:w /(A,B)n0 
P<--Tl + .LP 
a<:-- I 
11 





eo 1 . ~ 
e1 --
e7 , -->r 
e8 I..- , J 
e18'----' 

























(A, B) <- E:(16) , n 
. 0 
(A,B) <- ~(A,B)n 
no 0 












S<-- 0 f 8 1.;:,'------' 
1 
S<- (E A I 10)v ((O=a /(A,,B)I4)AI11) 
v((O=w1 j(A,B)I4)AI12 )v((O=R(A,B)I4) 
/\I14) f9 
(A,B)I <- T(O,a)I 
4 13 
S<-- (S ,S) I 15 
P<- T(0,1)I + .LP 
15 










p<-(.iY) + (.iw /I) 






P<- T1 + l.P 
DIV------> s 1<- B0 




1 : 0 <-(0 = C) 
x<-(J.C) I l.(B,A) 
[
f ___ 1 : 0- ((( (B,A)-x)~(l.C))>2 15 ) 
B,A <- ? (32) 
---? t<- ((l.(B,A))-x)~(.LC) 
16 x<-- /x; (s1 vs2) ;2 -x/ 
B - (16) Tx 
16 
A- T/t,(s1~s2 );2 -t/ 
MPY ------- > Y.Ac1 a, f: C) 
s 1<- Af/J 
sz<:- c0 
DLD 1 -------->MAC (a (f s) ·A) 
DST ' ' n0' 
1 
MAC (a+1, (f,s)n~;B) 





























. 8 24 8 23 
t<-.1./(A,Ci /B) ;A0 ;2 -(A,a /B)/~ 2 i 0 
t<- t X 
7 8 . 7 8 7 . Ia /w /B;B15 ;(.l.a /w /B)-2 I 
2 . 
FIX-~-----> t<--- (0,1)(t<6)+Lt ,i3 
7 8 7 7 8 




O<--- s 2 
16 A<---T /t; t<0; 2 +t/ 
' 
FLT------- b<--- ( (w 15 I A)/~) 0 
FAD FSB 
FMP,FDV 
u.;15 /A<- btw15 /A 
B<-· s(16) 
a7 !w8 /B<~(15-b) 
> MAC1 (1P,f;I) 
P - T1 + l.P 
. 15 ADC(I0 ;.l.w /I;a) 
MAC(a,f;C) 
MAC(a+1,f;g) 
8 24 & . 23 
x<--- /l. ( C, a I g) ; C 0; 2 -.1. ( c, a 7 g) I""'" 2 
7 8 7 8 7 
s 2<- I .1.a fw /g;g15 ; (.1.a /w /g)-2 I 
x<--- X X 2 
L .:.;_ x( +,-,x, -:-) n1 t 

















r l b<-(b' (1-2""'23)x 2-129) m i21 
I b<-(b,l) (b>(l+Z-:-22)x(-2129))A(b<2-129) 
Q<--- (b<-z127 )V(b>(l-2)-ZJX127) 
v((b>(l+Z)-22(_2129))(b<2-129)) 
t<-0 
s 1<- (b<'/J) 















7 8 8 
--> B15 , a /w /B T/t;(t<'/J);t+2 I i33 
b<- bX2 24 i34 
8 . 24 
A,a /B<- T/b ;s1 ;2 -b/ i35 




All 1R1 instructions are executed here. ADX, ADY, LDX, LDY, STX, 
and STY refer to certain memory locations whose addresses are defined 
in the word following the instruction word; thus some memory access 
and effective address computation tasks must be done(60-63) prior to the 
execution of the instructions. All the other instructions of 1R1 do 
not require those tasks and enter the routine at line b4 to skip the 
unnecessary steps. 
The EAR instruction sets are executed here. Each instruction 
enters at a different line. 
All the 101 instructions are executed here. The 1/0 devices are 
interfaced with the processor by these instructions; symbols V, F, and 
Z are used here to represent the control bits, 1/0 flag bits, and data 
buffers of all the I/0 devices. Each indexed symbol refers to a 
specific device. 
All the S/R instructions are executed here. Each S/R instruction 
consists of four microinstructions. Each microinstruction is chosen 
from its own microinstruction set. The first microinstruction set is 
the same as the fourth microinstruction set for S/R instructions. The 
instruction execution is divided into three parts. The first part 
59 
(lines e 0-e12 ) executes the first microinstruction, the second part 
(lines 13-14) executes the second microinstruction, and the third part 
(lines 15-17) executes the third microinstruction. The fourth micro-
instruction is executed in the first part after the previous three 
microinstructions are all executed. Every S/R instruction must go 
through these four steps to complete the instruction execution. 
All the A/S instructions are executed here. Each A/S instruction 
consists of 8 microinstructions. Thus the instruction execution is 
divided into 8 parts, each part executing one microinstruction. Every 
A/S instruction must go through these 8 parts to complete the instruction 
execution. 
The JUMP instructions JLY and JPY are executed here. A memory 
access must be made to get the destination address of the JUMP instruc-
tion. 
All the EAMR instructions are executed here. Each of the four EAMR 
instructions requires two words of memory: one for the instruction 
code and one for the operand address. Thus at line h0 , the second mem-
ory word (operand address) is incremented by 1 to point to the next 
instruction. The overflow bit is set when the DIV instruction is 
executed if the divisor is zero or too small. In the former case 
(division by zero), the division will not be attempted and the B-and 
A-register contents will be unchanged except that a negative quantity 
/ 
will be made positive. In the latter case (divisor too small), the 
execution will be attempted with unpredictable results left in the B-
and A-registers. 
All the FPI instructions are executed here. Four of the FPI 
instructions are floating point arithmetic instructions which require 
two words of memory: one for the instruction code and one for the 
operand address. Since a full 15 bits are available for the operand 
60 
address, these instructions can directly address any locat~on in memory. 
The execution of WMI, BIMI, BYMI, and DMI instructions is not 
included in the APL description here because they are not used and 
have nothing to do with this paper. 
Microprogramming 
Conventional Control Section 
In a conventional computer control section, the functions performed 
by the instruction set determine the specified hardware design. The 
major advantage of this specially designed hardware is speed of instruc-
tion execution. The major disadvantage is the loss of flexibility for 
special applications or for enhancements. Any changes and additions 
to existing capabilities require changes and additions to hardware 
components. This is no problem for a conventional computer is there are 
no new machine instructions required. "The hardware has been designed 
to minimize timing for the instruction set" (6). 
61 
However, a computer manufacturer rarely produces an instruction 
set that meets the requirements of all potential users. "Hence, the 
manufacturer must either focus his attention on one group of users or 
widen his scope and generalize the hardware design to meet the needs of 
a number of user groups. In the latter case, the user must modify his 
discipline to some extent to meet the limitations of his hardware"(6). 
Microprogrammed Control Section 
"In the microprogrammed computer, all distinct logical functions 
are separated from the sequence in which those functions are per-
formed" (6). Thus, hardware redundancy is reduced. The control store 
holds the microinstruction which defines the logical functions. Each 
machine instruction in Main Memory is performed by a sequence of micro-
instructions in Control Store. This sequence of microinstructions 
called a microprogram and is often referred as.firmware. Software can 
be executed much faster with the application of microprogramming. 
This speed is achieved by two factors: 
1. The memory access time of Control Store is less than 
that of Main Memory. 
2. The microinstruction has more flexibility than the 
normal machine instruction. 
In fact, the HP21MX Control Store where microinstruction reside, 
cycles more than twice as fast as Main Memory where normal machine 
instructions reside. In addition, microinstruction have the ability 
to access many internal registers and some logical functions that Main 
Memory programs do not have. 
For example, the HP21MX floating point software subroutines were 
62 
identified as very time consuming. They were microprogrammed by 
Hewlett-Packard and made available in ROM to users. Implementation of 
floating point firmware requires no change to user programs. The 
microprogrammed floating point instructions run about 20 times faster 
than the corresponding software subroutines. 
As in the floating point microprogram, the user can study his. 
software, determine the most time consuming function performed, and 
then microprogram these functions, that is, execute them in control 
store using a single memory instruction instead of a sequence of Main 
Memory instructions. Any software that uses these microprogrammed 
functions will execute at a higher speed. 
The Microprogrammable Computer 
Functionally, a computer consists of four major sections: 
Control 
Main Memory 
Input and Output 
Arithmetic and Logic 
Each section executes under the direction of the control section by 
means of a microprogram. The control section reads the user's program 
stored in Main Memory and directs the appropriate hardware in each of 
the other sections. 
Control Section 
The control section fetches an instruction from a certain location 
in memory, which is specified by the Memory Register (MR), and stores it 
into the Instruction Register (IR), as shown in Figure 10. An 












appropriate microprogram is determined by the IR. Conceptually, each 
program instruction in Main Memory is a jump to a microprogrammed 
routine which resides in Control Store. 
64 
The storage area for those microprograms is Control Store which 
may be either a Read Only Memory (ROM) or Writable Control Store (WCS). 
The control section that executes microprograms from ROM is referred as 
a Control Processor. 
The Control Processor 
A microprogram in the Control Processor is in command of the 
computer at all times. A microprogram takes program instructions from 
Main Memory and stores them into the IR. The upper eight bits of the 
IR determine the microprogram address within one of the following 
groups: 
Basic instruction set 
Extended instruction group 
Floating point instruction group 
User microprogram group 
The basic instruction set microprogram can be regarded as a supervisor 
microprogram·that determines when a user microprogram is called and then 
passes control to the user microprogram. 
When a microprogram has run to completion, it returns to location 
0 in Control Store (basic instruction set), returning control.to the 
supervisor microprogram, after which the next instruction is fetched 
from Main Memory and stored into the IR, Successive microinstruction 
address are determined in the following way. The ROM Address Register 
65 
(RAR) is incremented at the start of execution of each microinstruction. 
When a jump is executed, the RAR is loaded with the jump target address. 
When a jump to a subroutine is executed, the RAR is stored into the Save 
Register. When a return from a subroutine is executed (RTN), the 
Save Register contents are transferred into RAR and the Sav~ Register 
is cleared. Thus at the completion of execution of each microinstruc-
tion, the RAR holds the address of the next microinstruction. 
The central data transfer path is the S-bus. The contents of 
all registers except the following can be directed onto the S-bus: 
L-register, RAR,SAVE Register, Extend Register, and the Overflow 
Register. The following registers can receive data from the S-bus: 
M-Register, T-Register, L-Register, Counter-Register, Display-Register, 
Display Indicator, and Instruction Register. 
The T-but receives data only from the Rotate/Shifter (R/S) but 
can pass dat,a to the following registers: A-Register, B-Register, 
Scratch Pad Register (Sl through s12), X-Register, Y-Register, 
P-Register, and S-Register, (Front Panel Switch Register). 
The I/O-bus serves to transfer data to and from external devices 
under program control. In the functional block diagram (Appendix A) 
all the data paths are shown by the arrows. For example, the B-Register 
contents can be sent to S-bus and hence to the M-Register. However, the 
contents of the B-Register cannot be sent to Sl2 (Scratch Pad 12) with-
out passing through the ALU. 
Main Memory 
The M-register is a 15-bit register which holds memory addresses 
for reading from or writing into Main Memory. Upon storing from the 
i 
66 
M-Register, bit 15 is c1ear (0). The T-Register or transfer register 
holds the dat~ being transferred to or from memory. The contents of 
both of these registers are transferred to and from the -bus. Four 
loader ROMS, selectable by Instruction Register bits 15 and 14, can 
eac~ contain a 64-word Main Memory program which may be loaded into 
Main Memory and used to load Main Memory from a peripheral device,or to 
perform any other function desired by the user. 
Two flags are associated with memory: the A-Register Addressable 
Flag (AAF) and the B-Register Addressable Flag (BAF). These flags 
are required to allow the A- and B-registers to be addressed as loca-
tions 0 and 1, respectively, of Main Memory. 
Input and Output 
The Central Inter~upt Register (CIR) is a 6-bit register associated 
with the I/0 interrupt circuitry. It is loaded with the select code 
of the interrupting device under program control and passed to the S-bus. 
Whenever the CIR is loaded, and Interrupt Acknowledge (IAK) signal is 
issued to the I/0 device. The I/0 bus transfers data to and from exter-
nal devices. Two flags are associated with I/0: the interrupt pending 
flag and the I/0 skip condition met flag. The Interrupt Enable Register 
is used to disable or enable the recognition of all interrupts, except 
Memory protect, parity, and power failure interrupts. 
Arithmetic and Logic Section 
This section consists of the Arithmetic and Logic Unit (ALU), the 
twenty-two Rotate/Shifter (R/S) registers, and six flags. 
The ALU and R/S are the only units that execute functional 
67 
I 
modifications on the data. The ALD receives innut from the S-bus and 
from the L-register (Latch Register). Output -from the ALD goes to the 
R/S which places its output on the T-bus. 
Output from- the ALD and R/S can be stored in one of the folloiwng 
registers via the T-bus: A-Register, B~Register, Scratch Pad Registers 
(s1 through s12), X-Register, Y-Register, P-Register, and S-Register. 
Recall that the P-register holds the macroprogram (main memory) 
address. The P-register must be under control of the microprogram 
which must insure that it contains the proper address after the micro-
program is complete. When the microprogram is complete, the resulting 
, P-Register value is the address of the next macroinstruction to be 
executed. Note that the Basic Instruction Set fetch routine (at 
Control Store address 0) automatically increments the P-Register 
after the macroinstruction is fetched. Thus for one-word user macro-
instruction function codes, no further incrementing of the P-Register 
is necessary in the user microprogram. 
The S-Register is reserved for internal storage of the Front Panel 
Switch Register. Note that all of those registers can also be sent 
along the S-bus for storage into memory, passage to an external device, 
or input to the ALD. 
The Extend Register is a one-bit register usedin shift operations 
to link the A- and B-Registers or to indicate a "carry" arithmetic 
result out of the A- or B-Registers. The overflow is a one-bit regis-
ter used to indicate an arithmetic overflow from the ALD. These two 
registers can also be used as flags. 
68 
Implementation of a Polynomial Algorithm 
. . ., 
in the HP21MX Computer 
i 
The four tasks which are illustrated in Chapter I are performed in 
this chapter. One of them is to program the polynomial algorithm in 
Hp 21 assembler language for evaluating the sine function. The other 
task does the same thing but uses a microprogram instead of the program 
coded in assembler language. 
The particular polynomial algorithm used for evaluating sine 
functions has been determined in Chapter II and is shown as follows: 
where c 1 = 1. 5706268 
c3 = -0.6432292 
0. 0727105 
-1 :o; X <S; 1 
For evaluating the sine of an angle 8, x is substituted with 28/IT 
in Eq. (4.1); then sin 8 can be computed by 
1"\ (~) (28) 3 (28) 5 sin '=' = cl IT + c3 IT + c5 IT 
(4.1) 
In order to reduce the execution time when implemented this algorithm 
in the computer, Eq. (4.1) can be factored as follows: 
(4.2) 
Although Eq. (4.1) and Eq. (4.2) give the same result in computa-
tion, they require a different number of multiplications. 
Insp~ction of Eq. (4.1) shows that the number of multiplications 
required is 11, while the number of multiplications required by 
69 
Eq. (4.2) is 7. As mentioned in Chapter I, the multiplication function 
is one of the most time-consuming functions. Thus Eq. (4.2) definitely 
is more efficient than Eq. (4.1) when implemented in the computer. 
For the reason mentioned above, Eq. (4.2) is used for both the 
microprogram and the program coded in assembly language. The results 
of these two implementations are listed in Tables VIII and IX. The 
program listings are listed in Appendix B. 
Implementation of the Cordie Algorithm 
on the HP21MX Computer 
The Cordie algorithm has been introduced in Chapter II. To use 
it for evaluation of the sine function, the value selected for n is 
a function of the desired computing accuracy. Theoretically, the 
larger the value of n is the more accurate the result. 
Actually, it is impossible to represent a number to any degree of 
accuracy in any computer because the accuracy of all computers is 
limited by the number of bits in a word. In the HP21MX computer, 
there are 16 bits in a word. When the Cordie algorithm is used to 
evaluate the sine function, the value of n not only determines the 
accuracy of the result, but also affects the execution time of the 
program. There is a trade-off between accuracy and execution time; 
i.e., when n increases, the accuracy is increased as is the execution 
time. In order to get the greatest accuracy and the least execution 
time, the optimum value of n must be found. As discussed in Chapter II, 
a set of ATR constants, a., i=l, 1 •• ,n,, can be obtained from Eq. (4.3). 
1 




POLYNOMIAL METHOD IMPLEMENTATION RESULTS 
(ASSEMBLY LANGUAGE) OF EVALUATING 
THE SINE FUNCTION 
Ang1e(Radians) Sin Execution Time(Mi1i-Sec) 
----
-1.5 -0.997558 0.081 
-1.4 -0.985351 0.081 
-1.3 -0.963378 0.081 
-1.2 -0.932128 0.081 
-1.1 -0.891357 0.081 
-1.0 -0.841552 0.081 
-0.9 -0.783447 0.081 
-0.8 -0.717285 0.081 
-0.7 -0.644287 0.081 
-0.6 -0.564697 0.081 
-0.5 -0.479492 0.081 
-0.4 -0.389404 0.081 
-0.3 -0.295654 0.081 
-0.2 -0.198730 0.081 
-0.1 -0.099853 0.081 
0.0 0.0 0.081 
0.1 0.099609 0.081 
0.2 0.198Lf86 0.081 
0.3 0.295410 0.081 
0.4 0.389160 0.081 
70 
71 
TABLE VIII (Continued) 
At;g1e (Radians) Sin Execution Time(Mili-Sec) 
0.5 0.564453 0.049 
0.6 0.564453 0.049 
0.7 0.644042 0.049 
0.8 0. 717041 0.049 
0.9 0.783203 0.049 
0.1 0. 841308 0.049 
1.1 0. 891113 0.049 
1.2 0.931884 0.049 
1.3 0.963134 0.049 
1.4 0.985107 0.049 
1.5 0.997314 0.049 
'· 
TABLE IX 
POLYNOMIAL METHOD IMPLEMENTATION RESULTS 
(MICROPROGRAM) OF EVALUATING THE 
· SINE FUNCTION 
Ang1e(Radians) Sin Execution Time(Mili-Sec) 
-1.5 -0.997558 0.049 
-1..4 -0.984351 0. 0lf9 
-1.3 -0.963378 0.049 
-1.2 -0.932128 0.049 
-1.1 -0.891357 0.049 
-1.0 -0.841552 0.049 
-0.9 -0.783447 0.049 
-0.8 -0.717285 0.049 
-0.7 -0.644287 0.049 
-0.6 -0.564697 0.049 
-0.5 -0.479492 0.049 
-0.4 -0.389494 0.049 
-0.3 -0.295654 0.049 
-0.2 -0.198730 0.049 
-0.1 -0.099353 0.049 
0.0 0.0 . 0. 049 
0.1 0.099609 0.049 
0.2 0.198486 0.049 
0.3 0.295410 0.049 















TABLE IX (Continued) 






o. 841308 0.081 
0.891113 0.081 
0.931884 0.081 
0. 963134 0.081 
0.985107 0.081 
0. 997314 0.081 
73 
When implementing the Cordie algorithm in the HP21MX computer, 
a. will be divided by 180° and then represented in 16 binary digits. 
1 
74 
For example, if a.1 = 90°, then 90°/180° = 0.510 = 0.400008 . 0400008 
will be stored in the computer. If Eq. (4.3) is used to find the ATR 
constants n=l to n=l6, the values of ai are: a 1 = 040000, a2 = 020000, 
a 3 = 011344, a4 = 004773, a 5 = 002421, a6 = 001213, a 7 = 000505, 
000242, a 9 0001212, a10 = 000050, a11 = 000024, a12 = 000012, 
a 13 = 000005, a14 = 000002, a15 = 000001, a16 = 000000. 
Because the ATR constant is represented with a 16-bit word in 
the HP21MX computer, when n > 15, the constant.will be too small to 
be represented. Thus the value 15 is the best choice for the value 
of n. This yields the most accurate result without excessive execution 
time. 
Once the value of n is determined, the value of k can be found as 
well. The formula to obtain the constant k is: 
k 1+2~ 
-2(n-2) 
1+2 (4. 5) 
When the constant k is computed by Eq. (4.5) with n 15, the result is: 
k 1. 646744 
The original coordinate vector in the Cordie algorithm is: 
v 1/k = 0.6072589 
One critical problem occurs immediately when the Cordie algorithm 
is being implemented in the HP21MX computer. Review of the Cordie 
machine in Chapter III shows that the best feature of Cordie which speeds 
up computation is that it has three adder-subtractors which can operate 
75 
simultaneously. In the HP21MX computer, although there are two regis-
ters (A and B) which can operate like an adder-subtracter in Cordie, 
they cannot operate simultaneously. Due to this hardware limitation, 
the only way to simulate these parallel adder-subtracter operations 
is to execute sequentially. 
The flowchart for the assembler program which simulates the Cordie 
algorithm in the HP21MX computer is shown in Figure 11. 
An AHPL description for the microprogram which emulates the Cordie 
algorithm in the HP21MX computer is shown in Fi~ure 12. 
Both program listings are shown in Appendix B. The programming 
results for these two implementations are listed in Tables X and XI. 
Calculation of Execution Time 
To calculate the execution time of both the macroprogram and the 
microprogram, the Time Base Generator (TBG) and interrupt feature are 
used. The TBG generates an interrupt signal for a specified time 
interval; the CPU acknowledges the interrupt and forces the current 
computer program to suspend and transfer control to a service subroutine 
which records the number of times that the clock interrupt has occurred. 
At the end of program, the program execution time can be calculated 
from the following:equation: 
T 
N X TI 
L 
where 
T program execution time 
N number of clock interrupts 
TI interrupt time interval of Time Base Generator 
L number of times that the program has been executed 
• 
------] A -A +ref reg 
90° 
TI 














Y< · B reg 
Figure 11. Cordie Algorithm 
i +- 1 














,Y-+-Y +B I 
reg reg reg; 
X+ y I 
r~ 
Right Shift 
Breg k times 
y +- y 
reg 










reg . reg I 
+ B reg 
y + y 
reg 





A +A reg reg 
+ 
Yes 



















MR +- p 
P +- Tl+.LP 
1 
MAC ( .LMR, f; T) 
X+- T 
MR +- p 
P -<- Tl+ .LP 
1 
MAC (.LMR, f, T) 
S7 +- T 
S6 +-A 
-+ (10,14)AO 
S7 +- S7 




15 X +- 8(16) 
/ 
16 E,S~ +- T(.LS6)+(.LS7) 
17 54 +- 812,13,14(16) 
18 S3 +- 8(16) 
19 SS +- X 
20 L +- y 





CTR + w /S3 
B +-X 
The AHPL Description 
for the Cordie 
Algorithm in Imple-




























(v/CTR):O,(=,+) + (29,25) 
t 
B -<- o B 
(A/CTR):1,(=,f) + (29,27) 
CTR +- T1 + .LCTR 
+ 25 
SS +- B 
8 
CTR +- w /S3 
B +- y 
(V/CTR):O,(=,f) + (37,33) 
t B +- o B 
(A/CTR):1,(=,f) + (37,35) 
CTR +- T1+.LCTR 
+ 33 
MR +- .p 
P +- T1+.LP 
1 
MAC (.LMR, f; T) 
S7 +- T 
+ (42,48)(86)0 
. 16 
E,X +- (17) (.LX)+2 -.LL 
L +- S5 
E,Y +- (17) (.LY)+(.LL) 
L +- S7 
16 E,S6 +- (17) (.LS6)+2 -.LL 
+ 53 
E,X +- (17) (.LX)+(.LL) 
Figure 12. (Continued) 
80 
81 
49 L -<- S5 
50 E,Y + (17)T(~Y)+2 16-~L 
51 L + S7 
52 E,S6 + (17) TUS6)+ H 
53 E, S4 -«- (17)T1+~S4 
54 -+ (57,55)(v/S4) 
55 E,S3 16 + (17)T(~S6)+2 -1 
56 -+ 22 
57 RETURN TO MACROPRAM 
Figure 12. (Continued) 
• 
TABLE X 
CORDIC ALGORITHM IMPLEMENTATION RESULTS 
(ASSEMBLY LANGUAGE) OF EVALUATING 
THE SINE FUNCTION 
Ang1e(Radians) Sin Execution Time (Mili -Sec) 
0.0 0.000244 3.304 
0.1 0.099975 3.297 
0.2 0.198913 3.310 
0.3 0.295410 3.313 
0.4 0.389587 3.300 
0.5 0.479431 3. 313 
0.6 0.564758 3.305 
0.7 0.644226 3.305 
0.8 o. 717407 3.310 
0.9 0.783325 3.304 
1.0. 0.841369 3.305 
1.1 0.891174 3.313 
1.2 0.932206 . 3. 311 
1.3 0.963562 3.306 
1.4 0.985351 3.311 
1.5 0.997558 3.318 
1.6 0.999450 3.310 
1.7 0.991760 3.318 
1.8 0.973693 3. 313 
1.9 0.946350 3.305 
82 
83 
'J'ABLE X (Continued) 
Angle(Radians) Sin Execution Time(Mili-Sec) 
----------· --------------·----------- --
2.0 0.909301 3.316 
2,1 0.863220 3.320 
2.2 0.808654 3.312 
2.3 0.745666 3.310 
2.4 0.675476 3.312 
2.5 0.598510 3.314 
2.6 0.515686 3.324 
2.7 0.427795 3.311 
2.8 0.334716 3.313 
2.9 0.239074 3.321 
3.0 0.140869 3.311 
3.1 0.041564 3.316 
3.2 -0.058654 3.304 
3.3 -0.157592 3.311 
3.4 -0.255798 3.320 
3.5 -0.350M6 3.317 
3.6 -0.442016 . 3.320 
3.7 -0.530090 3.327 
3.8 -0.611999 3.311 
3.9 -0.687622 3.314 
4.0 -0.756713 3.329 
4.1 -0. 81805L~ 3.302 
4.2 -0.871520 3.327 
4.3 -0.9163-20 3.321 
lf. 4 -0.951599 3. 311 
84 
TABLE X (Continued) 
Angle(Radians) Sin Execution T:i.me(Mi1i-Sec) 
4.5 -:-0.977539 3.322 
4.6 -0.999877 3.314 
4.7 -0.999877 3.314 
4.8 -0.996032 . 3.310 
4.9 -0.982422 3.312 
5.0 -0.958801 3.314 
5.1 .;..0.925901 3.311 
' 5.2 -0.883422 3.314 
5.3 -0.832153 3.308 
5.4 -0.772583 3.304 
5.5 -0.705505 3.309 
5.6 -0.631530 3.303 
5.7 -0.550659 3.309 
5.8 -0.464599 3.310 
5.9 -0.373840 3.305 
6.0 -0.279541 3.319 
6.1 -0.182312 3.314 
6.2 -0.082885 3.312 
\' 
TABLE XI 
CORDIC ALGORITHM IMPLEMENTATION RESULTS 
(MICROPROGRAM) OF EVALUATING THE 
SINE FUNCTION 
Ang1e(Radians) Sin Execution Time(Mili-Sec) 
---·--· 
0.0 0.000244 0.105 
0.1 0.099975 0.126 
0.2 0.198913 0.112 
0.3 0.295410 0.108 
0.4 0.389587 0.106 
0.5 0.479431 0.104 
0.6 0.564758 0.108 
0.7 0. 6Lr4226 0.107 
0.8 0.717407 0.113 
0.9 0.783325 0.097 
1.0 0.841369 0.110 
1.1 0.891174 0.111 
1.2 0.932206 0.114 
1.3 0.963562 0.109 
1.4 0.9853.51 0.104 
1.5 0.997558 0.105 . 
1.6 0.999450 0.104 
1.7 0.991760 0.105 
1.8 0.973693 0.114 
1.9 0.946350 0.106 
85 
86 
TABLE XI (Con tinned) 
Angle (Radians) Sin Execution Time(Mili-Sec) 
2.0 0.909301 0.111 
2.1 0.863220 0.105 
~ 2. 2 0.808654 0.106 
2.3 0.745666 0.105 
2.4 0.675476 0.116 
2.5 0.598510 0.111 
2.6 0.515686 0.107 
2.7 0.427795 0.116 
2.8 0.334716 0.107 
2.9 0.239074 0.114 
3.0 0.140869 0.102 
3;1 0.041564 0.103 
3.2 -0.058654 0.101 
3.3 -0.157592 0.105 
3.4 -0.255798 0.106 
3.5 -0.350646 0.112 
3.6 -0.442016 0.110 
3.7 -0.530090 0.105 
3.8 -0.611999 0.109 
3.9 -0.687622 0.097 
4.0 -0.756713 0.108 
4.1 -0. 81805lf 0.112 
4.2 -0.871520 0.107 
4.3 -0.916320 0.107 
4.4 -0.951599 0.111 
87 
TABLE XI (Continued) 
Ang1e(Radians) Sin Execution Time(Mili-Sec) 
4.5 -0.977 539 0.106 
4.6 -0.993774 0.107 
lf. 7 -0.999877 0.107 
4.8 -0.996032 0.107 
4.9 -0.982422 0.102 
5.0 -0.982422 0.102 
5.1 -0.925901 0.101 
5.2 -0.883422 0.110 
5.3 -0.832153 0.108 
5.4 -0.772483 0.111 
5.5 -0.705505 0.110 
5.6 -0.631530 0.115 
5.7 -0.550659 0.116 
5.8 -0.464599 0.107 
5.9 -0.373840 0.114 
6.0 -0.279541 0.111 
6.1 -0.182312 0.110 
6.2 -0.082885 0.107 
CHAPTER V 
OTHER USES OF CORDIC 
The Cordie algorithm may also be applied in solving many other 
mathematic problems as well as being applied in the evaluation of the 
sine and cosine functions. Decimal to binary and binary to decimal 
conversion, arctangent function computation, fourier transformation, 
et.al., can be done by the Cordie algorithm--a different way from the 
conventional methods. Arctangent function computation and decimal 
to binary conversions are chosen in this chapter to demonstrate how 
the Cordie algorithm is applied to solve these problems. 
Arctangent Algorithm 
This algorithm is obtained by reversing the sine and cosine 
algorithms. In this algorithm, the value V which equals Y/X is known 
(X andY are components of a vector.) The vector is rotated with 
respect to the positive X-axis. The angle traversed is the angle whose 
tangent equals Y/X. 
Functional Description 
The VECTORING mode is used in this application. To illustrate the 
details of this algorithm, Figure 2 in Chapter III is referred to again. 
The value of v is checked before the initialization of the X- and 
Y-registers. If the value of v is greater than 1 then the Y-register 
8~ 
is initialized with 1 and the X-register is initialized with v· 
' 
otherwise the X-register is initialized with 1 and the Y-register is 
89 
initialized with v. The Angle Register (A~register) is always initial-
ized with 0. A sign digit of 0 in the Y-register establishes a v. 
]. 
of -1, which causes the top adder-subtractor to be set to subtract and 
the middle and bottom adder-subtractors to add. A sign digit of 1 has 
the opposite effect. The ATR constants are the same as those used in 
Chapter III. The VECTORING computing sequence as described in Table II 
is started. The angle whose tangent equals to v is taken from the 
A-register after the final computation step. 
Decimal to Binary Conversions in Cordie 
A technique is formulated for using the Cordie arithmetic unit to 
convert between angles expressed in binary fractions of a half 
revolution and angles expressed in degrees and minutes in the 8421-code. 
The Cordie decimal-to-binary conversion technique may be compared 
to a conventional conversion technique in which the 8421-code and 
binary arithmetic are utilized. The conventional conversion technique 
is based upon the 8421-code definition of the value of a decimal digit, 
N, located i placed to the left of the units position, as given by 
n4 (8 x lOi) + n3 (4 x lOi) + n2 (2 x lOi) +n1 (1 x lOi) 
(5 .1) 
where n4 , n3 , n2 , and n1 are equal to zero or one. The constants 
8 x lOi, 4 x lOi, 2 x 10\ and 1 x lOi, evaluated in binary for all 
values of i to be used, are required in the conversion. For example, 
5° in 8421-code is 
45° (0 X 8 X 10 + 1 X 4 X 10 + 0 X 2 X 10 + 0 X 1 X 10) 
+ (0 X 8 + 1 X 4 + 0 X 2 + 1 X 1) 
45°- = (0100)' (0101). 
For example, 86° can be written as 
86° (1 X 8 X 10 + 0 X 4 X 10 + 0 X 2 X 10 + 0 X 1 X 10) + (0 X 8 
+ 1 X 4 + 1 X 2 + 0 X 1) 
86° (1000). (0110) 
90 
.The conversion of a negative angle is accomplished in the same way, and 
the result is then complemented by subtracting the binary magnitude 
from zero. F~r example, -86 ° is (0111) (1010) which is the 2 's comp,le-
ment of 86°. 
The binary value of 45° as a fraction of half revolution is shown 
in Table XII. 
In Table XII at each step a binary constant is either added or 
not added, depending upon whether the 8421-code variable is 1 or 0, 
respectively. In order to use the Cordie principle, it is necessary 
either to add or to subtract a constant. The use of addition or sub-
traction is controlled by a code variable placed in the sign digit 
position of an arithmetic unit register. The problem of conversion by 
adding and subtracting constants is considered first. Subsequently, 
the method of properly positioning the code variables for control is 
presented. 
By analogy to the way in which a code variable of +1 is used 
to establish the addition of a constant, a variable of -1 is used to 
establish subtraction. Therefore, it is desired that a binary code with 
+1 and -1 variables be used to represent decimal angles in Cordie. For 
convenience, the desired code is called a + (plus-minus) code. 
Constants 
Degree 
8 X 10· 
4 X 10 
2 X 10 






THE CONVENTIONAL DECIMAL-TO-BINARY 
CONVERSION . 
Constants-Binary 8421-
Fraction of half Code Variable 
Revolution 
.01110010 X 0 
.00111001 X 1 " = 
.00011100 X 0 
.00001011 X 0 :::: 
.00000110 X 0· = 
.00000011 X 1 = 
.00000011 X 0 
.00000001 X 1 :::: 













The 8, 4, 2, 1 weights cannot be applied directly to a four-digit ± 
code because all possible sums ~f binary-weighted ± code digits are odd. 
( 
Therefore, a transformation of t\he decimal digits 0, 1, ... , 9, into 
a set of ten odd integers is necessary. The set of ten odd integers 
-0' -7' ... ' -1, +1, ... ' +9 is selected. 
The equation transforming a decimal digit N, having one of the 
values, 0, 1, ... , 9, into a digit Y having one of the values -9, -1, 
... , +9 is 
y 2N - 9 
The equation for the inverse transformation is 




Applying the factor of 2 in (5.3) to the 8421-weight results in the + 
code equation 
N (5.4) 




A factor of lOi may be applied to each 
term in (5.4), as was done in (5.1), account for the position of the 
digit N. 
c = 9 
2 
The pattern theY. variables of the code of (5.4), with 
J 
and with O's used t · represent -l's, is identical to that of 
the Excess-3 code. 
Equation (5.4) can be applied to each digit position, and the 
i 
constant term c ~ 10 for all decimal digit positions is added in binary 
to the accumulated sum. As an example 45° will be converted from + 
(excess-3) code to binary as follows: 
for 45° 
9 c2 = 2 4.5 
45 
C = .c1 + c2 = 49.5 = total constant 
Consequently the constant for 45° is 49.5. 
The + 1 code representation is 
5 + 3 = 8 (1000)2 




Where each digit must be added to 3 for excess -3. The zero stands for 
minus one and one for plus one. Thus 
45° = (-+++) (+---) 
The complete conversion of 45° is shown in Table XIII. 
X 
Where fro~ equation (5.4) 
(-40 + 20 + 10 + 5) + (4 - 2 - 1 - l) + 49.5° 
2 
Successive digits of the ± code must control successive set-
ting of the adder-subtractors in order for the proper sequence of 
additions and subtractions to occur as indicated in the previous table. 
The settings of the adder-subtractors during the conversion operation 
are established by the value of the sign digit located in the Y-register. 
In positioning the + code digits for control, the technique of 
















Fraction of Half + Code Product 
Revolution 
.0100011001110 (correction) . 010001100110 
. 001110001110 X -1 = -.001110001110 
.000111000111 X +1 = +.000111000111 
-
.0000011100100 X /+1 = +.000011100100 
.00001110010 X +1 +.000001110010 
. 000001011011 X +1 +.00000101101 
.000000101110 X -1 -.000000101110 
.000000010111 X -1 -.000000010111 














accumulated sum = 2 half revolution = 0.010000000000 
95 
given by the sign of successive remainders. Dividing the number 
representing the ± code of the angle by 1 produces the signs of succes-
sive remainders. In Cordie this is accomplished as follows: 
1) If the remainder is positive, subtract the divisor. 
If the remainder is negative, add the divisor. 
2) Shift the divisor one place to the right. 
3) Repeat 1 and 2. 
The positioning of digits of the + code for 45° is illustrated by 
following the above rules as shown in Table XIV. 
In decimal-to-binary conversion, the + code for the desired angle is 
placed in the Y-register and the divisor of 1 is placed in the X-regis-
ter. A sign digit of 0 in theY-register establishes a Y. of -1, which 
1 
causes the top adder-subtractor, Figure 13, to subtract and the bottom 
adder-subtractor to add. A sign digit of 1 has the opposite effect. 
The constant C in (5.4) is initially placed in the angle register and 
successive constants are introduced into the bottom adder-subtractor 
as shown in Figure 13. As one step of the division is taking place to 
establish the next setting of the,adder-subtractors, a constant is 
being added or subtracted to modify the quantity in the angle register 
according to the sign digit in the Y-register at the beginning of the 
step. The binary angle is taken from the bottom adder-subtractor on 












GENERATION OF + CODE FOR 45° 





0011 1000 + 
1 0111 7 in excess 3 
0001 1000 + 
1 
0000 1000 + 
1 
0000 0000 + 
1 
1111 1100 







• Ll,t SUBTRACTOR 
SHIFT 
CATES 
r~ I I I I I I I I I I I I I I I X REGISTER 
( • '1 \ , ) - l- <'- • 'I 
(_ . i 
~: 111111111111111 ~ ' 




Figure Ll. ImplementaUon of ~Code to Binarv Conversion. 
97 
CHAPTER VI 
SUMMARY AND CONCLUSIONS 
The results of the programming tasks discussed in the previous 
chapters are shown in Tables VIII - XI. 
In order to compare the accuracy of the results obtained from each 
task, a set of standard sine function values is obtained .. The result 
of each task is compared to these standard values and the accuracy is 
thus determin~d. 
For the convenience of further description, the four tasks which 
have been accomplished in Chapter IV are designated Task 1, Task 2, 
Task 3 and Task 4: 
Task 1 - polynomial method implemented in assembly coded program. 
Task 2 polynomial method implemented in microcode. 
Task 3 - Cordie algorithm implemented in assembly coded program. 
Task 4 - Cordie algorithm implemented in microcode. 
Note that the sine values of Task 1 are identical to those of Task 2, 
while the sine values of Task 3 are identical to those of Task 4. Thus, 
only two sets of results are compared with the standard sine values, as 
shown in Tables XV and XVI. A'cording to these tables, both tasks are 
accurate up to three decimal digits; in other words, all the tasks 
give about the same accuracy of sine values. 
The execution time of each taskis shown in Tables VIII -XI. By 
reviewing those tables it is found that Task 1 is the most time-
98 
TABLE XV 
THE COHPARISON BETWEEN THE CORDIC ALGORITHM 
IMPLEMENTATION RESULT AND THE 
STAND/Jm SINE VALUE 
Angle(Radian) Sin(Cordic) Sin(Correct) Error 
0.0 0.000244 0.0 0.000244 
0.1 0.099975 0.0998334 0.0001416 
0.2 0.198913 0.198669 0.000244 
0.3 0.295410 0.29552 0.00011 
0.4 0.389487 0.389418 0.000169 
0.5 0.479431 0.479425 0.000006 
0.6 0.564758 0.564642 0.000116 
0.7 0.644226 0.644218 0.000008 
0.8 0.717407 0.717356 0.000051 
0.9 0.783325 0.783327 0.000002 
10. 0. 8Lfl369 0.841471 0.000102 
1.1 0.891174 0.891207 0.000033 
1.2 0.932206 0.932039 0.000167 
1.3 0.963562 0.963558 0.000004 
1.4 0.985351 0.98545 0.000099 
1.5 0.997558 0.997495 0.0000063 
1.6 0.999450 0.99957Lf 0.000124 
1.7 0.991760 0.991665 0.000095 
1.8 0.973693 0.973848 0.000155 
1.9 0.946350 0.9463 0.00005 
99 
100 
TABLE XV (Continued) 
_Ang!_e (k~dian) SiL(Cordic) Sin(Correct) ----------- Error 
2.0 0.909301 0.909297 0.000004 
2.1 0.863220 0.863209 0.000011 
2~2 0.808654 0. 808!!96 0.000158 
2.3 0.745666 0.745705 0.000039 
2.4 0.675476 0.675463 0.000039 
2.5 0.598510 0.598472 0.000013 
2.6 0.515686 0.515502 0. 000018!+ 
2.7 0.427795 0.42738 0.000415 
2.8 0.334716 0.334988 0.000272 
2.9 0.239074 0.23925 0.000176 
3.0 0.140869 0.14112 0.000251 
3.1 0.041564 0.0415808 0.0000168 
3.2 -0.058654 -0.0583743 0.0002797 
3.3 -0.157592 -0.157746 0.000154 
3.4 -0.255798 -o. 255.5A1 0.000257 
3.5 -0.350646 -0.350783 . 0.000137 
3.6 -0.442016 -0.442521 0.000505 
3.7 -0.530090 -0.529836 0.000254 
3.8 -0.611999 -0.611858 0.000141 
3.9 -0.687622 -0.687766 0.000144 
4.0 -0.756713 -0.756802 0.000089 
4.1 -0.818054 -0.818277 0.000223 
4.2 -0.871520 -0.871576 0.000056 
4.3 -0.916320 -0.916166 0.000154 
4·. l1 -0.951599 -0.951602 0.000003 
\ 
101 
TABLE XV (Continued) 
.Angle(Radian) SiP (Cordie) Sin(Correct) Error 
-------------------
4.5 -0.977539 -0.97753 0.000009 
lf. 6 -0.993774 -0.993691 0.000083 
4.7 -0.999877 -0.999923 0.000046 
4.8 -0.996032 -0.996165 0.000133 
t+. 9 -0.982422 -0.982453 0.00031 
5.0 -0.958801 -0.958924 0.000123 
5.1 -0.024901 -0.924815 0.000086 
5.2 -0.883422 -0.883455 0.000033 
5.3 -0.832153 -0.832267 0.000114 
5.4 -0.772583 -0.772765 0.000182 
5.5 -0.705505 -0.70554 0.000035 
5.6 -0.631530 -0.631267 0.000263 
5.7 -0.550659 -0.550686 0.0000027 
5.8 -0.464599 -0.464602 0.000003 
5.9 -0.373840 -0.373877 0.000037 
6.0 -0.279541 -0.279416 0.000125 
6.1 -0.182312 -0.182163 0.000149 
























THE COMPARISON BETWEEN THE POLYNOMIAL 
NETHOD U1PLEMENTATION RESULT AND THE 













































TABLE XVI (Continued) 
A!!Md~adian) Si.n(Cordic_}_ Sin(Correct) Error 
0.5 O.tf79248 0.479425 0.000177 
0.6 0.564453 0.564652 0.000112 
0.7 0. 641~042 0.644218 0.000176 
0.8 0.717041 0.717356 0.000315 
0.9 0.783203 0.783327 0.000124 
1.0 0. 841308 0.841471 0.000163 
1.1 0. 891113 0.891207 0.0000937 
1.2 0.931884 0.932039 0.0001546 
l.J 0.963134 0.963558 0.000424 
1.4 0.985107 0.98545 0.000343 
1.5 0.997314 0.997L!95 0.000181 
104 
consuming t~sk; Taf?k 2 consumes less time; Task 3 consumes still less 
time; Task 4 consumes the least time of all. 
Task 1 and Task 2 are the same algorithm but implemented in 
different ways, so the sine values will be identical but the execution 
time may be different. The same applies to Task 3 and Task 4. The 
programming results in Chapter IV prove this assumption. 
Task 1 is performed in an assembly coded program, while Task 2 
is performed in a microprogram. According to the description of the. 
microprogramming in Chapter IV. the execution time of Task 2 should be 
less than that of Task L Similarly, Task 4 should have less execution 
time than Task 3. The programming results in Chapter IV also prove 
this assumption. 
The things that cannot be predicted before going to the computer 
are whether Task 1 or Task 3 will have less execution time, and 
whether Task 2 or Task 4 will have less execution time. However, we 
expect that Task 1 is faster than Task 3 and Task 2 is faster than 
Task 4. If this is true, it means we can improve the speed of evaluation· 
of trigonometric functions by replacing the conventional polynomial 
method with the Cordie algorithm. Surprisingly, the programming results 
in Chapter IV indicate that the conventional polynomial method is faster 
than the Cordie algorithm for computing trigonometric functions. 
Although this is disappointing, it is possible to determine exactly 
how these results were effected. 
Although the Cordie algorithm eliminates the necessity of multipli-
cation, some shifting still must be done. In the real Cordie machine, 
three registers (A,X,Y) can be shifted and added or subtracted 
simultaneously. When the Cordie algorithm is simulated in this general 
purpose machine the HP21MX, the shifting and adding or subtracting 
carr only be done sequentially, because the arithmetic unit can only 
handle one arithmetic operation at a time. In addition to this, the 
result of shifting and adding must be stored, and then the arithmetic 
unit for shifting and adding/subtracting of other registers msut be 
released. After all three registers finish their shifting and adding/ 
subtracting for the current cycle, the next cycle starts. So the 
shifting and adding/subtracting results of the first register in the 
previous cycle will be restored, and so on for the second register and 
thrid. Therefore, when the computer is running, a lot of storing and 
restoring is being performed, and this is very time-consuming. That is 
why Task 1 requires more execution time than Task 3. Task 2 implements 
the Cordie algorithm in a microcode, so it improves the speed of Task 1, 
but is still slower than Task 3 and Task 4. Task 4 is a microcode, and 
thus improving the speed of Task 3. Therefore, the conclusions are: 
1) The use of the Cordie algorithm for evaluating trigonometric 
functions without hardware extensions will-be slower than 
using conventional polynomial methods; 
2) When using a conventional polynomial method for evaluating the 
sine function, the microprogram will be two times faster than 
the assembly coded program; 
3) In order to use the Cordie algorithm to improve the speed of 
evaluation of trigonometric functions, a lot of hardware work 
must be done in the current HP21MX computer. 
With the suprising speed of development of the microprocessor 
today, it might be very easy to construct a microcomputer which has the 
features of both the general purpose computer and the Cordie computer 
in the near future. 
A SELECTION BIBLIOGRAPHY 
(1) Gear, W. C. Computer Organization And Progrannning. New York: 
McGraw-Hill, 1969. 
(2) Iverson, K. E. ~ Pro~mming Language. New York: John 
Wiley, 1962. 
(3) La Lyusternik, 0., A. Chervonenkis, and A. R. Yanpol'skii. 
_!!:'1ndbook_ for Computing Elementary Functions. New York: 
Pergamon Press, 1965. 
(4) Hayward, J. T. and J. P. Wong, Jr. Approximations For Digital 
Co___:;:rpu~~~~-· Princeton, New Jersey: Princeton University 
Press. 
(5) HP21HX Computer Series Reference Manual_. Cupertino, California: 
Hewlett Packard Company, 1974. 
(6) HP Micr_oprogramming 21MX Computers Operating_ and Reference· Manual. 
Cupertino, California: Hewlett Packard Company, 1974. 
(7) A Pocket Guide to Hewlett-Packard Computers. Cupertino, 
California: Hewlett Packard Company, 1974. 
(8) A Pocket Guid~ to Interfacing HP _gomputers. Cupertino, California: 
Hewlett Packard Company, 1974. 
(9) Volder, J. S. "CORDIC Trigonometric Computation Technique." IRE 
Transactions on Electronic Computers, EC-8, Sept., 1959, 
p. 330. . 
(10) Daggett, D. H. 11 Decimal-Binary Conversions in CORDie." T'RE 
Transactions on Electronic Computers, EC~S, Sept., 1959, 
p. 335. 
(11) Meggitt, J. E. 11 Pseudo·-Division and Pseudo-}1ultiplication Pro-
cessors." IBM Journal, April, 1962. 
(12) i1aither, J. S. "A Unified Algorithm for Elementary Functions." 
Spring Joint Computer Conference, 19_71. 
(13) Despain, A. M. "Fourier Transform Computers Using CORDIC 
Iterations." . IEEE Transactions on Computers, Vol. c-23, 
No. 10, Oct. , 1974, p. 993. 
106 
107 
(14) Cochran, D. S. "Algorithms and Accuracy in the HP 35." Hewlett-
Packard Journal, June, 1972. 
(15) Schmid, H. and A. Bogacki. "Use of decimal CORDIC for generation 
of many transcendental functions." Electrical Design News, 
February, 1973, pp. 64-73. 
(16) Richards, R. K. Arithmetic operations in digital computers. 
New York: Van Nostrand, 1955. 
(17) Briggs, H. Loga:tithmicall arithmetike. London: George Miller, 
. 1631. 
APPENDIX A 
FUNCTIONAL BLOCK DIAGRAM 
108 
I 





FRONT PANEL SECTION l/0 SECTION 






* TASI< :1. TE:3T F;F:OGF<:AI.'I----·COFDIC ''HJ.JOF:JTf·!i··l H!F'L.E.J·IEI'~TET• IN i'J5::::E.}'IE:L'r' + 
* CODE F~OGRAM 
+ INPUT PARAMETER-- AN ANGLE WHICH MUST BE IN THE RANGE OF + 
* , < ····::>..=:0.· ::::;_:;o > [:•EGF:EE ·l·o 
* OUTPUT PARAMETER-- SIN VALUE OF THE INPUT ANGLE + 
·+· 
*****~******************~~*******************************************~ 
H9•m .. 1"1.· f.:.. L .. T 
+SET UP THE CLOCk INTERRUPT VECTOR ADDRESS 
ORG :L4B 




+SET UP INTERRUPT TIME PEF:IOD TO a 1 MILISECOND 
O'T'F'I :1 .. 4B 
N()f" 
+SET REPEATITION COUNT TO 11210 






















I OF<: FH 
·:>Tf't ::::1··-l 
C>O:H 




El·.n· ::::TH f<:f=t 
F:J'-.ITJ. L..I:•H PH 
~INITIATE TIME CL..OC~ XNTERRUPT 
+THE CORDIC COMPUTING SEQUENrE STAETS HERE 




I .... I:>E: U'·/ 
CL .. O 
::;::::r:t 
JI''IP C:():l. . 
.:;::TF: 'r' 
f'tl:•r"'t NFE 
BJ.. L..l>:'< :=; T :·: 
::::f-n ' ... t::::o-: cor·l 
:::::n:: F:\1 
.·.=:: T B T E :~: 
L.l)E: .,., 


















f<i: I ... J)'r' '·r' 
:3Sf'l 
Ci··Jt: .. I Nf': 
i'IO'r' ::1.8 
I ... E:>< t:f? 
·:_;;:::H .. F:S::; 








*OUTPUT EXECUTION TIME .. SIN AN0 COS VALUES OF THE INPUT ANGLE 
! ... DE: C:OI .. .II'-JT 





J::::E: C:ii .. IT 
.J~:::t=: OI. .. IT:1. 




:r~:;r:: 01.1 r 
*INCPEA5E THE INPUT ANGLE BY 0. 1 T~~N REF~AT lHE PROGPAM 
I)I...J• ~'11'·.11, 
FHI) J 1'./C 
[•''cT HI'·,JCi 
*SERVICE ROUTINE FOP CLOCK INTERRUPT 
JTIF' ::::rHF~T 
T I 1"•1[ 1'·./0P 
Ci)lit~T 
COl 
STC :J..4F· C 
I SZ ~··ol . .ll'.iT 
J"i'lf:· TH··Jt::. I 
OCT 1) 




f'/1'·./U N':C 0. 1'1 
PI DEC 1. 14159 
FH OCT 177600 
SN NOP 
112 
UV OCT ~77 s 
NRE OCl 140000 
RE OCT 040000 
SJ>:: OCT 1C 
p• .. • E:::s 1 




OCT .... 14 
OCT -- ::l,~i: 
OCT ····1.2 
CICT --·::L:J .. 
1.JCT --·:1..0 
























:,< E::c;:::: :1 
',-' ~:::~:::::.:; J. 
LCT OCT 1?7654 
HD OCT 1??654 
f~:H NOF 
TP:::: J'.J(..IF' 
J···Jc: OCT :, , , 
Ji'·,JC: DEC 0 J 
113 
*************************************~***~*******~********~~********** 
+ TASK 2 TEST PROGRF~--CORDIC ALGORITHM IMPLEMENTED IN MICROCODE * 
+ PROGRAM 
+ INPUT PARAMETER-- AN ANGLE WHICH MUSl BE IN THE RANGE OF 
* (-?60, 360) DEGRLE 
* OUTPUT PARAMETER-- SIN VALUE OF THE 1NPUT 0NGLE 
* 
************************~******~******************************~******* 
· FISI'1B.. ('J, f.:: .. L .. T 
I)F:G JAE: 





+SET UP rJ··nERF;:J.JF'T TII'·IE PG':IOE:• T>.' ,, 1 r·Jii .. l::;Ecot.:;:. 
C>TH :14B 
J'·,JOP 
+SET REPEATITION COUNT TO lOCI 








1·•( OI'J'..·'EF:T THI:O: I J'.JI''IJT l:JI\IC3LE: T>.-J THE COF'I) I C 1; ETFESEi'JTfiT' I Ui·i 
FT-'..1 PI 








I ... I:OH :1B 
)I• IF' ENT 
C.:H>: 
FHf<: 




J ~=:::? ::.;N 
J I·IF' :;r 
TI'·IP [)'.1 
EJ'.JI ::;·n:t Fc'H 
[I'·,JL:L I .. C•f::l f;:f:i 
*INITIF'JTE TIME CLOC~ INTEF:RUP1 
+THE COR~IC CC~1PUTING SEQUENCE STARTS HERE 





~ THE f:NTF:Y PUINT TO THE MICRUPRUGRHM WHICH PERFORMS THE COF:DIC COMPiJTING 
,.,_. SE(;'t.JENCE 
OCT :10'"':::1.60 
UV OCT 23J35 
*THE ANGLE CONSTANTS 











OCT 1211ZU .. :•.·1.::: 
CIC,f 0~)~':1<C50":; 
OCT ~Ji,":l024;;: 
HNT10 OCT 000121 
HNT11 OCT 000850 
ANT12 OCT 000024 
AN113 OCT 000012 
ANT14 OCT 000085 
HNT15 OCT 000002 
ANT16 OCT 008001 




*OUTPUT EXECUTION TIME.SIN AND COS VALUES OF THE INPUT ANGLE 








1.. .. 1::-E: '., .. 
J:::::E: OIITJ. 
I..I)E: OF 
J·::E: 01 IT::I.. 
.r::::f': I)IJT 
+ li'·,II:.F:Ef't:c\f: THE I !'·,JF'I...I'> '''11'·1' il. E [':',' f} :. i H[N r:•:r:J>LHT T'-if:: r;:•:C>•.]i·:l::!i·J 
!'•I .. J> nNG 
I" HI> I l'!l·: 
L<:C~r Al',ll .. i 
IO:::>EF:\IJC:E F:CHJT' !.1',1[ ! , 1>;: C! .OCI< Il'·fiT 1·: I;IIPT 
li''IF' :5THI:::·r 




·.: .. rr· 141':. ,. 
r ·::;z c ()l.ii·.Jr 
ri·IP T I !'IF 
111:: r ';"' 
I'•EC 121. ,:;, 
DEC: l·'~J'5 .. 1 




'r' [:S:;:;; .:1.. 
LC OCT 177~S4 
HD OCT 17~G54 
r:.i=t t·.lo•::· 
r f-·:=:: I'·,Ji'IJ"·· 
1··11.. 01: ·1 • 
I I'·,JC I)F.C: 0 1 
ll5 
*~*****:~*~**=~*~(**~~~~***~~~:~~***;~;~'~-~,~~~=-~,,~~:~-~:4·'~*~:+:****~=~~*'~***~·~*~**~=~~**'~** 
*' 1''1 I CPOF''P(II:ol':f:WI-- IJ'3EE• I !'·If'' I~ t':TJ···IEI'rf TH>.:O C•:.•"'l.• I C fll.GOF: I I Hi·l *' 
* F''Of;: E'·/l"iU.IHT II'·IU THE ::.: :fl...r:: '"I.JI'·H 1: Ul-1 "" 
* THE ANGLE OF THE SINE ~UNCTION SHOUL0 GE STORED IN * 
·~ THE: REO I :=;TEf;: Fl E:EFOF:t:: El'-frf:::y THE I'HCF:UF'FOGP~:Ii·l * 
***~~*~~****~<***·~-~~-:i~**~(~,;~~*~'***'~**~'~***-~--~*=~~-~-~*'~*~'*****'~~-~-=***'~'~'*****'~*"i~*** 
'f.SVt'ITf'IB 
oi'(JF: I G I ~'"'J.400 
JMP NOF' PASS NOP START 
'*'OF I C1 I N:o:14.U. 
~;:n"JRT NOP NOP F'ASS NOF' NOP 
•<·GET THE UNIT ')ECTOP FRo1··1 1'·1AHJ r··I[I'IOf;~o,.· 
F:EA[:• r<or· H~C pm·1 P 
*STORE IT IN REQ ~ 
NOP I'~OP F'fi::'::_:; .·•. n'JE: 
*GET THE FIPST ANGLE CO~ITST 
PEAD NOF INC PNM P 
«:::;TORE .n IN F'I.~G. 57 
NOP NOP PASS 57 TAB 
*~TORE THE ANGLE OF THE SINE Fl~CTION IN PEQ 56 
NOP NUP PASS 56 A 
* I:F , THE ANGLE IS L.ESS THf'll'l :lSO [•C•JF:EE .. l::r.':AI'<CH TO DH 
JMP CNDX AL15 RJS EN:l 
*GET THE TWO'S COMPLEMENT OF 5? 
NOP ~-JOF' Ct1P::: ::::? 5? 
NOP NOP INC S? 5? 
·•·•GET HJO· 0::: COI'IF'L.IC:t·H OF ;..; 
NOP NOP C~PS X 
NOP NOP I NC ;.,; 
+:.::TORE ; .. ; J N '-r' 
EI'H 

























lt'/1"1 NOF· L.Ol·l 53 DE: 
NOP NOP PASS 55 A 
NOP NOF' PASS L ~ 
Jt'IP NOF' NOP NOP E:K1 
+INITIALIZE THE COUNTER 
E:K I'·J(JP I'·.J()F' Pf~~;.::- CNTP s:.;. 
~,pI GHT SHIFT E: I<:EG B'r' THE NUt1E:G: IN THE COUNTER· 
*THEN STORE THE SHIFTING RESULT IN 55 
NOP RPT PASS E: ~ 
ARS R1 PASS E: B 
NOP I'<OP PA::'::": S5 8 







NOl~ NOP PA~·· CNTP S3 
NOP PPT F·'A:::::; E': 'r' 
ARS R1 PASS B 8 
NOP NOP PASS L B 
NEXT ANGLE CONSTANT 
PEFK• !'·.lOP INC PNr··i I 
NOf'' I'IOF' PFc.::; ·:;;:· Tf'IE: 
NOP NOP PASS S6 ~~ 
THE ANGLE , IF GPEATEP THAN 1S0 DEUREE 
. .H'IF' CNr;:.; •. ; I"'LJ."'; r·JOP EI'.L:" 
NOP NC~ SUB A 
NOP N~~ PASS L 55 
NOF" I'KIF'' ~i[:o[:• 'r' ',' 
NOP NOP PASS L S? 














































CJU TO Et-./:2 
116 
******~;~*****~~'**·-~;~**;~;~*~~*~~****:~**;k.i(*i~=~+=~-~~*****-~**;~*****~~~-~-~*=~*******~ 
* TASK 3 TEST PPOGPAM--PC~YNOM1AL METHO~ IMPLEMENTED IN ASSEMBLY ~ 
* F·Rcn:.JF:f'il"l * 
* INi=·UT ~'FIF:tit•IIO:TEP-·-fit-4 riNGLE F-~ANI:,EL.• FF.:ot·1 -90 DEi::JPEE TO 9i21 DEGF.:EE * 
* OUTPUT PAPAI•'IETEF:-·-THE :3INE 'ii"IL.UE OF THE HWUT m~GLE * 
:~***:~*:~*;~*-~**~'***-~;~;~*=~**;~***=~**'*;~**;~~=~:~:+:*:~~--:~*****~-~:~-~~*****:~**********:~ 
fiSt'IE: .. A.· 8.· L .. · T 
*SET LIP THE C U.XI· 1.1-.ITEF:F.:UPT '·iECTOF: m:.DI;:Es~;. 
FI'H:l 
ORG :14E: 


























































*SEPVICE ROUTINE FOR CL(~~~ INTEPPUPT 











<;Tc 14E: .. c 
r ::::z COU~H 
fi·IF' T I !'·IE .. I 
OCT (:1 
OCT 0 
OCT l '?;:'6'54 
OCT :1 ~:·'?65.::1. 
DCT 0 
fJCT (1 
DEC 0. ·::.::.~::<892 
OEC ·-·~) . . 1.65968~; 
DEC 0. 0076031915 




* TASK 4 TEST PROGRAM--POLYNOMIAL METHOD IMPLEMENTED IN MICROPROGRAM* 
* PROGRAM * 
* INPUT PAPAt1ETER--AN ~:INGLE F.:FINI:JE[) FPOI"·1 ·-90 [:•EGREE TO 9\::1 f)EGREE ·'~-' 
* OUTPUT PARAMETER--THE SINE VALUE OF THE INPUT ANGLE * 
******************~*************************************************** 
t1SI"1E: .. t':J .. 8 .. L T 

























STC 148 .. C 




















JSB OI .. I'T 
.J$8 OUT:!. 
L.DE: Fit·~ C) 
J':.=:;B OUTJ. 
~,I NCF:Et:ISE THE I NF'UT t:ING!. .. E 8',·' (J :L THEN Rf:T'Et'IT THE Pf?O<.:JFHi''l 
LC·~:I INC 
FIF:::;;, t:IF:::::: 












STC 14E: .. r:· 
I~.;z counr 
Jt1F' T I !"'E, I 
OCT 0 
OCT l??,::~'H 
OCT 1 ??6:::;4 
oc·r o 




* 1·1 I CF:OF'ROGF·~:II···I·-···U::::EI) T() E'·/I:::!L..I.JfflT ;:, IN FI...I!'K:T I ON E:',.. I t···tPi_E:f'iENT I NG THE .• , 
FVLYNOMIAL METHOD 
$S'r't1Tm::: 
$:OF: I G I N=1·1121(;:1 
READ NOP INC PNM P 
*STORE THE VF~UE X IN 52 AND 59 
NOP NOP PASS 52 TAB 
NOP NOP PASS 59 ~2 
*STORE THE VALUE Ci IN 51 
READ NOP INC PNM P 
NOP NOP PR5S 51 TAB 
*STORE THE VALUE C3 IN 53 
READ NOP TNC PNM P 
NOP ffi)P PASS 53 TAB 
*STORE THE VALUE C5 IN 55 
REF~ NOP INC PNM P 














*STOkE )<:*:'< I I'·J 
NOF'' 
S6 
NO I"' p~:,~::;~:: :~:(: E: 
*C:Ot-'JPUTE C5*><:+:::.:: 
NOF NOF' p,:,:;;:::: f·l 
!'-lOP /'·.lOP PH~;::_; ·::··":0 
JSE: NOP NOr' NOF:' 
NOF NOF'' Pr:ts:::: L. 
*COMPUTE C3+C5+(X*X) 
NOF NOP ADD A 8 
NOP NC~ PASS 52 56 
JSB NOP NOP NOP MP~ 
*ADJUST THE SCHLE FACTOR 
ARS Li PASS B 8 
ARS L1 PASS B 8 
HRS L1 PHSS 8 8 
*COMPUTE C1+<X*X)*(C3+C5*(X*X)) 
I'·~ I)!"" r-~r·,,c:. Pn~:;:;:: L ::,::1 
NOP NOF' ril)[:• ~~, [: 
*COMPUTE XfC1+X*X*<C?+C5*(X~X)) • 
NOP NOP PHSS S2 59 
JSB NOP NOF NOP MPY 
*SAVE THE RESULT IN MAIN MEMORY 
NOP NOP PHSS T E: 
WRTE NOP INC PNM P 
*f':ETUf<~N TO t··!HCF:I)F'f?OGF:til·/ 
RETURN NOP RTN PASS NOP NOP 
*SUBROUTINE FOR COMPUTING THE MULTIPLICA1ION OF TWO INiEGER~ 













~:~EF:O E: NOP 
PFI::.;:::: I ::_;;;;: 
f:·f't:::;~:: CNTF': f3 
t"llA:• F: 8 
Pt~ls~:: NOP ::.;7 
f~L:v::; RJ:::: *+:0:: 
::::uE: r::: E: 
PFIS::::: l\tOI''' ::;;:; 
Al15 PJS RETURN 
f"f'6:::: I. :~:? 
SUF.:: E: E: 
120 

















~::Tc· :LLE: .. I 
.. 
... 
C:I ..• H 
- SF:::: LlE: 
Jl"'lf" :i·:-·:1 
I::::;.; 
.. H•'IP L OP 
[•LI) ~~:;~-::II,/ 
I .... I>:": :~::H~•/::J. 
Ji'IF' OCII:/.. I 
Ei-.11:-> 
*****************~~****~*********~***************~****~************ 
* U;IF.:F<:i~:I{.JE CONTPOL F.:CII_ITHJF···-F.:ETUF;:I·J THE Ci'IF.:f?IFIGE TO THE f3Ei."JINING OF* 







































Peihsung Thomas Hu 
Candidate for the Degree of 
Master· of Science 
Thesis: THE CORDIC ALGORITHM IMPLEMENTATION FOR TRIGONO}ffiTRIC 
FUNCTION EVALUATION IN HP21MX 
Major Field: Computing and Information Sciences 
Biographical: 
Personal Data: Born in Taipei, Taiwan, Republic of China, 
July 2, 1950, the son of Mr. and Mrs. B. Y. Hu. 
Education: Graduated from Chengko High School, Taipei, Taiwan, 
Republic of China, in June, 1968; received Bachelor of 
Science degree in Electrical Engineering from Chiao Tung 
University, HsinoHu, Taiwan, Republic of China, in June,l972; 
completed requirements for the Master of Science degree 
at Oklahoma State University in May,l978. 
Professional Experience: Graduate teaching assistant, Department 
of Computing and Information Sciences, ·oklahoma State 
University, 1975-1976; Software specialist, Atkins & Merril 
Training Equipment Company, 1976-present. 
