The systems organisation of incremental computers. by Bywater, R. E. H.
THE SYSTEMS ORGANISATION
OF
INCREMENTAL COMPUTERS
R E H Bywater, BSc, CEng, MIEE
A thesis submitted for the degree of Doctor of Philosophy 
in the Faculty of Engineering, Department of Electronic 
and Electrical Engineering of the University of Surrey.
April 1973
ProQuest Number: 10798307
All rights reserved
INFORMATION TO ALL USERS 
The quality of this reproduction is dependent upon the quality of the copy submitted.
In the unlikely event that the author did not send a com p le te  manuscript 
and there are missing pages, these will be noted. Also, if material had to be removed,
a note will indicate the deletion.
uest
ProQuest 10798307
Published by ProQuest LLC(2018). Copyright of the Dissertation is held by the Author.
All rights reserved.
This work is protected against unauthorized copying under Title 17, United States C ode
Microform Edition © ProQuest LLC.
ProQuest LLC.
789 East Eisenhower Parkway 
P.O. Box 1346 
Ann Arbor, Ml 48106- 1346
S U M M A R Y
The thesis describes the design of a digital computer system which 
can provide the precision of digital techniques, with the speed 
of,analogue processing together with the flexibility of the hybrid 
computer.
The application of Digital Differential Analyser (DDA) techniques 
is discussed and two forms of DDA are described. The first (SOIC 
I) embodies several improvements on previously applied techniques. 
These include a programmable electronic patching system, a one 
step integration algorithm, coefficient potentiometers and a method 
of implementing a suggestion of McGhee and Nilsen.
The second machine (SOIC II) uses an original form of digital inte­
grator and consists of 64 computing elements which are intercon­
nected by a programmable logic system. Each element is capable of 
performing any one of four functions including integration, vari­
ables summation and multiplication.
These machines have been designed for interconnection to a general 
purpose digital computer. The latter would be used for loading 
and monitoring the DDA, and other general ’housekeeping* activities.
The complete system should be capable of giving several orders of 
magnitude improvement in the solution of complex physical problems.
To my wife, for her support and encouragement.
A C K N O W L E D G E M E N T S
I wish to thank Professor W F Lovering for the interest and help 
he has given throughout this project, and for his guidance in the 
preparation of this thesis.
I also acknowledge, with thanks, the assistance given by 
Dr L S A Mansi, Messrs P K  Warrick, P Thomasson and J Ford for 
their many valuable suggestions, Miss W L Hudson and Mr T Brine 
for systems construction, Mr P Simms for the final drawings and 
Miss C Martyniuk for typing the manuscript.
Finally, I am very much indebted to Professor D R Chick for 
supporting this work and allowing me access to the Departmental 
facilities.
THE SYSTEMS-ORGANISATION OF INCREMENTAL COMPUTERS
CHAPTER 1
Contents Listing
PAGE
Introduction 13
1.1 History of Digital Differential Analysers; 13
Mechanical and Electronic Aspects.
1.2 Basic Process of Digital Integration and 14
the Solution of Differential Equations.
1.3 Application areas:- 16
problem solving, simulation, navigation 
numerical/process control.
1.4 Shortcomings:- 18
speed versus the analogue computer; 
flexibility versus the digital computer; 
breadth of application versus the hybrid 
computer.
CHAPTER 2 Digital Integration .
2.1 Theory of Digital Integration
2.2 Errors Induced Through Truncation 
Effects
2.3 Data transmission between Integrators
(a) Whole Word
(b) Incremental: Binary, Ternary
(c) Serial/Parallel
2.4 Reduction of Round-Off Errors by 
Biasing Registers1 Contents
2.5 Interaction of ErrorsForms in 
Digital Integrators
2.6 Separarion of Errors for Analysis:
Use of Sine/Cosine Loop as Example
2.7 Comparison of DDA Integration Errors 
and Those of General Purpose Digital 
and Analogue Integrators.
2.8 Systems of Digital Integrators: 
Sequential Processing versus Simultane­
ous Operation. Prediction and 
Correction Methods.
2.9 Other Integration Methods:
Single Step Methods: Adams-Bashforth 
and Adams-Moulton Methods.
2.10 Discussion.
PAGE
20
22
23
26
28
29
29
30
32
35
CHAPTER 3 Problem Solving and Simulation Using PDA's
PAGE
3.1 Basic Systems. 36
3.2 Hardware Considerations. 37
3.2.1 Integrator form. 37
3.2.2 Interconnections. 40
3.2.3 Loading of DDA 1s. 46
3.2.4 Interconnections/loading logic 4 7  
considerations.
3.2.5 Outputting/monitoring of machine 48 
elements’ variables.
3.3 Software Considerations 50
3.3.1 Allocation of machine elements. 51
3.3.2 Loading of machine elements 54 
(IC Storage).
3.3.3 Loading of machine elements 55 
(organization).
3.3.4 Interruptions handling.. 55
3.3.5 Scaling. 56
3.4 Hybrid Computers Incorporating a DDA. 57
CHAPTER 4 Extension to the Hardware Method
4.1 An Approach to Combinational Processing 5 9
of Integrators.
4.2 Carry-save Methods for Selection Logic. 63
4.3 Single Step Implementation of the Adam’s 67
Integration Methods.
4.4 Enhancement to the Shift Register Method 6 8
of Programmed Interconnections.
4.5 Use of Read-Only Memories for Table Look- 72
up and Logic Synthesis in DDA’s.
CHAPTER 5 The SOIC Mk I DDA
PAGE
5.1 Introduction to Machine: Design Aims: 7 5
Applications.
5.2 Differences Between SOIC Mk I and 7 5
Existing Machines.
5.3 Problems to be Overcome in the Implementation. 7 7
5.3.1 Interconnections selection rate. 7 7
5.3.2 Multiplication of variables. 77
5.3.3 Potentiometers. 7 7
5.3.4 Monitoring of variables. 7 7
5.4 Introduction to Reference 46. 7 3
5.5 Limitations of the Machine. 7 3
5.5.1 Accuracy of computation. 78
5.5.2 Number of computing elements that 7 9
can be connected.
5.5.3 Efficiency of functions generation. 7 9
5.5.4 Cost of implementation through non- gq
centralized storage.
5.6 Observations on the Performance of SOIC Mk I. ^
5.6.1 Arbitrary variables integration (AVI). 3 4
5.6.2 R.ate of change of increments. 8 5
CHAPTER 6 The Second Order Difference Integrator
6.1 The Second Order Method. 87
6.2 Initial Conditions. 89
6.3 Facets of the Second Order Method. 9 2
6.3.1 Simplification of interconnections. 9 2
6.3.2 Other functions that can be generated. 9 2
6.3.3 Introduction to Appendix 1. 94
- PAGE
6.4 Advantages and Disadvantages of the Second
Order Difference Method. 94
6.4.1 Slew rate. 94
6.4.2 Cost/storage. 94
6.4.3 Initial conditions loading. 95
6.4.4 Second order difference magnitudes. 96
6.5 Techniques with the Second Order Difference 
Method. 98
6.5.1 Arbitrary variables integration. 98
6.5.2 Multiplication. 102
6.5.3 Summation. 105
6 . 6  Higher than Second Order Differences. 107
CHAPTER 7 • The Requirements of a DDA/GP Hybrid Computer
(SOIC Mk II) 110
7.1 Specification. Ill
7.1.1 General specification. ill
7.1.2 Detailed specification. Ill
7.1.3 Performance. 114
7.1.4 Technology. 116
7.2 Maintenance Philosophy. 116
7.3 Aspects of Software. 117
7.3.1 Software with or without presence 117
of a DDA.
7.3.2 Scaling. 118
7.3.3 Allocation of machine elements. 119
7.3.4 Programmed interconnections. 119
7.3.5 Interruptions handling. 120
7.3-6 Repetitive mode. 120
7.3.7 Handling of functions not suited to 120
DDA execution.
CHAPTER 8 Aspects of the SOIC Mk II Design Philosophy
-
8 . 1 Storage.
PAGE
1 2 2
8 . 2 Integration Algorithms. 124
8.3 Interconnection Topology for Machine Elements. 127
8.4 Processing Method. 127
CHAPTER 9 Hardware Realisation of SOIC Mk II Machine Elements
-
9.1 General. 131
‘ 9.2 Time Integration (TI). 133
9.3 Variables Summation. 144
9.4 Arbitrary Variables Integration (AVI). 147
• 9.5 Variables Multiplication. 150
9 . 6 Auxilliary Logic for Data Boards.
(a) Output logic.
(b) Overflow logic.
155
CHAPTER 1 0 SOIC Mk II Interconnections and Control Hardware
1 0 . 1 Interconnections (General). 158
1 0 . 2 Interconnections (Detailed Description)
2
(a) Intra-group (AY).
2
(b) Inter-group (A Y) .
159
10.3 Monitoring Logic. 165
10.4 Interruptions Logic. 166
10.5 Hardware Check Facility. 168
1 0 . 6 Power Supply. 169
10.7 Input-output Hardware Considerations. 171
CHAPTER 1 1 System Design Evaluation and Discussion
11.1 Project Progress. 173
REFERENCES
APPENDICES
ADDENDA
ii
11*2 Discussion
11.3 Future Work.
1 - 50.
1. "Second Order Difference Integrator"
Preprint of Author’s paper in Proc IEE, Vol 119?
No 2, February 1972 pp 138-142.
2. "Digital Integration Algorithms' and their Errors" 
(Review of methods and limitations - includes 
some of Author's original work.)
3. "Justification for Integrator Register Truncation” 
(Shows formally how second order differences 
outputs may be obtained for a time integrator 
without use of I[Y]or AZ.)
4. "Unregulated Power Supply Tolerancing and Heat
Sink Design".
5. "Word and Label Formats for SOIC Data".
1. "Proposal for a New SOIC Mk. II Interconnections 
Topology for 64 elements".
2 . "Proposal for Enhancement of SOIC Ilk II Machine 
Elements’ Dynamic Range".
PAGE
X S W M w n n m
175
176
177
183
188
224
227
225
230
232
"Brevis esse laboro* obscurius fio"
("I struggle to be brief* and become obscure
Are poetica
Quintus Horace 65-8 BC
C H A P T E R  I
INTRODUCTION
1.1 History of Digital Differential Analysers (DDA)
"It takes a great deal of history to produce a little literature." 
Henry James: Life of Nathaniel Hawthorne
As early as 1876, Sir William Thomson (later Lord K e l v i n ) ^
recognised the power of the integrator as a computing tool. He
invented a mechanical integrator which was to be instrumental in
his solutions to certain differential equations. From these
beginnings, the analogue computer, mechanical differential analysers
and hence the digital differential analyser were developed. The
first mechanical analysers for solving general differential equations
(?)
appeared in 1930 and were developed by Vannevar Bush at MIT. These
devices, because of their superior accuracy to the analogue computer,
(3)(though, of course, not speed!) were constructed as late as 1952.
An account of these machines, their operation and uses is given by 
Fifer.(4*
The electronic DDA first appeared in the early 1950*s and the name
(5) .
of Forbes is prominent both m  its early development and mathem­
atical foundations.
At the same time many other digital computing machines, which were 
somewhat special purpose were constructed.^ These included
devices especially built to solve wave equations and the like. 
However, the more generally useful DDA has survived against the 
general purpose digital computer (GPDC) better than any of these.
The DDA appeared to be foundering somewhat in the middle 1960*s as 
the GPDC received so much attention. This was, no doubt, in part
due to the enormous hardware and software development costs of any 
computing machinery at that time which caused the ad hoc machines to 
be ignored in favour of those with the widest possible repertoire, 
albeit at the expense of performance on specific tasks.
At the same time, embellishments were being applied on the analogue 
integrator to increase its generality, ie Ref (26). Latterly (post 
1967) more and more attention has been paid to special purpose 
digital machines due to the ever falling costs of hardware develop­
ment. The micro-integrated circuit is largely responsible for this. 
Thus machines of many types are being developed, relying on hardware 
even more to relieve the software burden, where possible. Examples 
of applications are legion; process control, navigation, simulation 
being but a few.
This work represents a study of the present day (and some future) 
potentialities of the DDA and, in particular, the ways in which 
benefits may be gained from coupling GPDC's to DDATs with a view to 
making use of the flexibility of the GPDC together with the ability 
to solve differential equations of the DDA.
1.2 Digital Integration and the Solution of P E ’s
A digital integrator is a device which, because of its nature, is 
limited to representing quantities (variables) in discrete amplitude 
steps and is compelled to sample these variables a finite number of 
times, in a given interval, in order to arrive at a result. The 
process has many of the characteristics of the elementary "squares 
counting" integration process that can be carried out on graph paper.
In order to do this digitally, a "cup and bucket" type of system is
required. The cup represents the current value of the input variable 
and hence the new supply of "squares" to be added into the "bucket" 
which represents the integral, ie the running total of "squares".
The process can therefore be carried out quite simply (and crudely) 
by utilising two registers and an adder. The registers, termed Y 
and Z hold the input and output variables, and the adder sums the 
current Y with the old total in Z. Fig 1.1.
MSB LSBM SAMPLEDINPUT
S(Y) Y INPUT VARIABLE 
REGISTER
ADDER
Z [ACCUMULATOR] OUTPUT VARIABLE 
REGISTER
S( Y) = SIGN BIT OFY
[i n t e g r a l ]
I
Figure 1.1 SIMPLE (EULER) DIGITAL INTEGRATOR
Such a system has to be severely modified to be of any practical 
use (see Chapter 2) but the basic tenets still hold. If a large 
number of these devices is built and interconnected, a device, the 
digital equivalent of an analogue computer can be^fabricated. This 
would be known as a "simultaneously processed DDA" as every inte­
gration process, like the analogue, is computed in a separate, 
physically existing apparatus.
Alternatively, one basic processor (Y, adder, R) complex could be 
built and the Y and R values for a collection of "integrators" could 
be centrally stored and called down when required (in turn). This 
would be known as a sequential processor. This latter is clearly' 
much slower in its operation than the simultaneous one, as a single 
solution step now calls for the calling down and sequential processing 
of a large amount of data in a timeshared processor.
Thus, ignoring the sequential alternative, DDA1s can be built with 
many of the outward physical characteristics of the analogue computer 
and can be similarly ’patched' to solve a variety of problems.
1*3 Application Areas for the DDA
A DDA whether of the simple form just described, or cf the more 
sophisticated variety as described in Chapters 5, 7 and 8 , can be 
applied to a large number of jobs. It is evident that the more potent 
machines can tackle a wider range of jobs than the. simpler ones*either 
because of superior cost/effectiveness or simply because it would not 
be possible, in a reasonable time scale, to tackle some very large 
problems.
(a) Problem Solving. This heading is intended to cover the spectrum 
of jobs undertaken by the analogue computer. The DDA, although 
generally slower than the analogue machine%does exhibit two 
distinct advantages over it, (1 ) precision (2 ) ability to work 
with arbitrary independent variables. Thus the analogue 
computer and DDA have many similar facets but have individual 
advantages in certain respects.
(b) Simulation. The analogue computer (and hybrid installations)
have been dominant in real time simulation because of solution 
speed. However, both the raw speed and speed/accuracy products 
that can be achieved with a DDA together with the ability,
(being a logical device) to be programmed easily without patch­
boards, makes it a potentially much more powerful tool. SOIC 
Mk 11, a machine described in Chapter 8 , and embodying many of 
the salient features of this thesis, is an example of an 
electronically patchabie high speed DDA. It is coupled to a 
GPDC in order that its housekeeping, peripheral device control, 
compilation etc, are done with the most suitable equipment.
This will result in a very powerful and flexible system.
Navigation. Fixed, or semi-fixed programme DDA's are very 
suitable for navigators because of the mass of high resolution 
differential equations which have to be solved. By using a DDA 
instead of the on-board GPDC, much of the latter5s burden is 
removed, so that it can get on with its main vehicle monitoring
and management tasks. Several DDA style navigators have been
. . . (11) 
built and are to be found m  aircraft and missiles *
Numerical/process Control. The modularity of the digital
integrator, coupled with its accuracy, makes it very useful for
(12)
constructing modular controllers. It can be used not only
for interpolative work for curve following etc, but also for 
standard curve generation etc. (In these respects, it has an 
application in vector generation for graphics.) DDA techniques 
can be used to construct very inexpensive three-term controllers^ 
which have the advantage over their analogue counterparts of 
easy set-up from remote digital computers (set-point) and more 
direct control of the end of band filters by the use of table 
look-up, slew rate limiters, etc.
1.4 Shortcomings of PDA's
"There's something wrong with our bloody ships today."
1st Earl Beatty 1916
(a) Speed vs. the Analogue Machine. Clearly, without invoking 
very high degrees of parallelism, a digital integration method 
(which requires function slicing to obtain results) is intrin­
sically slower than the "continuous" method of an analogue 
machine. Thus, it is considered that, taking 'equivalent* 
technologies, the analogue device must always be faster than 
the digital one.
(b) Flexibility vs. GPDC. The DDA obtains most of its speed
(27)
advantage over the GPDC by being tailored for the job and 
using parallel processing methods to the full. It pays for 
this in lack of computing repertoire. A DDA is specifically 
built for the integration of differential equations, nothing 
else. It is for these reasons that SOIC's 1 and 11 were couched 
on a hybrid philosophy (GPDC/DDA).
(c) Application Breadth vs. Analogue/Digital Hybrid. In principle,
(37)the analogue/digital hybrid should be a very potent com­
bination. The freestanding DDA would appear to be a very poor 
match for it both in terms of performance and repertoire. 
However, because the hybrid consists of such radically different 
machines, in terms of variables representation, etc, the inter­
facing and dual running of them is far from easy and leaves 
much to be desired. Instances of the unsuitability are
(i) Difficulties of analogue-digital conversion.
(ii) Inability of analogue to "hold11 well for 
the more slothful GPDC.
(iii) Need for two ’languages'.
(iv) Very different electronic systems which 
have to be mated.
(v) Very high system cost.
(vi) Inability to have hands-off (closed-shop) 
programming and processing.
(vii) Machine dependence of analogue programming.
Once again, the GPDC/DDA hybrid has much to offer against this 
no data conversion, indefinite hold ability on part of DDA, 
common variables representation, identical electronic systems, 
lower system cost due to reconfiguration ability of DDA,
closed shop processing possible through use of simulation
(13) . .
languages, ie CSSL, machine independent languages possible.
C H A P T E R  2
DIGITAL INTEGRATION
2.1 Theory of Digital Integration
"Errors, like straws, upon the surface flow; he who would search 
for Pearls must dive below."
John Drydetv
The simplest form of digital integration is that which can be imple­
mented by hand by counting squares under a graph of the function 
y « f(x). This illustrates two forms of error that are inherent 
in the method
(a) The amplitude y value is rounded to the nearest representable 
quantity (V) at every sampling instant.
(b) The sampling is a finite process, ie a limited number of samples . 
are taken for the interval of interest, and thus assumptions 
have to be made about the behaviour of the function in the 
intervening periods.
The errors in Eulerian integration are dependent upon the 
period between the samples and the quantisation of the variable. 
If the sampling is sufficiently frequent and the quantisation 
of the variable is not too coarse the method can yield accept­
able results.
or
f(t)
CALCULATED
INTEGRAL
2.1 FINITE TIME AND AMPLITUDE SAMPLING OF A CONTINUOUS FUNCTION
A digital integrator can be easily built to mechanise this process, 
the variable y being held in a Y register and the integral in a 
second register. The integration process then merely consists of 
adding the updated value of Y at each sampling (or clocking) instant 
into the integral (R) register. It is assumed in this simple 
illustration that the input signal sampling has been carried out in 
some suitable device which not only carries out the rounding process 
but also outputs the digital signal in a form suitable for a digital 
integrator. This might take the form of an analogue to digital 
converter (ADG) if the inputted continuous function is an electrical 
signal.
. . •m'v;
2.2 Errors Induced Through Truncation Effects
If integration is based on the assumption that the function follows 
a straight line path between its sampled values, a much better 
result is obtained than for the Eulerian method. Figure 2.2 illus­
trates this point for a reasonably well conditioned (behaved) function.
y
ORDER
PATH
0 ORDER
2.2 SIMPLE DIGITAL HOLD METHODS FOR FINITE SAMPLING
It is clear that the Eulerian method (Oth order) leaves out slices 
of integral at each step which approximate to a triangle (see 
Appendix 2). Thus a curve which is sampled very frequently so that 
the third term, and higher orders of Taylor's series are insignifi­
cant in their effects, can be far better "tracked" by a first order 
method. When integration is carried out using this technique, 
squares counting (or rectangles summation) is replaced by the sum­
mation of trapezia hence "Trapezoidal Integration." »
Naturally, when this method is implemented on a calculator or 
computer, there can be no a priori knowledge of the value of Y at 
the next step in a process, so that the first order insertion has 
to be made in retrospect after the value of ^(n+2.) ^as ^een determined. 
The correction^being "late11, has many of the characteristics of delay 
in an analogue integrator (or phase shift). However, in general, 
providing this correction, albeit late, realises a considerable 
improvement in the integration process (so long as the original 
Eulerian integration was not already dominated by round-off error 
effects).
From Taylor’s series, and the figures, it is clear that the per-step 
truncation error induced by Euler’s method is proportional to the 
step length, and that due to the trapezoidal method is proportional to 
the step length squared. Thus step halving is much more effective 
with the latter.
2.3 Data Transmission Between Integrators
"I only ask for information.”
Rosa Dartel in David Copperfield (Charles Dickens)
Digital integrators must be interconnected in a variety of ways
to solve differential equations. The integral (output) is represented
in an integrator by a computer word which may, typically, be up to
2
24 bits. To provide such a variety of connections (a possible n on 
an n element machine)>each of 24 bits^is clearly out of the question 
by today’s circuit technology, if a purely combinational selection 
method is contemplated. (For a 64 element, 24 bit machine, over 
1 1 0 , 0 0 0  gates would be required for the multiplexure alone without 
regard for routing logic, fan-*in, fan-out considerations.)
To alleviate the problem for DDA's having simultaneously processed 
machine elements,several techniques are used:-
(a) Incremental, rather than whole word, information (integral) 
transfer methods to reduce the number of bits.
(b) Data serialization to reduce the number of wires.
The incremental method (a) works on the principle that if the 
integration method used is far from perfect, then there is no point 
in highly precise information transfer, ie the precision can be 
reduced to the point where the round-off error produced is no worse 
than comparable with the truncation error. In practice this leads 
to systems ranging from the transfer of crude ( 1  bit) increments of 
integral to relatively precise multi-bit (word) increments^
The data serialization method (b) merely transfers the problem 
from the space domain (cost in hardware) to the time domain (time 
to effect interconnections). This method will be summarily dis­
missed on the following grounds for the purposes of project SOIC:-
(i) Serialization is unacceptable within a high speed system 
which elsewhere uses parallel bit, parallel (simultaneous) proces­
sing methods.
(ii) Large systems with widespread (topographically) shift register 
systems, which are required to work in synchronism, are difficult 
to reliably control at high speed due to gate delay tclerances. 
Current shift registers (and gates) have, typically, a 3 to 1 spread 
in circuit delays.
Three of the most commonly used incremental transfer systems are:-
Binary. In this system, the only information transmitted is 
±1 increment of integral. This quantum corresponds to a certain 
significance of the R register (ms), and that information which 
exists at a lower significance is stored in the R register and added 
into the next slice of integral. This system is another source of 
round-off error as *+1 * has to represent all integral increments 
between 0  and < lj and *-1 * all increments between - 1 -J and 0 . A 
further disadvantage is that problems run with a positive going 
independent variable followed by a negative going one exhibit a 
"hysteresis" effect, is the result does not return to its original 
v a l u e . p oriIiaxiy this means that
H o
Maa * ^  ^  generally.
(ii) Ternary. This is the same as binary except that three 
representations of increment are possible (+1 , 0 -1 ) for increments 
in the range (J to lj, to + £, -£ to -li). The per step round-off 
error is thus clearly less and the "hysteresis" effect absent. 
However, it is still very crude and places a limit on the sophisti­
cation of integration algorithm that may be usefully used in 
conjunction with it. The benefits gained from use of an algorithm 
more complex than trapezoidal would be completely swamped by the 
errors induced through round-off effects.
(iii) Word Transfer. A compromise between whole word transfer 
and the ternary scheme lies in a short word representation. The 
length of the word is dictated by economics and the types of 
problems to be run on the machine, in particular, iterations per
. (25)
solution. McGhee and Nilsen have put up a reasonable case for 
using word incremental transfers of up to half the number of bits 
of the dependent variable. However, even this can incur a heavy 
outlay for logic in quite small DDA's, say, of no more than 30 
integrators. The author has devised a method, described in detail 
in Chapter 6 , to allow wide word transfers without incurring undue 
expense. However, as it invokes a very different logical structure 
for the digital integrator, it will not be discussed here.
2.4 A Note on Round-off Error Reduction
Consider a simple digital integrator using the ternary (A £  +1, -1, 0) 
transfer system. Figure 2.3
S(Y) (1)
ADD
£ a y
■INTEGRATOR 1
' (1) R(1)
2.3 RESIDUE (R) REGISTER METHOD
The register transfer equations are:-
i\ + EAY,.n(i ) (l-l) (l)
Z (i) + R (i)
) 
) 
)
R (i-1 ) + Z (i-1 ) + Y (i) )
for the Euler integration method. Z is that part of the integral 
which is greater in significance than the 'magnitude* portion of Y.
If a second integrator is added (which is fed with the output of 
integrator 1), then Y (2) is related to Z(l):~
Y(2) [Z(I> + R(I)] x 2
-n
when n is the number of bits in the R(l) register. This relation­
ship exists as a Y register always holds the input variable 
magnitude to an integrator and this is just the scaled sum of all 
the outputs which feed that integrator. Z can thus be eliminated 
from all integrators, and (in this case) Y(2) merely updated by 
carries-out of the top significance of R(l) (incremented), rather 
than completely updated at each iteration.
' ( 1)
^(l) Sardware)
« 8 - Y (1) '
t - 1 1 1 
tj
I 1 1 I 1 1 1
—
*t
1
R (1) (n BITS )
%
--- - ;
OVERFLOW FROM R, ^  
vT)
S(Y)
y (2)
t -
ADD |
lZ(2) Rc)[ n BITS]
INTEGRATOR 1
;C1) T° Y(2)
INTEGRATOR 2
[ Y * Z (l)x 2 -n ]
2.4 CONNECTION OF TWO INCREMENTAL TRANSFER INTEGRATORS
As Y(2) cannot exactly equal [Z(1) + R(l)j x 2~*a due to the
possibility of the latter assuming non-integer values, then Y (2 )
(at best) can only equal the nearest integer to [Z(l) + R(l)] x 2
/__1 \
To achieve this, R(l) can be set to 2 ie | quantum of Z(l)
so that carries-out of the top of R(l) can occur at the most
-n
favourable time to hold Y(2) to the nearest correct integer to 
[Z(l) + R(l)] x 2~n .
This method finds an exact analogy in the derivation of rounded 
integers in normal numerical work, ie:~
find the nearest integer to 4.52.
If 0.5 is added to the mixed number and the 
fraction discarded, the nearest integer remains*
• 4 . 5 2 + 0 . 5  « 5.02
Result = 5
2.5 Interaction of Error Forms in Digital Integrators
Inevitably, the two major types of error in hardware digital inte­
gration viz. truncation and round-off, interact. The truncation 
errors affect the content of the R register, the collecting point 
for integral; it is from the R register that the integral increment 
is determined. The behaviour of the Y register, or more explicitly, 
the input ’waveform' to the integrator, determines how these effects 
interact. For example, a waveform of positive slope causes the R 
register to contain less than the correct integral at any given time, 
for Eulerian integration. TheEe will thus be a tendency for insuf­
ficient increments to be transmitted. This will tend to be 
exacerbated by downward rounding of the R register content during 
the interconnections phase. At other times the round-off and 
truncation errors can tend to cancel, and this can be very useful 
at times. Trapezoidal integrators connected in a two integrator 
sine/cosine loop using the ternary transfer scheme, will execute 
sine waves, neither converging or diverging^d infinitum. This is 
particularly useful for navigation or (inverse) resolution of (Polar) 
Cartesian data.
2.6 Separation of Errors for Analysis
Clearly when digital integrators are to be simulated for perform­
ance checks and behaviour analysis, when interconnected in various 
ways, it is helpful to be able to separate the error forms. (As 
they interact, separated effects will not sum to their combined 
effects; nonetheless separation is useful.)
To separate the round-off effect is comparatively easy using a 
GPDC, as the latter, because of its application area, usually has
far more bits per word than a DBA. Thus it is merely necessary to
use the maximum word length available so that round-off diminishes 
into comparative insignificance. The use of double precision 
variables can further aid this. Unless the simulation is highly 
protracted,adding just a handful (say, 8 bits) to the word length 
is sufficient to largely eliminate round-off effects. See Appendix 2 
on simulation of DDA's.
To remove the truncation effects is more difficult,as more thinly 
slicing the function will disturb the round-off error behaviour.
Thus, in general, the analytical solution to the problem must be
known or each step performed first of all in double precision (and
subdivided) to obtain the ostensibly correct value for the next 
step. The latter is more easily achieved, in general. See Appendix 2 
for the example of a sine/cosine loop.
2*7 Comparison of DDA and GPDC Errors
DBA's and GPDC*s, being both purely digital machines, attack digital 
integration in much the same manner. The main differences lie in 
the round-off error effects and their interaction with truncation 
errors. (It is assumed here that basically similar integration
algorithms are being employed.)
The difference comes from the fact that the GPDC has a centralised 
storage system,so that whole word transfers are carried out between 
integrators, which are merely represented by locations in the main 
store. The round off error thus comes from the scaling of problems 
ie data flow through potentiometers. The latter are normally 
multiplication routines. Thus the round-off errbr occurs in a 
slightly different place and so has a slightly different effect 
on the truncation errors. In addition, most GPDC's carry out 
multiplication producing only truncated results, rarely correctly 
rounded. This applies whether the computation is done in floating 
point or the less frequenct “fixed point" such as is associated with 
CORAL 6 6 . The greater number of bits per word in a GPDC also means 
that round-off effects are generally very much smaller than with a 
DDA. Furthermore, double precision working is possible with many 
machines and languages, ie ICL FORTRAN and IBM FORTRAN.
2.8 Systems of Digital Integrators
Although most GPDC's act as sequential machines, ie each derivative 
is computed in turn on a given solution step in a time shared arith-
i i .
metic unit, hardware-sequential systems are nowadays of more academic 
rather than practical interest. This is largely due to the low cost 
of microcircuits making simultaneous machine element processing 
economically worthwhile. In terms of integration algorithms, it is 
of interest to note that although, clearly, simultaneous machines 
require every integrator to make a one step prediction in order to 
generate the current integral slice, the sequential machine needs 
this of only one element, provided they are all processed in the
correct sequence. Figure 2.5 shows an example of this
+y
2.5(a) "ANALOGUE11 COMPUTER DIAGRAM FOR A FIFTH ORDER DIFFERENTIAL 
EQUATION
r
1 I I
t
2.5(b) SEQUENTIAL DDA DIAGRAM FOR FIGURE 2.5(a). [INTEGRALS AND 
INTERCONNECTIONS DO NOT PHYSICALLY EXIST]
If element 1 has to predict, then 2 can use this prediction and 
merely interpolate to the desired degree to determine the behaviour 
of its input variable, etc. Thus suitable processing sequences might 
be
1, 2, 4, 5, 6 , 3
1 2 3 4 5 6
1 2 4 3 5 6
( 2 must preceed 3 and 4 )
( )
( 4 " " 5 and 6 )
( )
( 5  " " 6 )
Because only one element is predictive, there is a tendency for 
truncation errors to be less for sequential machines. (See 
Appendix 2 .) However, having different algorithms for certain 
machine element(s) is a design nuisance. If all of them use simple 
Eulerian integration, the lack of a post-iterative-corrector makes 
them all work in a similar manner. This was the method adopted in 
many early sequential machines, ie C O R S A I R , ^ ^ V E R D A N ^ e ^ c .
2.9 Other Integration Methods
flI never think of the future - It comes soon enough."
Albert Einstein, 1930
More complex methods of integration are commonly used on GPDC's 
mainly because of the very low iteration rate that is achieved 
through sequential integrator processing. The more complex 
procedures allow greater step lengths to be taken in many cases ana 
thus solution times are reduced. However, the more complex 
algorithms, when implemented in hardware, necessarily incur a much
greater capital outlay not only for the integrators, but also for 
coiranensurably more sophisticated interconnections methods. (There 
is no point in reducing truncation errors if round-off errors 
dominate.)
Despite these limitations, more complex integration methods have 
been implemented in hardware:-
(i) Runge-Kutta Methods. (Appendix 2.) These methods are 
called single step as they require no information pertaining to past 
solution points (n-1 ) ^ n ^ ^ e t c ^ i n  order to determine the next solu­
tion step (n+1). In effect, the method uses the fact that a 
differential equation
»
y «■ f(y, k, n . . . .)     (1 )
is a hypersurface in(n) dimensions so that exploration of a sufficiency 
of this surface between points (n) and (n+1 ) will yield sufficient 
information to obtain a describing nth degree polynomial for the 
interval (n) to (n+1). This exploration, when implemented, is a 
closed loop procedure, as can be seen from equation 1 , and obtains its 
power from this fact. Clearly^if computational loops greater in 
degree than the describing polynomial are simulated, the procedure 
is no longer closed loop and the method no longer works in the 
desired manner.
The method does have the advantage that, as no previous points on 
the solution are used, step changing without algorithmic changes are 
possible, This is very useful in ill-conditioned areas of a problem. 
Unfortunately, the algorithm does not spot the difficult areas coming. 
It also does not directly assess the truncation error, so that a
suitable step length may be chosen. A modification to the Runge- 
Kutta method by Merson (1948) does give an error estimate, which 
although usually safely pessimistic on the desired step length, 
cannot always be guaranteed to be s o . ^ ^
(28 29)
(ii) Adams-Bashforth(A-B) and Adams-Moulton(A-M) methods *
T<!
In the 18 CenturyvJohn Couch Adams (in association with others at 
Cambridge) devised some multi-step integration methods which would 
be suitable for hand calculation in well behaved sets of differential 
equations. For his work (planetary orbit computations) they were 
ideal. The A-B methods were a family based on the use of current 
step and previous step data to determine the next step of a solution. 
In effect, an nth order polynomial is fitted to some of the last 
ordinate and slope values on the solution and assumed to be valid 
up to the next point, ie an nth order extrapolation. In his formulae, 
the current ordinate and slope are used in conjunction with (n-i) 
previous slopes.
An improvement to this scheme, which provides a post-iteration- 
correction on the same step (but takes twice as long per solution 
step as a result) is the A-M method. This makes a similar extra­
polative step during the first phase of the operation, but then 
perforins a correction using an nth order polynomial fit with the 
new slope derived from the integrator(s) that input(s) to the 
integrator in question. It has been found that only one phase of 
correction is worthwhile, although any number could be applied. Of 
interest is the fact that an error estimate can be easily derived 
from the formula. Unfortunately, it cannot be simply applied, due. 
to the algorithm being multi-step. Step length changing would
involve weighting factor changes for all the current and past 
slopes and ordinates. This would be much too cumbersome for a
hardware integrator. It is generally not even attempted in GPDCfs.
In summary., comparing these two methods:- R-K does not readily give
an error estimate but can easily have its step length changed;
A-B and A-M can easily be programmed for error estimates but it is 
very difficult to change the step length. Because of costs, neither 
have been extensively implemented in hardware.
2.10 Discussion
A consideration of the cost effectiveness of the various methods 
described, and the systems that have evolved in the past, has led to 
the development of SOIC I (Chapter 5) which uses a relatively simple 
version of Adams method, together with a simple (4 bit) 
word transfer system between machine elements.
Experience with SOIC 1 led to a very different system (SOIC II) 
being conceived which attempts to use a more sophisticated algorithm, 
but without incurring the hardware penalties that suitably low 
round-off error limits would apparently impose. (Chapter 6 *)
CHAPTER
PROBLEM SOLVING AND SIMULATION USING PDA's
"Our little systems have their day;
They have their day and cease to be."
Tennyson
3.1 Basic Systems
A  basic DDA often performs the same function as an analogue computer 
ie the solution of sets of differential equations. Therefore, it 
contains, in digital form, many of the components found in an 
analogue machine: integrators, potentiometers, multipliers, function 
generators, etc. In practice, if a digital integrator is used as 
the basic element and can integrate with respect to arbitrary variables, 
rather than only time, all the functional elements can be formed 
by suitably interconnecting a number of such elements. This con­
siderably simplifies the system architecture of the machine compared 
with the analogue, but does leave the problem of having a larger 
number of elements to connect together. For instance, a multiplier 
is a single element in an analogue machine, but is formed from a 
pair of integrators in a DDA, if a single element type is to be used.
The DDA can take two basic forms:
(a) The sequential machine which has a single, time shared arithmetic 
unit for generating integral, and a store whose locations contain 
Y, R etc associated with each "machine element." The system 
is usually very slow in operation because the processor has to 
access each element, in turn, and service it once, to carry out 
each iteration. This makes it little different from the GPDC, 
in concept. It will not be considered in any more detail in 
this chapter.
(b) The simultaneous (parallel) machine, in which each element 
physically exists with its own identity, storage for Y, R etc 
and arithmetic unit. All elements can work concurrently which 
makes the machine work at a speed which is both much greater 
than the sequential machine and independent of problem size. 
However, as the storage is decentralized (unlike the sequential 
machine), an element interconnection system must be incorporated, 
in a like manner to the analogue computer.
• «
The solution of a problem in a simultaneous machine is very similar
in concept to that of the analogue,(as shown in Figure 3.1 for a 
ternary/Euler system solving a second order DE.) The signal wire% ^ 
are, in fact, pairs for the ternary transfer system, and, if the DDA 
is to be programmable, these wires must be routable from any AZ 
output "socket" to any (or a selection of) Ay or Ax "sockets". The 
freedom of routing determines the ‘topology* of the machine: complete 
choice giving a complete topology, limited choice a partial topology. 
Whichever topology is chosen, several methods may be used to imple­
ment the interconnections: a patchboard in the manner of the analogue 
computer, logic gating using a multiplexing or crossbar technique, 
and others.
3.2 Hardware Considerations
3.2.1 Integrator Form
This choice is governed very much by application and economics. 
However, in order to provide a "feel" for costs, the component cost 
for the data-flow section of various forms of digital integrator will 
be given. Ex-works costs will be typically 4 to 5 times as great 
to cover printed circuit manufacture, connectors, skinning etc. 
(Sixteen bit variables are assumed and standard TTL technology.)
3.1(a) ANALOGUE SOLUTION TO SECOND ORDER D.E,
AX
AZ,
AX/ AX,
AZ,
AZ,
INTEGRATOR. POTENTIOMETERS, INTEGRATOR.
3.1(b) DDA SOLUTION (OVERALL)
V(1) I
I
L J
R(1)
AZ(l)e
At
V(2)
R(2)
AZfc)
AX ^Cb)
□
AZ(3)
■ e
AY(4)
D l l I
At I
- e
Y  (4)
1
I ... R(4> j
a z C4)
YJ& y4 are initial conditions
Y2& Ya ARE POTENTIOMETER COEFFICIENTS
RjRgRaR^ ARE INTEGRAL RESIDUES (SET,INITIALLY,TO ^AS)
♦ SEE TEXT
3.1(c) REGISTER STRUCTURE FOR FIGURE 3.1(b)
TYPE OF 
INTEGRATOR
■
*
.. . ..
INTEGRATION
TIME
.
COST OF 
MICROCIRCUITS 
(IN QUANTITY)
iI
**
'
Ternary/Euler / 0.5yS * £15 2.5
Ternary/
Trapezoidal
/ 0. 6yS = £17 17
Ternary/Euler 4 yS * £ 8 0.3
Ternary/
Trapezoidal 4 yS
= £ 1 0 2.5
4 R-K 5 pS = £18 t
4 bit increment 
Trapezoidal
/ 1 yS = £35 82
Parallel Arithmetic
.  -
Approximate speed - Accuracy product for sine wave.
(Arbitrary scale.)
• J *  •  0
No figure is given for 4 R-K as a 16 bit version 
would be round-off error dominated until the step length 
rose to about 20° per circle. If round-off errors are 
ignored, the error per circle is 4 x 10 for 4 R-K at 
h= 0.1. This corresponds (for sine-waves only) to a 
speed accuracy product of 256.
The figures given do not take account of any control 
logic, nor interconnections logic where programmed 
machine element interconnection is required.
3.2.2 Interconnections (See also Appendix 2)
Although digital integrators may be hard-wired, (that is,the program 
determined at manufacture in the rack-wiring), for general purpose 
problem solving and simulation work, the elements in a DDA must be 
patchable.
This may be carried out in several ways, some of which are listed
below:—
(i) By patchboard, as in a conventional analogue computer.
(ii) Electronically, by the use of suitable logic forms 
ie in the manner of a telephone exchange.
(iii) By the use of . communication media other than 
electrically, such as by optical transceivers.
The use of a patchboard has several disadvantages, especially
for machines such as SOIC I and II:
(a) It is not patchable by programme.
(b) It is unreliable due to the large number of friction or 
sprung contacts that are required.
(c) The contacts form an electrical discontinuity which will 
reflect high speed logic signals and make line termination 
difficult.
(d) The random disposition of the wires on the patchboard will 
give rise to cross-talk between these wires of indeterminate 
magnitude. This will tend to limit the size of patchboard 
that may be used and may force the utilization of screened 
cables with their attendant mechanical termination and 
reliability problems.
(e) Data may have to be serialized in order to reduce the number
of patch-wires.
The use of optical systems^ ie gallium arsenide optical sources^ 
has certain attractions (ie fast data rate and electrical immunity 
between subsystems) but introduces severe electromechanical problems 
if some sort of auto-patching is required.;
For these reasons, an electronic system, directly compatable with 
the machine element logic, was sought. It was decided that the 
following requirements would have to be met for SGIC Mk I.
(i) The interconnections system should make possible the 
connection of any one machine element to any combination of machine 
elements, including itself.
(ii) The interconnection system should be programmable from some 
data source ie peripheral device, GP computer etc.
(iii) The interconnection system should cost little compared with 
each machine element^but not consume processing time greatly in 
excess of that of the machine elements.
(iv) It should be capable of handling data in a bit parallel 
form in order that the performance criterion could be met.
The SOIC Mk I system that evolved appears schematically in 
Figure 3.2.
MACHINE
ELEMENT
(1)
) k BIT OUTPUT 
INCREMENT A
( D
AZ(l*f) ^Z(15> 'AZ(l6)
NOTE
TIME/SPACE ALIGNMENT 
OF DATA i
SEL
4 z c o
SSL
4Z(2)
NEG
4 Zd )
4BIT TRUE/COMPLEMENT
DATA SELECTOR
Z.AYZ A Y  ACCUMULATOR
3*2 SOIC I INTERCONNECTION SCHEME
All machine elements are processed concurrently and produce 4 bit 
output increments which are concurrently loaded into a 4 bits wide} 
16 bit shift register having a parallel entry facility. The four 
bit increments are then broadcast, during the interconnections phase 
to all machine elements. The occasion, during this phase, that 
any increment appears on the highway depends on which element from 
which it was outputted ie its initial position in the shift 
register.
Each machine element contains two shift registers, each of 16 bits 
corresponding to the 16 machine element outputs. The pattern in 
these determines the outputs that will be selected into that 
machine element and whether it should be negated, or not. The code 
used is as follows
SHIFT 
REGISTER 1
SHIFT 
REGISTER 2
ACTION TO 
BE TAKEN
0 0 Ignore Increment
0 1 Ignore Increment
1 0 Accept Increment
1 1
Accept and Negate 
Increment
Table 3.1
These patterns form the interconnection information for each 
machine element and are thus loaded, with all other initial con­
ditions into the DDA prior to computation.
The selection/signing shift registers are made circulatory as the 
same information will be required for data selection after every 
iteration. The 4 x 16 AZ shift register discards its data as the 
highway is loaded, as the data is not required after the intercon­
nections phase is complete.
The method described has several advantages:-
(i) It is cheap to implement and can be accommodated in the body 
of the machine element logic without greatly increasing the letter’s 
complexity or connector finger count.
(ii) Only the AZ store is best left on a separate logical item 
from the machine elements as it can be built from shift register IC’s.
(iii) The system can work with increments of any number of bits 
without greatly increasing its logical complexity.
In the implementation of SOIC I, some disadvantages were also noted. 
The most serious was that even with only 16 elements to interconnect, 
the interconnections phase was twice as long as that for integration 
(2 yS vs. 1 yS). Thus, even with a small element count, the inter­
connections time was dominant. The reason for this was that, in 
order to avoid the cost of storing the increments individually until 
all had arrived, they were added into an accumulator, as they were 
accepted, and time had to be provided between shift pulses for this
Vpurpose. ■~/
The approach described above yielded a total interconnection 
topology ie any element could be connected to any other. In order 
to speed-up the interconnections phase, partial topologies could be 
considered, possibly in conjunction with more highly parallel methods 
of data selection. In the limit, interconnections could be totally
parallel, in a manner similar to a cross-bar telephone system. This 
has already been achieved, for a partial topology in analogue 
computers (references 35, 36, 41.) For X>DAls a partial time and space 
system of selection has been reported for a total topology
3.2.3 Loading of DDAfs
One advantage of the digital integrator structure, particularly 
where a simple, say ternary, interconnection system is used, is that 
very few connections are needed either between machine elements or 
to the outside world. This makes it suitable for packaging either on 
to simple, cheap cards, or, ultimately, onto integrated circuits. A 
machine element built onto a single integrated circuit would only 
require conventional 16 or 24 lead frames, which are already avail­
able. However, such a device would have to accept the limitation of 
serially loaded initial conditions (to the Y register, in particular). 
There are, however, two redeeming features:
(i) A  typical run would usually consist of thousands of iterations, 
taking anything from 1 mS to several seconds. Thus, serially loading 
the data (taking about 5 microseconds) is quite acceptable. All 
the machine elements can be loaded concurrently, so that the total 
time for a whole machine need only be 5 pS.
(ii) For many applications, integrators use the same initial 
conditions for several runs, as in hill-climbing, optimisation, etc. 
Thus, use of an auxilliary initial conditions register in each 
integrator can often be justified. Being an internal register, it 
can load the integrator registers broadside (bit-parallel) in a few 
tens of nanoseconds. Only when different initial conditions are 
required, need serial loading be tolerated from outside the chip.
The interconnections system, referred to in paragraph 3.2.2, and 
Appendix 2, has the advantage that the interconnections highway to
each integrator can be merged with the initial conditions highway(s).
Furthermore, the increments store (the only interconnections logic 
not internal to the integrators) can be used as a distribution point
for the initial conditions so that the system is very cheap and
compact.
3.2.4 Interconnections/loading Logic Considerations
The basic logic layout for the interconnections/loading system is 
shown in Figure 3.3.
MACHINE ELEMENT OUTPUTS
INITIAL 
CONDITIONS { r-r--
/
A
CLOCK
GENERATOR
SHIFT REGISTER 
MATRIX
“%>■ ©-
PROGRAMME
START/STOP
INITIAL
-CONDITIONS
I INTERCONNECTIONS 
(’AY)
EXAMPLE MACHINE 
ELEMENT
TO JOTHER ELEMENTS
SOIC I LOADING SYSTEM
It is clear that the interconnections system requires all machine 
elements to have access to all outputs. This is achieved by broad­
casting these increments, via the shift register matrix, concurrently 
to all machine elements. However, the initial conditions system, 
requires data to be selectively routed to a given element at any one 
time. The major problem here is a decision as to whether the data 
should be broadcast and the elements selectively clocked in order to 
selectively load the IC's, or alternatively, broadcast the clock and 
selectively route the data. (For interconnections, both are broad­
cast.) In Chapter 5 it is shown that the author adopted the former 
for SOIC Mk 1, although either is possible. The reasons for the 
author*s choice lie in the fact that selecting the clock needed the 
selection and routing of only one signal per machine element (as 
a single clock is used). However, selectively routing the data would 
have required the routing of five signals. The five initial con­
ditions for each integrator are loaded bit serial but word parallel.
3.2.5 Outputting/monitoring of Machine Elements*
Variables
There is only one major output signal from a digital integrator 
(although a few simple additional conditioning signals are also 
required). This is Z, the integral output. As the machine works in 
increments, this is not explicitly generated. It only exists as the 
"built-up" accumulation in any integrator to which the one in question 
is connected. Thus, the easiest way to monitor the output of an 
integrator, especially where programmed interconnections are used, is 
to collect and accumulate the increments from an integrator in a 
special element. This element is effectively the same as the front 
end of a conventional integrator^but does no more than put the out­
putted variable into whole number form. The variable can then be
converted to analogue form for on-line monitoring or multiplexed 
with others and sent to a GPDC, as required. Figure 3.4.
INCREMENT SELECTOR
A Z  (SELECTED)
ifiTORaccdmu;
16 WIRES
TO ANALOGUE/DIGITAL 
CONVERTER (ADC)
MULTIPLEXOR
G.P.COMPUTER
3.4 SOIC I ELEMENTS MONITORING
The additional signals referred to earlier would be used to determine 
the status of the elements and can be carried on single wires. They 
are:-
(i) Dependent variable (Y) overflow, analogous to saturation 
in an operational amplifier.
(ii) Breakpoint crossing for functions-generation. The logic 
for this can take the form of an auxilliary register(s) in each 
element which is updated with increments in a like manner to the 
Y register. However it (they) is (are) loaded with a different 
initial condition^) so that its (their) crossing of a defined 
boundary can be detected. The most obvious boundaries are over­
flow and zero, both of which are logically easy to detect. The 
author prefers zero crossing detection for the following reasons:
(I) It is easier to detect through sign bit changing without 
consideration of the validity of the registers content.
(ii) Having been detected, the auxilliary register contents are 
still meaningful and related to those of the working Y register.
Had the auxilliary overflowed, it would now contain nonsense and 
have to be adjusted before again being used.
The author adopted the 1 sign-changing method \ for SOIC Mk II as 
described in Chapter 9 .
3.3 Software Considerations
SOIC Mk I machine was a hardware test-bed designed and built 
specifically for the purpose of confirming the feasibility of using 
DDA techniques for the efficient solution of problems involving 
sets of differential equations. As such, it required virtually no 
software other than machine code programme material and initial 
conditions appropriate to a given problem. This was supplied via 
paper tape, although any other input peripheral could have been used. 
SOIC Mk I also had no special outputting facilities other than 
addressible monitoring logic and optional digital to analogue 
convertors (DAC) for the continuous monitoring of solutions to 
problems.
However, from a study of the operations of SOIC I, the nature of 
supporting software for a hybrid DDA/GPDC could be established ie 
for a system such as SOIC II to be described later.
Broadly, the GPDC must be able to completely service a DDA from the 
housekeeping, loading, I/O and interruptions handling viewpoints. 
These are, in fact, the major tasks of the GPDC, which make its 
attachment to a DDA desirable.
For any DDA/GP hybrid, four of the major software problem areas are 
centred on the following GP activities:-
(i) Allocation of Variables (and their derivatives) to machine 
elements. This is only a problem because of the limited connec­
tivity between machine elements, which makes arbitrary allocation 
impossible.
(ii) Loading of Machine Elements - an organisational problem, 
the nature of which is dependent on the storage available both 
in the GPDC and the DDA and on the nature of the DDA/GPDC interface 
hardware/software.
(iii) Interruptions Handling - the first level of DDA to GPDC 
communication, which keeps the GPDC informed of the progress of a 
solution, at least at an elementary level.
(iv) Scaling - a problem very similar to that encountered in 
a GPDC/analogue hybridfs software. Fortunately it is not so severe, 
due to the higher potential dynamic range of a DDA's machine 
elements.
3.3.1 Allocation of Machine Elements
An analogue computer, with a patchboard, has, in general, an 
unlimited, total inter-element connectivity. The only limitations 
that occur are due to fan-out and the deliberate segmentation of 
machines into relatively autonomous "fields" or subsystems.
A programmed DDA, using logic as the switching/routing medium for 
data, may not be able to enjoy such flexibility, either because of 
the cost of providing a sufficiency of data-routes or because
of the degradation to the machine performance by removing this 
activity from the space to the time domain.
The GPDC can be used in two ways for DDA element allocation:-
(i) merely as an interpreter for converting manually prepared 
allocation data, written in some user orientated format, to DDA 
machine code ie unintelligent housekeeping>or by
(ii) taking the problem, say in a form which identifies machine 
elements and the interconnection matrix, and by some algorithmic 
means, allocate the GPDC machine’s elements to positions in the DDA. 
The allocation must form a matrix which is a subset of the (known), 
incomplete, DDA interconnection matrix. •
The DDA may be considered, in the case of SOIC II, and others, as a 
matrix of elements with more than one level of interconnections.
SOIC II can be considered as in Figure 3.5.
GROUP 1
GROUP 2
•GROUP X
' «
i i
! jf*/
t / 7
/!
TMPRA.- GROUP CONNECTIONS
i
■0
•6
■€>
■O
O
a
- 6
-6
I
INTER-GROUP
BUSBARS
I
i ; 1 i
• i i i
i i  i i
3.5 A PARTIAL INTERCONNECTIONS METHOD
Each group has a certain number of elements (say 8 ) and the intra- 
connectivity is unlimited within the group (total topology). •
The machine has a limited interconnections matrix for the groups 
using selected members of each group as "donor" and ’’data recipient11
elements for this matrix. The purpose of the algorithm, or software 
package, is thus to allocate elements in such a way that those of 
high interconnectivity gravitate to the same group and those of low 
connectivity, for which the sparse inter-group matrix is sufficient, 
go to different groups.
It has been.found that connectivity can be handled, for the purposes
. . (2 3 )
of grouping, m  much the same way as Steinberg groups components
on a plug-card in the placement problem. This appears to give good
results when carried out manually on selected simulation problems.
3.3.2 Loading of Machine Elements (ic Storage)
Once the allocation of variables to machine elements has taken place, 
it is, from a software viewpoint, comparatively simple to suitably 
route the initial conditions data.
However, the exact method of routing is dependent on storage facilities
in the total system. Either
(i) No initial conditions (IC) storage is provided other than 
that in the machine element ie the IC’s are routed directly from
the initial programme data source to the DDA.
(ii) The GPDC assembles the ICfs in matrix form in core so that this 
matrix can be conveniently updated, as necessary, and then dumped to 
the DDA or
(iii) the DDA has mass data storage in addition to that found in 
machine elements.
System (iii) has the advantage of not constantly hogging I/O every- 
time a fresh, possibly similar, DDA run is to be initiated. The DDA
store can be updated on an exceptions only basis. However, the GPDC 
needs to retain certain information in order to know what to update 
in the DDA store. Provision of a DDA store is expensive when considered 
in the light of its very limited usefulness for non-repetitive com­
puting.
System (ii) can use core which can be used for other purposes, so it 
is relatively inexpensive. However, I/O is heavily loaded whilst a 
new set of IC*s is being loaded to the DDA.
3.3.3 Loading of Machine Elements (Organisation)
When a digital integrator has been used in a program, there is no 
record of the initial conditions that were loaded into it. Thus, to 
repeat a run, it must be reloaded. Initial conditions can be centrally 
stored (say, in a DDA) and reloaded into integrators each time a re-run 
is required. The storage can either be in the DDA or in a GP computer 
store.
If storage is within the DDA, only changes of initial conditions need 
be sent by the GPDC. However, each piece of information must have a 
DDA store location appended. If storage is within the GPDC, the 
information will be sent, en bloc, thus no addressing labels will be 
necessary. The order of transmission will determine locations. The 
software suite depends on the method of storage used.
3.3.4 Interruptions Handling
As with machine loading, the strategy is dependent on the logic 
structure of the DDA. As interruptions (which incidentally, will 
arise mainly from element overflows, auxilliary registers, interrup­
tions, etc) are comparatively infrequent on a per iteration basis, it 
is considered that they are best collected in a register in the DDA.
and a single disjunction signal sent to the parallel logic. This can
then initiate a scan of this register to determine which ^element caused 
the interruption.
3.3.5 Scaling
This problem is the very kernel of much simulation.(42,43,45) jn a 
GPDC operating in some simulation language, there is no problem, due 
to the powerful combination of long computer words (typically 1 0  and 
2 0  equivalent decimal digits) together with, in many cases, a floating 
point facility-. This makes for such a wide dynamic range-that scaling 
can be very crude and yet perfectly satisfactory. In an analogue
computer, the dynamic range is much less, if accuracy is to be main-
tained, and scaling has to be very carefully applied.
In a DDA, floating point operation is rarely carried out (for an 
exception see Reference 24) so that the dynamic range is obtained
solely by the use of a long computer word. (Floating point, simult­
aneously processed digital integrators would be prohibitively expensive 
for most applications.) Thus, although more precision, arid hence, 
useful dynamic range is available, care must still be exercised in 
the scaling of a problem. The methods used are thus very much akin 
to those employed for scaling the analogue computer. However, a DDA} 
being a device based on logic^is quite easy to reconfigure and thus 
the scaling of problems can be adjusted during problem execution so 
as to maximize the usefulness of a given dynamic range. An example 
of this would be in the use of the overflow/interruptions facilities 
to scale down low level signals which had grown and gone into over­
range, or, alternatively, in the use of auxiliary registers to detect 
decaying signals, so that they may be uprated in amplitude.
3.4 Hybrid Computers Incorporating a PDA
A collection of digital integrators can form the basic machinery for 
the solution of problems involving sets of differential equations. 
However, additional logical items are required to service and monitor 
such integrators.Summarily, these items are, means of loading, moni­
toring, interruptions handling, interconnections programming and set­
up, initial conditions loading, I/O to peripheral devices (hardcopy 
generation, plotting, archival storage etc), programme compilation, 
housekeeping etc. It is quite clear that much of this work can be 
carried out by suitable ad hoc logic, etc, built into a DDA, thereby 
making the system freestanding and independent of any other systems, 
but is very conveniently covered by a suitably programmed general 
purpose digital computer (GPDC). In fact', many establishments con­
cerned with computing in its multifarious forms, might already have 
a suitable GPDC and peripherals. Thus it is only reasonable to con­
sider a hybrid configuration of DDA and GPDC as a more economical and 
powerful tool to the freestanding DDA. (SOIC Mk II is an example of 
such a hybrid.)
The two machines are both digital and this obviates the low precision 
analogue to digital conversion that is necessary when analogue and 
digital machines are interfaced. More importantly, the DDA having 
unlimited hold time (without loss of accuracy), is easier to use 
with a GPDC in a loop. (Problem instability can result in the con­
ventional hybrid in loop operation due to computational delays in 
the GPDC.)
The GPDC, through its mainframe store, can easily hold the DDA
program/IC matrix referred to earlier and quickly decant fresh 
program material to the DDA through the normal I/O interface, 
in effect, the DDA can become a satellite processor for the GPDC with 
a rapid reprogram /reconfiguration characteristic virtually unknown 
in the conventional hybrid. The communication from DDA to GPDC 
can be not only in terras of element monitoring (problem results), 
but also by way of interruptions generated in the DDA due to element 
overflow and breakpoint crossing by auxilliary Y registers.
SOIC Mk I was built as a prototype DDA capable of being loaded from 
an external source. SQIC Mk II is a system design making use of all 
the facilities outlined in this Chapter.
C H A P T E R  4
EXTENSION TO THE HARDWARE INTEGRATION METHOD
The SOIC Mk I machine, in order to have a very high performance 
(speed-accuracy product) and be electronically programmable, required 
certain enhancements to be made to the conventional integrator design 
and the logic system in which these integrators were to be embedded.
4•1 An Approach to Combinational Processing of Integrators
A conventional two pass Euler integrator performs the following 
register transfers (Equation 4.1) and logically appears as in Figure 
4.1.
Y (i) Y (i-1 ) + ZAY(i)
A Z / .n + R / . n = R/. + Y,.>vA X / .v(i) (l) (i-l) (l) (l)
(1 )
AY.
(i)(i)
(i-1 ) (i-1)
SELECTOR I
SELECTOR 2
SELECTOR 1
SELECTOR 2
SELECTS Y
(i-1)
tl
Y(i)
IF AXd)
tl
*(i)
IF
AX(i)
tl 0 IF AX(i)
tl Z a y
(i)
tl R/.(n>1)
ON PASS 1
= -1> ON PASS 2
ON PASS 1 
ON PASS 2
4.1 TWO PASS DIGITAL INTEGRATOR
With this arrangement, ignoring the irrelevant effect of block carry 
systems in the adder, two full carry propagate additions have to take 
place. With standard TTL, this takes approximately 200 nS (worst 
case) for 16 bit words. Including the necessary delays through the 
selectors, plus the staticisation times of the registers when storing 
and , the integration time is, worst case, 720 nS. If the
two passes could take place nearly concurrently and it was not neces­
sary for to be established in the Y register before pass 2, a
much faster iteration would be possible.
It is clear that concurrent processing of Y and R calls for two adders. 
However, if they are wired so that only combinational logic separates 
them,the whole system will work much faster, as they will propagate 
almost in tandenu
The reconfigured logic thus looks as in Figure 4.2.
As an example, if the adders are fabricated simply from a collection 
of full adders with carry rippling through, a timing chart for a 
short section will be as in Figure 4.3.
The total time through the two adders (including carry) is only 
slightly more than one full carry propagation time. The worst case 
integration time for this design is 450 nS. The increase in component 
cost is approximately 20%. Thus the improvement in performance-cost 
index is 33%.
The system benefits in other ways from this approach. In a large 
computer system in which a great deal of data has to be staticised 
in many registers, which are probably physically far apart, wide 
margins of safety have to be left in the logic timing. These margins 
are necessary because of the wide spread in the performance of the 
buffer gates which are used to service the multitude of strobe inputs
(i)
(i)
A X
SELECTOR 1
Z a y ,(i)
ADD
ADD
(i - 1)
(i - 1)
SELECTOR 1 SELECTS IF A X (.v =1
\
tt
" 0
Y (i) IF A X (i) =~1 
IF A X (i) = 0
A. 2 SINGLE PASS DIGITAL INTEGRATOR
1st
ADDER
ARRAY
SELECTORS
Oi Oi 3T 0, 0 2T 0 0
A B C
C S
4-T
y f
D  E
F
G •
5T
OI
( A B C
2 nd J
ADDER )
ARRAY »
C S
2L
%
o
•<
kT
r
5 T
f6 T '5T
2 T
Oi
3T
T
0 0 0
CARRY
IN
A B C
C* . S
T.
\
D
2T
0 , 0
CARRY 
IN ■
A B C
C S
3T
ADDER LOGIC S ( = SUM ) = ABC +ABC + ABC + ABC
C'( = CARRY ) = AB + BC + AC 
SELECTOR LOGIC G ( = OUTPUT ) = DEF + DEF
NOTE : DELAY THROUGH ALL LOGIC BLOCKS ASSUMED TO BE fT *
4,3 TIMING DIAGRAM FOR CASCADED FULL ADDERS
to these registers. Furthermore, the propagation of these strobe 
signals may take a negligible time to local registers but a signif­
icant time to those more far-flung. There is a limit set on how 
early a strobe pulse may arrive at a register by the settling time 
of the data at the input to the register. This will dictate when 
the strobe pulses may be expected to arrive at the furthermost 
register. Clearly, by minimizing the occurrences of stro.be pulses 
per iteration, the total time wasted through providing margins can 
be minimized. -The single pass system helps considerably in this 
respect.
4.2 Carry-Save Methods for Selection Logic
"Procrastination is the thief of time'*
Night Thoughts: Edward Young
The interconnections method for electronically programmed machines 
that was described in Chapter 3 uses a totally time domain oriented 
approach using shift registers. Thus the total time taken to 
effect interconnections is critically dependent on the shift rate 
that can be achieved.
There are two ways in which the selected,'incoming,increments (AY) 
can be handled: they may be stored in individual registers until 
the interconnections cycle is over, and then summed during the 
integration cycle (to form £AY), or as they become available. The first 
method is expensive both in terms of storage and arithmetic, the 
latter is slow because arithmetic must be performed between each 
selector shift pulse. However, the latter is comparatively inexpen­
sive, and it was for this reason that it was adopted in SOIC I. 
Nonetheless, in order to make the second method acceptably fast,
bearing in mind that the arithmetic is being performed on 4 bit 
increments to form a 7 bit sum (DAY), a rethink'of the conventional 
logic structTire for accumulators was necessary. The solution lay in 
the economical use of carry-save methods and, in particular, in the 
use of a microcircuit which contains 4 full adders with a fast 
internal carry path. This circuit, known as a quadruple full adder, 
has logic optimized in the carry path so that the. carry ripple - 
through time is only 7 nS per ^ it compared with 40 nS for the add 
time. The logic economy in the accumulator thus took the form of 
saving only every 4th carry-bit and not every one, as for a con­
ventional carry-save system.
To accumulate, say, 7 AY's, 5 carry-save additions take place plus 
a final full carry propagating addition. The increments added on 
each occasion are 1 and 2, A + 3, A + 4, A + 5, A + 6 (in carry-save) 
and A + 7 in carry propagate. (A is the accumulator.)
The logic appears as in Figure 4.4 The propagation times, based on 
typical circuit delays are shown.
The final addition is split into two parts: a carry-save addition
(A .+ 7) and a carry assimilation. The carry-save addition leaves
bits of significance -2 ^, 2 ^ 2 ,^ 2 ^ 2 ^ 2 ^ , thus a spare 2?
6 5 A 3
has to be assimilated into (-2 , 2 , 2  , 2 ). This may be done in 
one of two ways:-
6 5 A 3
(i) To use adders working at significances -2 , 2 , 2 , 2 as 
shown in Figure 4.5
4 BIT AY  FROM SELECTOR
SIGN PROPAGATION
\
A
¥
- 2
L j f v
f2 )
¥
©
24,
©
(§7^)
bV c 'f  dV eY
@ i
w
I
/O'
FINAL CARRY ASSIMILATION
101 ©
¥ ¥ ¥
©
QUADRUPLE
C =0 ? FULL ADDERS 
m
I
21
8  BIT PARALLEL 
ENTRY REGISTER
X a y
&  UNUSED SECTION OF QUADRUPLE ADDER 
EXPRESSIONS FOR O/P’s -2® THROUGH 23GIVEN IN EQUATIONS 4.1
4.4 SOIC I AY ACCUMULATOR LOGIC
FROMZAY r e g i s t e r
HAHAEO
HA=HALF ADDER
( S = Y©Z )
( C* = YZ )
EO = Exclusive OR
(s' = w©x ) •
IIS PART ZAY
4.5 ’DAY CARRY ASSIMILATION - METHOD I
This is more economically carried out by using a single quadruple 
full adder (QFA) as in Figure 4.6.
EROM Z  AY REGISTER
out
( NOT
m
USED )
4.6 lAY CARRY ASSIMILATION - METHOD II
However, the QFA is very under-utilised as can be seen from 
Figure 4.6.
(ii) An alternative approach is to fabricate some ad hoc logic 
which is^in effect,a look-up table for carry assimilation. The 
logical connectives are:-
0/P bit C23) = D © E
(24) = C(DE) + CDE = C @ (DE)
(25) = B(CDE) + B (CDE) = B ©  (CDE)  ....(a .i )
" " (-2 ) = A(BCDE) + A(BCDE) = A © (BCDE)
4.3 Single Step Implementation of the Adams1 Integration Methods
SOIC Mk I was built with a fixed integration algorithm in each
machine element: zero-order-hold (Euler), with a retrospective first
order (trapezoidal) correction. This decision came as a result of 
logic simulations of such integrators in various situations on the 
one hand, and economics on the other.
In order to maximize the performance (iteration rate) for such an 
algorithm, the "combinational" approach to implementation was adopted.
The relevant register transfers (for time integration) are:-
Y (i) = Y (i-l) * s“ a >
AZ(i) + R (i) = R (i_x) + [Y(i) + {
for a basic "trapezoidal" method and:-
(4.
Y a )  - Y (i-i) + | XAY(i) - * “ Y a-i)
AZ(i) + R (i) R (i-1 ) + Y (i)-At(i)
for the combinational approach.
Assuming E A Y i s  0 as an initial condition at i = 1, ( A Z ^  + R (i)^
to
will be identical in equations 4.2 and 4.3 for all (i).However, 
^(i) only be ■t i^e same i-n equations 4.2 and 4.3 on the occasions
when E A Y ^  = 0. Thus Y ^  in equation 4.3 does not always
exactly represent the input dependent variable to an integrator.
In practice, and in SOIC I, this is of no consequence, as is
used solely for "feeding" R ^ _ ^ ; no monitoring, for instance, is 
done at this variable. (Monitoring is carried out by collecting 
the incoming AY's and summing them in a dummy (monitor) machine 
element, elsewhere in the machine.)
4.4 Enhancement to the Shift Register Method of Programmed 
Interconnections
The programmed interconnections method described in paragraph 3.2.2 
(and implemented in SOIC Mk I) used a single shift register in each 
machine element which contained a 'pattern* of 0 's and I's which 
corresponded to those element outputs which were to be selected or 
rejected for that element. Each output was presented, in turn, to 
a parallel bit highway and AND1ed with the current selector shift 
register output bit. See Figure 4.7.
AZ(i) AZ(m)
(nXm) BIT PARALLEL ENTRY 
SHIFT REGISTERS
A  ^  SELECTION LOGIC
\ ( 1  per ELEMENT;
SELECTOR SHIFT REGISTER 
(1 per MACHINE ELEMENT) '
TO OTHER MACHINE 
ELEMENTS
n = NUMBER OF BITS IN EACH AZ SIGNAL 
m - NUMBER OF MACHINE ELEMENTS
4.7 SOIC I ELEMENTS OUTPUT SELECTION LOGIC
The method is not very fast as it works purely in the time domain, 
except for the fact that bit parallelism is used. For SOIC Mk I, 
the interconnection time took 2 pS for a 16 element machine, compared 
with 1 pS for the integration cycle. An alternative method suggested 
by the author (but not implemented in hardware) is shown in Figure 
4.8. The interconnection time is reduced by slicing the output and 
selector registers into two halves or four quarters (as in Figure 
4.8), eight eighths, etc, and making these subregisters work con­
currently. Such a method should basically reduce the interconnections
PARALLEL FEED OF ELEMENT OUTPUTS
if BUSBARS 
EACH OF 'n 
WIRES
if RANKS OF ( n X m/ 4 ) BIT SHIFT REGISTERS
• OUTPUT LOGIC ( 1 per SYSTEM )
J INPUT LOGIC ( 1  per MACHINE ELEMENT)
i »
:---------------------------  7-1 r —
—n i" ~ * 7n—r hh
SELECTOR
SHIFT
REGISTERS
 — —
, - -
if ENTRY
ACCUMULATOR SELECTORS
TO OTHER 
MACHINE ELEMENTS* EACH AN n BIT WORD
Z a y
4.8 QUADRUPLE RANK SELECTION METHOD
'time''to i, *, a, etc of the time of the original method.
The advantages of such an approach are:-
(i) No more stores are required to hold the integrator output 
increments (they have not increased in number). ■
(ii) No more stores (in toto) are required per integrator to 
select the desired increments [for the same reason as (i)].
(iii) Multiple ranking is comparatively easy to implement in an 
existing DDA providing certain, obvious, provisions are made in 
advance for the additional logic. Thus increasing the size of a 
machine by two does not necessarily mean a doubling of the intercon­
nection time.
There are, naturally, some disadvantages:-
(i) Each integrator has to be prepared for two or four (or more) 
increments being selected in the same sampling period. These must 
be processed (accumulated or stored) concurrently in this period. 
This is a logical complication.
(ii) The effect of (i), if on-line increment accumulation is 
attempted, ivTill probably be a slight increase in the accumulator 
delay time (not a doubling or quadrupling, of course).
(iii) Each integrator will'now be connected to two, four, or more 
busbars, so that if an LSI chip is to be designed in the future, the 
lead-frame requirement will stretch, somewhat. The increase is 
not severe even for rank quadrupling and leadframe technology can 
pretty well cope with it today. It certainly will be able, to, in 
the near future. ~
The method used for adding the increments occurring in the sampling 
period depends, largely, on the number of bits in the increment.
For simple ternary systems it is probably easiest to use a tree of 
full adders.
A suitable system would be as shown in Figure 4.9.
It is not particularly fast, but would be adequate for a system 
requiring simplicity rather than the ultimate in speed. Paragraph
4.5 suggests a possible embellishment.
4.5 Use of Read-only Memories for Table Look-up and Logic 
Synthesis in PDA’s
There are many areas in which look-up tables can be used in a DDA.
It has already been suggested that the final assimilator of a carry-
save selector could use one. Another area is in the summation of
ternary increments in a machine using multiple ranking selection
systems. If four two-bit increments have to be added, a bipolar
ROM provides a fast, compact solution. Figure 4.10. There are
8
clearly a total of 8 bits to be accounted for,yielding 2 ie 256 
possibilities. . The maximum excursions of the sum lie in the range 
±4. ie 0100 thru 1100. Thus a 256 entry 4 bit table is called 
for. This is well within the capabilities of today’s technology 
and look-up times of the order of 50 to 60 nS are possible. This 
compares well with the 100 nS of the scheme in paragraph 4.4. Using 
this technique, ranking could be further increased without neces­
sarily increasing the accumulation time^providing sufficiently 
capacious ROM’s exist. An alternative approach would be to use the 
ROMS for summing the individual positive and negative bits (Figure 
4^9) and merge the subtraction with the main integrator by carry- 
save techniques . Summing 3 individual, equi-significant bits to
0 e 00
NOTE: TERNARY CODE : } + 1  = 01
-1 = 10 
^ 0 = 11 (UNUSED)
h TERNARY CODED AY s
SUMMATION
OF
NEGATIVES
\
' HA HA •
E a y
FULL ADDER
SYMBOLS
carry=C
carry sura=S 
=C
HA
HALF ADDER
INVERTER
i . ■
SUMMATION OF 
^  POSITIVES
•1*
4,9 ZAY LOGIC FOR A TERNARY DDA
give a sum between 0  and 8 would again require a 256 entry, 4 bit 
wide ROM. (One would be needed for negative and one for positive 
AY increment bits.)
( i ^  
* y ( 2) { ;
(3) { ’
(4) {
a Y
a Y
Ac
A,
A 2
Ag
A,
a5
Ae
A,
256 ADRESS x ABIT ROM.
v
I  AY
4.10 KiY LOGIC USING A READ-ONLY-MEMORY (ROM)
THE SOIC MK 1 PDA
5.1 Introduction to Machine; Designs Aims: Applications
The SOIC Mk 1 machine, p e r se, was designed to extract the best 
performance that could be obtained from a parallel digital differen­
tial analyser, for a given technology (standard TTL) and budget. In 
many respects it was a test bed for the several innovations that had 
been mooted in previous chapters.
The role of production machines, based on the Mk 1 architecture, was 
seen to be largely in real time simulation and problem solving where 
computing power was at a premium, and as a replacement for the 
analogue computer, in particular, in hybrid environments. With such 
a postulated application area, particularly where it meant "competi­
tion" with the analogue computer, not only had its speed/accuracy 
product got to be at least comparable with the analogue machine, but 
also its raw speed. This latter can be defined in many ways, real 
time gain, slew rate of sine wave speed, whichever is mere convenient.
As the machine was intended for hybrid environments associated with 
GPDC's, it was decided, at the outset, that electronically patched 
interconnections were mandatory. Furthermore, in view of the lack 
of information available at the time (and now) on the implications of 
non-total topologies for analogue/DDA computers, it was decided to 
"play safe" and opt for a total interconnection topology.
SOIC 1 was clearly going to be a research instrument for feasibility 
work from the beginning, thus its size both in terms of word length 
and element population was limited, together with its expandability. 
This reduced the sheer labour of unnecessary construction to the 
absolute minimum.
5.2 Differences Between SOIC Mk 1 and Existing Machines
Logically, no two machine types are ever identical, so that this 
subject is best dealt with by comparing concepts and approaches, 
comparing SOIC 1 with its nearest counterpart, if appropriate, in 
each case.
(i) Multiple bit word for increments. All previous simul­
taneously processed DDATs have used simple, ie binary or ternary 
. . (25)
increment systems; McGhee and Nilsen (amongst others) have 
postulated word increments but have never realised one in hardware.
(ii) Inexpensive interconnections method. This is an imple­
mentation of an innovation. Hyatt and O h l b e r g ^ ^  reported a more 
complex system which was of only about the same performance.
(iii) Single step integration method. Few machines have even been 
built using a trapezoidal algorithm -none using multi-bit increments, 
or a single pass algorithm.
(iv) Integrated interconnections and loading logic. No known 
precedents.
(v) On board potentiometers. Rae of Membrain has produced a DDA 
like machine with on board potentiometers but this was a serial 
machine having a patchboard. Iteration time:- 100 yS compared with 
3 yS for SOIC 1.
There are a multitude of other minor differences. These will not 
be listed.
5.3 Problems to be Overcome in the Implementation
"A master passion is the love of news."
George Brable
5.3.1 interconnections Selection Rate
The method of paragraph 3.2.2 was adopted in its single ranked form. 
Furthermore, in order to simplify the implementation of the post- 
carry-save assimulator (suitable ROMS were not available at the time), 
the machine element fan-in was restricted to 4, thus EAY only had 
bits of significance 2^ thru 2^. The assimilation was carried out 
by parallel logic, rather than use of the ,ms* adder in the EAY 
accumulator, to investigate the delaying factor of such a scheme.
(See paragraph 4.5.)
5.3.2 Multiplication of Variables
To implement the functions /y dx or K/y dt, where 1C is a constant, 
requires a multiplier of some sort to form either or
K.Y^j. As output increments from the machine were chosen to be 
4 bits (a limit set by multiplication and interconnections costs) 
the multiplier had only to work with multipliers of 4 bits precision, 
which can be achieved by a single row adder and a ' 2  bits at a time* 
multiplication method. To add merely another 2 bits to the incre­
ments would have, of course, added 50% to the cost of interconnections 
and caused a requirement for a 2  row adder for multiplication - a 
doubling of the multiplier cost.
5.3.3 Potentiometers
These were chosen as on-board devices so as to reduce the number 
of connectable machine elements. The factors K,in K/y dt,form the
potentiometer coefficients(scaling function), and to reduce costs
scaling is not carried out on arbitrary variables integration. (Two
multipliers would otherwise have been required, one of which would
probably have been redundant during the more common time integration.
The restriction thus imposed on programming is quite trivial and has
. (22)
been adopted m  at least one other machine.
5.3.4 Monitoring of Variables
The method outlined in paragraph 3.2.5 was adopted, but because 
multiple bit increments were used, a fan-in restriction of 1 was 
imposed so as to avoid the need for increment accumulators. It was 
thought at the time, and later established through machine useage, 
that this was in no way restrictive.
5.4 Introduction to Reference 46
Reference 46 is a paper published in the "Radio and Electronic 
Engineer" in May 1972 and contains details of the SOIC Mk 1 machine. 
Performance details and some sample waveforms are also shown.
5.5 Limitations of the Machine
5.5.1 Accuracy of Computation
Comparative figures are given in Reference 46 for computational 
accuracy. This paragraph is intended to highlight the cause of 
accuracy limits.
(i) Use of 12 bit machine variable was, of course, very decisive 
in limiting the accuracy. This limit was imposed on cost grounds 
and the limited speed-accuracy product that may be obtained with an 
integration method which only provides an Eulerian prediction.
Clearly a bigger word and a more ambitious algorithm would have 
improved the basic integrator performance. This has been verified 
in subsequent simulation studies.
(ii) Use of a greater number of bits in the increments would have 
further improved the speed accuracy product and/or reduced the per- 
step round-off error.
(iii) Single ranking of the interconnections matrix, even with only 
a 16 element machine, caused interconnections to take twice as long 
as integration— 2 yS vs. 1 yS. This further reduced the speed - 
accuracy product. There are several other minor causes of loss of 
accuracy. These are listed under the appropriate headings in 
Appendix 2.
5.5.2 Number of Computing Elements that can be Connected
Because of an insistence on a total interconnections topology and 
single ranking of selectors, the number of elements that could be 
connected was limited to 16. This would be generally considered a 
far from adequate number. To use the same method for 64 elements 
would have raised the iteration time from 3 to 9 yS of which only 
1 yS would have been for integration.
5.5.3 Efficiency of Functions Generation
Function generation in a DDA can be conveniently split into two 
forms :-
(i> "Continuous functions" such as can be generated by suitable, 
direct programming of integrators, making considerable use of 
arbitrary variables integration. These include the generation of 
polynomials in terms of arbitrary variables, square roots, quotients,
reciprocals, trignometrie functions, etc.
(ii) Discontinuous and other difficult functions. These are often 
implemented in analogue computers using diodes or on a GPDC by table 
look-up and interpolation or extrapolation. In a DDA, these func­
tions are usually segmented as in a diode function unit. Use is 
made of comparators to detect ends of segments.
SOIC Mk 1 was very inefficient from the point of view of the provision 
of comparators - complete integrators had to be given over to this 
activity. Operating experience with Mk 1 made it very clear that 
an on-board comparator system was really mandatory if efficient 
function generation was to be carried out.
5.5.4 Cost of Implementation through Non-Centralised 
Storage
One item in the SOIC Mk 1 machine dominated all others in terms of 
cost - the storage of variables. Because it was a parallel machine 
working in a simultaneous mode all variables needed concurrent, 
parallel, access and entry. Thus, no use could be made, at all, of 
the then current developments of random access memories (RAM). Since 
1970, they have become very common, comparatively inexpensive but, 
of course, unsuitable for a simultaneous machine. (To decant all 
16 words from a lo word RAM would take every bit as long as monir 
toring every bit of a serial access shift register.) However, in 
view of the fact that the cost per bit of triggerable flip-flops 
(used in SOIC Mk 1) and RAMS is in the ratio 3 to 1 and rising, this 
form of storage could not be ignored for future machines. The 
question of making good use of them remains unanswered as far as 
SOIC 1 like systems are concerned.
5.6 Observations on the Performance of SOIC 14k I
It was mentioned earlier that the prime reason for the construction 
of SOIC M k  I was to confirm the efficacy of the design and innova­
tions, as outlined in previous chapters, and to use it as a test­
bed for the further development of digital simulators.
Tlie performance of the machine was much as expected from the system 
design and simulations carried out on a general purpose digital 
computer (GPDC).
However, use of the machine soon made it clear that, despite the 
considerable improvement it brought over existing machines, it was 
still wanting in several respects, if real time simulation was 
contemplated.
Briefly, the speed-accuracy product of the solutions was high, com­
pared with both GPDC and analogue methods: however, the raw speed 
was neither as high as the best analogue devices nor was the solution 
accuracy comparable with the GPDC. Furthermore, it was considerably 
lacking in facilities which would give it the necessary flexibility 
to make it generally useful, ie provision of a sufficiency of poten­
tiometers, comparators, interruptions handling devices, etc.
The shortcomings will now be considered in more detail.
The precision of a DDA is determined by the projected worst case 
application to which the machine will be put, for any job. The 
integrator gain is 2 n when n is the number of bits by which AZ is 
shifted before presentation to a following Y register. If the word 
(Y register) length is increased by m bits, without changing the 
length of AZ, then for a given integration algorithm/interconnection
method, the solution (sine-wave) rate will be reduced by 2m . Since 
the R register length is the difference of the lengths of the Y and
AZ registers, increasing the length of the Y register (with a fixed
AZ length) increases the length of the R register.
Thus, if the same gain factor is required of an integrator with an 
increased basic precision, then the increment word lengths must 
also be increased by the same number of bits. As these increments 
determine the complexity of the built-in multiplier for each inte­
grator, their length is of crucial importance. For the pilot .-(Mk l)
machine, the increment was 4 bits and was easily catered for by a 
simple single-row-adder array. (Chapter 4.) If the word length 
is increased to 8 bits, to allow a 16 bit (y) machine to be built 
of the same gain factor, then one of two changes have to be made:-
(i) the multiplier must be increased to have 3 rows 
with an attendant high cost and power consump­
tion and a slight increase in iteration time,
or
(ii) the multiplication function must be timeshared in 
the manner of normal shifting-and-adding methods.
This would slightly increase the integrator cost 
(for organisational logic and storage of partial 
products) but, more importantly, greatly increase
the iteration time to carry out this process.
83
The number of outlets provided by each integrator on the pilot machine 
was only one with the physical possibility of being able to provide, 
at trivial extra cost, an unsealed version of the same integral. This 
was found, in operation, to considerably limit the usefulness of each 
integrator. Multiple use of integrators to provide many scaled ver­
sions of the same integral had to be frequently carried out which was 
wasteful of both integrators and the interconnections capacity. The 
ideal system for scaled integrators would appear to provide at least 
one more scaled output, if not, two. However, reference to paragraph
5.1.1 will show that such a requirement only serves to exacerbate 
the cost/time problem for the current integrator technology>as either 
more high cost multipliers would be needed to maintain the current 
solution rate or a greater element of adder time-sharing invoked to 
form these products.
The interconnections logic cost is dependent on the number of inte­
grators (and hence interconnection paths) and on the highway (AZ) 
word width usedc For the largely serial interconnection method used, 
the first cost is quite low, leaving the cost dominance to the high­
way width. If the AZ word width were to be doubled, then there 
would be nearly a doubling of the interconnections logic. This comes 
about from the fact that parallel-word-highways are used despite word 
serialisation. The alternative, that is partial word serialisation 
would only serve to lengthen (double) the interconnection time (the 
already dominant member of the two components making up the iteration 
time).
Integration with respect to arbitrary variables would also be made
more difficult by lengthening AZ*s as this would reflect as an 
increase in the length of AX's. However, as the Mk 1 machine did 
not provide scaled outputs involving arbitrary variables integration,
for the latter.
5.6.1 Arbitrary Variables Integration (AVI)
SOIC 1, In common with all first order difference integrators)is 
particularly poor when integrating with respect to arbitrary variables, 
AX,for such integration, is the output of another integrator, Until 
the latter produces a AZ, which will be at the end of the first itera­
tion at the earliest, then an AVI has no information on which to work 
and must, perforce, assume that AX * 0. In other words, AVI^not only 
produce a one iteration delay, which is characteristic of all digital 
integrators, but an additional 1 (at least) iteration delay because 
of lack of an independent variable. If a string of AVI's are connected
ie only K/y dt and /y dx were offered, the multiplier performed 
potentiometer scaling for the first integral and (y)x ^ 5x^multiplication
©
1
AX
AY
A Z (3) " Y(3)'A X (3)
A z (if) = A X ( «
5.1 SYSTEM OF CASCADED ARBITRARY VARIABLES INTEGRATORS
The last integrator will produce no output until the end of the 
fourth iteration (if the first machine element is a time integrator).
Because of the considerable use made of arbitrary variables integra­
tion in DDA's (for the multitude of functions generated), this is a 
serious shortcoming of the present elements.
In addition to observing the limitations of the Mk 1 machine, tests 
were performed on the data highways ie AZ, AY, aX to see if the 
utilisation warranted the vast number of lines used for data trans­
mission. If the utilisation was poor, it would indicate that an 
alternative strategy should be sought for the transmission, multi­
plication of such quantities and hence perhaps the basic structure 
of the present integrators. This evaluation could only be reasonably 
carried out on a working machine^as the simulation costs and time 
involved might be much greater and not highlight all the properties 
of the system. The latter point is made possible by the fact that 
the simulation was carried out more at a ‘’machine definition" level 
rather than a detailed logic level. This detail limitation was 
forced because of the vast increase in simulation time that would 
have resulted.
It was noted that for the majority of the computing time, changes 
to A signals between iterations were of zero or unit quanta only. 
Expressed in difference form,
5.6.2 Rate of Change of Increments
(i)
= AZ
(i)
was limited in value (1)
The majority of exceptions occurred either:- . ,
(a) when very large amplitude signals were involved which
2
caused a slightly greater increase of A of ±2.
(b) During the first couple of iterations, when a step or ramp 
input was applied to the system, which resulted in temporary 
large values of AZ to occur, which reflected in large values 
of A2 Z.
In view of these observations, it was clear that data transmission
could be greatly simplified if multi-bit A highways were replaced
2
by, say, ternary A highways (bit) and an interruption system for
. . 2 .
the processing the exceptional excursions of A . Large starting up
2 . . . . 
values of A could be accounted for by suitable choice of initial
conditions, ie for a normal integrator can be preset^ready for
iteration 1 .
C H A P T E R  6
THE SECOND ORDER DIFFERENCE INTEGRATOR
"Great fleas have little fleas upon their backs to bite 'em, 
And little fleas have lesser fleas, and so ad infinitum."
A Budget of Paradoxes. Augustus de Morgan
6.1 The Second Order Method
If, instead of transmitting increments, A, of variables, the second
2
order differences, A , are used, then the original variables, X, Y, Z 
can be built up from two levels of accumulation in any machine element. 
Figure 6.1
zTY]
t
AY ]ACCUMULATE A Y TO FORM AY 
<
ACCUMULATE AY TO FORM Y
6.1 BASIC SECOND ORDER DIFFERENCE INTEGRATOR
In register transfer form, this is:-
AY,., « AY,. + A Y ,.,
(l) (l-l) (l)
and Y,., 15 Y,. + AY,.,
(l) (i-l) (i) (1)
Y,. + AY,. + A Y /.,
(l-l) (l-l) (l)
The reverse process, that is the generation of second order differences
»
of information processed in a machine element, is also simple. For 
example an integrator working with respect to time, in its simplest 
form, would consist of the register structure shown in Figure 6.2
A2„
ay
—j-"-
Y
j/i
AZ ' • 1 R
*f
a2z
AYu> “ AY(i-l) + A Y(i)
Yu) Y(i-1) + AY(i)
+ R(i) “ R(i-1)+ Y(i)
AZ(i) = AZ(i) - AZ(i-l)
6.2 BASIC SECOND ORDER TIME INTEGRATOR
For this treatment, all the registers may be considered as being 
multi-bit. Clearly, in the process of adding to ^(i-1)’ a
change of Az can only occur if the topmost bit~position of 
transmits a carry through to A Z ^  so as to increment AZ by 1 . 
However, Y ^  may also differ from Y ^ _ ^  at the same significance 
as the. bottom bit of Az due to the register transfer equation,
Y /-\ 1:5 Y /* + AY,.v .U )  (i-I) (l)
^^(max) ^ * R (max)* t l^^ s difference restricted to ± 1  quantum 
at the stated significance of Y. See Appendices! and 3 .
Thus taking both changes in Y and carries out of R into account
changes in AZ are limited to two quanta only, making a quinary 
2
(A C  ± 1 , - 2 , 0 ) system of information transfer possible.
2 .
The logic for generating A Z has certain ramifications that are not
obvious in the foregoing description:-
2
(i) As A information is required and not A for outputting to 
other machine elements, AZ need not be contained in any registers. 
Indeed, it does not even exist as an available piece of information
. 2 .
m  a machine element. A Z can be obtained from changes in other 
registers/adders, since the previous iteration.
(ii) Following from (i), that part of Y of significance that is 
congruent with that of AZ is therefore also unnecessary. Thus the 
Y register may be truncated to a length equal to that of the R 
register. In this text, this part is called F[Y].
2 . .
(iii) The processing of A Z from carries out of registers is com­
plicated by the effects of signing of registers contents, in 
particular, Y. In two's complement arithmetic, a carry out of an 
adder due to a negative operand will yield 'no change', whereas the 
absence of a carry will signify a decrement of 1 to the result.
The extra complication is minimal from the logic viewpoint.
6.2 Initial Conditions
Evidently, for this method, a given machine element must not only 
contain as an initial condition, but also A Y ^  . This latter
variable is easily derived as it is merely the scaled version of 
the Y register content of the machine element which feeds the given 
integrator. Figure 6.3
AY,0
(°)
V ®  v 9 -n 
Y(o)
• MSB
M S B
AY,
AZ,
LOGIC
"UPSTREAM11 
MACHINE ELEMENT ©
[INTEGRATOR]
ELEMENT (©) UNDER 
CONSIDERATION
IN TEXT (para 6.2)
WHERE Y  IS BIT LENGTH OF R REGISTERS.
6.3 INTERCONNECTION OF SECOND ORDER DIFFERENCE INTEGRATORS
Having this additional difference information available allows the 
DDA designer to avoid multiplication in multi-bit AZ machines 
(paragraph 5.3.2)• This is because the products can, at any time, 
be built up from an initally computed product (before the start of 
computation) plus a series of increments determined from second 
order differences.
As an example j to compute 'for scaled time integration would
normally require a full multiplier as both K and are multi­
bit quantities. However, the alternative approach is as follows:-
KAY(.) - KAY(.„1) + KA2Y (i) - K ( A Y ^ ,  ♦ A2^
“ (i) =  ‘“ (i-l) +  K A Y (i)
- K(Y(..1) ♦ AY(i))
K A ^ d  involves multiplying a multi-bit number by a second order
\ »
difference (which is a one-bit quantity). This is trivial, hence 
KAY^^ can be formed by simple addition. Hence KY^^, which entails 
accumulation (summation) only.
By a similar technique, can be formed by the simple
expedient of reducing both Y and AX tc their second order difference 
forms and building up a product through accumulation of two sets 
of variables:-
Y
(i) (i-1 )
+ AY
(i~l) (i)
CD (i-1 ) CD
Hence
(3)
- Y (previous values)
*
products involving second order differences.
6.3 Facets of the Second Order Method
6.3.1 Simplification of Interconnections
Since variables having only a very small range of values are trans-
mitted between machine elements, a great cost saving in this area
2
is realised. In a typical machine A variables would have their 
values restricted to ± 1, ± 2, and 0. This can be implemented in 
a simple three-wire quinary-transfer-system.
A  comparable machine with 16 bit variables (Y) and 8 bit increments 
A, would, of course, require 8 channel highways.
6.3.2 Other Functions that can be Generated
As products can be formed as shown in the previous paragraph, it 
is tempting to attempt to form the product of two arbitrary variables
in a■■■■single machine element. In a conventional DDA, two variable 
products are formed as the summed pair of cross integrals
(yV) * / y d V + J V d y .   ............ ........ ..(A)
There are several disadvantages to the method in a practical system:-
(i) The accuracy of the method not only depends on the 
accumulation of round off errors due to the transmission system used* 
but is also dependent on the integration algorithm. It has already 
been pointed out that arbitrary variables integration is very poor 
on conventional digital integrators.
(ii) Two integrators are committed. In view of the large number 
of occurrences of multiplication in real simulation problems, this 
is very wasteful both of machine elements and the machine intercon­
nection system.
(iii) The output from the two integrators still have to be summed 
in a downstream machine element. Often the product has to be applied 
to several other elements, all of which are committed to summing 
these two components. This is wasteful of the fan-in capability of 
the machine elements.
The second order method can be used to extend the concepts of the 
previous two paragraphs to form the product of two variables. The 
main disadvantage being that several initial conditions have to 
be formed prior to computation. However, they are trivial to' derive 
and only waste loading time once in a run.
6.3.3 Introduction to Appendix I
Appendix 1, "The Second Order Difference Integrator ..." gives the 
theoretical basis for the second order method and shows what 
performance figures can be expected from a system built in this 
manner. Of particular importance is its appendix which gives a
proof, without recourse to complex state-space analysis, of the
2 . .
range of A values to be expected for a given register structure.
6*4 Advantages and Disadvantages of the Second Order 
Difference Method
"How long halt ye between two opinions?"
1 Kings 18:21
6.4.1 Slew Rate
The second order difference system allows large values of first 
order differences to be used without having to transmit them 
around the machine. In order to do this a restriction is 
effectively placed on the differences between them. Thus although 
the slope of variables in now allowed to take large values, the 
rate of change of slopes is not.
6.4.2 Cost/storage
The largest cost item in the first order system was the storage of 
variables because they all required concurrent parallel bit access, and 
thus had to be fabricated from autonomous flip-flop registers.
The second order method requires more variables to be stored in 
each machine element but because the algorithms are dominated by 
sequential accumulation processes, often using known previous step 
information, register contents can be called in sequence. Thus
random access memories (RAM) can be used to considerable advantage.
An example of a sequential register call is given below
hi) ■ hi-D + AYa-i) ....... .............(5)
i . '
This is a partial sum. which can be computed during the intercon­
nections period between iteration (i-1 ) and (i) without reference 
to second order data.
After the interconnection period, the rest of the derivation of
Y... can be carried outr­
u n
(i) (1 ) (i)
and incidentally, /    (6 )
Ay ... = Ay.. ' , + a2y ...(i) (l-l) (1)
The latter, which is probably not required in the derivation of 
as it has already been effectively implanted in Y (£) could be computed 
any time as late as the beginning of the interconnections period 
between step (i) and (i+1 ) .
6.4.3 Initial Conditions Loading
This topic has already been mentioned in paragraphs 6.2.1 and 6.2.3.
In summary, the total number of initial conditions to be loaded, f o r a  
unity fan-in machine element, for
(a) Time Integration
[KAY(0)]’ K[Y(0)1   .(7)
cf first order system:- y q^ )
(b) Arbitrary Variables Integration
[Y (0 ).AX(o)], [ A Y ^ . A X ^ y ] ,  [AY(0)]> P(o)l £AX(o)^
cf first order system Y (0)
(c) Multiplication
P(0)-X(0)] tAY(0) -X(0)] P(0.),AX(0)^ ’ [AY(0)-AX(0)1
[ AX(0)], [X(0)], [AY(0)], [Y(0)] 
cf x ( o ) ’ Y ( o > -  tx o - Y (o )3-
(8)
 (9)
6.4.4 Second Order Difference Magnitudes
This is covered from a statistical viewpoint in appendix 1 . However, 
as an introduction, consider the following register structure.
Figure 6.4
I
A 2Y
A Y
I [y ] F [y ]
AZ
6.4 SECOND ORDER INTEGRATOR REGISTER STRUCTURE
If the simplest form of integration (time: Euler) is used.it is clear 
that the range of values that AY can take is the dominating' influence
2
on the range of A Z. See appendix 1. If Y is added to R then the
only changes in contiguous valuer of AZ can be due to the carry
propagated in the Y/R adder from the top significance of R (or F(Y))
2
to the bottom of a Z (or I[Y]) . Thus A Z can only assume values of 
0 or 1 if Y is positive or 0 and -1 if Y is negative. However, in 
addition, AY is added to Y before Y is added to R. Thus a carry can 
be propagated from the top bit of F[Y] to the bottom bit of I[y ] 
even if |AY| is very small (> 1). The two carries can add if suit­
able initial values of Y and R exist so that the total perturbation
of AZ can be two units, ie Ia ^z I = 2. Provided that [Ay | < F[y ]
1 1  max
the change in I[Y] can only be 1 unit (due to the carry in the Y 
adder). However, if I^Y|ffiax>F[Y] , then I(Y) can change, of course, 
by more than 1 unit. Figure 6.5.
I
I AY
i_________
t I I
» I I
JL _________________________
I[Y] ! F [Y] -
' ! • 
! ! i '
AZ  j
i
R
I
6.5 SECOND ORDER INTEGRATOR'WITH HIGH CAPACITY AY REGISTER 
Thus the maximum value of Ja zj is given by
ENTIER infers the nearest integer whose magnitude 
is less than [ ]. 
n  = no, bits in the R register.
where
and
6.5 Techniques with the Second Order Difference Method
"Think of your forefathers! Think of your posterity!" 
John Adams 1802
6.5.1 Arbitrary Variables Integration
A typical register structure for carrying out this operation 
shown in Figure 6 .6 .
is
20 | .19 12 11
1 i 1 2 1 1
1 YAX 
____i
] X.AY 1 1____ 1—
AX.1 AY A.Y.AX
AY.AX
if [x .a y ]
AZ
I nj^
! k !
NOTE;
□
NOT IMPLEMENTED 
IN HARDWARE
'BITS
6 . 6  REGISTER SYSTEM FOR ARBITRARY VARIABLES INTEGRATOR (AVI)
This leads to the following magnitude limits
-2 < A2Y $ 2
f  9 2 ** 9
j-2 $ A Y  * 2
( Steps of 2 3
[-215 < AX < 215'
v a y ^
(Steps of 2 )
To S f[y] < 21? '
1 (Steps of 2 )
f-22 3 * I [Y] < 22 3
(Steps of 2 ^ )
•2 8 < I X 'AlY I < 2 '
Y . A X
15 15
-2 < AY.AX < 2
o < / f P-AX]\ 16
U * yF[X.AYU
,19 V i p.AX]) 19 
Z *\l[X.AY]/ z
(Steps of 2^3)
0 £ R < 21 6
•22 3 < AZ < 22 3  
(Steps of 2
A typical algorithm might be
i Z (i) = Y (i)*AX (iJ + * &Y (i) AX(i) + Y (i) ^ ( i )
This is a simple first order method which is not too complex despite 
the fact that there is an intrinsic prediction associated with the 
current value employed for the independent variable.
Graphically, the algorithm looks as in Figure 6.7
Y
Vii-
(i+1 )(i-1 )
AX
i+1 )
Area 1 
Area 2
Area 3
s Y (ij“ .(i)
5 iAY(i)AX(i) 
S Y (i).A2 X (i)
6.7 GRAPHICAL REPRESENTATION OF AVI ALGORITHM
It is readily seen that arbitrary variables integration involves 
predictions of variables at two levels, the dependent (as always) 
and the independent. Thus an algorithm which is to benefit in 
anyway from a first order post iterative correction must correct, in 
some measure, the integral prediction errors caused by both predic­
tions. Area 1 represents the integral that would normally be 
associated with Euler's algorithm; area two represents a slope 
prediction, and area three represents a first order attempt to make 
good the area omitted from the previous step independent variable 
prediction • To provide a more sophisticated correction is
unlikely to benefit the system much, as in most cases the predictive 
error of omitting area three during the previous iteration (i-1 ) is 
fairly serious and to refine the correction would be a case of 
providing too little, too late.
6.5.2 Multiplication-
"The good Lord said to the animals: go out and multiply: 
but the snake answered: How could I? I am an adder!"
Multiplication can use the same register structure as AVI ie as
shown in Figure 6.5. Multiplication requires one additional variable
to be handled (which is not necessary in AVI), that is X, Hence 
2
sub-products x (i)a Y (i) and appear. The magnitude limits
are also as indicated for AVI>as might be expected. See paragraph 
6.4.1.
A suitable algorithm, which is based on true product formation and not 
two integrals being summed, is shown below. It does leave only 
one term as seen from the graphical interpretation. Simulation 
confirmed that this term is very small in almost all cases compared 
with the other sub-products and is certainly always small (in its 
omission) compared with the built-in round-off errors inherent in 
the system.
Multiplication algorithm
AZ(i) ° Y (i)-AX(i) + AY(i)‘X (i) + AY(i)A X (i) + Y (i)A X (i) + X (i)A Y (
This algorithm takes no account of a final term involving the
2 2 . 
sub-product A ^(i) * T 1^1S clear*y small compared with the
others.
The algorithm is predictive in terms of-both variables and provides 
a post correction for both.
(i+i)
A Y
Ci- 1
X/ , . \ (i+1)
EQUIVALENT AREA (PRODUCT) AT END OF TIME (i-1)
/ / /  » » l (  n ) tl H .. tt (±)
6.8 GRAPHICAL REPRESENTATION OF MULTIPLICATION ALGORITHM
( i - 1 )
AREA
6.9 COMPONENTS OF MULTIPLICATION ALGORITHM
(It can be seen that a superfluous should also be
subtracted.)
A r e a s  1-3 r e p r e s e n t  t h e  n e w  ( p r e d i c t e d )  a r e a  f o r  s t e p  (1). Areas 
4  a n d  5  r e p r e s e n t  a p o s t - c o r r e c t i o n  f o r  s t e p  ( i  1 ) «
A suitable register transfer scheme would be:
* ay... ■« ay.. + a2y...
(i) (1 -1 ) (1 )
* Axa )  = Ax(i - i ) + A2x a )
* Ay(i>ix(i) = AY(i-i) Ax(i-i)+ A2y(i)Ax(i-i)+ i2x(i)AY(i>
* Y (i)AX(i) Y (i-l)AX(i-l) + Y (i-l)A2x (i) + AY(i)AX(i)
AY(i)X (i) AY(i-l)X (i-l) + A^ ( i ) X (i-l) + AY(i)AX(i)
° Y.. .. + AY... 
(l) (l-l) (1 )
X/.x - X,. .. + AX... (l) (i-l) (1 )
AZ... * l{2"*1 6 [R.. .. + Y...AX... + AY...X... + AY...AX...
(1 ) (i-l) (l) (i) (i) (1 ) (i) (i)
+ Y a ) A x(i ) + A\ i ) x (i)]} = 1{p}
* R a )  - p -2 l6 Az(i)
6.5.3 Variables Summation
* Also used in AVT
Although a fairly adequate fan-in can be provided by time integrators 
for most purposes, it is uneconomic to provide fan-in which is not 
being well utilised in the majority of problems. Thus fan-in to 
such integrators is often limited and a separate summing facility 
provided. The summation algorithm is particularly simple although 
it is necessary to attenuate the signals fed in to avoid over­
loading the interconnection systems under large signal situations.
The register structure used is the same as for conventional time 
integration. It is desirable that, like the latter, scaling of 
input variables be provided.
A suitable algorithm is:~
n=j
AZ... = G Z (K AY )
Ci) n=l n n
where G is a gain factor (<1) 
j is the fan-in.
As there is no processing carried out ie integration, no post 
connections are necessary.
A suitable register transfer scheme would be:-
A \ i )  ”  I { 2 " 8
= I{N}
, 2 * - .  8 2 *
F A Y  ...] « N -2 .A Y ...
(l) (l)
Ata )  >  AYa-i) + A V a )
A2(i) = It2-16.[R(..1) + AY(i)]} = I { M >
R... * M-2X^AZ..N
(i) (l)
where
I { } » nearest integer which is algebraically less than { }
K =* Potentiometer coefficient n. 
n
a V  Second order difference input n.
n
F[A2Y*1 ** is the Bfractionaln residue of Z(K . A2Y ) of
*■ n n
significance <2 ^.
6 . 6  Higher than Second Order Differences
"'Curiouser and curiouser* cried Alice”
Alice's Adventures in Wonderland } Lewis Carroll
Having embarked on a system of increasing the difference order of 
represented data in a system, the question obviously arises as to 
why ever higher differences might not be applied. It is clear that 
this must always imply ever more accumulation to return to the 
original variables, but this is relatively unimportant especially 
if eventual fabrication of such computing elements is by large scale 
integration (LSI).
The reason for stopping at second order differences can be seen in 
two ways:-
(i) Consider a variable, which during a run, is slowly rising in 
magnitude. If the various differences are plotted against it, the 
following result is obtained:- (Figure 6.10)
t-.S 0
,^[step] = 
—  1 Q UATUM
BET A R E A - 1 
! AMPLITUDE =1
N . A . =  0
AMP.  = ± 1
AMP.^+l^Z
N . A . = Q
A M P . -  ± 3
6.10 BEHAVIOUR OF HIGH ORDER DIFFERENCES FOR A UNIT STEP CHANGE 
OF VARIABLE Y
It can be seen that a unit step change in amplitude of Y causes the 
necessary differences to vary in an ever wilder and wilder fashion. 
Up to the second order differences, the amplitudes of these dif­
ferences are well behaved, but beyond the second, the shuttle like
action becomes very marked. Thus the registers needed to store such 
variables would start to get greater rather than less which would be 
defeating the object of the exercise.
C H A P T E R  7
THE REQUIREMENTS OF A DDA/GP HYBRID COMPUTER (SOIC Mk II)
"In the multitude of counsellors there is safety"
Proverbs: 11:14
SOIC Mk I showed the potentialities of a DDA for solving problems 
involving differential equations,but also highlighted the need for 
additional hardware for loading, controlling and housekeeping 
purposes. The problems, in many ways, are akin-to those found in 
the use of analogue computers. It was decided that a general purpose 
digital computer (GPDC) might be an ideal partner for the DDA in view 
of its flexibility, and the ease with which it could be interfaced 
with the DDA.
The GPDC should be capable of performing a variety of functions, if 
full use is to be extracted from the system.
In general, the GPDC should form the hub of the system activities 
and thus carry out programme compilation, work to some operating 
system, execute peripheral control and DDA loading, programme 
structuring, solution monitoring and interface control. These are 
all activities which are best carried out by software control, either 
because of the flexibility of software or cost grounds. (The aportioning 
of activities between the DDA and GPDC should be decided on the pro­
jected application area for the system.)
Hie many issues that these general considerations raised led to 
discussions with organisations with an interest in simulation and 
who had experience in the analogue and hybrid computation fields.
From such discussions, a picture was obtained of a reasonable DDA/GPDC 
configuration which would have general simulation applicability.
7.1 Specification
"lactaalea est"
("The die is cast")
Julius Caesar
7.1.1 General Specification
SOIC Mk II should be a general purpose digital simulation facility 
which has the properties of a wclosed-shop" computing system, capable 
of working in .a conventional computing environment, using normal 
input/output media.
In particular, the system should be able to accept programme material 
in some or all of the existing simulation languages such as CSSL, 
CSMP, etc. It should be capable of total reconfiguration, patching, 
scaling etc, completely under programme control and be capable of 
continuously monitoring the progress of problems as they are execu­
ted.
7.1.2 Detailed Specification
(i) Number of machine elements: 64
This number is chosen on the basis that it would be sufficient 
for many simulations without recourse to the special software 
needed for programme segmentation using the DDA configuration 
facility.
(ii) Length of basic machine word: 16 bits (15 bits plus sign)
This gives a basal machine precision of better than 0.01%.
However, the logic partitioning'of the machine is expected to be • 
bitwise (vertical) so that wider words.could be entertained with 
comparative ease. •
* Specification drawn up in consultation with the Cranfield 
Institute of Technology (Computing Centre). See p. 173.
(lii) PDA performance
This should be equivalent to an analogue computer sine-wave rate 
of about 1 KKz at better than 0.1% accuracy. This figure is ' 
based on the projected integration algorithm (first order hold- 
first order correction) and the DDA iteration rate that is antici­
pated (200 KHz).
(iv) Machine elements
Each element should be capable of executing the following functions
(a) Integration with respect to time.
Fan-in: 6 (all inputs individually scaled).
. . -4
Maximum integrator g a m  factor: 2 .
(b) Integration with respect to arbitrary variables.
Fan-in: 1 independent and 1 dependent variable
(neither scaled).
■(c) Variables summation.
Fan-in: 6 (all inputs individually scaled).
(d) Variables multiplication.
Fan-in: as for (b). Neither input scaled.
Each machine element should have an unlimited (electronically 
speaking) fan-out.
(v) Auxilliary registers
Associated with machine element functions (a), (b), (c), should be 
two auxilliary registers to track the element's Y register and be 
used for comparison, break-point detection and overflow detection.
They should be fed with the same AY signals as Y so that, apart from 
initial conditions, they will mimic the changes of Y during a problem. 
By having a zero-crossing detector (ZCD) associated with each 
auxilliary register, and issuing it with suitable initial conditions,, 
the state of the main Y register can be implied from the ZCD. The 
GPDC should then be informed via a suitable interruption facility.
For element function (1) ie multiplications, 4 registers are 
required, two for each variable.
(vi) Initial conditions
Should be supplied by the GPDC in the form of a block transfer. It 
is hoped that an on-board store can be appended to the DDA in the
i . *
future so that loading from the GPDC can be on an exceptions only 
basis in the manner specified in CSSL.
(vii) Interruptions
These should be derived from machine elements and fed to a shift 
register: their disjunction to be sent to the GPDC in the form of 
an interruption signal. The GPDC would then be able to establish 
which machine element interrupted^by shifting the register under 
a counter's control. The count will be sent to the GPDC as the 
indication of the source of the interruption.
(viii) Interconnections
The 64 machine elements should be arranged in 8 groups of 8 . A 
total interconnection topology should exist within,and independently 
in, each group, together with a limited topology to connect, the 
group. This latter should be in the form of four busbars. Each 
busbar should be associated with two elements, one of which will
be able to send data to the busbarj both of which to be able to 
receive data. Thus any given element should be able to receive 
data from 8 internal integrators (to its group) and 8 from outside 
(including itself). Figure 7.1
(viv) Input/output
Machine variables to be monitored by accessing second-order-differences 
through the intergroup busbars. Any input/output interface to be 
able to select one of the four busbars, thus one of the donor elements 
feeding it. Each I/O element to assemble the intended variable by 
two levels of accumulation as per an ordinary machine element.
7.1.3 Performance
T h e  m a i n  performance coefficient for a DDA is the iteration time 
and this should not be greater than 5 yS, giving an iteration rate 
of 200 kHz. The figure is intended as a guideline in view of the 
speculative nature of the project.
From this figure, some figures of merit can be derived:
(i) The maximum gain of the integrators based on a best gain of
-4 . . .
2 per iteration is approximately 1 2 ,0 0 0 , equivalent to a time con­
stant of less than 100 yS. Such a gain would yield a sine wave 
rate of approximately 2 kHz. At such gain settings, the full 16 bit 
accuracy would not be maintained even for a whole cycle of the sine 
wave. However, simulations show a precision of better than 0.1% 
should be achievable. At lower integrator gain (potentiometer) 
settings, the truncation errors should be less so that, at sine wave 
speeds of 100-200 Hz, the amplitude deviation should be of the order 
of 0 .0 1 % per circle.
*INTER GROTJP . W  
BUSBARS
( ALL ELEMENTS IN &  
A COLUMN CAN 
RECEIVE DATA FROM 
THE APPROPRIATE 
BUSBAR )
4 4 4 4 4 » 4 >
< 4 4 ► 4 4 4 4
1 ► 4 4 ' 4 4 4 4
4 4 4 1 ( 4 4 4
4 4 4 4 1 > 4 4
4 4 4 4 4 4 4y
4 4 4 ► l 1 4 4
4 4 . i 4 4 4 4 4
(! y
< ' 4 4 4 I 4 4
1 4 4 i 4 4 < 4
4 4 I» 4 4 1 4
1 1 4 » 4 4 . i 4
t » 4 » 4 4 4 4 .
4 4 4 > 4 4 4>
4 4 4 ► ■ 4 4 4 4
1
l _
4 4 > 4 4 < 4
GROUP 1
GROUP 8
if ALL FOUR ELEMENTS IN THE ROW THUS MARKED CAN o/f TO THAT BUSBAR
7.1 SOIC II INTER-GROUP CONNECTIONS SYSTEM
7.1.4 Technology
Standard TTL has been chosen on the grounds of reliability, avail­
ability, cheapness and relative immunity to noise.
7.2 Maintenance philosophy
The number of microcircuits expected to be used in a machine such 
as this is very high, and the question of reliability of circuits and 
down time during failures is of importance. Furthermore, general 
inexperience amongst custom (maintenance) engineers of DDA!s makes 
the necessity for a sound maintenance philosophy very great.
Thus the following safeguards have been built into the system 
concept
(i) All major data carrying printed circuit boards ie those 
carrying machine element logic should be identical. This would 
minimise the spares holdings and allow easy multiple replacement 
if any doubt exists as to the status of replacement boards.
(ii) Each output slice from a board is to be. equivalenced with 
the equivalent slice of another group of 8 boards. This is easy 
to implement as it is intended that each rack-width crate of logic in 
the DDA shall carry two groups. (Four crates for the machine.) By 
doing inter-group comparisons, under suitable software control, any 
tendency for errors generated in one part of a group to interact 
with another part of the same group is eliminated. The inter-group 
checking would, of course, be carried out by the simple expedient 
of feeding two groups in a given crate with identical programmes, 
initial conditions, potentiometer coefficients, etc.
The equivalencing would, effectively)be done on the R register out­
puts and>as the bulk of logic on the machine element boards is 
combinational, this should quickly pinpoint errors.
(Hi) Each board should have its own monolithic regulators. The 
effect of this would be twofold:-
(a) any individual regulator failure will only threaten a very small 
amount of logic bn each board and
(b) only crude, relatively poorly smoothed, DC would need to be 
circulated around the machine. Thus heavy busbars to reduce voltage 
drops would be unnecessary. The only limitation would now be the 
current density in the conductors.
(iv) On-board-regulators allow more even heat dissipation in the 
machine, thus making special cooling of the power supply unnecessary. 
(It is of interest to note, that at the time of purchasing the pox^er 
supply equipment, it proved no more expensive to decentralise voltage 
regulation - it was about £0.67 per watt either way. The efficiency 
of the system was 70% through the unregulated power supply and 60% 
through the regulators - overall 42%.)
7.3 Aspects of Software
7.3.1 Softx^are With or Without Presence of a PDA
The software for this machine is in a very early stage of development 
but some general guidelines have already been laid doxm. One of the 
most important is that, from the user’s point of viexr, it should be 
of no interest to the user (or concern) whether or not a DDA is 
going to implement his simulation. It would}hopefully,reduce both 
his budgetary statement and the turnaround time, but that is all.
Thus, the user may wish to implement a simulation using^say, The 
Continuous Systems Simulation Language (CSSL), either because it is 
appropriate for his job, or because his software already exists in 
that language. In view of the fact that several well established 
and widely recognised languages such as CSSL exist, there is no 
intention to devise a new one just because the system is a DDA/GPDC 
hybrid. In fact, there is every incentive, within reason, to figure 
the DDA hardware so as to make it as amenable as possible to one of 
the existing languages. This will certainly aid compilation, debug­
ging and the writing of compilers for other languages. For the 
purposes of the project, the main problem is which one of the several 
perfectly acceptable languages to adopt? It seems desirable to 
adopt the one that is going to gain the most general acceptance in 
the near future if this can be identified.
7.3.2 Scaling
This is a notoriously difficult problem, either when GPDC simulations 
are done on short word fixed point machines, or analogue simulations 
are done. The DDA will have to be treated in a like manner to the 
analogue computer but, fortunately, because of number of bits in 
the word, will benefit from a wider useful dynamic range. Thus, 
software used at present in hybrids for this purpose can be used, 
with the necessary modifications, on the DDA/GPDC hybrid. The 
high speed at which the DDA can be reconfigured, and in particular, 
have potentiometer coefficients changed, does: open the possibility 
for rescaling during a problem. This applies both to occasions 
when machine-elements overflow and underflow occur.
7.3.3 Allocation of Machine Elements
This is only a software problem because of the partial-interconnections- 
'topology. It has already been suggested in paragraph 3.3.1 that a 
system based on the Steinberg connectivity algorithm could be used.
Briefly, this would consist of identifying the most heavily connected
machine elements ie those which connect to the most other elements.
\
Each one of these is effectively the "kingpin1* in a group. As there 
are 8 groups,-then the 8 most connected would be sought. An alterna­
tive strategy, if this fails to give a workable allocation, is to try 
what is, in effect, the opposite ie separate out clumps of sparcely con 
nectedtelements and build these into near autonomous groups of 8 .
Which of these strategies works on any occasion depends on the problem 
topology ie whether the total problem exhibited fairly uniform 
connectivity between elements or whether fairly well defined groupings 
were evident.
7.3.4 Programmed Interconnections
Once a successful allocation of machine elements has been effected, 
it is but a simple step to deduce the interconnection linkages that 
have to be formed. The information, however, has to be presented to 
the DDA in the form of a two-dimensional bit-matrix for reasons of 
economising on the number of words sent across the interface. Thus 
a small amoupt of bit manipulation is necessary to assemble the matrix.
That the information sent across the interface is interconnection 
information and not initial conditions is determined by a label which 
should preceed all block transfers. The same applies when subsidiary 
information such as the number of iterations to be executed is entered 
to the DDA. (Appendix 5 summarises the 1/0 data formats.)
7.3.5 Interruptions Handling
The specification states that any one or more machine elements sending 
an interruption shall halt the DDA, inform the GPDC, which will then 
execute a search of the sources of interruption. Thus, the masking 
of interruptions shall be the responsibility of GPDC. It should not 
be handled in hardware by the DDA. A form of masking should be 
inherent in the system in so far that unwanted comparators in the 
machine elements can be set to their extreme values so that they 
cannot signal unless element overflow takes place. Thus unwanted 
interruptions should be very infrequent and merely due to ill 
programming of the DDA in the first instance.
7.3.6 Repetitive Mode
In the present machine, no provision will be made for the storage of 
initial conditions presented by the GPDC. Thus all programme restarts, 
in particular, those for optimisation, will require a reload of the 
DDA registers. The necessary storage will have to be provided in 
the GPDC for this purpose (approximately 1 K.Wd). However, it is 
anticipated that such storage may well be justified for the DDA at 
a later date. In which case, a reduction in GPDC storage will be 
effected along with a simplification of the software.
7.3.7 Handling of Functions Not Suited to PDA
Execution
The DDA should be capable of computing a wide variety of functions 
ranging from simple "continuous11 ones such as polynomials, roots, 
reciprocals, trigonometric functions, etc.. Many others could be 
formed by straight line segments by using the DDA as an extrapolator 
and the GPDC as a store of slopes, breakpoints, etc. The more complex
the function, the more breakpoints needed (the number would not be 
limited by the DDA) and hence the more frequent reference to the 
GPDC and hence computer run time that would be needed. The only 
functions not really suited to DDA execution are those which 
involve discontinuities. Luckily, these are comparatively rare.■■When 
they occur, a more radical reconfiguration of the DDA at the break­
points would, of course, be necessary. Other difficult functions 
would be those which are multi-dimensional, as the breakpoint setting 
would become very complex.
In really bad cases, it is possible that the function might have to 
be generated by the GPDC in its entirety. In which case, this would 
be an example of the use of the DDA and the GPDC in a truly computa­
tionally closed loop. However, the unlimited hold time for both 
machines would prove a considerable advantage over the analogue/GPDC 
hybrid.
C H A  P T E R  8
ASPECTS OF THE SOIC Mk II DESIGH PHILOSOPHY
8.1 Storage
Consideration of the requirements for the SOIC II hardware complex 
showed that, by the use of 6 or 8 bit "word dZ,s" in a parallel 
arithmetic structure, the design specification could be met. However, 
both the speed and accuracy of the solutions could be improved by 
the use of second order differences.
A feature of second order difference methods is the necessity to 
provide initial values to the machine elements, not only of the 
functions, per se, but also their first rate of change (A), and 
products of two variables (ie AY. A X  for multiplication). In a hard­
ware realisation, this implies the use of some additional storage in 
each element.
The specification also called for 4 "monitoring" registers Y^, Y^,
X., in each machine element. These also require storage. Multi- 
A h
bit, word organised, bipolar memories had recently become available 
(1970-71) at relatively low cost, and it was decided to make use of 
these. Further consideration of the possibilities showed that an 
economical arrangement of a multi-purpose element could be made using 
24 (16 by 4 bit) memories per group of 8 elements. The use of such 
memories in preference to the currently available dual, quad, hex and 
octuple "D type" registers yielded savings in power, printed circuit 
board area, backwiring and component cost. Table 8.1.
"D Registers" 64 Bit 
Bipolar Memories
Bits per IC 2 ,4,6 , 8 64
Power per Bit 40 mW 6 mW
Cost per Bit £0.130 £0.070
Table 8.1
COMPARISON OF TTL IMMEDIATE ACCESS STORAGE METHODS
Each machine element requires only 12 positions to store the data 
for any function as shown in Table 8.2. .
Time
Integration
* Variables
Summation
*
Arbitrary
Variables
Integration
* Variables
Multiplication
JU
Ki
2 0
Ki
2 0 AX 1 2 AX 1 2
K 2
2 0
K 2
2 0 X 16 X 16
k3 2 0 k3 2 0 AY 1 2 AY 1 2
K4
2 0
K*
2 0 Y 16 Y 16
Variables
S
2 0
K5
2 0 AY. AX 16 AY. AX 16
2 0
K 6
2 0 Y. AX
/ .
16 Y. AX 16
AY 1 2 AY 1 2
y a
16 AY.X 16
Y
•
16 Y 16 V 16 y a 16
* y a
16
y a
16 R 16
y b
16
Number
y b
16
y b
16 XA
16
of bits
/i
R 16 R 16 ' XB
16
F[A2Y] 8 f[a2y] 8
Table 8.2
VARIABLES USED IN SOIC II
l i t
Note: (1) K is a "potentiometer coefficient . 51
(2) Y., Y_, X , X are auxilliary (monitor) variables.
A  13 A  d
2
(3) F[A Y] is the fractional, low significance, residue
resulting from the weighted summation of second order
2
dependent variables,K.A Y.
For economy in the use of the memory units, therefore, the 16
locations of a memory are split into 8 two variable segments, and a
group of 8 machine elements use 24 (16 by 4 bit) memories. The variables 
are allocated as shown in Figure 8.1
To facilitate the design of printed circuit boards, the 8 machine 
elements are arranged on 4 boards (Bl, B2, B3, B4), each board con­
taining 4 bits of each variable as indicated' in Figure 8.1. This 
arrangement allows the 64 machine elements to be constructed on 32
identical boards, each of reasonable size. The excess 4 bits of the
2 0  bit "potentiometer coefficients ’1 K^,etc are stored separately on 
the control board associated with each group of 8 machine elements.
The main purpose of this board, however, is the strobing/buffering of 
of control signals from the main machine control unit before applica­
tion to a given group of machine elements.
The variables (Figure 8.1) are allocated to memories in such a manner
as to avoid problems of access. For example, in multiplication,
. 2
and aY are both used early m  the computation of a 2. These are there­
fore located in separate memories so that they may be accessed con­
currently at the start of an iteration.
8.2 Integration Algorithms
From simulations (Appendix 2),-it was decided that a first order 
(slope) prediction/first order (trapezoidal) correction integration
R
A
M
 
'
Ad
dr
es
s 
6 
of
f 
16
 
lo
ca
ti
on
 
me
mo
ri
es
 
( 
pa
rt
it
io
ne
d 
in 
4 
b
i
t
•H
-P
<D *H 
rH rH
-Pr—I 
i—I
M
CQ
o o
to '-0CMI+- co
UO
8.1 MEMORY LOCATIONS OF VARIABLES IN SOIC II
Ma
ch
in
e 
El
em
en
t 
N
u
m
b
SK748 9 64-BIT FULLY DECODED —ORGANIZED 10 W O R D S  OF 4 'BITS'
LOGIC
G^-C^O—I >
5-4
MEMORY CELLS
04 0-
PIN (16) -  V CC, (8) = G ND
8.1(a) SOIC Mk II RANDOM -ACCESS MEMORIES (RAM).
algorithm was necessary to obtain the specified speed-accuracy product. 
This applies to both time and arbitrary variables integration.
8.3 Interconnection topology for Machine Elements
A partial interconnection topology was chosen due to the undue cost 
and programming freedom of a total topology. Consultation with pro­
grammers revealed that splitting the 64 element machine into an 8 x 8  
matrix and providing a total row (intra-group) topology and total 
column (inter-group) topology would be adequate. No diagonal con­
nections were provided.
8.4 Processing Method
This is shown skeletaliy for one group in Figure 8.2 and is seen to be 
based on using combinational logic to derive each new variable (AY, Y,
R etc). Thus only one step is used to process each machine element 
and the variables are processed concurrently. About | yS is allowed 
for each element giving a total time of 4 yS for 8 . This leaves 1 yS 
for interconnections processing at the end of the iteration.
As each RAM contains two words pertaining to each machine element, it 
has to be accessed twice. Thus a temporary store iB interposed between 
the RAM’s and the combinational logic for the first six words which 
are accessed. The combinational logic then processes the six words in 
the temporary store and the six new words which appear from the RAM 
on the second access. (The RAM’s are static devices ie provided 
they are enabled, and the addresses are held steady, the outputs 
remain true to the contents of that location accessed.)
8 WORDS
V
T
T
INITIAL
CONDITIONS*
w-  — ---------------------------------- ;---------— a y
•
yf~ ~ - - -  - -f
12 x 16
MEMORIES
8  WORDS
)1 WORD * TEMPORARY STORE
\
COMBINATIONAL LOGIC
Y/.n etc 
(1 )
6 x 2  INPUT MULTIPLEXORS
2b BIT SHIFTING
TEMPORARY STORE
RETURN HIGHWAY TO MEMORIES (R.A.M.)
8.2 SOIC II REGISTER STRUCTURE (ONE GROUP)
When the combinational logic has settled, multiplexing of the 6 pairs 
of new variables is carried out and the results are written back to 
the. RAM's' in two passes. The delay through the combinational logic 
is sufficiently long that corruption of the multiplexer input does
not take place until long after the results have been stored.
Figure 8.3 shows a timing diagram of the system for the processing 
of one machine element. In order to ensure a safe time for the 
acquisition of results^ a temporary store is interposed between the 
multiplexer and the RAM write data terminals. This .allows'the first 
6 words to be acquired and the second 6 to be "lined-up" at the input 
to this second storejwithout, in any way, corrupting results. By 
making this temporary store a parallel entry shift register, it may 
be used as a data loading register when worked in the shift mode.
Data, once assembled in this register, can then be readily loaded, 
bit parallel, into the appropriate RAM locations. As six RAM’s are 
associated with each board, a 24 bit ( 6  x 4) shift register is required. 
Twenty-four shift pulses are needed to set up one block of input data 
to the RAM's. This takes about 2 yS - thus the 16 locations for a 
RAM can be loaded with initial conditions in about 35 yS, including 
the time to write to the RAM's. As already stated, all boards can be
loaded concurrently, provided data from the GPDC is available. Thus a
complete problem can be loaded in about 35 pS plus the time to trans­
fer the data from the GPDC. (This amounts to 768 words ie 64 
elements at 1 2 words each)
READ FIRST 6 WORDS
------
FROM MEMORY 
TEMPORARILY STORE
FIRST o V/ORDX 
READ SECOND 6 WORDS
FROM MEMORY . 
COMBINATIONAL
LOGIC DELAY 
TEMPORARILY- STORE
FIRST 6 RESULT WORDS 
....SECOND 6 WORDS
ELEMENT ADDRESS
--- -
COMPLETE PROCESSING 07 1 MACHINE ELEMENT
8.3 TIMING WAVEFORMS FOR PROCESSING OF ONE MACHINE ELEMENT (SOIC II)
C H A P T E R  9-'
HARDWARE REALISATION OF SOIC MIC II MACHINE ELEMENTS
9.1 General
The eight machine elements associated in a group are connected to 
control logic which can produce the connections required to enable 
any (or all) elements to act as
(i) an integrator with respect to time having 
6 scaled inputs generating Z = £(K yl dt
+ / y2 dt + ....) .
or (ii) A variables summer having 6 scaled inputs generating.
Z = E(K1y1 + K2 y2 + ....).
or t (iii) An integrator with respect to an arbitrary
variable (X) generating Z = / y dx. •
or t (iv) A two variables multiplier (X and Y) generating 
Z = X.Y.
The signals transmitted from the output of any element for accept­
ance on an input by another element (or by itself) are all in the 
form of second order differences. Thus, from a ’time integrator 
the output is A“Z ie (A^Z - ; from a multiplier the output is
A(YAX + XAY + AYAX).
Each element must therefore accept inputs in the second ofder
2 2
difference form (A Y, A X ) .  It generates first order differences
2
(AY, AX) and the function value (Y) from the received A Y signals
•J*
See addendum No 2.
and initial conditions set in the AY and Y registers. The hardware 
used must therefore provide for the generation operations and for 
the algorithm required to generate the required function.
In general, the generated function (Z) is not automatically produced
in the machine element. For example, for time integration, the
2 . 2  
element receives A Y inputs and produces its output a 2 (where Z =
Z(K^ j y ' dt + K 0 Jy^ dt + ....). The integral (Z) may be generated>
2
as the contents of a register^by applying the A 2 output as the 
2
A Y input to another element^which will then produce the value of Z 
in its Y register. In other words, if Z is specifically required 
(ie for monitoring) it can be easily obtained. However, it is not
necessary to generate Z in an element (time integrator) in order to
2 • 
determine A Z.
The various register lengths are arranged so that, whilst Y (or X)
may have a maximum length of 16 bits, the transmitted second order
increments are restricted to the values of 0, +1, +2. (See Appendix
2
1 ). The information (a ) is thus transmitted between elements
2
by a quinary transfer method. This range of A values makes scaling
2
of inputs to elements (ie the generation of K.A Y) .
The use of second order differences to represent the outputs of
integrators has one further facet. As Z does not have to be formed
. 2 . .
m  order to generate A Z, that part of Z of significance greater
' 2 . 2  
than that of A Z need not exist. Furthermore, as A Z represents
only changes of AZ, the whole of AZ does not have to be generated -
only the signals (carries out of adders) that are instrumental in
changing AZ at each iteration. This leads to the redundancy of
the following (parts of) registers that might have been assumed
necessary for a time integrator:-
(a) Most significant portion of Y (about half).
(b) All but 2 bits of a 2.
(c) The whole Z register.
Any adders that might have been used to generate (a), (b), (c) 
are also made redundant. Thus, the second order method, which might, 
at first sight, nave appeared very expensive to implement, is quite 
economical.
Because any machine element can be programmed to carry out any one 
of 4 functions at any time, some apparent redundancy exists in the 
hardware for two (summation and arbitrary variables integration) of 
them. This is because the algorithms have been arranged such that 
summation and arbitrary variables integration are subsets of time 
integration and multiplication, respectively.
As an example, summation does not require a Y, Y and R registers in 
the formation of its output. However, they are needed for time 
integration, and sx^itching them out of circuit adds unnecessary 
expense to the system. The existence of AY, Y and R is, in no way, 
deleterious to the accuracy of the summer.
A further example is in the generation of terms involving X in 
arbitrary variables integration (AVI) ie (x .A^y), A y). These
are used only in multiplication and are only used in the determination
2 . • ♦ •of A z in multiplication. Switching them out of circuit for AVI xs
unnecessarily wasteful of logic.
?•2 Time Integration (TI)
The requirement of an element performing this function is to produce 
an output AZ where
I «J *»
Z = / (K1y 1 dt + K2?2 dt +   K 6 y 6 dt)#
Simulations of various integration algorithms revealed that values
- 4  *
of K^, etc as large as (N x 2 ) could be employed for low
accuracy computation. (Such values of K would correspond to steps 
of 3|° in a sine/cosine loop.) The required gain range for the 
integrators (and summers) was 10 :1 so that 20 bit scaling (K) 
registers were chosen. This yields integrator gains in the range 
(N x 2 to (N x 2 2^) .
2
The incoming second order differences A are scaled by
by shifting and complementing as in Table 9.1. See also Figure 9.1.
! '
A 2Y OPERATION ON K
- 2 Shift left 1 and complement
- 1 Complement
0 Inhibit
I
+ 1 Unmodified J
+ 2 Shift left 1 j
TABLE 9.1
As is a 3 bit quantity having bit significances -22 , 2'', 2° 
(= a, b, c) the logic connectives arer-
shift left if c = 0  \
complement if a = 1 >     9.1
inhibit if abc = 1 )
*where N is the machine iteration rate/sec.
AY
ADD
AY/F [ A Y ] ^
NOTE: Y/Y
aux
NOT SHOWN FOR CLARITY
ADDERS
A Z/R
ADD
ADD
ADD
ADD
ADD
ADD ADDADD
* 3  BIT SECOND ORDER INCREMENTS
9.1 SOIC II INTEGRATION WITH RESPECT TO TIME
V  2
The summation of (K.A Y ) !s produces a sum of which the least signifi- 
2
cant 8 bits F[A Y*] ■ are held in an accumulator (low significance
. . . . 2
residue) and the most significant portion (a Y*) is used to increment
AY. The AYs thus formed is accumulated in a *'Y register” . The 
actual hardware register is only 16 bits in length so that only the
least significant 16 bits of Y remain in it after each iteration.
. . 2
Any overflow produces 1 bit which is used m  the formation of A Z.
Such a system makes conventional overflow detection for a Y register 
impossible as there is no built-in check on the number or polarity 
of such “overflows". Overflow in the normal sense is applied to the 
AY  register, which is restricted to a 16 bit signed number. At any
iteration (i) its new value ( ^ Y ^ )  is •the sum of its old value
2 *\ ,
( A Y ^ _ ^ )  and the incoming increment (A ). The overflow logic
is thus:-
Overflow = [S(AY(i_1)) ^S(AY(i))] A [S(AY(i_1)) = S(A2y (i)*) ] ...9.2
where S( ) is the sign of ( ).
The least significant portion of Y (F[Y] ) is added to the contents
of a 16 bit unsigned R register at each iteration. Any overflow
. 2
from the top of R is also used m  the formation of A Z.
2 .
A Z is the sum of the simultaneous overflows from the R register
and the "Y register" (F[Y]) minus the overflow/ from the R register
on the previous iteration. This comes about from the fact that the
difference of overflows from the R register on contiguous iterations
2
represents the differences between implied AZ ’s (ie A Z) due to 
the accumulation of fraction portions of variables (F[Y],R) and 
F[Y] overflow represents a permanent change of the implied most 
significant portion of Y(I[Yj) that is, hereon, being used in the
generation of the implied AZ. (The term "implied" is used for 
l[Y] and AZ as neither appear in hardware in the integrator, as 
already explained.)
When initially loaded, F[Y] contains the initial fractional value
of Y and the AY register the initial value of Y (a Y ). The latter
o
may be used in a predictive algorithm.
At the first iteration, Yq] + 5 A i - s added to the R register
(ie a first order, trapezoidal, prediction) and f[Y0] is updated
to F[Yq]+ AYq . At the next iteration, a y g is updated to a ^,
2 *
(indicating that Y has changed by A Y =(AY -a Y ). The additional
> °
integrator correction required to effect a proper trapezoid is 
2 *
therefore £ A  Y . In effect, on the first iteration, it was "guessed*
that F[Y^3 would be F[Yq] + A Y q . It was subsequently determined, as
2 *
a consequence of the arrival of a , that the correct value of F j^ Yj
was (F[Yq] + AY^) not (FTYJ + AYq) . The missing integral was thus
2 *
a triangle of height A Y^ . Hence a post-iterative correction of 
i AY* .
. . , 2 *
In order to avoid the use of half quantities ( 5 a ) with a subse­
quent loss of accuracy (lost 5 quanta), the F[Y]register is arranged 
to work with double size quantities ie it is initially loaded 
with F[Yq], left shifted one place. This then required AY to be left
shifted one place at each iteration before it is used to update F[Y] .
2 *
The corrective element | A Y can thus be added, unshifted, into 
the R register. This approach doubles the gain of the integrator, 
but suitable choice of K registers' values restores the original
. . 2 *s
gain. Had right shifting of corrective quantities ( A Y )  taken 
place, a non-symmetric error would have been induced into the inte­
grator^ output. This error would have accumulated in time and
woulei thus have been a serious source of error.
(Simulation of this effect with a 50% amplitude sine/cosine loop 
showed an error of equal amplitude to the s i n e  wave in less than 
a million iterations ie 4000 radians.)
The registers used are 16 bits in length (a Y, F[Y] , R) but only 
the fractional part of Y is actually held in a register. The 
"effective" length of the Y register is, in fact, 24 bits (but 
because of the manner of generation of Y from 16 bit A Y ’s, the 
round-off errors are, of course, somewhat larger than in a 24 bit 
machine).
i . .
The implied "word AZ" is 8 bits in length although only a three 
wire highway is used for interconnections to other machine elements. 
The element thus provides a very good speed accuracy product with­
out great cost being involved in the provision of programmable 
interconnections (auto-patch).
Summarising the method in register transfer form, starting at the
2
beginning of an iteration (i) ie assuming incoming A Y*s have 
just arrived:-
• A 2 *
(a) Scaled second order differences (A Y ) are formed:-
. . 1 9
where A is the scaling register, and -2 < A < 2
2 * . . 2 * 
F[A Y ] is the unused fractional residue of A~Y ,
2 . . 2
A Y .  .v is one (n) of 6 incoming A Y's at time (i)
(n,i)
(b) AY is updated
AY
(i)
AY
(i-1 ) (i)
9.4
(c) F[Y] is updated
F[Y(i)] =  F[Y(i_i)] + AY(i)
2
(d) A Z is formed from a trapezoidal prediction/ 
correction where:
AZ(.) + R(.} = R(i_1) + + |aY(i) + J A2Y(i)
and A 2 Z (i) = A Z (i) " A Z (i-i) ____ ..9 . 5
= f {0(F[Y](i)), 0(R(i)), 0(R(._1)}
where 0 ( ) is the overflow from ( ).
Note: limits:-
015 .v 015
-2 < AY < 2
-2 <• A 2Y < 2
The register transfers are carried out 'concurrently* in the 
machine element in order to save time ie 'no adder time-sharing 
takes place. Thus, purely combinational methods are used in every 
update etc).
Other arrangements require the same total number of adders but 
involve greater total delays (eg a "tree” of 4 + 2 + 1 adders 
involves 3 delays whereas adding one variable at each level ie 
a 1 + 1 + 1 .... + 1 tree involves 8 delays).
2
The logic for forming (An«A Y^) is required to be capable of 
shifting, inhibition and complementation. Integrated circuits 
are used as follows:-
9.6
(a) a 4 bits wide 1 of 2 selector for generating 
shifted and unshifted A (74157).
(b) A 4 bits wide true/complement selector with 
inhibitor for the rest of the function (74H87).
20 BITS (K)
sh if t/ 20 x (1 OUT OF 2 SELECTORS.)
SHIFT
20 BITS
INHIBIT
COMPLEMENT
OUTPUT
PER BIT LOGIC
OBIT (n+1)
SHIFT
~ "9"l
I
BIT (n+t)
SHIFT LOGIC 
9.2
-BIT (n)
■INHIBIT
BIT (
COMPLEMENT
INHIBIT/COMPLEMENT 
LOGIC
K.A Y LOGIC (SOIC II)
The logic for adding the second order difference sub-products 
- 2
E K ,a Y is arranged so that the sub-products and sums are 
n - 1  11 n
fixed in a single logic structure. The shifting/inhibiting and 
complementation has already been discussed. However, the formation 
of negative numbers involves complementation and also the addition 
of 1 in the least significant bit position.
2
To form the sum of two negative Y*s therefore requires an
. . . 2
addition of 2 , and for the sum of five negative Y ’Sja total 
of 5 must be added. This is accomplished with the adder con­
nection of Figure 9.1, ■by adding a 1 into the position of as 
many adders in the chain as is necessary.
Since the fractional part of A ^ Y ^ _ ^  ^ ^ ^ ( i - l ) ^ 5 a^waYs 
positive there can be no more than 6 negative numbers added by
the adder tree (adders 1 to 6 ). The additional 1 for any negative
2 .
KA Y can therefore be routed to the C. terminal of one of them
n
adders in the chain. Thus C. of adder 1 is used if A “Y- is nega-m  i .
2 . .
tive; of adder 2 copes with A Y^ being negative.
This method is economical in adders and avoids any additional 
delays.
The whole of the logic (for forming AY, F[Y] etc) is shown in 
Figure 9.1
2
The logic for forming A Z from the F[Y] and R overflows is shown
in Figure 9.4. It may be seen that the overflow from R on the
previous step (i-1 ) is negated and added into the new overflow
from R on step (i) thus forming the second order term of Z due
to the accumulation of fractional integral increments from F[Y] .
. 2
The overflow from F[Y] , which forms the other portion of A Z is
. , 2 ,
added in as well. From this logic, it may be seen tnat |A Z|
can never exceed 2 .
142
F [dY]THROUGH
««S£"» 8. *<— C
d = 1 if A \ < 0  NOTE: F [^Y] < 0
e = 1 if A2Y5< 0  
f = 1 i f £ y g< 0
9.3 k2Y r s LOGIC FOR TIME INTEGRATION (SOIC II)
“A Z (i-1 )
•-<-CARRY OUT O:
R r.\ ADDER
 — ---------
AZ,
(i)
I [ Y l i )
not stored- ' 
only implied (i-D + I[Y{AZ
9.4 A2Z LOGIC (SOIC II)
A  ^  •
Z(i) 1S t i^e second order term to be stored, m  negated form, m  
readiness for the next iteration, and is, in effect, the overflow 
from R due to the accumulation in R.
Each time integrator has two auxilliary registers associated with 
it. (They are omitted from Figure 9.1 for clarity.) They have 
the same capacity as the Y register and receive the same updating 
signals ( A Y ^ ) .  However, they are loaded with initial values such 
that they overflow when EA Y exceeds some predetermined values.
Thus the Y register is provided with two benchmarks (or comparator 
breakpoints). The overflow detection is as for A Y  (as described 
earlier) and causes a flag (flip-flop) to be set (one for each 
auxilliary register). The flags from the auxilliaries are OR'ed
and the resulting signal checked at each iteration. One or more 
auxilliaries overflowing causes firing of this OR gate; the GP 
is interrupted, the DDA halts and a search is initiated for the 
offending auxilliary. The auxilliary is identified and suitable 
GP software action initiated.
9.3 Variables Summation
It was explained in paragraph 9.1, that variables summation is 
arranged in-SOIC II to use a subset of the logic used in time 
integration. Thus, much of the description of paragraph 9.2 is 
applicable here.
. . 2
The requirement of this element is to produce an output A Z where
6
Z = Z (K Y )    .....(9.7)
. n n
n=l
.      (9.8)
2 6 2 
ie A Z = Z ( K A Y )
, n n
n~l
. -4
where, as for time integration, Kn lies m  the range ± 2  in
-23
steps of 2
2 .
Since A Y, etc may each have a value of ±2 the total sum is capable
of reaching 12, if each coefficient K is' unity, However, the
2 . . .
output A Z must be restricted to the range ±2. The coefficient
values (K^ etc) are therefore generated as 2 0 Lit binary numbers
. . . ' - 4
(as for time integration) having significances in the range ± 2
-23
in steps of 2 (coincidentally, as for time integration). This
2
gives a maximum possible A Z output as less than 1. (The choice
-4 . . .
of 2 was determined by convenience in the use of 4 bit "slices1'
of logic in the machine). This restricts the full dynamic range
2 .
of A Z and so, marginally, reduces accuracy.
The implementation of variables summation is shown in Figure 9.5.
. . 2
As with time integration, the 6 subproducts A.a Y /. are added(1-6;
. . .  . 2
and the 8 least significant bits are retained in the F[A Y] register
for accumulation with subsequent subproducts. The most significant
2 2  ^
portion of EA.A Y is used as the !,scaledr: A Y (called A~Y to avoid
confusion with incoming increments) ana is added into AY on each
iteration. (This logic is shared with time integration.)
In order to generate the desired algorithm (equation 9.8), and, at
2
the same time, make use of the time integrators A Z/R logic, then 
Ay must be accumulated in R. (Adding Y into R creates an integral, 
then it is readily seen that adding AY into R recreates the original 
function.) •
In order to provide the monitoring facilities as for time integra­
tion, Ay  must be added into the two auxilliary Y registers in the 
same way as for time integration. (AY is also added into Y, although 
it is not subsequently used and is therefore not shown in Figure 9.5.) 
All the AY, R, Y, YAUX1, and Y ^ ^ ,  registers and most of the logic 
are shared with time integration. The summation algorithm is merely
i 2 *
implemented by inhibiting the passage of Y ^ ^  and |A Y ^  to the
R register and altering the significance at which AY is added in
to obtain the desired values for K- ,.1-6
Summarising, by use of register transfer equations:-
A 2Y (*} + F[A2y J }] = 2-.8 J i CArfA Y (n>i)) ♦ F[A2Y (i!1)] ....9.9
(Symbols as per equation 9.3)
THRU,
R
AS LEFT
shift/inhibit/
COMPLEMENT
9.5 BASIC LOGIC FOR VARIABLES SUMMATION (SOIC II)
AY updated at each iteration by scaled second order differences.
YAUX1’ YAUX2) (i) = (Y’ etc)(i-l) + A Y (i) V ....... : 9 , 1 1
Y and auxilliary registers updated with AY at each iteration.
AZ,.v + R r > = R /. . n + AY/.a
(i) (l) (i~l) (1 )
 .....   9.12
i2z(i) = AZ(i) ~ AZ(i-l)
AY's summed into R register. Overflows treated, as per time integra-
2
tion, to determine A Z. (As Y is not added into R, then, effectively,
2
I[Y] = 0 at all times in the A Z determination.)
9.4 Arbitrary Variables Integration (AVI)
The requirement of an element producing this function is:-
z = j y dx '....’......... ..9.13
where y is the dependent
and x is the independent variable.
For reasons that are apparent in the logic implementation, there 
are several differences between the strategy and facilities in this 
function compared with Lime integration. These all arise from the 
substitution of a general variable (X) for simple time. The changes 
are: -
(i) only a single dependent and independent
variable are used.
(ii) Neither y nor x are scaled.
(iii) A somewhat more crude integration algorithm •
is employed. A first order predictor is
used, but the trapezoidal corrector is cruder
as it has to cope with corrections to both 
Y and Ax. Comparing the algorithm with time 
integration (TI):-
(a) Prediction:
(i) TI: = (Y.At)(i) ....+ |(AY.At)(i)
(ii) AVI: = (Y.AX)(i) + i(AY.AX)(i)
(b) Post-correction:
Cl) TI: = (A^-At)(i+1)
(ii) AVI: = (Y.A2 X)(i+1)
where A t ^  = A t ^ +^  etc.
It will be noted that the major post-correction in AVI involves
a product of second order form,and therefore,the third order term
(equivalent to A Y.At in time integration) is ignored. A second
. 2
order product does not arise m  time integration as A = ^
for all (i). This is the cause of the greater "crudeness" of AVI.
As no scaling or summing of second order differences is required
2 2 . 
at the A Y and A X inputs to AVI, then AY, AX can be obtained
directly by incrementation. Thus:-
AY(i) “ AY(i-I) + ^(i)
and AX(i) = AX(i-1) + A X (i)
,9.14
Y may be similarly obtained. (X is not required.)
H e n c e  y ( d  = Y ci-i) +  A Y (i) • • • • • • • ........  9 -15
(AY.AX) ^  is a required product for the generation of the output.
It involves two multi-bit variables and thus, in order to avoid
multiplication, is obtained by summing its old value (AY.AX)
with other products which involve second order differences.
2
(Products i n v o l v i n g  A are easy to obtain as explained in paragraph 
9.2.)
(AY.AX)(i) — ( A Y « A X ) + ^(i) + ^^(£—X ) Y (i)
Note. £s gGnerated for other purposes, such as updating
Y, Ya u x 1  etc, as well.
( Y . A X ) ^  is another product to be generated to obtain the final 
integral. The same technique is used as in equation 9.16 except 
that the result of 9.16 can be used as one of the terms instead 
of going all the way back to second order differences for all terms.
<Y.AX)(i) = (Y. AX) (._1} + (AY.AX) A2 X (i)
Equations 9.14, 15, 16, 17 can be used for generating the final 
output:-
AZ(i) + R (i) = R (i-l) + (Y-AX)(i) + + (YA2x)(i) 9
where is ^ ie integral residue from the previous iteration,
( Y . A X ) ^  is the Eulerian component of the prediction, s(AY,AX^y
2
provides the first order element to the prediction (Y.A X) 
provides the post-correction term for the previous iteration.
2
Note:- At the start of computation, A terms are all zero as
2
no outputs can have been generated. Thus (Y.A X) = 0 for the
first iteration meaning that no correction is attempted for the 
non-existent previous iteration.
2 . .
The a Z logic is as for time integration (paragraph 9.2) except 
that, for I[Y] read I[Y .AX] . (I[Y] is really shorthand for
I[Y.At] in time integration.)
The logic for AVI is shown in Figure 9.6. The structure is basic­
ally the same as for time integration except that:-
(i) no input scaling/adders exist.
(ii) Storage is required for the extra terms 
(AY. AX) and (Y.AX) .
(iii) Different terms are added into R.
Much logic is, however, shared with time integration. This includes 
that for:-
(a) generating Af, Y, YAUX1> YAnX2 and\ ^ Z *
(b) Detecting AY, Yaux1, and Ya u x 2  overflows.
Sharing of logic, in this context, means not only employing the
same methods, corrections of adders, etc, but the actual components
(adders, stores, etc).
9.5 Variables Multiplication
The requirement of such an element is to produce a function of the 
form:-
z = x.y or Z = X.Y
or A2Z = A2 (X.Y)
where X, Y are machine quantized variables.
Y (Y.AX)
(Y.AX)
AX (AY. AX)
(AY.AXK
9.6 BASIC ARBITRARY VARIABLES INTEGRATOR (AVI)
This is a very complex function to perform as both X and Y are
multi-bit numbers. In order to reduce the problems of implementa-
. 2 2 .
tion, no input scaling for either A X or A Y is provided.
In order to avoid the time consuming process of true multiplication, 
use was made of products involving second order differences to 
build up the desired element output.
The multiplication logic, therefore, has much in common with AVI
and does, indeed, make use of much of its logic and storage. AVI,
2 .
however, uses only terms involving A “X, and AX. Kence multiplica­
tion which uses terms in X requires rather more storage and 
processing.
In addition, the specification calls for the monitoring of X as 
well as Y. Hence two auxilliary X registers are provided in a 
like manner to Y. They, like ^.UXljCause 3 processing interruption 
of they cross a breakpoint, or overflow.
Starting from equation 9.19, it is possible to obtain first and 
second order difference forms for Z. These are:-
A Z (i) S  Y (i)a X (i) + X (i)A Y (i) •••••••••••••••••..... 9-
A 2Z(i) = A[Y(i)A X (i) + X (i)AY (i)] y  •
- \ i r \ )  + V 2Y(i) + 2iV x (i)'
However, these forms do not take account of the predictive nature 
of simultaneous machine element operation nor of the post-iteration 
corrective potentialities of the system (of which advantage is 
taken in the other functions).
If equation 9.20 is rewritten for step (i—1), the difference taken
• 2 2 
to obtain second order form and terms involving products A .A
4
(=A ) ignored, the resulting product predictor/corrector algorithm 
is obtained:-
A2 Z (iy  = (Y.AX)(i) + (X.AY)(i) + (AY.AX)(i) + (Y.A2X) (i) +
(I.A l) ,j, ... .9
This can be formed as for AVI if it is appreciated that the same
2
process may be used.in the formation of (XAY) as (Y.AX) and (X.A Y) 
as Y.A2X).
. 2 . '
As the logic for(X.AY) and (X.A Y) does not appear in other functions
it is specially provided for multiplication - an almost inevitable 
penalty for providing such an exotic single element function.
The output logic for multiplication is shown in Figure 9.7. As may 
be seen, multiplication is much like AVI except for:~
(a) the generation of the ‘'mirror" terms involving X.
(b) The storage for "mirror" terms involving X.
(c) The extra terms that are added into R.
2
The A Z logic is also slightly different insofar that I[Y.AX] 
is replaced by I[Y.AX] + I[X.AY].
( X.AY) (X. A  Y) (AY. AX) (Y.A2X)
9.7 MULTIPLICATION OUTPUT LOGIC
The register transfer equations (referring to equations 9.14 to 
9.18 for comparison with AVI) are thus:-
AY,.. = AY,. .. + AY,., 
(l) (l-l) (i)
AX,.. = AX,. + A2X,.\
(i ) (l-l) (l)
9.23
(update first order differences)
Y,.. = Y,. .. + AY,..
(i ) (l-l) (i )
v = Y + AY
\AUX l,2(i) XAUX l,2(i_1) . (i)
X,.. = X,. .. + AX,., 
(l) (l-l) (l)
XAUX 1,2,.. XAUX 1,2,. '. + AX(i)
? (i) (i-l)
9.24
(update main ana axuilliary variables)
...9.25
(AY.AX) (i) * ( A Y - A X ) ^  * (AY.A2X ) (i) + A Y (._iyA2 (.)
(Y.AX)(i) = ( Y ^ X ) (..1) + (AY.AX)(i) + Y (._1} .A2X(i)
(X.AY)(i) -  ( X - A Y ) ^ ^  ♦ (AY.AX)(i) + X {._1 ) .A2Y (i)
(update multi-bit terms sub-products).
A Z (i) + R (i) = X (._t r + (Y.AX)(i) + (X.AY)(i) + (AY.AX)(i) +
(Y.A2X),.. + (X.A2Y),..   9.26(i) . (i)
(from equation 9.22).
Multiplication requires the following initial conditions
Y AY X AX (AYAX) (AY.X) and (Y.AX) 
o o o o o o o
Yq and Xq are primary initial conditions to the multiplier. AXq
and’AYo are easily derived as they are directly proportional to
the weighted sum of previous element Y q values. The other terms
are derived from Y , X , AY and AX .
■■0 0 . 0  o
The limited lengths of registers used in mutliplication (and other 
functions, particularly AVI) imposes certain limits on the magni­
tudes of some of the terms. These limitations are the subject of 
Appendix
9.6 Auxilliary Logic for Data Boards
(a) Output Logic
As was previously shown, contiguous values of AZ can only differ 
as a result of changes'to the most significant portions of 
certain'registers. For time integration a carry-out of the top
2"
of the R or F[Yj adder would constitute a unit quantum of A Z.
2
A carry out of both would give A Z = 2. The registers that 
have to be inspected for carries-out in the cases of the 
other algorithms are:-
(i) Summation:- R.
(ii) AVI:- Y.AX, R.
(iii) Multiplication:- Y.AX, X.AY, R.
(b) Overflow Logic
Although foreshortening of certain registers is carried out (which 
means they can apparently freely overflow), a check is necessary on 
those not so foreshortened. Overflow of these registers would 
clearly be unacceptable and suitable detection and flagging is 
incorporated. (Particular amongst these registers is AY.)
Overflow in these registers can be quite general ie positive 
or negative. Detection thus takes the form of watching those 
operations which can cause overflow, and trapping them if a result 
with an "unexpected sign" has occurred. Operations capable of 
overflow are:-
(i) addition of quantities of like sign.
(ii) Subtraction of those of unlike sign.
The other two combinations are incapable of causing overflow, if 
the two source operands are in range.
Detection of overflow is of the following form:-
1 3 /
(i) Addition of positive operands causing a "negative" 
result.
(ii) Addition of negative operands causing a "positive" 
result.
(iii) Subtraction of a positive from a negative operand 
causing a "positive" result.
(iv) Subtraction of a negative from a positive operand 
'causing a "negative" result.
(l)-(iv) may be expressed symbolically as:-
Overflow [S(sum) f S(A)] A [s(A) = S(B)] A addition, etc ..9.27
if S(x) means the sign of (x).
As 2*s complement arithmetic is used throughout the machine and 
all updating is of the form
W,.. = W,. + AW...
(l) (l-l) (l)
detection of overflow due to addition only is required.
I ii o
C H A P T E R  10
SOIC Mk II INTERCONNECTIONS AMD COHTROL HARDWARE
In order to support the SOIC II data processing hardware, several 
additional hardware/logic systems are necessary.
(i) Interconnections logic for communicating the 
outputs of machine elements to the dependent 
and independent variables of others.
(ii) Monitoring logic for extracting the results 
of calculations and presenting them to the 
computer interface.
(iii) Interruptions logic for trapping anomalous 
results and informing the GP computer.
(iv) Hardware check facilities for aiding diag­
nostics in the event of hardware failure.
(v) Power supplies.
(vi) Input/output hardware for the SOIC II/GP inter­
face.
10.1 Interconnections (General)
The machine consists of 64 computing elements which, for the 
purposes of interconnection, can be considered as a matrix of 
8 rows by 8 columns. Figure 10.1.
1a 1b 1h
PROCESS ORDER 
ct9b f c f efc
8 b 8 h
GROUP 1. 
I
i
GROUP $.
INTRA-GROUP INTERCONNECTIONS; HORIZONTAL. 
INTER-GROUP " : VERTICAL.
10.1 SOIC II MACHINE ELEMENT GROUPING
A group consists of a single 8 element row within which a total 
(intra-group) interconnection topology exists. The elements of 
each column are interconnected by a total topology (inter-group), 
but no inter-group connections exist between columns.
The logic for effecting interconnections at intra and inter-group 
levels is similar, although the storage of element outputs is 
centralised for the inter-group logic, but is local for the intra­
group interconnections logic. This is a matter of packaging con­
venience.
10.2 Interconnections (Detailed Description)
2(a) Intra-grpup (A Y)
Each group consists of 8 elements, processed sequentially during 
an iteration cycle. The topology is total so that 8 bits of 
information are needed for each element to determine the variables 
which it should input. Because of the sequential processing, a 
totally time domain orientated variables selection method is used. 
Considerable hardware economies are thus realised.
The intra-group logic is as shown in Figure 10.2.
2
The A Z ’s produced during an iteration cycle are stored, as they
2 .
occur, in a shift register (the A Z ^  store). They are required
for the next iteration, but must not interfere with the present one.
2 . .
The A Z fs being used in the current iteration are stored m  a
similar (AZ(._.j^) shift register. Every time a fresh element is 
to be processed, the A 7 ^ _ ^  store is circulated and its contents 
are presented sequentially to the serial input terminals of a third
A % .  STORE 
Ci—1 )
(AS PER A2Z ^  STORE)
3 x 8  BIT SERIAL IE/SERIAL OUT 
SHIFT REGISTER ( A2 STORE)
3 x 6
BITS
■ p
CLOCK
A , . .  STORE 
(i)
3 x 6 BIT
SERIAL IN/ 
PARALLEL OUT 
SHIFT REGISTER
SELECTED A Y  FROM 
INTER-GROUP INTERCONNECTIONS 
LOGIC
3  x ( 2  from 1 )
SELECTOR
1 x 64 BIT
SERIAL IN/SERIAL OUT 
S H U T  REGISTER 
(SELECTOR STORE)
10.2 INTRA-GROUP INTERCONNECTIONS LOGIC
2,
(A i store) shift register. Those increments, which are to be
2
selected, are clocked into the A Y store by a '1* presented to
2 .
the A Y store strobe terminals from a fourth (selector) shift
register. The fourth shift register thus contains a pattern cor­
responding to those increments which are to be selected by each 
machine element (within a group). This register is thus of 64 
bits corresponding to the 8 sources of increments for each of 8  
elements.
Notes on the method:
(i) Each machine element can accept up to 6 inputs, thus the
2 .
A Y store has a capacity of 6 three bit words. The six
inputs have to be kept separate until after presentation to 
a machine element as they are individually weighted within 
the machine element.
(ii) The output from the intra-group logic (up to 6 three bit 
words) has to be combined with the output from the inter­
group logic (also up to 6 words). As a programming restriction 
is applied which only allows a total of 6 words tc be produced 
from the interconnections logic, as a whole, for any element, 
the intra and inter group outputs are OR'ed in such a way 
that each output word appears on a separate line.
2
This is accomplished by ‘'packing” selected A Z's into a
2
shift register. By clocking the A Y store (figure 10.2) with
the data output of the selector store, shifting only occurs
2 . 2 
in the A Y,.- store on the occasions that a A Z would be 
(i;
?
accepted from the A“Z ^ ^ ^  store. If only 3 out of the 8  
available AZ's are selected, these will occupy the first
. . 9
3 word positions in the A“Y shift register. If, by a similar
2
technique, inter-group A Z's are packed into another shift 
register (from the other end), the shift registers' contents 
can be OR’ed without fear of 2 increments appearing on the 
same output line. Figure 10.3.
v i )
IIOTA-GROUP
STORE
INTER-GROUP 
STORE
10.3 LOGIC FOR COMBINING SELECTED INCREMENTS FROM INTRA AND 
INTER GROUP INTERCONNECTIONS MATRICES
(iii) Selections for a given element occur during the processing
of the previous element. In order not to create a "dead-space”
2 . . . 2
when the A Z ^ _ ^  store is being loaded with A at the
beginning of a new iteration, the AZ,^_^ store is loaded with 
A Z^j during the processing of the last (8th) element. None-
theless, time must be provided for the A~Y store to accept
. . 2
the output of the 8th element (if required), when A Z^^
appears. This time is about 50 nS.
(b) Inter-group (A^Y)
This works in exactly the same manner as intra-group having the same
2 2
number of A Z fs to distribute to the same number of slots (A Y) in
2 .
the same time. However, whereas the A Z ^ y  store for intra-group
working contains the 3 outputs of a certain group, and there are 
thus 8 autonomous intra-group interconnection logic systems, the 
inter—group logic collects all 64 outputs and aligns them for timely 
inter-group communications.
6x3 BIT /
WORDS TO BE I 
COMBINED' WITH I 
INTRA-GROUP V 
WORDS i
(1 per group) I
* 'CO
STORE
V
•
■
1
1
.0/P»s GROUP 1. 
r
0/P*s GROUP 2.
' t; R  j
TrlRClleui i'irt.'jxijLiiE
“AZ, . x STORE
(i)
(l per nachine)/
10.4 INTER-GROUP INTERCONNECTIONS LOGIC
Referring to Figure 10.1, elements are processed in the order a, 
b, c, a ie la, 2a, 3a .... 8a are all processed at the same 
time, then lb, 2b, etc.
The 8 three-bit increments (A“Z ^ )  from each column of elements 
are collected in a 3 bits wide, 8 by 8 bit shift register. Figure 
10.4. As columns of elements are processed, the 8 word columns 
pass along the shift register (AZ^).
The increments are collected at the other (left-hand) end of the 
shift register for distribution, by shifting them at right angles 
to the previously mentioned direction of flow. Figure 10.4. At 
this point, the selection is carried out in a like manner to that 
for intra-group distribution.
Hie two directions of increment flow are achieved by the use of 
dual source shift registers for the left-most column of the A Z ^  
store of Figure 10.4,
2
Although the use of a A store may appear wasteful, as the
information is scattered throughout the machine already in intra-
2  . . .
group A Z,. 1N stores, the logical complication of collecting the 
t t“l/
appropriate increments from the various parts of shift registers
in various parts of the machine does not warrant the saving in
2 .
storage. Thus the A store of Figure 10.4 is implemented m
hardware (there is only one such 192 bit store for the whole 
machine). Furthermore, it is made from serial in/serial out 
shift registers which are particularly inexpensive and light on 
power consumption. The 8 twenty-four bit registers (8 x 3) are 
situated physically in their ovrh group's logic bay and the cross- 
selection' on the last bit is achieved by a single 3 bit busbar
running, fairy -light fashion,round the 8 groups, making a physic­
ally very long 3 by 8 bit shift register!
10.3 Monitoring Logic
Monitoring nodes are merely dummy machine elements with no outputs, 
which acquire data from the inter-group busbar, select the increment 
(1) required, and convert, through two stages of accumulation, 
into a whole number variable:-
AY(i) = AY(i-l) + ^ (i)  ^
10.1
YCi) V Y(i-l)+ AY(i) 1
(Unlike an ordinary machine element variable, is not fore­
shortened.)
The selections logic is as for an ordinary machine element, except
that the fan-in is one. Thus, the logic is merely required to trap
2 2 
the appropriate a as xt passes-down, the A busbar.
The only special point of note in the system, as implemented, is 
that 8 monitor points are provided and it is economically expedient 
for them to share the selections logic and arithmetic unit. The 
sequential processing that the latter demands is no disadvantage 
as the I/O interface only allows for one output variable at a time 
to be passed to the G? computer. The monitor logic is thus as 
shown in Figure 10.5
(i)
(3 BITS)
64 BIT SHIFT REGISTER
(8-xV [t o t a l] allov/s d )
8 WORDS A Y  STORE
S WORDS . Y STORE
TO I/O
10,5 MONITOR LOGIC (8 MONITORS)
10.4 Interruptions Logic
(This section is only concerned with interruptions arising as a 
result of elements1 auxilliary registers crossing breakpoints.)
Each machine element has two (Y) and two (X) auxilliary registers. 
They are set to values such that when the Y (or X) register reaches 
some predetermined value, the auxilliary register changes sign.
The change of sign is easy to detect using an exclusive-OR gate 
(Figure 10.6.)
CHANGE OF  — -- -i
SIGN
10.6 AUXILLIARY REGISTER SIGN CHANGE DETECTION LOGIC
. ' / . '
The exclusive-OR gate gives a one output if the sign of a register
differs from its previous value - two conditions contained in the
combinational logic of the most-significant data boards.
The change causes the setting of a certain flip-flop of a long 
shift register and also fires a high fan-in OR gate whose output 
interrupts the GP computer. The GP computer can then halt opera­
tions and initiate a search of the shift register to determine which 
bit or bits was/were set. Figure 10,7.
SIGN CHANGES IN AEITHMETICo
DETECTSHIFT REGISTER
> ^ T 0  I/O
CLOCK
COUNTER
CLOCK
OR INTERRUPT G.P.D.C.
10.7 INTERRUPTIONS STORE AND INTERROGATION LOGIC
If the shift register is shifted until a one appears at its output, 
the number of shifts can be counted and the source of interruption 
(auxilliary register sign change) determined. The interruption can 
be cancelled. If the OR gate is still fired, then, clearly, more 
than one sign change must have occurred on the same iteration.
The search is then continued, together with the shift count, until 
the next interruption !?flag’' is located.
Each time a flag is located, the shift count is transmitted, via 
the GPDC/DDA interface to the GPDC. The appropriate software action 
is then initiated ie a fresh breakpoint entered to the comparator 
and fresh potentiometers coefficients entered to function generators, 
etc.
10.5 Hardware Check Facility
In order to gain full benefit from the use of largely identical 
computing element?, a simple hardware check facility is incorporated 
into the machine at the group level. This consists of the provi­
sion of equivalence gating between the R register outputs of the 
two 8 element groups in each rack. Simply, the 4 bit R register
t
outputs of like significance in 2 groups are compared. Non-equiv­
alence causes a halt in the computing. By setting similar programs
in the two groups, a check can be made for agreement. The following
points are worthy of mention:-
(i) The similarity of programs must extend to the machine 
element level ie those elements that are processed 
concurrently within the cycle of 8 must be similarly 
programmed. (The check is carried out for each element 
processed during an iteration cycle.)
(ii) By picking on two members of different groups to check 
at any one time, element interaction within a group is 
eliminated. This is particularly important when it is 
remembered that members of a group share arithmetic 
elements, variables storage etc. A catastrophic failure 
of a complex integrated circuit is fully capable of making 
two elements of the same group yield consistent results.
(iii) This check facility is normally only used during a
maintenance check activity and not during normal comp­
utation, say, with unused elements.' This is necessary 
as the checking slows down the machine appreciably.
(iv) Host processing operations affect the R register so 
that failure of an IC will usually cause incorrect 
results to appear at the output of the R register.
However, auxilliary registers, in particular, can be 
processed wrongly without affecting R. They can be 
checked by setting breakpoints and checking that 
breakpoint crossings occur at the same time for two 
groups.
10.6 Power Supply
Although the power required for this machine is considerable - 
approximately 100 A at 5 V, it is derived by largely conven­
tional means. A double-wound transformer - silicon bridge 
rectifier set is used to generate unsmoothed DC. This is partially 
smoothed by an inductor capacitor filter. The final smoothing 
and stabilisation is carried out by a bank of 5 V/l A silicon
monolithic voltage regulators, using internal reference voltages. 
However, the essential difference between this system and many 
others is that the regulators are mounted on the printed circuit 
boards (PCB). This has the following advantages:-
(i) Unregulated DC is distributed through the machine, 
the voltage drops due to busbar resistance being 
equalised automatically in the regulators (providing 
more than 7 V (instantaneous minimum) reaches the 
regulators). Therefore busbars are not unduly thick 
or cumbersome.
(ii) Regulated DC is kept local to individual PCB!s.
The problems of busbar impedances and decoupling are 
considerably reduced.
(iii) Heat is dispersed more easily in the machine. If
the power supply had contained the regulators at the 
bottom of a 19** rack, it would have been difficult 
to remove the heat other than by convecting/biowing 
up the rack chimney. This would have caused unneces­
sary heating of the logic modules. In the system 
adopted, the regulators and their heat sinks form a 
chimney at the front of the rack, quite separately 
from the logic column.
(iv) Use of separate regulators causes only local damage 
if a regulator goes short circuit (the most common 
failure mode). The use of parallel regulators in a 
single outlet power supply could conceivably cause 
widespread damage to the machine.
The power supply is designed to achieve satisfactory machine 
operation at extremes of current consumption and available mains 
voltage. Detailed calculations are given in Appendix 4*.
At "mains-low" the problem is one of providing the minimum neces­
sary input voltage for the IC regulators (7 V ) . The lowest input 
voltage occurs just before the power supply rectifiers conduct 
(once every lOtiS).
At "mains-high" the problem is one of keeping the junction temper­
ature of the regulators within prescribed limits, even at maximum 
current drain. This is achieved by the use of a suitable size of 
heat sink in a suitably convected or forced-air chimney.
10.7 Input-Output Hardware Considerations
Input-output (I/O) for SOIC Mk II will have the following purposes:-
(i) loading of SOIC machine elements wTith initial 
conditions, breakpoints, potentiometer settings, 
interconnections (patching) data, iteration 
count, monitoring information.
(ii) Outputting of results from SOIC, interruptions, 
breakpoints identification.
SOIC is designed to work largely autonomously, for extended periods 
- maybe several seconds in exceptional circumstances. The through­
put rate of I/O will therefore not be excessive and a standard 
interface system (CAMAC) has been chosen in preference to a high 
performance, non-standard, interface. The choice of CAMAC, of 
course, makes possible the connection of other equipment to the
GPDC/DDA system at a later date without any system redesign.
As all information for SOIC is contained in addressable RAM’s, 
the interface/local storage connection will be particularly simple. 
Each block of data from the GPDC will be headed by a label stating 
the nature of the data and its destination (where applicable) and 
this will set up the appropriate path to the RAM's addresses. As 
stated in a previous chapter, the RAM’s have local buffer stores/ 
shift registers which are used to prealign data before it is 
presented to the RAM's.
To use RAM's for storage, means that they will be able to consume 
data much faster than the GPDC can issue it, so that no special 
storage will have to be provided for holding data whilst it is 
being distributed through SOIC II.
SYSTEM DESIGN EVALUATION AND DISCUSSION
11.1 Project Progress
The theoretical work on the organisation of an incremental computer 
has shown that such a system could yield a significant advance in 
computing speed. A detailed logic design of an original form of 
digital differential analyser (DDA) has been completed and has 
demonstrated that the manufacture of such a DDA is a practical 
proposition.
Cranfield Institute of Technology is interested in the production 
of a complete installation of the new computer system and has 
collaborated in the simulation of the proposed logic designs. In 
addition, they have written some software for initial connection 
and loading of the machine.
The production machine has been designed with 8 groups of 8 elements 
and a central control for loading, interconnections, etc. A pro­
totype single group of 8 elements has been constructed (using hand- 
wiring) and an ad-hoc logic control has been used for loading and 
interconnections.
The results of experiments upon the hardware (and the simulations 
which have been done) show the following:-
(i) Time Integration
With a 200 KKz iteration rate, the maximum sine-wave 
frequency, using the full dynamic range, is 1,000 Hz 
with a non-cumulative error of 0.003% (measured and 
simulated for a sine-cosine loop).
(ii) Integration with Respect to Arbitrary Variables
When tested in the same situation as (i).
At 1 KHz sine-wave rate; error = 0.2% (non-
accumulative).
The error was found to be inversely proportional 
to frequency for the first 4 octaves below 1 KHz, 
then levelled off below 60 Hz.
(iii) Summation
Summation was tested by the insertion of such 
elements (2) in a sine-cosine loop. The errors 
they introduced were found to be dominated by 
the delays they incurred, not their basic accuracy.
(iv) Multiplication
A sine-cosine loop was formed from two time inte­
grators. The sine and cosine were multiplied to 
form the first overtone ie
sin 2 0 = 2 sin 0 cos 0.
The overtone wac found to be very stable both in 
terms of amplitude (0.2%) and frequency at 1 KHz.
The amplitude error, over the period of simulation
"i
(10 cycles), did not grow or decay.
The simulations have revealed no intrinsic errors or weaknesses 
in the logic design of SOIC II. However, over 130 hours of 
simulations has represented only some 90 seconds of SOIC II 
time and thus the only realistic method of testing SOIC will be 
after construction.
Cranfield has also simulated the interconnections logic, mainly 
with a view to establishing the viability of the proposed inter- 
element connections topology and for finding ways to automate 
element allocation within these topological constraints.
The original topology ie that described in this thesis was 
found not to be suitable for the most complex simulations and 
has since been modified. However, the topology described is con­
sidered suitable for a wide variety of other applications, 
particularly the economical implementation of sparse matrices.
A
The new topology , which provides a slightly wider choice of 
inter-group paths, has been found to work on all the problems 
attempted sc far, including some involving over 95% of the avail­
able machine elements.
11.2 Discussion
SOIC II appears to have provided a possible solution to the 
problem of fast, economical simulation. It also has shown many 
characteristics which are worthy of further investigation.
If DDA complexes of this form are available, there is the pos­
sibility of having several of these (with local buffer storage) 
connected to a general purpose digital computer acting as a 
central processor. Such systems could deal with many problems 
involving iterative solutions ie turbine blade vibration, 
distillation column design, bridge vibration analysis, etc. Com­
puting systems such as this could provide several orders of 
magnitude improvement on current solution rates. (A private 
communication from the Cambridge Engineering L a b o r a t o r i e s i s
See Addendum 1.
even more optimistic!)
11.3 Future Work 
This would appear to fall into two categories
(a) Final construction and subsequent evaluation of SOIC II 
in the simulation activities for which it is intended.
(b) More general evaluation of the second order difference 
technique, with a wider sphere of application in mind. 
Particular issues of interest are:-
(i) The efficacy of the machine algorithms - should 
there be more algorithms or better ones - should 
all elements be able to perform any function in 
the system repertoire? Are predictors the best 
approach or should algorithms more akin to the 
methods of Adams or Runge-Kutta be attempted?
(ii) The inter-element connection topology. Is there
a topology that will.suit most incremental computer 
applications, that is economical, and will not 
be deleterious to the system? Is the topology 
specified for SOIC II too sparse, or otherwise?
Do some topologies suit element allocation by 
computer algorithm better than others?
(iii) Would a second order difference system benefit
from a local buffer which is fed on an exceptions
R e f e r e n c e s
1 Thomson, J
"On an Integrating Machine having a New Kinematic Principle." 
Proc Royal Soc (London) Vol A2A, 262-265 (1876)
2 Bush, V
"The Differential Analyser, a New Machine for Solving 
Differential Equations."
J Franklin Institute, Vol 212, pp 447-448 (1931)
3 Sorensen, E G
"Construction and Maintenance Report on the UCRL 
Synchro-driven Differential Analyser."
Univ California Road Lab Rept, UCRL-1717,
February 1952
4 Fifer, S
n
Analogue Computation.
Vol III pp 665-714, McGraw-Hill 1961
5 Forbes, G F
"Digital Differential Analysers."
(Pacoima) 1957
6 Leondes, G T, Rubinoff, M,
"DINA, a Digital Analyser for Laplace, Poisson, Diffusion 
and Wave Equations."
Trans. Am, Inst, Elec, Engrs, Vol 71, p 303, 1952
7 Westinghouse Electric Corp Contracts Repts
"Numerical Solution of Partial Differential Equations 
on a Parallel Digital Computer."
HASA Control, NAS5-2730 (1963)
8 Adams, D P
"Multi-special Purpose Computer."
Proc 1962 Workshop on Computer Organisation.
Spartan Books, P89 Washington 1963
9 Wallman, H
"An Electronic Integral Transform Computer and the 
Practical Solution of Integral Equations."
J Franklin Inst, Vol 250, pp 45-61, July 1960
10 Tomovic R, and Parayanovic, N
'Solving Integral Equations on a Repetitive Differential 
Analyser."
Trans IRE on Electronic Computers, Vol EC-9, pp 
pp 503-506, December 1960
11 Hyatt, G .P, Ohlberg, G
"Electrically Alterable Digital Differential Analyser." 
Proc AFIPS Spring Joint Computer Conference, 
1968, pp 161-169
12 Hatvany, J
"The DDA Integrator as the Iterative Module of a 
Variable Structure Process Control Computer." 
Automatica, Vol 5, No 1 pp 41-49, 1969
13 Strauss, Jon C, (Ed)
"The SCI Continuous System Simulation Language (CSSL)." 
Simulation, December 1967, pp 281-303
14 Teska, J T, Maddox, H M
"Verdan Logical Description"
Autonetics Report EM-1808, March 1963
15 Sizer, T R H, (Ed)
"The Digital Differential Analyser."
Chapman and Hall, London, 1968
16 Norton, B R
"Numerical Approximation."
Routledge and Kegan Paul Library of Mathematics
17 Owen, P L et al
"CORSAIR - a Digital Differential Analyser."
Elec Eng, Vol 32, No 394, P740, 1960
18 Owen, P L et al
“The DDA and its Realisation in Digital Form."
Elec Eng, Vol 32, p 614 and p 700, 1960
19 Doran, J F
"The Serial Memory DDA."
Maths Tables Aids Computation, Vol 6, p 102 (1952)
20 Lance, G N
"Numerical Methods for High Speed Computers."
Iliffe, 1960, (London)
21 Mesniaeff, P G
"DDA Computes Elevator Travel Curve."
Control Engineering, September 1970, pp 76-77
22 Rae, W
"The CETA Digital Simulator."
Company Report. MEMBRAINE, Poole, Dorset, 1971
23 Steinberg, L
"The Backboard Wiring Problem; a Placement Algorithm."
SIAM Review, Vol 3, No 1, P37, January 1961
24 Elshoff, J L, Aulina, P T
"The Binary Floating Point DDA."
Fall Joint Computer Conference, 1970,
Vol 37, pp 369-376
25 McGhee-, R B, Nilsen, R N
"The extended Resolution Digital Differential Analyser; 
a New Computing Structure for Solving Differential 
Equations."
IEEE Trans, Vol C-l, pp 1-9, 1970
26 Paul, R J A, Galhand, H B
"Design and Some Applications of a Generalised Integrator." 
Proc IEE, Vol 114, No 9, pp 1193-1205, September 1967
27 Parasuraman, B
"Solution Time Comparisons of Digital Computers and DDA's." 
Proc IEEE, pp 324-326, March 1972
28 Benyon, P R
"A Review of Numerical Methods for Digital Simulation"1 
pp 219-238, November 1968
29 Martens, H R
"A Comparative Study of Digital Integration Methods." 
Simulation, pp 87-94, February 1969
30 Turtle, Q C
"Incremental Computer Error Analysis."
IEEE Trans Comm Electronics
Vol CE-82, pp 492-495, September 1963
31 Gilbert, E G
"Dynamic Error Analysis of Digital and Combined 
Analogue-Digital Computer Systems."
Simulation, pp 241-257, April 1966
32 Kutta, W
Z Math Phys, Vol 46, p 435 (1901)
33 Gill, S
"A Process for the Step-by-Step Integration of 
Differential Equations in Automatic Digital 
Computing Machines." Proc,Camb.Phllos#Socf 1959 P9&«
34 Bradley, R E and Genna, J F
"Design of a One-Megacycle Iteration Rate DDA."
Proc 1962 Spring Joint Comp Conf, pp 353-364
35 Hannover, G
"Automatic Patching for Analogue and Hybrid Computers." 
Simulation, May 1969
36 Kurokawa , K
"All IC Hybrid Computer Eliminates the Patchwork 
from Programming."
Electronics, pp 100-107, March 17, 1969
37 Bekey and Karplus
"Hybrid Computation."
(Wiley)
38 Milne, W  E
"Numerical Solution of Differential Equations."
(Wiley) 1953
39 Kopal, Z
"Numerical Analysis."
Chapman and Hall, 1961
40 Frbberg, C E
"Introduction to Numerical Analysis."
Addison Wesley, 1964
41 Gracon, T J, Strauss, Jon, C
"Design of an Auto-Patching System for Analogue 
Computers."
AF1PS, Spring Joint Computer Conf, 1970 
Vol 36, pp 31-38
42 Gill, A
"Systematic Scaling for DDA*s."
Trans IRE on Elec Comp, Vol EC-8, pp 486-489 
December 1959
43 Knudsen, H K
"The Scaling of DDA1s."
Trans IEEE on Elec Comp, Vol EC-14, pp 583-590, 
August 1965
44 Kella, J
"On the Reversibility of Computations in a DDA."
Trans IEEE on Elec Comp, Vol C-17, pp 283-284 
March 1968
45 Leake, R J, Althou, H L
"DDA Scaling Graph."
Trans IEEE on Elec Comp, Vol C-17, pp 81-84 
January 1968
46 Bywater R E H, Levering, W F
"A Prograranable Extended Resolution Digital 
Differential Analyser."
Radio and Electronic Engineer, Vol 42,
No 5, May 1972, pp 203-212
47 Bywater, R E H
"One Step Integration Method for Digital 
Differential Analysers."
Electronics Letters, 1970, Vol 19, 
pp 613-614
48 Bywater, R E H
"Digital Integrator Design Incorporating an 
Output Scaler." .
Automatica 1971, Vol 7, pp 735-739
49 Bywater, R E H
"A Modified Gray to BCD Convertor"
Electronic Engineering, May 1972, pp 67-68
50 Burgess, A D, Coales, J F, Stojak, P F, White, G W T
"Modelling and Control of Complex Industrial 
Plant using DDA and a Multiprocessor 
Configuration."
Private communication, University of 
Cambridge, Department of Engineering 
CUED/B - Control/TR47(1973)
Design base for realising high-resolution digital differential analysers
R. E. H. Bywater, B.Sc. (Eng.)
Indexing te rm s : In te g ra tin g  c irc u its ,  D iffe re n tia l analysers  
A B S T R A C T
The limited information transmitted between integrators in binary and ternary systems has led to consideration of 
multibit methods. However, conventionally implemented, these are expensive, because each integrator requires 
not only a more complex interconnection matrix, but also a full multiplier. The use of 2nd-order differences 
leads, in many instances, to a cheaper system, because, in a fast machine, the integral increments will not change 
significantly between solution steps. ..
LIST O F  S Y M B O L S
Y
X
Z
AX, A Y  etc. 
A 2X, A 2Y  etc. 
R
F[Y], F[Z] etc.
I[Y] etc.
n
m
P
K
t
X co) y Y (0J etc. 
X(i)> etc. 
j
= integrand 
= independent variable 
=  integral
= increments ofX,Y etc.
=  2nd-order differences of X, Y  etc.
= low-significarxe integral residue 
= fractional portion of mixed quantities Y, Z 
etc.
= integer portion of Y  etc.
= R I
= X, Y, Z • register bit iength s 
=  A Y  ’
=  integrator scale (gain) factor 
=  time
= initial values of X, Y  etc.
= values of X, Y  etc. at iteration i
=  integrator ’fan-in’
1 INTRODU C T I O N
The digital differential analyser (d.d.a.) has been a subject 
of interest, for many years, as a special-purpose computer 
for the solution of problems involving differential equations, 
in particular,for process control,1 navigation and missile
guidance 2--1
Basically, it consists of an assemblage of digital integrators 
and other items of special-purpose logic, i.e. function gener­
ators. input/output interfaces and controllers. There is, 
thus,an obvious parallel with the familiar analogue c o m ­
puter/controller. Its digital character allows certain advan­
tages over analogue processors to be readily realised, such 
as, integration with respect to variables other than time, 
unlimited precision which can be traded with speed, and good 
low-frequency stability.5 Further.it can be simply inter­
faced with a general-purpose digital computer (g.p.d.c.) to 
form a’hybrid system.
However, because the performance attainable with a d.d.a. 
is bought by sacrificing a general-purpose computing facility, 
it must achieve a considerable advantage over the g.p.d.c., 
in what it can do, to be justifiable. (Occasionally,the sheer 
simplicity of d.d.a. techniques can be a justification in itself, 
i.e. for c.r.t. vector generation.)
Basically, a digital integrator realises the solution of the 
lst-order-difference equation
AZ =  Y A X
This is a substitute for the Ist-order differential equation 
y' = f(y,x)
The g.p.d.c. is usually a sequential processor, with central­
ised storage, and hence the logic architecture invariably 
allows whole-number representations of variables and whole-. 
number transfer of information between ’integrators’. On 
the other hand,d.d.a.s often have separate integrators for
Paper 6624 E, rece ived  11th August 1971
M r .B y n a le r is  ic ith the Departm ent o f  E le c tro n ic  & 
E le c tr ic a l Engineering. U n ivers ity  o f Sur / t y ,  G uild ford , 
S urrey .England
138
each derivative of a problem so that all the derivatives for 
a solution step can be computed at the same time. This is 
one way in which the d.d.a. can gain its speed advantage over 
the g.p.d.c. However, because the number of wires connect­
ing integrators is very high, simultaneous routing of sig­
nals poses a severe technology problem.
To realise the full potentialities of parallel processing, inis 
signal routing must be accomplished in a minimum of time. 
Various systems have been proposed, using both patchboards3 
and logic systems.7 The latter have the advantage of being 
electrically programmable,and cause less signal discontin­
uities and crosstalk effects when high-speed logic families 
are used. However,both would be difficult,or impossible,to 
implement if other than very simple information-transfer 
systems were needed.
It is clear that both digital computer and d.d.a, solutions to 
problems are subject to similar types of errors,namely 
roundoff and truncation.6 Both errors are due to signal quan­
tisation and describe those pertaining to the dependent and 
independent variables, respectively.
Roundoff errors reflect the ’coarseness’ of information trans­
mitted by integrators;i.e.the fewer bits involved in carrying 
the integral, the higher the error at any solution step. Thus, 
the integral consists of two parts: the most significant part 
(AZ), which is transmitted, and a less significant part, which 
is retained and accumulated with subsequent integral incre­
ments. This leads to the following system of equations for 
Euler's (rectangles-summation) method:
Y(i) =  Y (i_l) + 2 A Y (i)
A Z (i) + R (i) =  R (i-i) + Y (i) A X(i)
(1)
Fig. 1A indicates the register structure for implementing 
eqn. 1. Thus AZ(i),as a representation of integral, m ay be 
in error by an amount at any iteration where
0 « e(i) <  + 1 (2)
if roundoff is downwards, or
-  V2 ^ e(i) <  + V2
if symmetrical roundoff is used. Clearly, from eqn. 1, the 
latter can be arranged by setting R (0) to V2(AZ),at the start
input
>'
S a y
Y •
1
AZ i R
^ m  J
n
4--------- ;--— ----- £— ------------ *
Fig. 3 A
B asic d .d .a .-re g is te r s tru c tu re
PROC.1EE, Vol. 119, No. 2, FEBR U AR Y 1972
registers. This reduces the solution speed at the same rate 
as it reduces the per step roundoff error. This is clearly 
unacceptable in many circumstances, although it does, in 
Euler's method, similarly reduce the truncation error.
(b) Transmit a greater proportion of the integral; i.e. in­
clude more bits of lesser significance. McGhee and Nilsen8 
have shown that a considerable number of bits need to be 
included in AZ.if the higher-order algorithms (e.g.Heun) 
are to be justified. It has been suggested that up to half the 
R  register should be transmitted in some cases, for clearly 
there is no point in reducing the truncation errors in a solu­
tion, if the roundoff errors are dominant. In a totally-paral- 
lel machine, this seems to present an almost intractable 
interconnection problem, as the number of wires involved 
is the product of the number of interconnection paths and 
the bits in each integral transmitted. For a 16-integrator 
16 bit machine, about 4000 wires would be needed. One 
alternative, that is data serialisation, would normally carry 
an unacceptable time penalty. (Sequentially-processed 
machines, or those using only serial arithmetic do not, of 
course, encounter this problem).
The subject of this paper is a solution to this problem in­
volving the use of 2nd-order differences, i.e. the differences 
between successive integral increments are transmitted, 
rather than the increments themselves. The advantages 
gained are a reduction in the number of wires needed for 
interconnecting integrators and removal of the large, full 
multiplier used in conventional rnultibit machines. The 
design and performance limitations of such a scheme will 
be discussed in the context of the more popular integration 
algorithms in current use.
2 - 2ND-O R D E R - I N C R E M E N T  M E T H O D
The 2nd-order-increment method is based on sending 
changes in integral increment A 2Z between integrators, 
rather than the true increments AZ.
If the sampling rate for a digital integrator is fast compared 
with the rates of change of the variables involved, the Y- 
register contents will not change significantly between solu­
tion steps (iteration's). Consequently, successive values of 
AZ will also only differ by small amounts. This argument 
applies to both ternary- and multibit-transfer systems, al­
though it is evident that small changes in Y  will be reflected 
earlier in AZ  in integrators with longer AZ  registers.
The basic units, for an integrator using the 2nd-order differ­
ence approach, are shown in Fig. IB. It will be noted that 
familiar registers, such as those containing Y q ), 2AY(j),have 
now been replaced by temporary stores for subproducts. 
These stores are little different from the original ones in 
terms of length, but have a totally different function.
The operation of the integrator is, in some respects, similar 
to that of a conventional integrator. The differences are 
mainly:
(a) The input and output must be modified to accept and 
generate 2nd-order differences of variables. Thus 2nd-order 
inputs (SA2Y5 must be accumulated to.form 2 A Y  before the 
Y  register can be updated; and, at the output, differences of 
successive integral-increments (A2Z) must be formed from 
the integral.
(b) In order to scale the integrator output, all variables
Y, S A Y  etc. must be changed to KY, KZAY, where K  is effec­
tively a potentiometer fraction. By doing the scaling inside 
the integrator, and in particular, performing the multiplica­
tion by K  on 2 A 2Y, considerable hardware savings are real­
ised. It is evident that the size of the multiplier K 2 A 2Y/n 
will depend on the constraints imposed on the 2nd-order dif­
ferences and the integrator fan-in. If only ternary transfers 
are allowed, the product will take three values (—  j, 0, + j), 
where j is the fan-in. If, as is often the, case, j equals 1, the 
multiplier is avoided; otherwise, j is usually not greater than 
10. Thus, a unit-row adder may be used in a 'two-bits at a 
time' multiplication scheme, as shown in Fig. 1C. Alterna­
tively, for low-fan-in integrators, it may be economically 
worthwhile to form all the products (j, k), and store them in 
a random-access memory (r.a.m.).
PROC. IE E ,  Vol. 119 ,N o .2, F E B R U A R Y 1972
iz ‘patching*matrix
V »
algorithm'
matrix
K2 * Y(i-D
1
kscsa'y )
interruption
(optional)
A Z
Fig . IB
Scaled 2nd-order-d iffe ren .ee in te g ra to r
Vh. selection 
iK, ±2K
single-row adder
r.h. selection 
± 4 K , *8K
Fig. 1C
2 A 2Y  by K  m u lt ip lie r
Having secured the product K S A 2Y, it is an obvious step to 
modify the hitherto S A Y  and Y  registers to contain K S A Y  
and KY, respectively, thus reducing their update to mere addi­
tion; i.e.
K2AY(i) =  K2AY(i_1) + K 2 A 2Y (i)
and (4)
KY(i) =  K Y (i.1) + K 2 A Y (i)
The values of K Y  on the first three iterations of a solution 
are
K Y (0){ 2 A Y (0)=  0} (5a)
KYa) = KY(0) + K-AY(1)
=  k y (0) + k z a 2y (1)
( K 2 A 2Y (1j to be formed in the multiplier] (5b)
K Y (2) =  K{Y(1) + Z A Y (2)} =  K { Y (i; + I A Y (U + S A Y {2)} 
=  K Y (0} 2 K Z A Y (0)+ 3 K 2 A 2Y (1) + K 2 A 2Y(2) .
IK2A2Y (2) to be formed in the multiplier] (5c)
139
•*“ —  w ~ — i— - - - ~ —  -— -----o ---- —  —  •—  --c*— \------ ------
of integral, but a difference, it is not necessary to store those 
bits of K Y  which are more significant than the most signifi­
cant bit of the R  register. This is because 2nd-order changes 
of integral are completely defined by the most significant 
bits and overflows from the fractional portion of the K Y  and 
R  registers (see Appendix 7). This allows the K Y  register 
to be reduced to contain FlKYi, typically a 507o reduction 
in register length. Furthermore, the adders which form 
F{KY'^/ and R^j,) need only be a similar length.
The interconnection matrix may take any form, from a totally 
time-domain-oriented selection using, say, shift registers, 
through partial time and space systems,7 to a totally parallel 
’crossbar'-type matrix. The form is unimportant from the 
viewpoint of this paper, as time selection will inevitably 
terminate in some form of increment accumulator or a space 
selection in a tree of adders followed by a register.
The weighting matrix is also of secondary importance to this 
paper, and it is only shown for its position in the logic layout. 
The matrix may be implemented quite readily for the Adams- 
Bashforth algorithms, as suggested by Bywater.9
It is shown in the following Section that 2nd-order differ­
ences, greater than one, can-occur under certain circum­
stances. These may be catered lor in the logic by one of 
two methods:
(a) Interruption: If, for any integrator, | A 2Z | >  l,an inter­
ruption can be sent to the control unit, which will initiate an 
extra, dummy, iteration. This will, in effect, allow' the extra 
K{£AY(i) + £ A 2Y(j+1)} addition(s) to take place. If the d.d.a. 
is working m  a nonreal-time mode, such as for general prob­
lem solving, this extra computation delay will be of little 
importance. It will merely be removed from time plots etc.
If real-time process-controlwork is being executed, then a 
’performance margin’ must be left, and the d.d.a. allowed to 
run a little anead of the process. If the d.d.a. is either coup­
led to a g.p.d.c., or has a suitable buffer store, this margin 
may be accounted for and controlled.
(b) Multibit transfers: The author considers that, for real­
time work, the addition of a busbar to carry A 2Z =  ± 2 is 
preferable to using ’interruption’ methods. This is because 
the frequency of such interruptions is somewhat unpredict­
able. The logic in an integrator must, of necessity, already 
exist to detect such differences, whichever method is used. 
There is thus no great increase in integrator cost to imple­
ment this alternative scheme.
3 D Y N A M I C  R A N G E  O F  2 N D - O R D E R  D I F F E R E N C E S
Although an integrator of the type described could be scaled 
so that 2nd-order differences (of + 1,—  1, 0) only, could 
occur, this would be wasteful of the Y  registers’ dynamic 
ranges.- Thus, larger differences should be catered for. This 
requires a knowledge of the likely occurrence of such differ­
ences under various conditions. The manner in which these 
cases are handled will depend on the results of such an 
analysis. .
3.1. Integration with respect to time
Hero, the independent variable may be taken to be regular, 
uncuanging increments, scaled by some predetermined factor. 
If the basic integrator-register structure is as in Fig. 1A, 
with relative significances of 'words’ as illustrated, the 
worst-case positive change of AZ, i.e.
^2z(i) = |AZ(i) - AZ(i_i)}
will occur for
R  ^  R(max) =  2(n_1) —  l\
S A Y  =  2(m "1) -  1 (
(6)
Y  =  K(2P~1 - 1 )  t
0 « k <  1 j
where k is a scaling factor (potentiometer fraction).
140
. XVI x' V
=  l(max) for S A Y  <  2 <7>
=  -  2(max) for S A Y  « —  2KY <  0 )
Further, if S A Y  =  0, i.e. Y  =. constant,
A 2Z € (+1,0,-1) 
only.
In practice, for n m, !a 2Z! will rarely exceed I. Thus a 
2nd-order difference-integrator can be realised using a 
ternary-transfer system, with some additional provision for 
the exceptional excursions of A 2Z.
The statistics surrounding the occurrence of I A 2Z|> 1 de­
pend greatly on the application to w'hich the integrator is put, 
the extent to which use is made of the dynamic range of the 
Y  register, and the fan-in to the integrator.
As a guide, if the Y  register is linearly scanned through its 
complete range from —  2(P_1) to 2^P-1' —  1, using the maxi­
m u m  allowable value for SAY, i.e. 2(m - 1 ' —  1 throughout, 
then AZ will traverse its complete range in the shortest 
possible time. This will naturally maximise the opportuni­
ties for A 2Z greater than 1 to occur. Table 1 shows the 
frequency f(70) of such occurrences for some sample values 
of p, n, m. The theoretical probabilities g(%) are also indi­
cated.
T A B L E  1
O C C U R R E N C E  O F  !a 2Z| > 1  F O R  I N T E G R A T O R  W I T H  
M A X I M U M  S A Y
Register lengths Occurrence |a 2Z 1 >  1
Y S A Y R f (measured) g (calculated)
bits bits bits % °/o
12 4 8 <  0*1 <  0 1
12 5 7 <  1-0 <1-0
12 6 6 5-8 5-0
12 7 5 66 70
12 8 4 100 100
16 7 9 <  1-0 <  1-0
16 8 8 7-0 6-0
16 9 7 73 74
Appendix 7 indicates a simple method for determining this 
worst case, from a probability viewpoint. However, this is a 
very pessimistic estimate for practical purposes, but does 
show that, for n —  m  greater than about 2 or 3, these occur­
rences are quite rare.
A  more realistic estimate may be obtained, by considering 
a typical integrator application, such as the 2nd-order dif­
ferential equation
x + x =  0 (8)
If this is set up with various initial conditions and register 
lengths, the results are as in Table 2, where f is averaged 
for two integrators over one complete cycle.
T A B L E  2
M E A S U R E D  O C C U R R E N C E  O F  |A2Z| > 1  F O R  SO L U T I O N 
O F  x + x == 0
Initial conditions Register length Occurrence
X X Y R S A Y
bits bits bits °/0
0-75 0 0 12 6 .6 2 1
0-50 0 0 12 6 6- 0-4
0*75 0 0 16 8 8 1-8
0-45 0*0 16 8 8 0-7
PROC. IE E ,  Vol. 119, N o .2 ,  F E B R U A R Y  1 9 ?'/ .
occur in the region n -  m.)
3.2 Integration with respect to arbitrary variables 
If the independent variable is not time; i.e.
z =  /  ydx or Z =  Z(YAX)
a multiplication of two variables is involved, both of which 
may change at any time during computation. Whatever varia­
tions are found to be possible for a given AZ, these must, 
for consistency, be allowable for AX, as the latter is derived 
from the output of some similar integrator.
If the registers Y, A X  are taken to be fractional [—  1 «
(Y, AX) <  + 1], so that their relative significances always 
allow AZ  to lie ir. the same range, then it may be seen that 
A 2Z can vary greatly in the worst case. Thus A Z  depends 
heavily on A X  (and hence A 2Z on A 2X). In comparison, S A Y  
tends only to have a 2nd-order effect. The maximum 
dependence occurs for Y  =  ±1, whereupon AZ =  A X  and 
A 2Z —  A 2X. Hence, if a given integrator is ’patched', with 
its AZ output connected to its own A X  input, any value of 
A 2Z is possible within the dynamic range of the AZ and A X  
registers.
However, the rate of swing of such an output can be control­
led, by suitable scaling, to yield reasonable sampling and 
solution rates.
The frequency with which |A2Z | might be expected to exceed 
1 need not therefore differ significantly from the time- 
integrator case.
4 C O N C L U S I O N
It has been indicated that a limit is imposed, by simple (e.g. 
ternary) information-transfer systems, on the usefulness of 
the higher-order integration algorithms, such as Adams- 
Bashforth, due to roundoff-error accumulation.
The system makes use of the fact that 2na-order differences 
A 2 associated with a d.d.a. solution are generally 'well con­
ditioned',.due to the high sampling frequency, or iteration 
rate, that may be achieved. This allows the ternary tech­
nique. with a few modifications, to be employed in the trans­
mission of A 2 rather than A information. The svstem is well 
behaved, up to a limit where the effective lengths of A regis­
ters approach half that of the whole number Y  registers.
The use of 2nd-order-increment methods has allowed expen­
sive full-field multipliers to be avoided. (The technique of 
scaling differences could be applied equally well to integra­
tors employing only lst-order transfers.)
(Springer, Wien. 1961), pp. 139-209
5 SIZER, T. R. H. (Ed.): 'The digital differential analyser’ 
(Chapman & Hall, 1968)
6 GILBERT, E. G.: ’Dynamic error analysis of digital and • 
combined analog-digital computer systems’, Simulation, 
April 1966, pp. 241-257
7 HYATT, G. P., and O H L B E R G ,  G.: ’Electrically alterable 
digital differential analyser’. Proceedings of the 
AFIPS joint computing conference, Spring 1968,
pp.161-169
8 McGHEE, R. B., and NILSEN, R. N.: ’The extended resolu­
tion digital differential analyser: a new computing sirue- 
ture for solving differential equations’, IEEE Trans., 
1970,C-l,pp. 1-9
9 BY'WATER, R. E. H.: ’One-step integration method for 
digital differential analysers', Electron. Lett., 1970,19, 
pp.613-614
10 B Y W A T E R ,  R. E. H.: 'Digital integrator design incor­
porating an output scaler', Automation, 1971,7,
pp. 735-739'
7 A P P E N D I X
Probability treatment of worst-case integrator operation
A  worst-case analysis may be carried out, assuming that the 
maxi m u m  possible value of L Y  is maintained for a mono­
tonic swing of Y  from —  Y(m ax) to + ,Y(maxj. By so doing, 
the m a x i m u m  rate of change of Y  is realised and hence the 
greatest chance of obtaining values of j A 2Z j greater than unity. 
If the value of Y  is taken..as a mixed number, with the integer 
fraction portions coincident with the significances of the AZ 
and R  registers, respectively, then all AZs are integral, and. 
R  lies in the range •
0 « R  +(1 -  2“n)
where n is number of bits in the R  register. (It is assumed 
that Y, AY  and AZ are all signed quantities using twoS-com- 
plement notation).
Thus, for this analysis, it is only necessary, over two itera­
tions, to consider the behaviour of the R  register, ana the 
'fractional portion' F[Y'] of Y, for a given value of L'ATh The 
latter will, naturally/have only fractional significance unless 
the length of the A Z  register (and hence BAY) is greater 
than that of R. If the weighted sums of the Y, R, A Y  fractions
can, for a given pair of iterations, have integral portions
differing by mere than one quantum, then j A 2 Z ! >  1.
If, for simpHcity, Euler's algorithm is chosen, the register 
transfers are
■Y(i) =  Y (i-l) +  S A Y (i) ( (9)
AZ(i) + R(i) =  R(i-i) +  Y(i) At(i) \
and comparison is being sought between l{F[Y(j)] + R(i_i)} 
and
l{F[Y(i+l)] + R (i)} 
i.e. between
l{F[Y(i)] + Rd-!)} and l{F[Y(i)] + 2 A Y (i+l) + R (i)}
If other algorithms are used, or other integration methods, 7>9 
suitable weighting factors must be appended to the contents 
of each register. However, as most other algorithms would 
only provide 2nd- or lesser-order changes to the contents of 
either the Y  or R  registers, little difference will be encoun­
tered in the results.
F[Y(i)] and R(i_i) are assumed to have an equal chance of 
any value in the range 0 to + (1 —  2"n). The further addition 
to the fractional portion of the sum of these two fractions of 
F [ Y ( i ) ]  + 2AY(j+1) may produce a sum with a different inte­
gral portion. However, to compute the chances of the integra 
portion changing, as distinct from the two sums differing by 
more than 1, is not easy. However, the situation may be 
shown graphically using a 'layer cake' type of diagram for 
any value of SAY,Including values greater than I (F ig . 2a).
The limitations of the system lie in a slightly variable real­
time solution rate due to the occasional interruptions in 
computation (if a ternary system is used), and a slightly 
greater logic cost. Objections to the latter will almost cer­
tainly fade as l.s.i. techniques become more established.
The small number of leadouts from each integrator, owing 
to the retention of a 2- or 3-wire transfer system, makes 
current lead frames quite adequate.
5 A C K N O W L E D G M E N T S
Particular thanks are given to Prof. W. F. Lovering for his 
help and many valuable suggestions, and to Prof. D.R. Chick 
for supporting this work.
6 R E F E R E N C E S
1 H A T V A N Y ,  J.: 'The d.d.a. as the iterative module of a 
variable structure process control computer', Automa- 
tica, 1969,15, pp. 41-49
2 DONAN, J.F.: 'The serial memory d.d.a.', Math.
Tables & Other Aids to Comput., 1952, 6, p. 102
A  digital-integrator design is proposed, which permits the 
use of 'extended' resolution techniques for increasing the 
speed-accuracy product of solutions, without precluding the 
use of an economic, all electronic, interconnection facility.
PROC. IEE,  V o i . i : J , X o . 2 ,  F E o . l i ’A E Y  1972
resenting the subtotals for various values of F[Y^)J and 
R(i_1). As the latter two can take any fractional value, with 
an equal chance of occurrence, any increment of area, in the 
plan view of the solid, has an equal chance of being initially 
encountered.
•h-
F[VJ
a
t
b
axis. For A 2Z =  (0,1) a composite curve is shown, hence 
its different form from that of the others.
1 2 1
0-5 1-5
SAY, 1
1
2AY(i*lf°'5
3 1
1
SAY, = 1-5
A
1
F ig . 2
R -re g is te r  behaviour eve r tico ite ra tion s
SAY. =2-5 (1*1)
F ig . 3
Plan mew o f  F ig . 2  
^  A 2Z = 2 A 2Z =  3 A 2Z =  4
The first sum F[Y(j)] + appears as an oblique slice,
as in Fig. 2b. Removal of the integer portion modifies this 
solid to that of Fig. 2c.
The next increment {F[Y^)] + ZAY(i+jj} *s a ’wedge’ of 
height F[Y(i)], plus a cuboid of height ZAY(j+1), as shown in 
Fig. 2d. The total is thus a stack of these solids, suitably 
distorted in the vertical axis to fit surface-to-surface.
In plan, this becomes, for say, Z A Y  =  V4, V2,1,1V2, 2V2, as in 
Figs. 3a-3e, respectively. The precentage area of each square, 
shaded, corresponds to the probability g of A 2Z =  2, 3,4 
etc. The m a x imum value IA2Z| can take is
{SAY(niax) + 1 —  2-(n-1)} 2 A Y > 0
or - { S A Y ( m a x )-l] S A Y  <  0
(10)
  total
—  MY(i)] +  R'tti,}
100
75
8 25
SAY
F ig . 4
W orst-case 2nd-o rd e r-c liffe re n c e -in te g ra to r opera tion
142 PR O C .1E E, Vol. 119, N o .2 ,  F E B R U A R Y  1972
A P P E N D I X  2 
DIGITAL INTEGRATION ALGORITHMS AND THEIR ERRORS
" *1 could have done it in a much more complicated way1 
said the Red Queen, immensely proud.i!
Lewis Carroll
%
C O N T E N T S
PAGE
.1.0 INTRODUCTION 189
1.1 General 190
1.2 Scope of analysis 191
2.0 ELEMENTARY METHODS OF INTEGRATION 192
2.1 Euler’s prediction 192
2.2 Trapezoidal prediction 192
3.0 ERROR COMPONENTS IN NUMERICAL INTEGRATION 194
3.1 Rounding errors 194
3.2 Truncation errors 199
3.3 Inherited errors 200
-4.0 IMPLEMENTATION BY THE USE OF CURVE FITTING 202
4.1 Simple correction techniques 203
4.2 Higher order algorithms 207
4.3 Single step methods 211
• 4.4 Practical notes on implementation 215
5.0 SEQUENTIALLY PROCESSED INTEGRATORS 219
6.0 COMPARISON OF INTEGRATION METHODS FOR SELECTED PROBLEMS 221
L I S T  OF S Y M B O L S
f(t) Continuous function of time which will be sampled for 
digital integration.
AZ,R Most and least significant portion of a digital integral. 
Az is outputted to other machine elements.
Yn Sampled integrand at time n.
i
Y A difference variable ie a forward, backward or
central difference operation on Y.
At Sampling interval for Y.
h Arbitrary variable interval associated with a continuous
function.
Other symbols are defined in the text.
/
1.0 INTRODUCTION
1.1 General
. . (28-29)
The process of numerical integration , whether implemented
by hand, general purpose calculator or special purpose digital
. . (30 31)
integrator, is subject to several forms of error * . All of
these result in errors when applied to the solution of problems on
a computer, and must therefore be reduced to an acceptably low
level. Furthermore, a knowledge of their nature, predictable or
otherwise, must be sought in order to achieve a certain solution
accuracy over the required interval. A table of results for variou
integrator forms is given at the end of this Appendix.
A digital integrator can be represented by:-
m = n - 1  , ,
r = z Y + Y .
n m  n
m  = o
AZ + R - f Y , Y
q q 
Y (q+1 )’ Y (q+D (q+1 )
n
q 0  
p >_ 0
0 > R > AZ
at time t .
n
It is tne number of terms taken in the second identity that largely
willdefines the precision of the integration method used. Y
(q)
be defined eitner in terms of backward (V), forward (A), or 
central (6 ) differences, where
vf (t) = f(t) - f (t-h) I
/II
Af (t) = f(t+h) - f(x) >
, I
&f(t) = f(t+h/2 ) - f(t-h/2 ) where h is the interval )
(dt) over which the iteration takes place.
1.2 Scope of Analysis
For the purposes of the present discussion, the following assump­
tions will be made about the integrator structure:
(i) Simultaneously Operating Integrators.
The evaluation of a solution through an interval 
t = to (t^+At) requires all machine elements to 
be operated once. Therefore every integrator is 
predicting the course of events from t^ to (t^+At) 
as 110 information exists about the solution during 
this interval.
(ii) Tlie Interval !h f (=At) will be Assumed Constant.
This rules out integration methods such as that due 
to Gauss•
2.0 ELEMENTARY METHODS OF INTEGRATION
*
If a curve y = f(t) is to be integrated over the interval tQ to t^,
then at any point t2, the point t^, where t^ 
deduced, say, from Taylor*s series by:-''
t
y(t3) = y(t2 + h) = y(t2) + h . y (t2)
... higher powers of h.
2.1 Euler*s Method
If the integration method is Eulerian, the first term only would be 
used: if trapezoidal, the first two: if paraboloidal, the first 
three, etc.
If the interval *h* is small and the curve /well-conditioned*, it 
is usually sufficient to calculate the error induced, through 
taking only un,f terms, by evaluating the component due to the 
missing (n+l)th term. Thus the Eulerian method has an error approx­
imately equal to h.y*(t2) in the interval t2 to t^.
2.2 Trapezoidal Prediction
An interesting manoevre can be effected by launching a tangent 
(slope = y f(t2))from Unless yn (t2) = 0 where n >. 2, this is
obviously going to give a poor prediction (on average, not much 
better than Euler). A better prediction may be obtained by launching 
a chord from t^, of slope y* (t2) over two intervals tc t^. Graph­
ically this looks like:-
= t2 + n, may be
h "
+ ~ r  (t2 ) +
Z •
i. t"v3 t
This may also be obtained by developing Taylor?s series.
Let t2 - n and t3 ~ t 2 = t2_tl = Therefore equation (4) 
becomes
a2 h^ 111
y ^ i i  = y ^  + h -y + —  • y'V.^ + —  y /_% + ••• --(s)
h ii . '
(n+1 ) •’'(n) ’ *w  (n) ' 2 » * y (n) 3 j * y (n)
similarly
l2 h3
yr \ - yr ^ +ti*y1/' ^  + —  'v"/ -,n+ — *yT1 *, % + .. (6)(n) . (n-1 ) 7 (n-1 ) 2 , - (n-1 ) 3j ■7- (n-1 )
Differentiating equation (6 ) m times yields
^  »»i t, 3
2 !
y ’(n) ■ y '(n-l) + h -y"(n-l) + ~ y (n-1 ) + + ••.(?)
2 3 '
m m . , m + 1 h m + 2 ■ h m+3 . /ox
y fnl = y fn-ll + h>y------- - y + ----y + "  (8)
W  tn 1J (n-1) 2 ! (n-1) 3! (n-1)
substituting equation (8 ) in (5) and (6 ) yields:-
3
y (n+l) ~ y (n-1 ) + 2h -y,(n) + T  7 (n) + ”  (9)
3 # ■ 2
The third term error now involves (h ) in lieu of (h ) and is
therefore smaller (if h is small). This assumes that there are no
singularities in y = f(t).
3.0 ERROR COMPONENTS IN NUMERICAL INTEGRATION
When equations 1 and 2 are implemented, three types of error evolve.
3.1 Rounding Errors
When a quantity is represented in a radix *r* to a finite number 
of places *n*, then an error e will, in general, accompany it where
|e| < £ . r . ~ ^        ,.......(1 0 )
Even this assumes that rounding up and aown is possible. If only 
rounding down is allowed (a frequent constraint on a computer imple­
mentation), the error lies in the range
« -n0  <_ e < r
Both of these forms are unpredictable and hence non-linear. However, 
the latter is biased ic has a mean value of j.r^ and therefore 
causes the integral to inexorably drift in a given direction (for a 
given sign of integrand slope). The former is unbiased ie has a 
zero mean value and is therefore less troublesome. Together with 
the influence of the other errors, it is still unpredictable so that 
worst case or maybe stochastic error analysis is the only way to 
keep track of its build up during a problem.
Round-off errors have two sources in a digital eomputer:-
(a) In the Initial Condition Representation
This is a most unfortunate place to introduce an error as it 
will propagate itself throughout the entire problem time. The 
error is merely a function of the number of places to which 
quantities are represented in the i.ntegrand (Y) registers.
For most DDA*s, this is between 12 and 24 bits ie e lies in
the range
2 12 < lei < 2 lk
(if Yq is assumed to lie in the range:
-1 < Y < l-2~n).
where "n" is the number of bits in the y register.
(b) In the Integral Transmitted Between Integrators
Normally,the solution time for a problem is set by adjusting 
a scale factor "k" for Az, when transmitted. A downstream 
integrator therefore receives k.AZ and not AZ so that, strictly, 
its integrand is updated thus:-
Y , . = Y , ■ n v + K.A Z , x where 0 < K < 1 .............. (1
(n) (n-1 ) (n)
"k" is often merely a shifting index (as this is simple to 
implement) so that &Z is scaled down by k = 2 m .
If k = 1 (m = 0) there is no rounding error provided that the 
total integral is transmitted.
As m increases, so does the error, inferring that longer 
solution times incur greater rounding errors^as implemented.
For K = 1 :
Inte­
grator^ 4r
Inte-
Y 1
xi 4,
1
I AZ j
Yn 1
2 !)
Zn f(Yn ’Y (n-l)’ etc)*
, ,. = Y + R = Y + AZ, ..1
(n+1 ) n n n (n)
grator 2 <1_____  ____
Z fn+Tl= f Y^Yr,*Tl>Yn> etC)*R2
Thus equation 2 is modified to make = 0 }as the integral may be 
transmitted in its entirety. In practice, this makes the solution 
time for a problem too short (h large relative to (t^ 
inducing unacceptably large truncation errors; see below. Thus m  
is usually greater than 0  yielding:-
Y i
Integrator 1
Integrator 2
mo
AZ + R = R , + f (Y , Y, ' .
n n (n-1 ) n s (n-1 )
etc) .
Y , _v = Y + 2~mAZ . 
(n+1 ) n n
AZ0 j R 0 I LZ . • N +.R . = R + f(Y, 1N,
2 | 2 | (n+1 ) (n+1 ) n (n+1 ;
etc;,
t=t
Although R.^  contains a portion of j Y^ dt, it is not transmitted
o
due to lack of significance in the Y^ register. Hence AZ^ is in 
error by £ where
0  < e < 2- '‘n+1  ^ or 0 < e < 2~n     ...(13)
depending on whether AZ is rounded symmetrically.
Thus Y_ is in error by e where:- 
2 y
0-(m+n+l) or 0 < e < 2  n^+m  ^
0  < e < 2 —  y
-  y -
as appropriate.
/
But (m+n) is a constant (equal to the number of bits in the Y
register). 'Hence, regardless of "m", the error is the same (as
received by Y^) . The riifferenc-e merely lies in how often an error
e is allowed to be introduced to This is, of course, equal to
the number of iterations in the solution ie (t -t )/h.
n  o
In general, an error of a given range introduced more times in a 
solution will build up a greater error. This applies both to 
biased and unbiased errors.
Rounding Errors - Worst Case
The most pessimistic situation is found (for +ve integrals) wherein 
AZ + R = 1 - 2 n where m is the number of fraction bits in Y (or R). 
This yields the maximum rounding error as AZ = 0 and R is full.
A vast number (2m) of initial conditions can induce this rounding 
error and hence the Tciriterion of pessimism1 lies in finding the 
combination of these to do the most damage in a given concatenation 
of machine elements.
Consider the conformation "below
It is evident that a stable or neutrally stable loop of integrators 
will not give the worst case as either (i) a decaying sinusoid, or 
(ii) a decaying monotonic function will be time limited in its 
ability to generate rounding errors. The former, in fact, because 
of its possible excursion through +ve and -ve y, will tend to 
induce a degree of error cancellation. The worst case is therefore 
to be found in a straight "chain" of integrators or an unstable 
loop such as y = ke3t.
First Consider a Chain.
Evidently if each integrator can, at all times, be coerced into
producing the maximum rounding error, the output of the chain will 
be worst case (for a chain).
For a ternary machine (rounded) the error is, worst case, 2 ^AZ . 
transmitted, ie ^_2  ^ (AZ = 1). Hence the error in Y ? is y_ 2 
after one iteration and if Y. *= 1 — 2 n , R. = 0 for all time
h o )  _  , u + p  h o )
then Yj will be in error by ^.m.2 ‘ ' after m  iterations. This
error in Y. will have its greatest (%) effect on Y^ if Y^ is mini­
mal (=0) and. R^ maximal ie 1 - 2  n . The rounding error from 
integrator (i) can now be judged by the delay in producing its 
first in relation to the time integral.
If the rounding error is asymmetric, effectively Y ^  - 0, =
0, R(i) = 1, R ^ y =  1. ie Y.dX should be a ramp (slope = 2 n) but 
the output is zero. A worst case analysis is therefore inappropriate 
because:
(a) it is unlikely to occur,unless problem scaling 
has been grossly in error.
(b) Rounding errors in any but trivial length problems 
will be virtually random, so that a worst case error 
of |.2 n per iteration is very pessimistic. For 
random errors j'e I < I (■? .2 n where m is the number of
r 4 3
iterations. This is a reasonable benchmark for such 
errors although about 5% of iterations will yield errors 
rising above ±j^ .2. t
A particular integrator structure of interest in the field of DDA’s 
is that in which V  = 1. This is commonly called the Ternary AZ 
system as AZ e (+1,0,-1).
Here, the error on AZ ie [AZ - (/ y at)| lies between 0 and 100% 
if AZ is unrounded (as is often the case) or ± 50%jif rounded. The 
latter may he implemented by two methods (both are used).
(i) Set R initially to AZ/2 in machines in which
R is constrained to lie in the range 0 ^ R < 1.
or
(ii) If R £ A set AZ = 1 and R to R - 1: n  ^ * n n
if R < set AZ = -1 and R to R , - . 
n 1 n n+ 1
This latter assumes R can assume +ve and -ve values.
3.2 Truncation Errors
Referring to equation (4), it is evident that, regardless of roundin 
errors, to take only a finite number of terms from TaylorTs series 
must result in an error being introduced into the integral j y dt.
If Mhn is small, it is evident that the dominant error component 
will be the first term in Taylor*s series that is excluded.
In view of the fact that in a simultaneous machine, every iteration 
executed by every integrator is partly extrapolative, the accuracy 
must be dictated as much by this as by any retrospective corrections 
applied.
I!
If time t =t« .  the current iteration has, in effect, to estimate
t3
f(t3) by extrapolation, deduce / f(t),dt from it, and when 
the correct value of is known, make any necessary retrospec­
tive correction. If high accuracy is required, this initial 
"guesstimate” must be consistently good because the correction is 
one iteration late. An integrator working in a dynamic system 
must necessarily not only produce a correct integral but also 
produce it at the right time.
Prediction for any system is necessarily a dangerous practice but, 
at least, as physical systems will be most often represented in 
simulations, it seems reasonable to say that the first derivative 
of a sampled quantity must be well conditioned. On this basis, 
trapezoidal extrapolation can be entertained.
3.3 Inherited Errors
At any iteration, "n':, an integrator output will not only be a 
function of the truncation and rounding errors already discussed 
but the truncation and rounding errors inherited from previous 
iterations. On the first iteration, there will only be an inherited 
rounding error from the digitization of the initial conditions.
Thus the inherited errors will be non-linear due to the influence 
of accumulated rounding errors, although in a practical digital 
integrator, the component due to truncation effects will tend to 
dominate.
Using the error due to truncation only, and assuming a chord 
trapezoidal extrapolation, at time t = n, the correct integrand in 
an integrator might be z^ and the real one y^. The difference
A new point y (n+1y is computed from y (n+1) = y (n-i) + 2hy'n *
However, this is in error due to y (n)anc* ^ig^ier powers being 
ignored. Thus Z (n+jj would be more correctly computed from
» 3
z f -v = z , t\ + 2hy + h /3.z 11 * .
(n+1 ) (n-1 ) J n (n)
thus
e (n+l)‘e (n-l) = 2h(y ^ Z W  ' h3/d*y ” W
Using the mean value theorem in the * interval1 y 1. .z.!, , the
(nT
recurrence relation
e (n+l)-e (n-l)_2h®n =
where g = f (t ,y) is obtained. )
Over a small interval, y JI* .and g^.may be considered constantIn)
yielding
e^ = A(q)n + B(q)n + K/h where A,B,K are constants
obtained from the initial conditions where
2 2
q = hg + j (1+h g )>> 1 for g > 0  
giving an exponentially rising error, or
2 2
q = -hg + / (1 +h g ) for g < 0
I
(°8)
giving an oscillating error of rising amplitude.-'
If, as an example, the sine/cosine loop is implemented, the indiv­
idual contributions of truncation error at any iteration "n" may 
be obtained by replacing the calculated value at time t = n by the 
correct value obtained to a sufficient accuracy by some other 
means. (Sin (wt) and cos(wt) may be obtained by series summation
to any degree of precision). The new value y = f ( t ) c a n  
then be compared with the correct value at t = (n+1 ) and replaced 
by it, and so on.
4.° IMPLEMENTATION BY THE USE OF CURVE FITTING
The integration formula for ”n” terms in Taylor’s series may be 
produced as follows:-
For n terms in TaylorVs series, n points on the curve y = f(t)
must be known. They may include the current ordinate plus (n-1)
previous ones which have been stored for this purpose. The points
m=n«l
are to be fitted to a polynomial y = f(t) = Z a .t ie of
m=o
order (n-1 ) . Mn" points can, in fact, be used to ’fit* any poly­
nomial of order (n-1 ) or greater, but the lowest order is picked 
on the basis of the curve being x^ e1 1 conditioned and using "h” 
sufficiently small that this is reasonable. (Digital integration 
must be governed by lax^ s common to all sample data systems and may 
thus be ’guarded* by the same means ie external forcing functions 
can be monitored with a view to detecting all ill-conditioned 
changes as far as the sampling system is concerned.) This means 
that the function y = f(t) should be differentiable n times, each 
coefficient of t being continuous in the interval of interest.
n
Formally: P(t) = y = I ^ k ^  ^k w^ere
k=l
(t— ti) (t— 1 £) •••• (t—t ^ ) (t—t ) •••• (t*”^ )
(V t l ) ( t k- t 2)   ( V h k - l ^ V h k + l ) 5 ( t k- t n)
for t = t ( 1  < r < n) .
n —  —
where P(t) = f(t) for t = t^ k = 0,1,2 .... n, but
P(t) f f(t) for all other t except where f(t) is of degree (n-1).
f (n)(E) k=n
The error term is:-  j  II (t -t. ) where (£) = F(t).
TV , ... O K.
k=l
which is dominated by the first excluded derivative of f(t) in 
the Taylor’s series.
For small values of n, it is probably easier just to solve for the 
coefficients of P(t), by using simultaneous equations ie
*1 = K(n-1) tl(n'1) + K(n-2) +   * K1 ‘1 + Ko
• _ „ • (n-1 ).. : ' (n-2 ) . .. . *
yn " (n-1 ) n (n-2 ) fcn T 1 ^  o
ie bl = M - M  •
From this, Kq ,K^, .... niay be obtained; hence P(t). This
may be integrated from t=t^ to t=(t^+h) yielding the integral in 
the current interval in terms of f(t)n * In practice, it is prefer­
able to recast the correction integral in terms of the current 
value of f(t) and differences Af(t) as these are readily available 
and require less storage space than true ordinates.
Particular examples of this technique are:-
4.1 Simple Correction Technique
(a) Trapezoidal Correction
The “true" area should be £(f(t)n + This effectively
fills in the triangle left by Euler’s method:-
The register transfers for a given scaling factor are thus:-
Y (n) " Y (n-1 ) + EAY(n) where a Y ^  = f (t)n - f C t ) ^
A Z (n)+R(n) R (n-1 ) + Y (n) + ’EAY(n) )
To avoid adding significance to the Y and R registers to accommodate 
the | Z A Y ^ ,  the whole problem can be scaled up by two, effectively 
halving the integrator time constant.
Hence (14) becomes:-
Y (n) = *(»-!> + 2 EAY(n>
9  \  .-(15)
A Z (n) + R (n) " R (n-1 ) + Y (n) + EAY(n)
These transfers assume Eulerian Prediction.
For trapezoidal prediction, the value of Y ^ +^  is guesstimated 
by assuming Z A Y ^ +1  ^ = ZAY ^
■ V o  • » ’ «  ) ....... (1H
A Z (n) + R (n) = R (n-1 ) + Y (n) + EAY(n) - *zaY(ii-1 )^
ie on the previous step y^ was guessed to be Y^n-^  + ^AY^n_-^ »^
Y / N = Y / IT + Z A Y , \(n) (n-1 ) (n)
the initial area guess of:-
Y , - v + I ZAY,
(n-1 ) - (n-1 )
had to be modified to:-
Y (n-1 > * *“ Y (n)
for the interval:-
t (n-l) +  t(n)
In practice, (16) would be scaled up by two. As was shown by (9),
a better trapezoidal extrapolation can be used than a tangent from
the current value of F(t) ie a chord from f(t), lV. "This can-
n (n-1 )
not be obtained directly as it requires a knowledge of the slope 
f f(t) at a point. In a DDA integrator, only ordinates f(t) are 
available so that an alternative method must be sought.
(b) Method of Continuous Second Derivative.
It was assumed earlier that a curve passing through 'n' points 
should, for the purposes of interpolation, be continuous in the 
interval of interest. Hence, in the case of trapezoidal extra­
polation, it may be assumed that the second derivative f ,?(t) is
continuous and constant in the interval ( t ^t^). Ignoring terms of
2 2 . 
P(t) greater than t , this leaves P(t) = aQ + a^x + a^x , to fit
f(t)1 ,f(t)2, f(t)3 . Hence, f f(t)n ~ is constant over
this interval. Therefore, f (t)^ - ^ ^ 2  ” ^ W ] /  Therefore it
must be assumed that
SAY, = 2ZAY, * - ZAY, , v .
(n+1 ) (n) (n-1 )
This is a slope difference formula for the next iteration. This 
result can equally easily by obtained by the use of the mean value 
theorem in the intervals t ^ t ^  and t  ^ so as to establish f f (t)
at the mid-points of these intervals.
Thus:-
Y (n+1 ) = Y (n) + Z A Y (n+l>**
(guesstimate) = (current + (guesstimate) 
ordinate')
At the end of iteration "n", the real Y (n+x) come to light
and hence the area for iteration. f,n ” must be corrected by a quan­
tity:-
'k'k
| [Z(AY(n+i) - z(AY(n+i)^ 3
The register transfer equations are thus:'
Y (n) " Y (n-1 ) + ZAY(n)   •.......
AZ(n) + R (n) = R (n-1 ) + Y (n) +^ 3 ZAY(n) " 3 ZAY(n-l) + ZAY(n-2 )^(38)
(Note that in the trivial case of:-
ZAY(n-2 ) ■ ZAY(n-l) = ZAY(n)
ie a straight line
y = at + b
that equation (18) reduces back to:-
AZ + R = R, , v + Y . v + |ZAY, v
n n (n-1 ) (n) (n)
ie equation (14).).
4.2 Higher Order Algorithms
In order to further improve the speed/accuracy product of integrators, 
it is possible to use further terms in the Taylor's series. However, 
it is necessary to accompany a high order correction with at least 
a comparably good extrapolation. The reason is derived from the fact 
that the correction will be applied one iteration in arrears and 
therefore constitutes a rounding error for that iteration. If the
prediction i's "good*1, the magnitude of the correction will be mini­
mal .
Using Lagrange's interpolation formula to obtain P(t), the following 
higher order algorithms may be developed.
(a) Parabolic correction/Eulerian Prediction
P(t) = f(t2 ,t,k)
Y(n+i) is predicted to be
Hence correction at time t = (n+1) is:-
1 2  SAY(n+l) + 1 2  ZAY(n)
Typical register transfers would be:-
Y (n) " Y (n-1 ) + 1 2 ZAY(n)
AZ(n) + R (n) = R (n-1) + Y (n) + 5 ZAY(n) + ZAY(n-l) )
.(19)
(b) Parabolic Correction/Trapezoida1 Prediction (Tangent)
P(t) = f(t2 ,t,k)
Y(n+D  is predicted to be + ZAY^v).
Hence correction at time t = (n+1):-
1 2 SAY(n+l) “ 1 2 ZAY(n)
Therefore transfers are:-
Y, . = Y, 1V + 12ZAY * \
(n) (n-1 ) (n) ^
4Z(nj * R (n) = R (u-1 ) + Y (n) + 1 1 ZAY-(n) " 5 ZAY(n-l) )
Notice, once'again, that the trivial case of Y = f(t) = at+b 
would give:-
ZAY(n) ZAY(n-l) ZAY(m) " i 0
and hence (2 0 ) would become:-
AZ + R ' '*= R , n  + Y m  + 6ZAY f * . v 
n n (n-1 ) (n) (n-1 )
which is equivalent to equation (14).
(c) Parabolic Correction/Trapezoidal Prediction (Chord) 
p(t) = f ( t2,-t,k)
Y (n+1 ) *‘S Prec^ ct:e<^  to be:-
(Y(n) + . 2SAY * } - LAY *,).
Hence correction at time t = (n+1 ) is:-
Hence the transfers are:-
Y, . = Y, + 12 XAY/'\ Y
(n) (n-1 ) (n) /
k(21)
&Z(n) + R (n) = R (n-1 ) + Y (n) + 1 3 ZAY(n) - 1 7 2 AY(n-l)+ 6 2 AY(n-2 )
(d) Cubic Correction/Trapezoidal Prediction (Chord)
P(t) = f(t3,t2, t,k).
Y (ri+1 ) I'redicted to be [Y(n) + 2 ZAY(n) _ 2 AY(n-l)] 
yielding register transfers (suitably scaled) of:-
Y („) ■ Y (n-1 ) Y 2 A2 AY(n) , ) (2 2 )
AZ(n) + R (n) = R (n-1 ) + 3 3 2 AY(n) " 3 2 2 AY(n-t) + U 2 AY(n-2 ) j
(e) Predictor/Corrector Methods
In one sense, the methods described in paragraphs (a) thru (d) have 
been predictor/corrector methods inasmuch as both of these activities 
take place in an integrator. A more advanced predictor/corrector 
scheme would ensure that the "correction*' would take place in the 
same iteration as the prediction using preliminary estimates of the 
'end of iteration* conditions to evaluate the next ordinate y(n+]_)»
In general purpose computer simulations, the working quantities
would be y and y* where m £ n and *n' is current time. The nextJ m *
ordinate y^v 1V is predicted from the current ordinate (y ) and
(n+j~/ n
currdnt and past slopes Using the appropriate function of y*
ie y* = f(y,t), " Y ^  hben found and used in a second
(different) correction formula to give a better estimate for y^n+^ y
The fourth order Adams-Moulton formula is both typical and popular
for which the equations are:-
. y (n+l)P = y (n) + 24 [5 5 y'(n) ” 5 9 y '(n-l) + 3 7 y '(n-2) ^
_ 9 y '(u-3)]
yfP(n+l) = f[y(n+l)P *
then
y (n+1) = y (n) + I4 (9y '(n+l)P + 1 9 y '(n) "5 y '(n-l))
...(23)
+yW
y '(n+l) f(y(n+l)>t(n+l)) -
The difference between an^ ^(n+1) y*-e^ s t i^e error cor­
rection that was necessary (for that iteration only) and is thus an 
indication of the efficacy of the method. The method does not 
indicate the truncation (or round-off) error build up since t = 0 . 
(A knowledge of the sensitivity equations associated with the 
problem are necessary for such an estimate.)
Equations (23) (or any other predictor/corrector formula) may be 
recast in difference form suitable for a DBA*-
yP (n+1 ) = f[y(n)*AY(m)]
^(n+1 ) ~ L^ (n) *^(m) *^(n+l)
Clearly, evaluation of equation (24) is going to involve twice as 
much computation time (or effort) as the methods described in pre­
vious paragraphs and this must be weighted against its merits of 
precision. As twice as many computations are involved, it is
(24)
reasonable to suggest that, for a given computer word length, that 
round-off errors will be increased. This may partially account for 
the results obtained by Martens who indicated that the Adams-Moulton 
methods (using derivatives) were inferior both to Runge-Kutta, Kutta 
Blum and Runge-Kutta-Merson, in evaluating the step response of a 
fourth order system:-
2
n \ 0.5 s + 3.5 s + 1
GA CS) =  T------
(s+1 ) 4 '
for which the analytic solution is
2 3
y
Another disadvantage of Adams-Moulton (which it shares with other 
polynomial fit methods) is the inability to start, as previous 
ordinates and/or differences/derivatives are required. (These do 
not exist at t = 0.) Other, simpler algorithms must be used to 
"get it going".
4.3 Single Step Methods 
C33 3 Q AO)
Single step methods * (the better known being due to Pomge-
Kutta Ref: 32) use calculations from solely within the time inter­
val t =(n) to (n+1 ) to calculate y ^ +^^, and hence
t=n+l
/ 7 dt-
t=n
They are therefore self starting (needing no past values) and thus 
very amenable to step length changes during a run. However, in 
general, they do not give an error estimate unless severely modified 
(eg Runge-Kutta-Merscn.) . The step lengtn is then changed according 
to an algorithm based on:-
h - f ( e n >
This usually takes the form of step halving or doubling based on 
this error. There is, of course, no estimation in such an algorithm 
for error build-up due to round-off effects.
To deduce -y(n+]j anc* use ^  such that Taylorfs series is matched
t“Ti
up to the r power, an exploration is carried out in the region 
t(n) t° t(n+l) easting tangents from (yn ,tn ) a t  slopes determined 
by the slopes at the end points (y^n+g) ,t(n+ct) •
The following identities are formed:-
Kx = h.f(t,yn )
K2 = h,f(tn+dh,yn+ek1)
K3 ~ h *f ^ (n+cijh) ,yn+ 3lkl + W
K
. . . . . .  (26)'
4 = h *f(tn+a2h>yn + 32kl + '^ 2k2 + 52k3}I -
; ic = ^  + r2 k2
At this point, ctn>$n and are arbitrary.
The number of fR f identities determines the system order. The 
constants a , 3 etc., can be manipulated to yield formulae for 
y (n+!) ^ave certain qualities convenient for the computation
method used.
As an example, the fourth order R-K system can be expressed as:-
(a) For Equal Subdivisions of the Interval t^ to
ie al 3 a2 “ 3
K. = h. f (t ,y ) 
1 n Jn
K2 = h.f(tn+ i h ,  y„ + i - K , )
n 3 1V1'
K3 * h 'f(tn + ! h ’ yn - I K 1 + V
K4 = h -f^ ( n +l)> V * A  - K2 + V
K .8 ^1 + 3*2 + 3^3 +
(b) If is chosen as | and = i
A simpler formula results as is made zeroi­
se.
h-f(tn ,yn)
and
K2 = h.f (t^ + jh, + 1^ )
K3 = h'£(t(n+l)’yn + V
K = i  (Kl + 4k2 + r )
(27)
Runge deduced that a convenient substitution formula was
K 1 “ h 'f(tn* yn>
K 2 - h -f(tn + h^,yn
+iKx)
K 3
= h.f(tn
+ih>yn + Jk 2)
K4 = h -f(t(n+l)>yn + k 3)
(28)
if R 2 Rj ~ 3
and thus K - (k^ + 2 k2 + 2k^ + k^)
This requires less storage than for other a.
Merson^30  ^ suggested a method of determining the interval for the 
Runge-Kutta methods which is very quick (only one extra computation)
and gives about 2 0 % faster solutions than the inevitably slightly 
over-cautious fixed interval trials.
He evolved the following modification to the fourth order method
The step change method was this:-
if e > accuracy limit, halve h and recompute that step
For a DDA, there are two ways of implementing equation (30):-
(i) leave the equations in their present form ana
structure the integrators to work with derivatives 
ie the GPDC approach suitably system-tailored and 
optimised for Runge-Kutta.
(ii) Recast equation (3) in finite differences V , A or 6 .
(29)
The estimated truncation error is e where
e - |  K3 + -£K5)/5
ft double h M it it
4.4 Practical Notes on Implementation
(i) It is apparent from equations (18) thru (22) that the higher 
order algorithms are increasingly expensive to implement both in 
terms of the amount of, and complexity of, the arithmetic and 
storage space required for previous ordinates and/or backward 
differences.
(Note that chord prediction used with cubic correction does not 
increase".the number of past ordinates to be stored, whereas one 
more is needed if only parabolic correction is used. The same 
applies to trapezoidal (tangent) prediction and parabolic correc­
tion.)
The expense incurred must be weighted against
(a) the increase in speed/accuracy product over simpler 
algorithms (allowing for any differences in iteration 
time.
(b) The magnitude of Th' in terms of real time that can 
be tolerated.
(ii) Integral Offset
Consider an integrator being fed from a "perfect" analogue to 
digital converter (ADC) whose output is compatible with the inte­
grator and which rounded down its output y « f(t) to n binary bits.
The Y register in the integrator will interpret the i/p f(t) as 
shown shaded.. This would cause a consistent error build up due to 
effectively rounding down all inputs so that an average error of
5 . 2  n would be introduced at each iteration. If an offset is built 
into the ADC so that inputs in the range
yo -£.2 n <y < yo + .J.2 n
yield an output y (instead of a range
, «-n »
y < v < y + 2  )Jo r Jo
then an improvement results.
If an integrator is feeding the one in .question^the same argument, 
of course, applies.
(a) For a ternary machine, the improvement can be implemented either 
by pre-setting all R registers to 5 if they are of the type 
working in the range 0 £ R < 1 or by pre-empting the real over­
flow at jRj > 5 if R works in the range -$ < R < 'J. The
algorithm would thus be:-
If R > 2 set AZ = +1 and perform R: = (R-l)
If R < set AZ = -1 and perform R: = (R+l)
(b) For a ncn-temary machine a method similar in principle to
(i)(b) can be used.
If this improvement is not carried out, there are many machine
situations which will show up this omission quite pointedly.
For instance, in the sine/cosine loop, the rounding errors are
all biased in a given direction as each quadrant boundary is
being crossed. Mien the boundary is crossed, they will all
round in the opposite direction ie sin 0 as it crosses 1 .
The result is a pronounced step in the radius of the solution
7~2 v 2, vj in the region of (wt) = tt. The step J ( s m  (wt) +cos (wt) &
has an amplitude of approximately one quantum of AZ (as would 
be expected). It is most noticeable on high order algorithms 
where the radius is being held very constant (save for a small 
!?g" causing exponential growth: see paragraph 3.4.3 'Inherited 
Errors*).
(iii) Initial Conditions
The higher order algorithms, trapezoidal and above, require deriva­
tive information in order to predict points etc* This
information is not available at the start of a computation. This 
problem may be overcome by several methods, the first of which can 
introduce an unacceptable error.
(a) Set all derivatives f (t) or backward differences f(t) to 
zero. This is obviously erroneous and becomes more acute as 
the order of the algorithm rises due both to the sensitivity 
of them to errors, and the time it takes to make the highest 
order stored backward difference correct (n iterations).
& I Q
(b) Compute by hand (or using a GP computer supplied also with 
integrator interconnection details) the derivatives in the 
region t = 0. This is tedious and, depending on the complexity 
of the algorithm, highly complex when has to be deduced.
(c) Do "nM iterations as a dummy run, say, starting with all 
derivatives equal to zero where V  is the number of derivatives 
needed for the algorithm. TheseV* iterations can then be repeated 
.H times using the computed derivatives from the previous dummy 
run. The correct derivatives should be approached monotonically 
provided the curve y = f(t) is well conditioned in the interval
t = 0 to t = nh. In practice these dummy runs are better run 
on a GP to avoid delays due to constantly reloading the DDA 
with trial initial conditions.
(iv) Use of Self-Starting Algorithms
The well known Runge*Kutta methods are an example of self-starting
. . . (25)
algorithms the second order one of which is mentioned by McGhee
The implementation is due to Heun. Once started, it has an accuracy 
rather better than the trapezoidal method. The disadvantage of 
this approach is the greater complexity of the basic algorithm which, 
j.n the case of Faun, is now spread over four distinct periods com­
pared with two for the non-self starting methods.
(v) Negative integral Residues
For positive integral values, it is evident thatAZ and R are 
both going to be positive, lying in the ranges
G < AZ < N N > 0
0 < R < 1 
*
In the case of negative integral, two choices exist.
(i) 0 * AZ > -N N > 0
0 < R < 1
or
(ii) 0 * AZ -1 > -N N > 0 
0 > R > -1
This is really the same question as fixed point division •using 
negative dividends and/or divisors wherein AZ and R are synonymous 
with the quotient and remainder
Example (-9) t (+5) = -2, remainder (+1)
or = -1, remainder (-4)
In GP computers it is general practice to make the sign of the 
remainder equal to that of the dividend ie as per (26).Digital 
integrators, often fabricated using unsigned R registers effectively 
do not. It can be seen that there is virtually no difference 
between the two - sometimes one being better than the other and 
sometimes vice versa. The author7s conclusion is that this only, 
in practice, is going to have, at worst, a second order non-biased 
effect on the rounding errors. It may thus be ignored for most 
functions especially those which make equal excursions through 
positive and negative values of f(t). Monotonic and single quan­
tities may be slightly adversely affected although it must be small.
5.0 SEQUENTIALLY PROCESSED INTEGRATORS"
,lLife,s too short for chess11 
Henry Byron
Because of prohibitive hardware costs prior to the advent of the 
integrated circuit, DDA’s were processed sequentially so that there
was only one time shared arithmetic unit. The integrands etc., were 
drum or core store locations and therefore inexpensively realised.
The demand for high iteration rates has tended to highlight the 
advantages of the simultaneous machine despite its higher cost 
and complexity.
The sequential machine is still worthy of consideration because of 
its cost advantage and its ability to obviate the integrator inter­
connection problem by centralising information storage. Also, pre­
diction for a group of integrators is now restricted to one, as all 
others can nox* interpolate using that one prediction, provided they 
are processed in a suitable sequence. With only one integrator 
predicting, there is less induced error in a given group of inte­
grators .
6.0 COMPARISON OF INTEGRATION METHODS IN SELECTED PROBLEMS
(i) Effect of 1 Stage Delay Inserted into a Two Integrator 
Sine/Cosine Loop
Parameters:- Algorithm: Euler (y” + Ky = 0)
Initial Conditions: y, N = 50%
(o)
y ’ (o) -  0%
l(Y) ; 
(bits)
Delay
Inserted?
Divergence (%) 
at wt = 2tt
7 NO 6.25
7 YES 15.6
8 NO 3.12
8 YES 7.8
9 NO 1.56
9 YES 3.9
10 NO ....  0.78
10 YES 2.0
TABLE 1
Conclusion
One unit delay incurs times the error.
With or without delay, the error halves for each additional bit 
in the Y register.
(ii) Comparison of Solutions to Sine/Cosine Problem 
Using Various Digital Integrators
Problem: y 11 + K.v ° 0.
Initial conditions ^(o) =
Additional stage delays: none.
L(Y)
(bits)
Iterations per 
Circle
Divergence % 
after 1 circle
' . • 1
Algorithm
Pred/corr
6 201 6.25 EUL/-
7 402 3.12 n
8- 804 1.56 it
9 . 1608 0.78 ii
10 3217 0.39 ii
11 6434 0.19 >»
6 101 0.0 EUL/TRAP
7 201 0.0 n
8 402 0.0 ii
9 804 0.0 it i
10 1608 0.0 ii
11 3217 0.0 it
10 201 -0.8 SLOPE/TRAP
11 402 -0.4 ii
12 804 -0.2 it
22 66 0.015 ?ARAB/TRAP
23 133 0.0075 »»
24 266 0.004 ii
22
23
24
66
133
266
0.009
0.004
0.002
•
CUBIC/TRAP
it
i;
TABLE 2 
PREDICTOR/CORRECTOR METHODS
Algorithm
Step Length 
(Radians)
Divergence 
( after 1 circle)
( 0 . 1 7 x 10“2
2 R-K
( 0 . 0 1 8 x 10"5
( 0 . 1 -4 x 10”7
4 R-K
( 0.01 -5 x 10“12
TABLE 3
SINGLE STEP METHODS 
;   /, — _ _
(iii) Comparison of Integrators used in a Fourth 
Order DE
d4 d2
Problem: couDled tuned circuits — 2L + a  — —  + B = 0.
dt dt2
coupling coefficient = 0.28 (=K)
. . .  . . dv d2yin i t ia l  conditions = y /  . = 1, = 0,
(°) d t(o) d t(o)
, .3 • '
_ i _  . 2-1—  = o
2 3
(i-K2) dt3(o)
Algorithm
PRED/CORR
Step Length 
(h) rads
Divergence %
(after 1 complete envelope)
( EULER/ j o . 1 +1.2
( t r a p j p . o i -0.03
(TRAP/ f o . i 1.4
(TRAP \ o . o i -0.04
(EULER/ ( o . l useless
I- l o . o i +12.3
2 R-K 0.1 +0.3
4 R-K 0.1 -0.3
TABLE k
A P P E N D I X  3 
JUSTIFICATION FOR INTEGRATOR REGISTER TRUNCATION 
The SOIC Mk II machine, which is based on second order differences 
does not generate its output by the straightforward method of 
subtracting two successive integral increments ^'Z^y and A . 
It was found that sufficient information existed in the portions 
of Y ^  etc at significances less than AZ in the register struc­
ture to determine the second order integral differences.
Proof.
For the register structure shown:-
<1
t
! I U 1  1 
1
f i n
! -z »A  r  i R F U  2 J
i
i
i
where I[Y] etc means the (loosely) integral 
portion of [Y] etc ie M. S. portion and 
F[Y] etc means the LfS, or fraction 
portion.
If,for time integration, the slope prediction, trapezoidal 
correction algorithm:-
o
= (Y(i) +M^(i) +z& ^(i)^ is used, then:-
for a' 16 bit R register
AZ,.. Y,., AYf.. A2Y , m
p r C1) 1 _ -p rpr -I 4. 4.  ,
F [ ^ I ^ 3 ~ F [F[ 216 ] 216 216 3
2
However, A Z ^  = ^ Z . ^  *"^2 (i-l) -Wief.inition)
A2Z,.x A Z ... AZy. - v
* T r (i ) i_ x r (i; _ U “ l) ,
*’ ~p[6 216 216
AZ... A Z .. - v AZ...
. =  1 1 - ^ f e -  3 -  1  f - $ ■  ] -
A Z (i-l) '
AZ... Y,.v Y ... AY,.-v A 2Y,.\
and 1 [ - £ | >  ] - 1 I [ F  [ 4 i > ] +  - i | >  ■+.-? £ i ]
• r r A 2 z (i)1:. x , Y (i), Ir Y (i-1 ). t I r F r Isii1 + '
** 1 216 3~ V 216 3 [ 216 [ [  216
AY(i) + A i ) ,
2 16 2 16 1
— i jf £ ]+ -!(r.U + A!^i-2) j
1 1  216 ] 216 216 ]
AZ/.V AZ r • 1 \
-■ I [F [
Y (i)
Y...
- F [ f [  - I ^ ]  + 
2
A2Z.. v Y... 1
•• - f t  - i r f -  1 i-
. A z (i-X)
] ]1 ; 216
A Y (i) . A i )
.16
2 2
A Y (i> , A i )
9 1 6
(i-1). 
2  1
916
 ----- — V    .
Overflow from F (AZ)
A P P E N D I X  4 
UNREGULATED POWER SUPPLY TOLERANCING AND HEAT SINK DESIGN
Design Parameters
(i) Instantaneous minimum output voltage = 7.0 V
(ii) Mains fluctuation assumed (a) LOW -10%
(b) HIGH +5%
(iii) Rectification: full-wave silicon bridge
(iv) Allowable output voltage ripple: 1.0-'-V
(v) Maximum junction temperature of regulators: 30°C
(vi) Thermal resistance of regulators/neatsinks 
30°C/watt/sq in overall
Mean 0/P voltage at mains "low" = 7.5 V. (Mean + 5Vpk-pk ripple).
Assuming sensitivity to mains voltage is 130%. (Typical)
Therefore nominal 0/P voltage = 7.5 J^ l + - 1) x I.3J
: = 8.6 V
Therefore Mean 0/P for.mains high
= 8.6 [l + - i) :: 1 .3]
= 9.2 V • ■
Therefore worst case dissipation of regulators (at 1A)
>  [(9.2 - 5) x ljW 
= 4.2 W ■
4.2 x 30 .
— — 2 q— — sq ins/amp
4.2 sq ins/amp
A P P E N D'I X 5
WORD AND LABEL FORMATS FOR SOIC DATA
(i) Initial conditions
22 9
Type of Data (IC 
in this case)
20
Initial Condition (I.C)
(ii) Interconnection Data
Type of data
Single/double
Length
1 2
InterIntri
Connection Addresses
Element
Type
(iii) Iteration Count
Type of data
20
Number of iterations
(iv) Monitor Outputs
Type of data
20
20 Most significant (MS) 
bits of Y registers
(v) Interruptions
Type, of data Element
Address
Auxilliary register flags 
(Yi, Y2 , X^, X2) respec.
ADDENDUM i
PROPOSAL FOR A NEW SOIC Mk II INTERCONNECTIONS TOPOLOGY FOR 64 ELEMENTS 
Consider the 64 elements as a chess-board arrangement:
A o o
i
I
f
. . . . . . * T . . . .
1
A  J
■ . . . . . . . . J
1
?
}i
|
’i
!
A . .  1
i d  |
■
■
1 '
j
!
i
1 ■
r  - i
i  i
. !  . I .
■ i  
i
3
'ti
it
i
|
i  j
i*  i
i  .  f
f  j
!  J
\ 1  
i  I  
1 1 .
1 : S  
!  1 l
*  l
*  ir
; \ 
\ i
,1.
i '  1  
; }
‘ '•! ■ f !
K
1 1
1 A  1
7 0
!  1  
1 1
J
—
>3
Proposed method for an element of fan-in ~ 6 (TI and E) is that each 
recipient element shall have access to any one element of a row. (Six 
rows out of 8 are chosen for this purpose.)
Each group (ie row as represented on the map) shall thus nave access 
to certain combinations of outputs from 48 elements out of the 64.
Considering the chess board as a cylinder with a horizontal axis 
ie row 0 joined to row 7, the 6 chosen rows (groups) for a given 
group shall be those immediately adjacent (including itself).
Example: Group .2*s'-, (row 2) elements shall all have access to out­
puts from groups 7, 0, 1, 2, 3, 4 (one output from each). The 
outputs chosen by one element in a recipient group would not have 
to be the same as for another element in the same group. Thus 
element (2,6) could input Aq 2, A^ ,. and element (2,3) could
input A72, Aq2, A^^, a25, A37, A^7.
The method shall be similar to the method proposed in the body of 
the thesis ie largely time domain orientated. However, no dis­
tinction will now be necessary between intra-and inter-group 
interconnections as no intra group autonomy would be in force.
This topology is much less sparce than the "Castle1s Move" method, 
is slightly more costly but easier to program.
ADDENDUM 2
PROPOSAL-FOR ENCHANCEMENT OF SOIC Mk II MACHINE ELEMENTS * DYNAMIC
RANGE
Introduction
Simulations have shown that the lack of any scaling for AVI and 
Multiplication causes amplitude scaling difficulties, particularly 
when such elements are mixed with conventional "time integrators11.
Proposal
2
The inputs (A ) to AVI and M elements shall be presentable at
different significances relative to Ax and AY. This will give
o **8
powers of 2 scaling over a range 2 ->• 2-' . The extra AX and AY
significance shall be provided by extending the least significant 
ends of Ax and Ay.
However, in order not to change the basic machine element design 
(and an imbalance in the logic), the A extension shall not be
joined to the AY and AX registers, but merely used to generate a
. 2
modified stream of A signals. In order that this stream shall be
correctly sequenced as far as the 1 all important1 Y and X registers
are concerned, they will also have significance extensions. These
extensions will also not be.physically joined to the existing
registers. However, 'these extensions will provide the necessary 
.  ^: 2
"integral" control influence on the streaming of the modified A 
signals, to be presented to the machine element proper.
INOGIING A Y
RANGE
2
A Y ! 
 <
EXISTING AY 8 BIT AY  EXTENSION
EXISTING f [ y]
■ I
8 BIT P [y] EXTENSION
2 2 
Modified A Y (to M/C element) - A F[Y] extension.
U.D.C. 681.332.64
A Programmable 
Extended Resolution 
Digital Differential 
Analyser
R. E. H. BYWATER, B.Sc (Eng.)* 
and
Professor W. F. LOVERING,
M.Sc , M.I.M.C., C.Eng., F.I.E.E.*
List of
Z
Y
R
P
register
H,(0 
(^0) 
AT, AZ 
EA Y 
t
AX 
S(H)
8
CO
$
z
Symbols
integral 
integrand 
integral residue 
potentiometer fraction 
value of H  at time i 
initial condition (/0) of H  
increment of Y, Z  etc. 
summed increments for Y 
time (iterations)
increment of independent variable 
sign of H
round-off error due to variable discretization 
sine wave generator angular frequency 
arbitrary phase angle 
integrator slew rate
S U M M A R Y
A digital differential analyser is described which 
features the  econom ical use of parallel arithmetic, 
program m able interconnexions and internally-scaled 
integrators. It is suitable for connexion to  a general- 
purpose digital com puter to  form a hybrid facility and 
as such w ould be capable of solving dynamical 
problem s and sim ulating system s. The use of carry-save 
techniques and a 'trapezoidal' integration algorithm 
make real-tim e simulation possible at an acceptable 
precision.
* Department of Electronic and Electrical Engineering, 
University of Surrey, Guildford, Surrey.
1. Introduction
There are many problems, for example, in the real-time 
simulation of continuous physical processes, where the 
solution rate of a general-purpose digital computer 
(g.p.d.c.) is inadequate. This stems, in part, from the fact 
that the g.p.d.c. uses sequential processing methods to 
obtain a solution. Although the solution time may be 
reduced by taking wide steps in conjunction with such 
methods as the Runge-Kutta integration procedure, the 
improvement possible is limited by the solution error/ 
instability which may be tolerated.
Although the use of special digital integrators was 
suggested in the 1950s, the cost and speed of available 
circuits offered little advantage.1-4 The situation has 
changed with the availability of modem microcircuit 
elements, and it is now possible to build differential 
analysers (d.d.a.) to considerable advantage. These 
devices can make use of special-purpose logic to effect 
substantial performance gains over g.p.d.c.s.5, 6 The 
use of digital integrators obviates some of the basic dif­
ficulties of analogue methods. If is possible to include 
logic decisions in a program. Such integrators may be 
simply interfaced with a digital computer to make a 
powerful hybrid facility, and are programmable without 
needing special analogue switches or servo-potentiometers.
This paper outlines the design of a d.d.a. which is able 
to use currently available components. In particular, 
the integrator is described together with suitable, pro­
grammable, interconnexion techniques.
2. D.D.A. Principles
2.1. The Digital Integrator
The basic form of a digital integrator is illustrated in 
Fig. 1. The inputs are increments in the values of the" 
variables X  and Y. A  ‘clock’ signal is used to control the 
operation. At each clock pulse the inputs are in­
terrogated. The increment of A Y (+1, 0, — 1) is added 
to the previous value of Y and stored in a register. 
If the AX increment is positive, the contents of the Y 
register are added into the contents of the R register. If 
the AX increment is negative, Y is subtracted from the
The Radio and Electronic Engineer, Vol. 42, No. 5, May 1972 203
R. E. H. BYWATER an d  W . F. LOVERING
contents of the R register; if the A X  increment is zero, 
the R register contents are unaltered. The R register thus' 
accumulates the integral increments Y. AX. As the 
increment information can take only three states, it is 
known as the ternary transfer system. When the total 
in the R register reaches or exceeds some predetermined 
value ± M  (=2"), an output signal is produced and the 
number M  is subtracted from the register contents. 
Each output signal then represents an increment 
± M  (=AZ) of Y.AX, that is, AZ = Y. AX.
AY
Clock
c o n tro l
/^ ( in te g ra l  re s id u e )
Y ( In te g ra n d )
Add +Y, —Y o r  O
Fig. 1. Basic digital integrator.
In the simplest form, the R register is incremented 
only according to the current values of Y and AX. By 
the use of additional stores, the integration algorithm may 
be made to take account of the rate of change of Y and to 
introduce a corrective element. In this case, the. manner 
in which the R register is incremented is controlled by 
the current and previous values of A X  and AX. For 
simple time-dependent equations, the A X  input is pro­
vided by a clock. The ability to integrate with respect 
to any variable, however, adds to the power of the d.d.a.
In an analogue computer, multipliers and sign reversers 
are needed. These are not necessary in a d.d.a. since
xy = Jx dy + Jy dx
can be simply programmed, obviating the need for a 
separate multiplier. The complement of a digital number 
is always available, and this also makes special sign 
reversers unnecessary.
At the expense of more electronic gates, the AZ signals 
may be obtained as binary words representing a range of., 
values for the AZ increment. If this is done then, to solve 
a differential equation, subsequent integrators must be 
capable of accepting A Y increments in the form of binary 
words. If a word A Z  is used as a A X  input, then a binary 
multiplication is needed to form X.AX.
The effect of using ‘word AZs’ is to more closely define 
the value of an output increment. Thus, the difference 
between the true value of the increment and its 
quantized representation, is reduced. For every extra bit 
that is used to form AZ, the average per-step round-off 
error is halved.
A  d.d.a. being an assemblage of digital integrators and 
other special-purpose logic elements (function generators, 
and the like), must have these elements interconnected in 
various ways to solve different problems. This is the basis 
of hardware programming. If simple transfer systems are 
used to interconnect these elements, i.e. ternary, then the 
interconnexion logic will be simple and inexpensive. 
However, use of word transfers increases the cost about 
in proportion to the number of bits transmitted so that
an engineering compromise is needed between solution 
accuracy and machine cost.
2.2. Program Patching
Connexion between analogue integrators needs only a 
single wire which then can convey an ‘infinity’ of voltage 
levels. Digital transmission, however, requires either one 
wire per bit of information or the data may be serialized 
and transmitted a bit at a time along a single wire.
If the output of a digital integrator is serialized, con­
siderable time may be used in communicating integral 
outputs. For, say, 16-or 24-bit machines, this could be a 
large proportion of the total machine iteration time. 
However, the system does have the merit of simplicity 
and low cost:
The alternative, that is, parallel data transmission, is 
much faster, but somewhat more costly. However, the 
higher iteration rates that result allow simple integration 
algorithms to be used without loss of accuracy. The 
savings on algorithm complexity can do much to offset 
the greater cost of interconnexions. Furthermore the 
simpler algorithms tend to have a much more predictable 
performance.
It seems, therefore, that the best way to achieve a very 
high solution rate (as required for real-time simulation, 
etc.) is by parallel arithmetic techniques together with an 
efficient, interconnexion method.7
This paper describes a method of obtaining reasonably 
fast interconnexions which is not costly. The method 
outlined is a parallel-bit, serial-word version of a total 
interconnexion topology. By transmitting a certain 
portion of a unit’s output (integral), it represents an 
engineering compromise between a high-cost total 
integral system and the low-cost ternary transfer method.
3. Overall System
The d.d.a., as built, is a parallel-arithmetic, 12-bit 
analyser capable of interconnecting 16 units (integrators, 
etc.) and has an iteration period of about 3 ps (min). 
Of this period, 1 ps is spent in the machine cycle, proper, 
the other 2 ps in effecting the inter-unit connexions.
The integrators implement the trapezoidal correction 
method on an Euler prediction, and have built-in poten­
tiometers for scaling the output of each integrator over a 
wide range. Integration can be with respect to time or the 
output of any one unit. All unit outputs are transmitted 
as 4-bit rounded increments.
4. Integrator Design
A  digital integrator may be considered as a device for 
integrating (or summing) quantized, sampled input data, 
so producing a quantized output. This is equivalent to 
finding the area under the graph of Fig. 2.
The actual input as seen by the integrator is the set of 
points only, and it is apparent that some uncertainty exists 
(and errors may be induced) by the amplitude and time 
discretization. However, the integrator has only these 
discrete values to work with, and to what good effect it 
uses these points, either singly or severally, for each 
integration step, will determine the precision of its 
output.8 ’
204 The Radio and Electronic Engineer. Vol. 4Z  No. 5
PROGRAMMABLE EXTENDED RESOLUTION D.D.A.
Time
Fig. 2. Time and amplitude discretization of a continuous input 
function.
The integrator implements an Euler prediction 
followed by a retrospective (first-order) trapezoidal 
correction (Fig. 3). Thus on step (i), the current 
ordinate y is Y(i) (= Ya-D+AY(0). The integral is 
thus P. .Af(i) where P is a fractional scaling factor 
(potentiometer fraction). Using incremental notation for 
input and output:
or
=  (if the integrator is
to sum several inputs)
and
A Z(i)+ R(i) = !?(,_!)+P . Y(l). A j
 (1)
where AZ(i) is an increment of integral
i?(£) is the untransmitted residue of integral of 
lesser significance than AZ({).
The post correction (PC) using the new ordinate in­
crement SA y(i+1) is a triangle of area:
PC = iSAy(£+1).At(i+1).  (2)
Equations (1) and (2) may be recast as follows:
Ym = Y 0-„+l*Ym 1
az(I)+ k (I) = ■R(,-1)+i’.A<(0{r(l)+ii:Ay(,)} J  v ’
It has been pointed out9 that equation (3) may be recast 
so as to avoid additions of quantities involving long 
registers i.e. Y, R, as follows:
AZ(i) + R(j) = Ry.^ + P . YyyAtyy
\
A re a  A =  p red ic tio n  a t ? ( j_ D
B =  c o rre c tio n  to  A 
d ed u ced  a t  f(,-j
C =  p red ic tio n  a t
Time(t + 1)(i-1)
Fig. 3. Euler integration of y = f ( t )  with retrospective trapezoidal 
correction.
Although one more addition is involved, the two 
additions of XA Y are of very short words. Further­
more, as will be shown in the hardware description, carry- 
save techniques can be more effectively implemented.
5. Integrator Hardware
The system is completely built around currently 
available 7400 series TTL medium speed integrated 
circuits. Lately, several new circuits have come on to the 
market which could improve packaging and performance.
Ay
C arry
logic
re g is te r
G en era l
log 1
7483
l o g '0 <
(i-1)
(i)
(4  b its )  •
" fyt-U
7483
Y r e g i s te r
Y.P m u ltip lie r
7 483
A ZjR A dder
Ya d d e r
7 4 8 3
R r e g i s te r
7 4 8 3
O utput Az(l)
Fig. 4. Integrator dependent variable input logic.
In describing the integrators, it is difficult to be precise 
about the point where interconnexion logic gives way to 
integrator logic, particularly as the integrators, as built, 
can accept up to four inputs. However, the integrator 
will be described from the point of receipt of 2A y(i) 
prior to staticization. (Partial sums of SAy(0 will have 
been stored, as formed, in a register, and the final value 
will be the sum of the penultimate final sum and the last 
entry.)
5.1. Dependent Variable Logic
Figure 4 shows the logic for the 2A Y(i) accumulator 
and the associated summing logic. The design is arranged 
to make the best use of available microcircuits both 
from the cost and performance viewpoints. Thus despite 
some logic redundancy, the quadruple full adder (7843) 
frequently occurs. The system built uses four bits (2’s 
complement notation) to represent machine increments 
(AZ, Ay, AZ) giving a range — 8 ^  AZ, Ay, A Z < +7. 
An integrator fan-in of four thus causes SA Y to lie in the
May 1972 205
R. E. H. BYWATER and  W . F. LOVERING
range -32 < E A 7 <  +28, i.e. 100000 through 011100, 
i.e. six bits.
Carry-save techniques are used in the accumulator. 
The carry-out from the least significant (1 .s.) quadruple 
adder is stored in a single flip-flop, to be applied to the 
carry-in on the next clock pulse. After the final inter­
connexion clock pulse, the last ‘saved’ carry is assimilated 
into the intermediate sum of digits of the ms part of 
EA 7(J). The assimilator (Fig. 4) uses a logic array 
implementing:
x = a(bc)+dbc '1 
= a@(bc) }> ......(5)
and y = b ©  c = be + Be J
Carry-save allows selection of A 7s to take place at about 
8 MHz, despite the adder logic delays. The final 
carry assimilation delay is about 20 ns. This method 
compares with a typical, estimated, selection rate of 4 
to 5 M H z  if carry-save is not used. The six bit EA 7(i) 
is added, suitably weighted, to EA7(i_1} to form 
5 = p A 7 (0-iSA7(i_1). The whole expression is 
formed in two stages by two cascaded adders. Two 
possible reduced versions of B are
B = 2SA7(J)- P A 7 (i)-iEA7(i_1) = d + e + f ]  .
or >  (6)
B = EAy^+iXAy^-iXAy,,-!, = g + h + f  J
for which a total of six combinations of processing exist, 
i.e. (d+e)+f ,  (e + f )  + d, etc. The method chosen is 
(h + f ) + g  as it calls for the least logic without impairing 
the machine performance. The machine, with an inte­
grator fan-in of four and four-bit increments, needs only 
seven-bit adders at both the first and second stages. 
Pairs of quadruple full adders are used so that fan-in 
expansion can take place, if desired.
There is only combinatorial logic between the EA Y(i) 
accumulator all the way through to the AZ logic. 
(Fig. 4.) All the register updates, i.e. 7(i), R(i), etc., are 
formed as ‘spin-off’ results down the logic chain. This 
is only slightly more expensive than a system which forms 
and staticizes intermediate results at each stage, since 
although adder time sharing cannot take place, the 
expense of adder input selection logic is eliminated. A  
system, such as this, which uses a number of logically 
cascaded adders, gives a better performance than a time­
sharing adder system. The total carry propagation 
through all the adders is only slightly greater than that 
through one. In addition, delays in adder selection logic 
are obviated.
When the iteration has finished, the EA7(;_1) register 
is updated with the current value of EA 7(i) ready for 
the next iteration. The same clock signal is used for 
updating P (0, and AZ(i). At each iteration
(fEA Y( i)— -^ EA 7(_ j)) is added to the current contents 
(7(i_1)) of the 7 (ordinate) register. This is done 
with a separate set of three 7483’s (Fig. 4) so as to be 
consistent with the combinatorial method mentioned.
The EA 7(i_ register has a reset line which is actuated 
when the rest of the machine is reset. Ideally, at time 
t = 0, the SA7(j_1) register would be preset to the slope 
of 7. In practice, this initial slope is not known, unless a
‘dummy’ iteration has been carried out with guessed 
values of EA 7(0) to give a reasonable estimate. (This 
would be somewhat on the lines of the Adams-Moulton 
predictor-corrector scheme.)
However, for iterative computing and hill-climbing, 
it is desirable, at least, to have repeatable initial con- 
tions for all parts of the machine. Resetting the EA 7(/_ 
register to zero at t — 0 satisfies the condition of re­
peatability. The error induced is not very great as it only 
directly affects the first iteration.
The 7 register is capable of being overloaded (over­
flow), and suitable logic is appended to detect this. As 
only addition of 7(/_ and B can take place, the overflow 
(F) is simply described by
F =  1 if [S(7m)ifS (y (H>)]A
[S(V„) = SH)]
The 7 register can be foreshortened from 12 to 8 bits if 
higher speed, lower accuracy computation is required. 
Separate overflow logic is sited at the appropriate points 
on the *(0 adder, to cater for both situations.
5.2. Scaling Logic
In order to reduce the number of possible machine 
elements, and hence interconnexion logic, the output of 
each integrator should be scalable by some constant. This 
is equivalent to hardwiring a potentiometer to each in­
tegrator output in an analogue system. Furthermore, it 
was decided that, as machine expansion might demand 
that more than one scaled value should be made avail­
able, the logic design should be capable of such an expan­
sion merely in the form of added logic— not a complete 
redesign. Because scaling involves the multiplication of 
one number (the potentiometer fraction, held in a 
register) by the current contents of the 7 register, it 
would appear that a full multiplier is necessary. However, 
despite the fact that the machine contains 12-bit 7  and 
8-bit P fields, only a 12-bit product is required, of which 
4 is used for AZ. Thus P can be reduced to 4 bits pro­
vided these 4 bits are chosen (a) to average out to the 
desired 8-bit fraction over a large number of iterations 
and (b) the average P at any time is as nearly rounded to 
the desired fraction as possible.
The P register is constrained to perform in the manner 
outlined by adding a second register and adder which 
accumulate the amount by which the currently used 4-bit 
version of P differs from the desired fraction. The opera­
tion is not unlike that of the AZjR logic. Thus P is added, 
once per iteration, into its auxiliary P' register and the 
most significant 4 bits of the sum is used as the multiplier 
for forming 7(i).P. The less, significant residue is 
staticized in P' to be accumulated with P on subsequent 
iterations. (Fig. 5.)
To form a fraction 23/32 from a 4-bit multiplier (m), 
i.e. m = k/S where — 8 < k < 8, requires m to assume 
values of:
I, f, |, I, I, f, | ... etc. at t = i,i+1, etc.
The limit at which there is any meaning in the precision 
of P is determined by the rate at which 7(i) can change 
from its initial value at (/) while P is being run through a
206 The Radio and Electronic Engineer, Vol. 42, No. 5
rnuDKAivnviAtSLb EXTENDED RESOLUTION D.D.A.
cycle of values whose mean corresponds to the desired 
fraction (in this case 23/32 in 4 cycles). This will be 
determined by the machine accuracy required, the in­
tegrator fan-in, and the significance at which B is added
to Y,
N b it
. P'  
r e g is te r
(N—4 )  b it / ^ r e g i s t e r  
P
(Rounded to  4  b its)
Fig. 5. Scaling factor generator.
Before computation starts, P ’ is preset to +  ^  so that 
the top 4 bits of P ’ are suitably rounded to the nearest % 
quantum to give the best multiplier for any given step. 
For the example cited, the register contents are:
P =
p '
1
16
2 5
3 2
24-
3 2
2 3
3 2
2 6
32 e t c . -
P ' ( m s 4 ) 0
$_
8
&_
8
JL
8
■ 6_ 
6 e t c . -
P ' ( l s ) 116
1
3 2 0
3
3 2
-L
16 e t c . ~
t 0 2 3 4 5 6
The multiplier is fed with Yw on one side (12 bits) and 
P' on the other (4 bits). It was found that an economical 
multiplier could be formed from three 7483’s by encoding 
P' such that 2 bits at a time multiplication was carried 
out on either side of the adder, i.e. either ±1 or 
times Y(i) was set to one side and either +£ or — ir«) 
to the other. Thus the selection table for all P' becomes:
- 8 - 7 - 6 - 5 - 4 - 3 - 2 -1 0 1 2 3 4 5 6 7 8
L.H.S. -1 -1 12
_ i
2 2 2 0 0 0 0
_1_
2
_1_
2
j .
2
j_
2 1 1
R.H.S. ( x i ) 0 1 - 2 -1 0 1 - 2 - 1 0 1 - 2 -1 0 1 - 2 -1 0
Adder
c arry -in y y y y y y y y y y y y y y y
For P' = — f and — £, a carry-in of significance 21 (not 
2°) is required as both left- and right-hand factors to the 
adder are negative. In fact, occasional bottom bit errors 
thus introduced are of little importance compared with 
those due to per step round off error (relative significance 
to round-off is 2~q where q is the length of the R register).
5.3. Output Logic 
The scaled value of Y(i) is applied to the R register in a 
conventional way to form the sum (i?(/_ »+•?• Y(n). 
The 4 most significant bits of this sum form AZ(i), the 
least significant are stored as a less significant residue in 
the R register. The updating of this register is under the 
command of the main clock. As with other incremental
systems, the output (AZ) can be rounded correctly to the 
nearest quantum by setting R0 to + \ unit quantum before 
computation starts, thus ensuring that the error due to 
discretization (s) of AZ lies in the range —  ^ s < +
and not — | < e < 0 as would otherwise occur. The 
improvement is far better than two-fold as the error is 
now symmetrical about zero.
If required, the P register, multiplier, R register and 
AZ logic may be repeated several times so as to produce 
several scaled versions of the integral. The most econ­
omical approach is probably to have a scaled and an un­
sealed version of the output, thereby making only an 
extra R (AZ) adder/register necessary. The integrator 
may be connected to accept one 4-bit word (e.g. an 
output AZ from an integrator) as a A X  input. The 
multiplier is then used to form the product Y. A X  
(instead of Y.P used in scaling). Thus each integrator 
may be used to give either scaled time integration or 
unsealed integration with respect to an arbitrary variable.
If it should be required to form (Y. AXt + Y. A X 2) 
this can be effected by two integrators receiving the same 
A Y signal and different A X  signals. The outputs are then 
summed in the following machine element.
6. Interconnexions
The d.d.a. built is capable of interconnecting any 
machine element to any other by logic gating. The result 
is a compact system which not only allows programmed 
interconnexions, using paper tape or computer control, 
but also provides the hardware for entering initial con­
ditions to the machine elements.
For economy, a simple interconnexion system is used 
making use of the time domain to effect all the possible 
interconnexion paths (Fig. 6). The 4-bit outputs from the 
16 machine elements are concurrently staticized in a 4 
by 16-bit shift register under main clock command. They 
are then serially entered, via the shift register, to a 4-wire 
busbar connecting all the machine elements. Whilst 
this shifting is taking place, a shift register, sited in each 
integrator, is circulated. The pattern in this shift register 
corresponds to those AZ which the machine element 
containing this shift register requires to accept.
Thus a ‘1’ can be used for acceptance of a given 
increment, ‘0’ for rejection. Necessarily, as many shift 
pulses are required to circulate the data as there are 
machine elements.
Two embellishments can be applied to this method.
(i) The first, which has been implemented, is to add a 
second circulatory shift register to each integrator. 
This is used to determine whether any given 
accepted increment should be accepted per se, or 
complemented, i.e. a 0 for original form, a 1 for 
complemented form. This makes programming 
much more flexible and economical as inverters 
are rendered unnecessary. Thus, the same output 
from a given integrator may be presented in its 
true form to some integrators and in its com­
plemented (negated) form to others. This tech­
nique also makes it possible to restrict the range 
of P to positive values, only without loss of 
programming flexibility.
May 1972 207
Ft. t .  H .  B Y W A I t n  a n a  vv .  r .  L u v t m m j
(ii) The second possible improvement is to multiple- 
rank the AZ and selector shift registers, say, 
either in twos or fours and make each only a half 
or quarter of its original length. No extra logic is 
required to store the AZ or selection/negation 
patterns, only the ability to add more than one 
4-bit increment at a time into the EA Y^ t) 
accumulator. The extra logic is not very great. 
However, there is approximately a two or four fold 
decrease in the interconnexion time. This may be 
worthwhile in a large d.d.a.
An alternative way of reducing the inter­
connexion time is by a change of the inter­
connexion topology. This is discussed in the 
‘Machine Expansion’ section below.
—  AZ3 AZo AZ* . from '
J ^  in tegrator 1 etc.
4  by 1 6 -b it shifting A Z  store 
(1 per m achine)
Interconnexions , 
clock
G
Selections shift re g is te r  
(1 per machine e lem ent )
busbar
Fig. 6. Interconnexion logic.
7. Input/output
7.1. Machine Loading 
Since the interconnexion system communicates from a 
central point to every machine element it may be used for 
initial loading. The four AZ registers together with a fifth 
16-bit register are connected during the loading phase as 
one single 80-bit shift register.
The following initial conditions are serially entered to 
the 80-bit shift register (Fig. 7):
(i) The scaling fraction P
(ii) The initial value of Y( F(0))
i e :
(16)
A Z e n t r y
(1)
1 1
A Y se le c tio n
A /  co m p lem en tin g  p a t t e r n
D a ta  inpu t r e g i s t e r / A X s e le c to r
TT
S eria lized  P aralle l d a ta  D isp lay /m o n ito r
da ta  input e n try /h an d -k ey b o ard  panel
( l l in e )  ( 2 x 8  o r  1 6 lin e s )  ( I6 l in e s )
Fig. 7. Data entry/AZ store register matrix.
(iii) Interconnexion selector contents
(iv) Interconnexions sign selector contents
(v) AX  selector contents.
In effect, five 16-bit registers are connected as one long 
shift register to allow data assembly prior to distribution. 
The fifth shift register is arranged to accept data either (a) 
purely serially from a telecommunications line, (b) in 
parallel from a 1/0 busbar, or (c) separately from panel 
push-buttons. It is evident that making the AZ registers 
double as data assembly registers has two consequences:
(i) The combined lengths of the shift registers used 
must be sufficient to hold all the initial conditions 
for a single integrator. For this machine, the P  and 
Y  registers are 8 and 12 bits respectively and the 
three selectors each of 16 bits. The shift registers 
must also be individually long enough to hold the 
4-bit outputs from each machine element. The 
machine built, therefore, has a 5 by 16-bit shift 
register matrix.
(ii) Each machine element must be capable of routing 
accepted data from the busbars either to the initial 
conditions registers during loading or the incre­
ment input registers during computation. This 
was achieved by broadcasting the initial conditions 
serially along the AZ busbars to all machine 
elements and, at the same time, clocking only the 
element that was to input these data. Although 
there are only 4 AZ bits broadcast by the inter­
connexion logic during computation, there are 
5 busbars so that all the initial conditions can be 
transmitted in the same way. Machine loading is 
carried out serially because the frequency of load­
ing compared with that of iterating is so small that 
the extra logic needed for parallel loading is dif­
ficult to justify. Furthermore, the very high gate/ 
pin ratio so far achieved for the machine elements 
would be largely lost if parallel loading was im­
plemented. The possibility, as i.e. technology 
advances, of putting complete elements on a chip, 
is quite evident. However, chip technology is 
allowing gate counts per chip to rise faster than 
lead-outs. To some extent this is to be expected, 
but it does mean that computer sub-systems of 
ever higher gate/pin ratios are going to be 
demanded if full use of l.s.i. is to be realized. 
(There is also a maintenance/reliability penalty
208 The Radio and Electronic Engineer, Vol. 42, No. 5
■.I. 1.SXI i_iU brL L S n L O U L U  I IU 1 \I L / .U .M
associated with i.c.s of high lead-out counts.) 
The same arguments tends to make serial inter­
connexion methods more desirable than parallel 
ones even if there is a speed penalty.
7.2. Data Output 
The transmission of data between machine elements 
is incremental (4 bits). Furthermore the use of the single- 
step hardware method for implementing the trapezoidal 
algorithm means that a F  register rarely contains the 
actual variable y  that might require monitoring. (F  
corresponds to y  only if EAF(i) =  0 whereupon any 
outstanding •JSAF({_1) has already been processed.) 
Thus, if inspection of any given y  is required, it is evident 
that the most desirable way is to assemble the AZ in 
counters which can be addressed like any other machine 
element. That is, the AZ may be accumulated to form 
Z  (=  EAZ), see Fig. 8. Data assembly of this sort might 
be used for outputting information to the general-purpose 
digital computer or parallel logic/decision systems local 
to the d.d.a. These might include comparators and 
limiters and g.p.d.c. interruption generators. Also the 
output may, via digital-to-analogue (d.a.c.) convertors, 
monitor solutions on a c.r.o./strip recorder or X- Y  plotter.
Q Selector (1)
n s f
Q Selector (2 )
A Z (1)
A Z
busbar
A Z (2)
Adder (1) A d d e r(2 )
Z (1) Z (  2)
DAC(1)
~T
Output (1)
C om parator
r ~
DAC(2)
I
Output (2 )
Z ( l ) = Z ( 2 ) e t c .
r
G.P.D.C. interruption
Fig. 8. Addressable d.d.a. monitor logic.
The machine, as it stands, merely makes use of a pair 
of d.a.c.s to monitor the outputs of two selected integra­
tors. These are wired across at present, for simplicity. 
However, to provide the extra facilities cited is very 
straightforward. In particular, providing the outputs to 
the input/output busbar of a g.p.d.c. allows the latter’s 
peripheral devices to be used, if required. Such a pro­
vision, together with a data input facility, can provide 
completely ‘closed shop’ hybrid computing devoid of the 
necessity for analogue computer style patchboards or 
expensive servo-setting electronics.
8. Performance
8.1. General Considerations
The total iteration time of a parallel d.d.a. is a com­
bination of integration and interconnexion time. The 
former might be termed ‘useful’ algorithm time, the latter 
an undesirable overhead. It is desirable to make the inter­
connexion/integration time ratio as small as possible. 
This ratio, to a first approximation, is a constant for a 
given system design, regardless of technology used.
With 7400 series TTL a clock rate of about 8 MHz 
can be achieved for the interconnexion phase. The 
limitation lies in the ability of the EA Y  logic to 
assimilate selected A F from the AZ busbar. The use of 
carry-save in the input logic is thus well justified as it 
represents a strengthening of the weakest link. (The 
shift registers are capable of conventional operation at 
about 20 MHz.) The interconnexion period for this 16 
element, single rank selection machine is thus
16 x 0-125 ps =  2 ps
The integration time, which starts from the final 
selection, is dependent purely on combinational logic, 
followed by a single clock pulse to staticize the result in 
the AZ register. The time is that of a cascaded carry- 
save network and not substantially greater than a single 
carry-propagate add time. This arises because all the 
adders are ‘settling’ (starting at the l.s. end), almost 
concurrently. . There is approximately a 1-bit stagger 
between each stage. However, this is not consistent at 
all times due to the 7483 (quad full adder) that is widely 
used having an internal carry predictor. Thus, assuming 
the carry-in is settled (and the 9 inputs), the carry-out is 
true before the sum outputs. Furthermore, the latter are 
not generally produced with a straightforward stagger 
because of production spreads, etc. However, the time 
measured on several integrators indicated a worst case 
AZ/AF output delay of rather less than 1 ps. (Quite 
evidently, a factor of 8-10 improvement in the figures 
might be expected as a direct consequence of using a 
faster logic family, e.g. ECL.)
The total iteration time is thus 3 ps for the machine, 
as built. From this, the performance of the elements can 
be derived in a manner directly comparable with both 
g.p.d.c.s and analogue components.
8.2. Slew  Rate
The trapezoidal algorithm, using 4-bit increments, 
and a 12-bit word gives a slew time from maximum 
negative to positive F  of
212 x 3
=  770 ps (approx).  (8)
This is, of course, at the full 12-bit precision of the F  
register and is not directly comparable with the slew rate 
of analogue comparators and the like. It is comparable 
with g.p.d.c.s.
8.3. Sinewave Rate
From this slew time the sinewave speed can be derived. 
Alternatively, it may be obtained by considering that if 
a sine/cosine loop is set up wherein a given integrator
May 1972 209
R. E. H. BYWATER and  W. h. luvlkiimli
(a) Solution of y + k y  =  0, parameter k
(b) Solution of y + k y  =  0, j(0) /  0, y m  — 0, parameter k
(c) Solution of y + k y  =  0, y<0> #  0, j(0) +  0, parameter k
Fig. 9.
output \J/ is given by:
=  A sin (cot — <j>) ^
\f/ — Aco cos (to t— <jii) >  .(9)
i.e. \If(<j)) = AcD. J
210
(a) Solution of y + a y + b y  =  0; y displayed
(b) Solution of y + k y + a y  =  0; y displayed; y parameter k
(c) Solution of third-order differential equation illustrating over­
loading of integrators
Fig. 10.
The maximum slew rate (£) for i]/ occurs at
t =  +  (j) where K  =  0 ,1 , 2 , . . .
ct)
i-e- £(max) =  Aco.
The Radio and Electronic Engineer, Vol. 42, No. 5
PROGRAMMABLE EXTENDED RESOLUTION D.D.A.
Therefore, for this machine, the maximum sinewave 
speed is about 400 Hz. Due to the interaction between 
the truncation and round-off errors in the system, the 
amplitude remains constant f or a large number of cycles and 
is certainly accurate to +1 bit of the Tregister for 100 circles.
8.4 Parameter Scanning
The main advantages that have been obtained from the 
construction of this machine, apart from its solution, lie 
in its versatility for use in iterative computing situations. 
These include the ease and speed with which it may be 
loaded, reprogrammed, restarted or made to act on con­
ditions arising during computation.
Compared with the conventional analogue computer, 
it can be loaded with both initial conditions and patching 
information directly from a paper tape reader, keyboard 
or computer interface. The loading speed is about 60 000 
machine elements per second provided that the data 
source can match this rate. This applies whether the 
machine is being initially loaded or reprogrammed part 
way through a problem.
Iterative computing usually requires that problems be 
solved for a variety of initial conditions or potentio­
meter coefficients. It may even require changes to the 
equations ‘patched’. In a hybrid computer, these 
changes are sometimes carried out by parallel logic 
specially patched in or by a digital computer. The pro­
gram is usually held up for some time so that servo- 
set potentiometers, etc., may be actuated.
The d.d.a. can effect any of these changes very quickly 
just by changing the contents of one or more registers, 
and this is ideally suited for high-speed iterative com­
puting. Consider, for example, the analysis of a network or 
structure which must be described by different equations 
at the boundaries. The d.d.a., without any physical modi­
fication, is capable of analysing the system, node by node, 
and reconfiguring itself as boundaries are approached.
Some examples of parameter scanning are shown in 
Figs. 9 and 10. These have been achieved through parallel 
logic which can be readily introduced to the system, as 
desired.
Figure 9(a) shows solutions to a simple first-order 
differential equation where the time-constant (=  1 /k) is 
varied as a parameter. As many values of k may be used 
as there are values of potentiometer fraction available. 
In this machine, P  may be varied in steps of 1/128 from 
— 1 to +1. Only five solutions have been shown in 
Fig. 9(a) for clarity. The waveform shown uses only the 
6 most significant bits of the solution so that digitization 
is visible. Furthermore, as it is the content of a Y  
register that is displayed, small ‘impulses’ in the wave­
form are visible. These pre-empt a final change of value 
of Y, and indicate the action of the integrator logic 
implementing the trapezoidal algorithm. This causes the 
Y  register to be offset by half the AT increment on any 
step.
Figure 9(b) shows a family of sinewaves' being 
generated using various values of P  to control the fre­
quency. If non-zero conditions are used for both in­
tegrators, then varying P  affects both the amplitude and 
frequency (Fig. 9(c)).
Figures 10(a) and (b) show solutions to second-order 
equations, once again, with a 6-bit digitization of the 
output. P  registers can be used to vary both frequency 
and damping factor through a wide range of values.
. Because 2’s complement number notation is used, 
digital integrators do not saturate, but merely recycle 
through their range of values. Thus, if 1 is added to the 
maximum positive value that can be accommodated in 
a register, the maximum negative quantity results. In the 
solution of an equation, an overloading integrator 
will suffer a step change of value equal to its dynamic 
range (212 in this machine). The integrator after it will 
then suffer a step change of slope as shown at the top of 
Fig. 10(c), and the next integrator a step change of slope 
rate.
8.5. Loading
The time to load a single integrator (with all 5 words of 
data) is limited in the present machine by the input device 
(a 200 characters per second paper tape reader). Even if 
a very fast parallel computer interface was provided, this 
would still be the case as the 8 MHz shift rate is capable 
of disposing of a byte (character) in 1 ps giving a load 
time inclusive of data assembly in the AZ registers of 
14 ps. The 8 MHz rate, as has been stated, is a limitation 
imposed by the integrators’ EA T  logic. If a separate 
clock were provided for data assembly/loading, the load 
time could be approximately halved as all active logic 
in this phase is only in the form of conventional shift 
registers. , '
8.6. Accuracy of Computation
The accuracy of the d.d.a. is governed in very much the 
same way as a g.p.d.c. Both data round-off effects and 
independent variable discretization (truncation) gradually 
erode the precision of solutions on a per step basis. The 
only major difference between the g.p.d.c. and the d.d.a. 
is that the former generally incurs round-off errors by ' 
permanently losing the lower bits of variables after ' 
multiplication by At or AZ. The d.d.a., on a given step, 
suffers the same round-off, but the non-transmitted data 
are preserved in the R (residue) register to be accumulated 
with subsequent low significance integral increments. The 
round-off error therefore tends to manifest itself as a vary­
ing time or phase delay with a non-linear characteristic 
not unlike that exhibited by a g.p.d.c. algorithm. '
The truncation errors are exactly akin to those of a 
g.p.d.c. algorithm, the effects of which may be deter­
mined by solution of the associated difference equations. 
The only point of divergence lies in the fact that both 
hardware and algorithmic approaches to integration 
inevitably allow the round-off and truncation errors to 
interact. Because the effects of the round-off error differ, 
this reflects, to a small extent, on the truncation errors 
that are apparently induced.
The machine built uses a trapezoidal integration 
algorithm to provide a first-order post-correction for the 
well known Euler method. This, for a wide variety of 
problems, shows good stability with both a predictable 
and moderately easily calculated error growth rate. Also, 
because of the high cost per integrator of implementing
May 1972 211
R. E. H. BYWATER a n d  W. F. LOVERJNG
the more exotic algorithms, this was, to some extent, an 
engineering compromise.
The machine built also employed a 4-bit transfer 
method as a compromise-between the high round-off 
errors per step associated with, say, binary A (— 1, +1) 
or ternary A(— 1, 0, +1) transfer methods on the one 
hand, and transmission of the integral, in toto, on the 
other hand. The compromise on the latter part must 
necessarily depend on the expected number of iterations 
per solution for the problems likely to be encountered.
9. Machine Expansion
There are several ways in which this basic design can 
be extended both for the purposes of software simplifica­
tion and reduction of the demands on outside hardware, 
particularly in the interface to the g.p.d.c.:
(a) To include in each integrator a separate ‘Y’ 
register which is loaded at the same time as the 
operational one. This would store the initial 
condition, i.e. would not be updated during 
computation, but would refresh the operational Y  
register, if necessary. This would allow more ready 
implementation of certain simulation languages, 
in particular, the Simulation Council Inc. version, 
CSSL.11 In this, the updating of registers for each 
run is done on an exceptions-only basis, which is 
particularly powerful for fast parameter sweeping 
or hill-climbing. From the hardware viewpoint, it 
reduces the data flow through'the interface and 
hence loading of the g.p.d.c.s 1/0 busbar. It does 
mean, of course, that selective or addressable 
updating must be built in rather than the hitherto 
simpler approach of a systematic refill scan through 
all the machine elements.
(b) As mentioned in the description of the inter­
connexions method, other machine topologies can 
be tried with a view to reducing the time taken for 
interconnexions. This is very important for large 
machines. The extension can also include the use 
of the space as well as time domain, i.e. by 
multiple ranking of selectors (channels). In the 
limiting case of use of the space domain only, a 
totally combinatorial system is achieved but is 
bound to be costly, and raises the problem of con­
currently adding several input increments to any 
integrator. ‘Treeing’ o f adders to do this, even 
with carry-save techniques, will accumulate many 
gate delays. A possible solution to this problem is 
the (partial) use of look-up tables. Such tables, at 
present generally manufactured as metal oxide 
silicon devices, are very fast but would need to be 
very capacious for high fan-in machine elements. 
A compromise solution may lay in using them 
just for summing each set of equally weighted 
bits from a group of incoming increments. A  
single device could then be time-shared for each 
significance of such increments.
As an example, a 256-word by 4-bit (1024 bits) 
m.o.s. read-only memory (r.o.m.) could sum 8 
equally weighted bits to produce a 4-bit word of
conventionally weighted bits. This, is not, at 
present, a standard chip design, but the introduc­
tion of programmable r.o.m.s (p.r.o.m.) has 
allowed non-standard contents to be electrically 
formed, thereby eliminating the high pattern 
design cost.
10. Conclusion
The d.d.a. built has indicated that, with technology 
and systems practices now available, it is justifiable to 
build this special form of hardware for real-time 
simulation. The speed advantage gained from special- 
purpose logic structures can more than offset the limited 
repertoire of the systems.
As little as five years ago it would have been, in most 
circumstances, very difficult to obtain a clear advantage 
from such techniques. Also, the development, in the last 
decade, of sophisticated design tools such as c.a.d., 
automatic circuit design/layout and logic fault detection 
methods, allows new electronic systems to come into 
being more quickly, and with a greater chance of initial 
success, when manufactured. From the system architec­
ture viewpoint, the d.d.a. represents one of the structures 
most amenable to parallel processing techniques and this 
is, no doubt, due to the comparative simplicity of each 
machine element, in particular its interface with the out­
side world. This, together with the very small number of 
lines needed to effect this interface, make it an attractive 
logic base for large-scale integration.
11. Acknowledgments
The authors wish to acknowledge, with thanks, the 
support given to this project by Professor D. R. Chick 
of the Department of Electronic and Electrical Engineer­
ing, University of Surrey.
12. References
1. Owen, P. L., Partridge, M. F., and Sizer, T. R. H., ‘C O R S A I R  
— a digital differential analyser’, Electronic Engineering, 32, 
p. 740, 1960.
idem, ‘A  transistor digital differential analyser’, J. B rit. I.R.E., 
22, p. 83, August 1961.
2. Forbes, G. F., ‘Digital Differential Analysers’ (Pacoima, 1957).
3. Donan, J. F., ‘The serial memory dda’, Computation, 6, p. 102, 
1952.
4. Von Handel, P. (Ed), ‘Electronic Computers’, pp. 139-209 
(Springer Verlag, Vienna, 1961). '
5. Benyon, P.; R., ‘A  review of numerical methods for digital 
simulation’, Simulation, November 1968, pp. 219-238.
6. Kopal, Z., ‘Numerical Analysis’ (Chapman &  Hall, 1955).
7. Hyatt, G. P. and Ohlberg, G., ‘Electrically alterable digital 
differential analyser’, Proc. A.F.I.P.S. Spring Joint Computer 
Conference, 1968, pp. 161-169.
8. Sizer, T. R. H. (Ed.), ‘The Digital Differential Analyser’ 
(Chapman &  Hall, London, 1968).
9. Bywater, R. E. H., ‘One step integration method for digital 
differential analysers’, Electronics Letters, 6, p. 613, 1970.
10. Hatvany, J., ‘The d.d.a. integrator as the iterative module of 
a variable structure process control computer’, Automatica, 5, 
No. 1, pp. 41-9, 1969.
11. Strauss, Jon C. (Ed.), ‘The SCi continuous system simulation 
language (CSSL)’, Simulation, December 1967, pp. 281-303.
Manuscript received by the Institution on 29th November 1971. 
{Paper No. 1446/Comp. 140.)
© The Institution of Electronic and Radio Engineers, 1972
. \
212 The Radio and Electronic Engineer, Vol. 42. No. 5
