University of Wollongong

Research Online
Department of Computing Science Working
Paper Series

Faculty of Engineering and Information
Sciences

1981

A hybrid real time performance analyser
P. J. McKerrow
University of Wollongong, phillip@uow.edu.au

Follow this and additional works at: https://ro.uow.edu.au/compsciwp

Recommended Citation
McKerrow, P. J., A hybrid real time performance analyser, Department of Computing Science, University of
Wollongong, Working Paper 81-5, 1981, 24p.
https://ro.uow.edu.au/compsciwp/16

Research Online is the open access institutional repository for the University of Wollongong. For further information
contact the UOW Library: research-pubs@uow.edu.au

A HYBRID REAL-TIME PERFORMANCE ANALYSER
PhiLLip John McKerrow
Department of Computing Science,
The University of WoLLongong,
Post Office Box 1144,
WoLLongong, N.S.W 2500
AustraLia.

ABSTRACT
Hybrid monitors attempt to take advantage of
the compLementary nature of hardware and software
tooLs. Research into ways of dividing the monitoring functions between the hardware and software
sections of a tooL and
co-ordinating
their
interaction has guided the design of an easy-touse hybrid performance anaLyser.
The proposed
tooL can be added to an existing system or impLemented as part of an integrated design of a new
system.
The reaL power of the tooL is in its abiLity
to reduce and anaLyse the data being read. A
hierarchy of performance measurement strategies is
discussed. Execution time, execution path, execution frequency and stimuLus information can aLL be
obtained without having to refer to the code of
the moduLes, considerabLy simpLifying the measurement process.
ReaL-time anaLysis provides the information
needed for modeL deveLopment and verification,
moduLe spectral anaLysis, bottle-neck analysis and
adaptive system control.
Keywords: Hybrid Performance Analyser, Performance
Evaluation,
Hardware
Monitoring,
Software Monitoring,
System
Design,
Operating System Design, Adaptive Control, Data Collection, Data Analysis,
Module, Module Number, Task, Event, Execution Time, Execution Path, Logic State
Analyser, Performance Evaluation ModeLs,
Cache Memory, Spectral Analysis, Realtime Analysis, BottLe-neck Analysis.

- 2 -

1.

INTRODUCTION

Performance evaluation can be split into three broad areas:
measurement of system parameters, evaluation of the data and
modifications to improve performance. The goal of the evaluator
is to improve system performance by either identifying and
correcting bottle-necks or by selecting optimum system constants.
Some workers [4, 7, 30] have suggested that the ultimate goal of
performance evaluation research is to be able to measure and
adaptively control performance indices in real-time in order to
maintain optimum performance under varying system load conditions.
The state of the art of controL theory, and technology, is
such that practicing control engineers generally feel that if a
parameter, in an industrial plant, can be measured or accurately
modelLed, then it can be controlled.
Computing power is no
longer a limitation in developing highly sophisticated control
algorithms and systems. In order to develop the necessary control algorithms, extensive measurement, modeLLing and evaLuation
must be done.
When evaluating and controLling performance the analyst
first has to decide what to measure and then how to measure it.
A high performance monitoring tool designed to provide accurate
reaL-time information shouLd:
(i)

be simple to use, with an ergonomic human interface,

(ii)

be flexible enough
(instruction level)
measurements,

(iii)

provide data
real-time,

(iv)

not require an intimate knowledge of the
before any useful resuLts can be obtained,

(v)

eliminate the need to
code,

(vi)

cause minimal
interference,

(vii)

not be expensive.

to provide both
microscopic
and macroscopic (process level)

reduction,

anaLysis

spend

performance

hours

and

dispLay

digging

degradation

in

system
through
due

to

The design of such a tool requires an integrated approach to
system design.
If, at system design time, the indices to be
measured and the method of measurement can be specified then the
necessary hardware and software can be included in the computer
and operating system. This would provide considerable improvement over the current method of adding tools to an existing system, often in an ad-hoc way.

- 3 -

2.

HUMAN ENGINEERING

Increasingly computer users are faced with black boxes built
without regard for the user's need for information about the system operation. In many ways, computers complement human intelligence but if computers fail to provide information that is easy
for the user to assimilate, confidence in the system is diminished.
Often one types in a copy command on a dual floppy-disc system and wonders if the copy is going the correct way. Lights
indicating reading and writing operations would give the user
increased confidence and more rapid feedback. Computers are
excellent at handling numbers and tables; humans at recognizing
patterns and pictures. The removaL of speakers and front panels
from computers have made them Less friendLy.
The operating system used in a real-time control project [2]
was modified to iLLuminate an individuaL Light on the console for
each process, when that process was active.
As many of the
processes were cyclic, due to the nature of the external
machines, a reguLar pattern could be observed on the lights when
the system was operating correctly. This enabled the operation
of a complex controL system to be anaLysed visuaLLy. On a number
of occasions, controL system probLems were diagnosed, by an
engineer, using information about the Light pattern reLayed over
the phone by eLectricians who subsequentLy fixed the fauLts.
Often by concentrating on the technical aspects of a probLem, we overLook human engineering and consequently miss an
eLegantLy simpLe soLution. Looking at the performance measurement probLem from a human engineering and control system point of
view Led to the proposaL discussed in this paper.
3.

MEASUREMENT TECHNIQUES AND PROBLEMS

The main sources of measurement data are existing accounting
software, hardware monitors attached to the system and software
monitors. NormaLly, accounting data does not contain sufficient
information and a hardware of software monitor is required.
ConceptuaLLy, a measuring instrument consists of four parts
(figure 1).
The sensor~s task is to sense the magnitude of the
quantity being measured and thus it includes the software or
hardware probes. In the transformer section the measured data is
manipuLated and reduced to the desired form, e.g. a state
analyser's abiLity to distinguish desired states from a continuous data stream. After data reduction the measured information
can be displayed directly or it can be analysed and the resuLts
of the anaLysis dispLayed.

- 4 3.1.

Hardware Monitors

Hardware tools consist of additional hardware attached to
the computer's backplane via test probes. Attaching the probes
requires a detailed knowledge of the backplane and a clear
specification of which signals are to be monitored. Consequently
they tend to make maintenance people nervous.
They can detect
events occurring at microscopic levels with high accuracy, e.g.
individuaL instruction fetches. Thus, potentiaLly their scope is
broad since most of the interesting points in the system can usuaLLy be reached. The effect of the increasing scale of chip
integration can be offset if probe points are included at design
time.
Hardware monitors are often used to check the accuracy of
software monitors.
Their main advantage over software monitors
is they do not interfere with the system being measured.
They
are inherently· limited to measuring information that can be
interpreted at the hardware Level without knowLedge of operating
system activities.
Consequently, information like cpu service
time and file activity by process cannot easily be obtained.
Distribution estimates, e.g. mean channel service time, can be
obtained for the system as a whole but not for specific work load
cLasses.
For these reasons hardware monitors are often suppLemented by software monitors or accounting data.
Logic state anaLysers used as hardware monitors [1] provide
very accurate microscopic measurement, e.g. instruction execution
path and time.
They can also be used for certain types of
macroscopic measurement, e.g. process execution time or path, but
again they lack operating system specific information.
It is
difficult to analyse some processes (e.g. the scheduler) without
intimate knowledge of the current system load.
3.2.

Software Monitors

In event~driven software monitors significant events are
defined (e.g. process switch) and the operating system is modified to record information about the event.
Small measurement
sessions can be recorded in memory but most require disc file or
tape storage. Monitoring detail is limited by the number of
events recorded.
Conseq~ently, if
after data analysis, it is
found that a significant event has not been recording then at
best the session has to be rerun, and at worst the operating system has to be modified to record the event.
An event-driven monitor has greater fLexibility than a sample driven monitor but is likely to interfere with the system
more. Depending upon the particular system and monitor, a
software monitor may consume 20% or more of system resources
(particularly cpu and channel time) and thus produce very questionable results.
If this overhead can be kept to 5%, by
appropriate event definition and probe implementation, software

- 5 -

monitor results are reasonably accurate [3J.
In addition to interfering with system performance, software
monitors have two other significant problems. First, the amount
of data produced by an event trace may require excessive postdata
reduction before meaningful results can be obtained.
Secondly, software monitors must be specifically designed for the
operating system and machine architecture. If not designed into
the system they can be difficult to add.
However, software monitors can get at operating-systemspecific information (e.g. system queues) which hardware monitors
can not. Also they can readily associate physicaL activity with
logicaL entities (e.g. disc access with fiLe name). In many
ways, software and hardware tooLs compLiment one another. Hybrid
tooLs have been deveLoped to try to utiLize the advantages of
both.
3.3.

Hybrid Monitors

Hybrid tools require the addition of both extra hardware and
software to the system to be measured. They consist of externaL
hardware tools which receive data colLected by a software or
firmware tooL running in the system being measured [14, 27]. The
hardware monitoring device is no Longer invisibLe to the operating system, but it is treated as an "intelligent peripheral device" which can be used by a software monitor [16, 23J. Some even
aLlow sections of the hardware tooL, e.g. event counters, to be
allocated to users under the control of software.
A device of
this type has been used to derive an on-Line histogram of subroutine utiLization and other similar tasks [17]. Another type of
hybrid monitor uses part of the existing hardware (a channel processor) as the monitor [18].
Burroughs have incLuded a monitor micro instruction in the
firmware of some of their computers [13, 19]. This instruction
writes a bit pattern specified by the programmer to pins accessibLe to the probes of a hardware tooL. Nutt [16] indicates that
the system/monitor interface technique used by Hughes
and
Cronshaw [21] and by Ruud [22] best iLlustrates the trend toward
hybrid monitoring.
A hybrid monitor (figure 2) consists of a hardware monitor
pLus a data channeL between the monitor and the computer being
evaLuated, which can be used for transferring information between
the two. WhiLe using the hardware probes to monitor an event the
data channeL can be used to obtain information about the stimulus
of the event, thus overcoming the inherent Limitation of hardware
monitors. ALternativeLy, an event driven software monitor can
interrupt the externaL monitor which picks up the required event
data from memory, via the data channeL thus reducing the overhead
of the software monitor.
In addition, software can request a
measurement to be made by the hardware.
The addition of

Hybrid Monitor

Computer under
Evaluation
CPU

.

I/O

Address
Bus

r;
Memory

Data
Bus

Serial Link

•

Computer

&

Data

>..

Data

Analysis

Reduction

Probes

DMA
Channel

..

Interface

..

Data Channel

and
Control

Keyboard and
Display

Secondary
Storage

Recording Unit

Figure 2.

Basic Components of
Monitor.

il

Ilybrid

- 6 -

computing power into the monitor aLLo~s considerabLe reaL-time
data reduction and anaLysis to be carried out and increases the
flexibility of the monitor.
Thus hybrid monitors are an attempt to take advantage of the
compLementary nature of hardware and software tooLs. Ferrari [4]
comments that the simultaneous usage of co-operating tools of the
same or different nature, their coordination, and the partitioning of their functions and jurisdictions create probLems which
are still open to research. The proposal developed in this paper
is an attempt to solve some of these problems.
4.

COMPUTER AND O/S DESIGN FOR MONITORING

The majority of monitoring tooLs are designed to monitor
existing systems.
Consequently, serious problems, due to the
constraints imposed by the organization of the system being monitored, are often encountered. These probLems incLude:
(i)

Events of interest are not accessible.

(ii)

Excessive interference.

(iii)

DifficuLty in verifying coLLected data.

(iv)

Modifications to the system to provide the
data may be difficult, risky or expensive.

required

These problems could be minimised if a comprehensive set of
measurement facilities were included during system design. This
practice has not been popuLar so far among computer manufacturers
even though such tools wouLd then be avaiLable for use during
system implementation and debugging. Research projects into this
have included the instrumentation of Multics [6] and of PRIME
[5]. One of the main probLems faced by the designers was to
predict what events would be of interest for measurement purposes
once the system was implemented.
As a result, both projects
adopted a mixture of ad-hoc fixed tools and generaL purpose
tools.
To be effective in the Long term, performance evaLuation has
to be considered when the syst~m, both hardware and software, is
designed. We should no longer design hardware, software and
instrumentation separately and then patch until they work. An
integrated approach is necessary.
Before such an integrated
approach is achieved, performance evaluators will have to be able
to specify more clearly, at design time, what they want to measure and why. Current research into modelLing methods is providing much of this information. When performance evaLuation theory
has developed to the stage where what is to be measured and why
it is to be measured and evaluated, can be specified at system
design time, then the manufacturers are more LikeLy to include
instrumentation in their systems.

- 7 -

5.

HYBRID ANALYSER DESIGN - DATA

~OLLECTION

Before commencing to evaLuate the performance of a specific
computer system, the evaLuator must first define what is meant by
"performance" of the system. This incLudes estabLishing what the
user, or whoever requested the evaLuation, wants out of the system and what the system was designed to do in the first pLace.
Performance indices can then be defined for the system and the
measurements needed to give vaLues to these indices specified.
Thus performance requirements and performance indices wiLL
vary from system to system and from user to user within the same
system, e.g. the systems programmer and the payroLL cLerk have
compLeteLy different requirements.
However, even though the
actuaL variabLes measured depend upon the context, the type of
measurement is simiLar.
Measurements of interest to aLL incLude:
(i)

Execution time.

(ii)

Execution path a

(iii)

Frequency of execution.

(iv)

BottLeneck identification within a task.

(v)

StimuLus information reLated to the above.

The above hoLd true whether context is defined as a user
program, a segment of an operating system, a whoLe operating system, a disc server or a modeL of the system. A proposed hybrid
reaL-time performance anaLyser designed to make these measurements is discussed in the foLLowing sections. ImpLementation
effort is currentLy directed toward verifying the basic theory.
5.1.

Hardware Probes

The hardware to be added to the system under measurement is
a simpLe digitaL output port (figure 3) and some connectors. The
output port consists of a sixteen bit moduLe-number proberegister and a thirty-two bit stimuLus probe-register; each with
a Light emitting diode dispLay and a connector.
The stimuLus
probe-register can be used to store a singLe thirty-two bit variabLe, two sixteen bit variabLes or four eight bit variabLes. The
Light emitting diodes are for visuaL anaLysis and user confidence
onLy.
Software can store information in these registers pertaining
to program operation at any time. When either probe-register is
updated a cLock puLse is sent to the anaLyser. Software overhead
wouLd be Less· in a hardware system that used memory mapped
input/output than in a system that had speciaL input/output

Host Computer

Bus Master
Interrupt level
lnstruction/Operand/Da ta FetCh} STATUS
Memory read/write
User/supervisor mode

CPU

Memory }
Access

Data
Bus

Clock 1

Address Bus

Module
Number

Memory

Probe
Registers

--'

Stimulus

, > Information

Clock 2
Probe Register full

I/O

I

Serial Link

DMA
Control

D

>OMA

.....
Intelligent
Probe
Register

Peripheral
Controller

~

Optional

.....
Disk

Figure 3.

Hardware Probes

Connection

- 8 -

instructions, especially if they could only be executed in supervisor mode.
Other hardware includes connectors for a serial communications port, a direct memory access channel (only necessary in
some situations) and state analyser signals inc luning the address
bus and processor status lines, e.g.
user/supervisor mode,
interrupt level, type of memory access, etc.
5.2.

Software Probes

Every module of code (including processes, traps, interrupt
handlers, subroutines) is given a unique module number. A module
is a contiguous piece of code that is executed in response to an
event.
An event is any action that initiates a significant
change in system state.
Events include interrupts, software
traps, process switches, subroutine calls and, at a higher level,
user commands, but do not include programme branches. The first
statement in a module writes the module number to the output
port. The number remains there until that module completes execution and the next modules overwrites it. When a subroutine is
called the calling process saves the number of the calling module
so that on return from the subroutine it can be written to the
output port. Thus the number in the module-number probe-register
always corresponds to a unique module and is there only while
that module is executing (or not executing in the case of idle).
In the case of a re-entrant module, such as a terminal
driver, additional numbers can be written to the stimulus register indicating, in this case, terminal number and calling module
number. In the case of a disc driver, module information such as
disc number, channel number, file number, number of records
transferred or calling module number could be written out.
Operating system specific information, e.g. number of active
tasks, can be written out when operating system modules, e.g. the
scheduler, are executing.
A system module ~ must be available to the operator,
preferably in a form that can be transfered to the analyser for
display to the analyst and for use by the analysis programmes.
Entries in the map include the name, function, module number and
stimulus variables that ar~ written out for the each module.
For many evaluation studies it may be necessary to update
the stimulus probe-register several times during the module. The
sequence in which information is written to the stimulus proberegister is defined in the module map. Sequence order variation
between different module execution paths can be compensated for
by assigning several sequential module numbers to the module. In
effect a module can be broken up into sub-modules , each with a
unique module number, if the situation-requires. The flexibility
of the software tool can be increased if stimulus information is
obtained using pointers rather than directly. This allows either

- 9 -

a table of pointers from which one, defined by a global index, is
selected or pointers can be modi fied either by software from a
terminal or by the external analyser over the direct memory
access connection.
5.3.

Implementation

The hardware and software needed in the computer system are
easiLy incLuded in an integrated system. A structured approach
to operating system design shouLd simpLify th~ incLusion of the
necessary software. The overhead created by the inclusion of the
software shouLd be minimal and in fact can be measured by the
anaLysis tooL.
If the overhead is considered too high it couLd
be reduced by using an automatic checkpoint insertion system
simiLar to that used in the Informer [23] where the software is
onLy instaLLed for the measurement period. Modifications to an
existing operating system may be difficuLt particularLy if it has
PartiaL implementation to alLow study of
grown like Topsy.
specific moduLes can easily be achieved either by modifying process start and termination supervisor caLLs or by providing a
smaLL output routine for user programmes.
In either case a
module map wilL be needed.
In a new design the additionaL hardw~re couLd be easiLy
incLuded.
Existing hardware couLd be modified by adding a digitaL output port (preferabLy memory mapped) to the system.
Other
probe points should aLso be wired to connectors to make attaching
and detaching the analyser easier. A further advantage of the
registers being memory mapped is that the analyser can test the
probes out by writing to the registers over the direct memory
access channel and then reading the registers via the probes.
The address bus probes could aLso be checked.
5.4.

Analyser
The signals avaiLabLe for analysis are:
(i)

Probe register contents - module number etc.

(i i)

Address bus

(iii)

CPU status signaLs
Processor mode, user/supervisor,
Memory read/write,
Memory read type,
fetch, data fetch,

instruction

fetch,

operand

- 10 -

Bus master, CPU/DMA,
Interrupt mask/level,
(iv)

Serial link.

(v)

DMA connection.

(vi)

Clock signals
Memory access,
Probe registers fulL.

These signaLs must be
coLLected,
reduced,
anaLysed,
displayed and recorded in real-time. The type of tooL (figure 4)
needed is an extension of the Logic state anaLyser [1, 24] but
with greater inteLLigence and flexibility. Thirty-two bit microcomputers have the power needed to make this possible at reLativeLy Low cost. It must be easiLy attached to the computer to
obtain the above signaLs, using quick-connect connectors, and
must not interfere with the host hardware. The typicaL probe
width is 98 bits consisting of:
(i)

probe register 48 bits,

(ii)

address bus 32 bits,

(iii)

CPU status 16 bits,

(iv)

and 8 bits for cLock generation.

State anaLyser front end technoLogy typicaLly sampLes at 25 MHz,
sufficient for reading individuaL memory cycLes of current mini
and micro-computers.
The abiLity of the state anaLyser [1J to seLectiveLy trigger
and seLectiveLy trace specified sequences of states needs to be
expanded to alLow at least twenty states to be specified in each
sequence.
It shouLd be possible to specify a trigger sequence
and a trace sequence for each cLock. The resuLtant time-stamped
trace wiLL be a mixture of both traces. A fLag wiLL indicate
which sequence the sampLe belongs to. Also the abiLity to trace
a
number of unknown states before and after a specified state
wouLd be very usefuL for determining which programme caLLed the
moduLe under study.
In addition the anaLyser shouLd be doubLe buffered so that
it can monitor continuously. Current state anaLysers stop reading when their buffer is fuLL. Even in their pseudo-continuous
mode, input data is ignored untiL the buffer is transferred to a
mass storage device. Continuous monitoring aLLows the anaLyser
to anaLyse, display and record one trace while continuing to
trace, consequentLy no data is Lost.

,

.
Probes

"

High
Speed
Interface

JJ
High Speed
Comparators
for
triggering &
trac2 sequencing

Figure 4.

,
\

"

High Speed
Data Reduction
and
Comparison.
Doubly
Buffered
Output

,
'\
/

t
Time of
Day and
Precision
Clock

Hybrid Real-Time Performance
Analyser.

32 bit Micro System
for
Data Reduction,
Data Analysis,
Data Display,
Spectral Analysis,
System control
and
User programing
/

r--.

"-

/

Interactive,
Graphical,
User console

~

- 11 A current limitation of logic state analysers is their inability to modify the clock signaL, by logicaLLy combining signaLs
to form a composite cLock. ALso the abiLity to combine the cLock
signaL with an anaLysis input to produce a composite cLock is
desirabLe, for exampLe combining the memory access cLock with the
instruction fetch signaL to produce an instruction fetch cLock.
6.

HYBRID ANALYSER DESIGN - DATA ANALYSIS

The reaL power of the tooL is determined by the anaLysis it
can perform on the coLLected data. As each moduLe is executed
its exact execution time and frequency of execution can be measured, simpLy by tracing the contents of the probe registers.
NormaL entry and exit from moduLes is confirmed by monitoring
entry and exit addresses obtained using the address-bus probe.
Thus, significant information can be obtained just by reference
to the moduLe map and simpLe macro-LeveL tracing. The analyst
does not have to consuLt the "mythicaL" system Link map, which
wiLL change the next time the system is linked, to find moduLe
starting addresses. In fact the anaLyser can find them for him
directly, if they are needed.
The execution time for any moduLe will vary depending upon
the path taken through it and thus some form of data reduction is
needed.
6.1.

Models

Kumar and Davidson [25] have argued that a hierarchy of performance modeLs ranging from anaLyticaL modeLs to detaiLed simuLation modeLs, is a very usefuL tooL in the design of computer
systems.
The deveLopment and vaLidation of these modeLs, both
structuraL and functionaL, require the measurement of actuaL system va lues.

(1)

§@fvi@@ t1ffi@§ ~~d v1§it f~t16§ f§f ~n~Lyt1t~L ~U~U~1hg
m66@L§ e~6J, usn b@ eaLeuLat@6 ff6ffl ~ata prodUted by
takiM~ th@ m@sft 6f the @Meeuti§ft tim@§ §f the a~pfo~fi~
at@ m@duL@§ aft~ th@ir fF@~U@ftGY @f U§a~@ ~uf1A~ the
~@f106 UM~@r
§tu~y.
MuLt1~r@gfamm1M~
L@v@l GeM b@
obta1n@d @ith@r a§ a pr§b@ output from the §eh@dul@f or
by aV@feg1ng the number of u§@r programm@§ @~@cut@d
dur1n~ th@ m@s§ur@m@Mt p@r1od.
Job cle§§ 1§ 8n obv1oul
preb@ output for u§@r programm@§. Th@§@ caleuLet1on§
Caft b@ provided beth a§ a §taftdard in§iruffl@nt function
or 1mpl@ffl@nt@d by u~~r wr1tt@n §oftwaf@ funning 1M the
ana lyur.

(11) A simulation model represents the behaviour of a system
in the time dom8~n. As there is 8 conceptual similarity between simulation and measurements the results
obtained by
measuring module execution time and

Execution
Time
).J sec

10

o

-

10

Number of Executions.
Figure 5.

Program Spectrum.
e.g. Time of day clock measured over
a one hour period.
Average frequency of Execution
Total Number of Executions
Length of Measurement Period

- 12 -

sequence can be used for both deveLopment
tion of simuLation modeLs.
6.2.

and

vaLida-

ModuLe SpectraL AnaLysis

A frequency spectrum is produced by pLotting the ampLitude
of an eLectricaL signaL versus the frequency of the sameesignaL
on a graph. A moduLe spectrum (figure 5) can be produced by
pLotting the time taken to execute the moduLe verses the number
of moduLe executions. As there is a discrete number of paths
through a moduLe the spectrum consists of a verticaL Line for
each execution path.
Continuous monitoring of a particuLar
moduLe aLLows a history of execution to be accumuLated from which
a moduLe spectrum can be pLotted. This couLd be done at the end
of a measuring period or dispLayed on a video monitor, in reaLtime, so that the progressive deveLopment of the spectrum can be
observed.
Once the spectrum is obtained the anaLyser couLd be armed to
trigger on the occurrence of a particuLar execution time and to
trace the path taken through the moduLe. Comparison of this to
the code wiLL identify the exact execution path. AnaLysis of the
other data stored in the probe registers shouLd provide sufficient stimuLus information to characterise the system at the
time. Data to be stored in the stimuLus probe-registers is an
important consideration in module design.
ModuLe spectraL anaLysis indicates areas of poor performance
that require further investigation.
Tracing a desired path
through the moduLe aLLows detaiLed anaLysis of the probLem. Thus
the tooL is usefuL for finding and anaLysing bottLenecks. ModuLe
spectraL analysis is one exampLe of a generaL purpose graphics
utiLity which wiLL accept any two input variabLes and pLot them
either in real-time or at the end of the measurement period.
A higher LeveL exampLe is the concept of task spectraL
anaLysis where both the moduLes and the time between moduLes of a
sub-system are monitored. The anaLysis of disk fiLe handLing
faLLs into this category. A task is a coLLection of moduLes that
interact to perform a desired operation. ExampLes incLude disk
fiLe transfers, interactive editing, process switching and batch
processing.
The highest LeveL is spectraL anaLysis of the whoLe system,
faciLitating such measurements as the detection of the most commonLy used moduLes and resource usage by job cLass or muLtiprogramming LeveL.
6.3.

ModuLe Execution Path

Connection of the address bus to the anaLyser aLLows the
execution paths through any moduLe to be traced and measured.

- 13 -

obvious advantage this tool has over a logic state analyser
is that the majority of measurements can be made without having
to dig through the code. Module entry and exit addresses can be
determined simpLy by reading the address bus when the moduLenumber probe-register is updated.

One

Setting the anaLyser to record aLL instructions executed
during a moduLe aLLows the anaLyser to accumuLate an empiricaL
fLow diagram of the moduLe.
Comparison of execution paths
quickly pin points branches and Loops. Thus a complete fLow
diagram of the module, which shows not onLy execution paths and
address reLative to the start of the moduLe but execution times
and Loop frequencies as welL, can be constructed andd~spLayed.
SymboL tabLe information, obtained by the operator over the
seriaL Link to the host, can then be used to pLace LabeLs on the
fLow diagram. For high LeveL languages it may be possible to
place Line numbers on the diagram. For high LeveL Languages that
produce assembler mnemonics as an intermediate compiLation stage
both Labels and Line numbers could be pLaced on the diagram.
At this stage the code can be consuLted to fill in final
details and to anaLyse time consuming sections etc. It is this
feature which makes the anaLyser much easier to use than other
tracing tools.
The operator can obtain the majority of the
information needed without having to refer to Link maps and
assembLy code.
Discontinuities occur in moduLe execution as a resuLt of
servicing
externaL
interrupts,
waiting
for
input/output
transfers, supervisior caLls and time sLicing.
These can be
detect~d
and their effect eLiminated by monitoring the moduLe
number whiLe tracing the instruction path.
At a higher Level, monitoring the moduLes in a system or
sub-system enabLes typicaL task execution paths to be estabLished
and task fLow diagrams to be dispLayed. From
this
task
bottLenecks can be pinpointed and analysed. The presence of
stimulus information at the probes alLows the effect of job cLass
on task operation to be anaLysed.
6.4.

Performance Measurement Hierarchy

A top down approach to performance measurement is both
desirable and achievabLe. Execution time, execution path, execution frequency and stimulus information can all be measured at
system LeveL, task LeveL and moduLe LeveL. Understanding individuaL moduLes in an operating system is reLativeLy easy but understanding the complex web of interactions between moduLes can be a
mind boggLing exercise. EstabLishment of typicaL task execution
paths shouLd ease this probLem.

- 14 -

6.5.

Come From

It is often desirabLe to determine which processes caLL a
particuLar moduLe and how often, i.e. how did the computer get
here. This tool can be set up to trigger on entry to a moduLe
and record either the number of the'previous module or a specified number of instructions prior to the moduLe entry point.
6.6.

High Speed Data Transfer

The direct memory access connection aLLows the anaLyser to
read specific memory Locations, e.g. contents of Loop counters,
process tabLe numbers, etc. ALso the analyser wouLd be able to
modify memory Locations in the host computer e.g. tuning constants in an adaptive controL Loop.
Information such as operating system configuration, moduLe
map and symboL tabLes couLd be transferred over this Link or over
the Low speed seriaL Link. An obvious extension to this is to
make the analyser an inteLLigent peripheral so that measurements
can be requested from the host computer. In large computer systems this wouLd be economicaL but in many instaLLations a port~
abLe t60L wouLd be desirabLe.
6.7.

MuLtics StyLe Instrumentation

Multics was developed as a research project whose intent was
to create an operating system centred around the abiLity to share
information in a controLLed way which couLd support a wide
variety of computational jobs [6]. Instrumentation, integrated
into the system at design time, proved very useful in detecting
performance probLems and identifying the cause. The measurements
made by the MuLtics hardware tooLs can aLL be done by the analyser as defined.
Some of the software tools included in Multics performed the
tasks described below:
(i)

A general measuring package recorded the time spent
executing selectable supervisor moduLes and their
frequency of execution.

(ii)

A segment utilization meter sampled, every
ten
milli-seconds, the segment number of the segment
which was executing and stored the result in a table.
This provided a simple way of detecting how time
spent in the system was distributed among the various
components.

(iii)

The number of missing pages and segments encountered
during execution in a segment was recorded on a per
segment basis.

- 15 -

(iv)

The number of procedure calls was counted.

(v)

A graphics display monitor [28] on a separate computer connected by a channel, included a variety of
standard displays which were used to observe the
traffic controLler's queues, the use of primary
memory and arrays produced by other tools.
During
system initialization, Multics built a table containing pointers to interesting data bases for use by
this hybrid tool.

(vi)

The effect of the systems multiprogramming effort
an individuaL user was traced.

(vii)

Feedback was provided to the user about the
utilization of the command just typed.

on

resource

ALL the monitoring tasks performed above can be carried out
using the monitor as described except the last which is more
properly a user service.
6.8.

Accounting

One of the problems with software added to a system for
measurement is interference with the system b~ing monitored. In
this case the tool can be used to measure the interference. Also
the interference can be compensated for by reducing the accounting programme to provide only user specific accounting data as
all other information can be obtained more easily and more accurately by the analyser.
7.

Flexi bil i ty

In many circumstances the analyst is interested in the
occurence of events and the relationship between these events.
ALL events result in the execution of a module.
A page fault
causes a handler routine to be executed. The onset of thrashing
can be detected by monitoring the frequency of execution of the
swapping modules.
User programmability and a suite of general purpose graphics
display programmes allows the anaLyst to tailor the analysis to
his particular application. Considerable research is being conducted into the design and improvement of algorithms. The addition of a special high speed supervisor call would enabLe user
programmers to pass information "to the probe-registers for accurate analysis of algorithm performance.

- 16 -

8.

THE PROBLEM OF CACHE MEMORY

Cache memory creates a significant problem for
state
analysis.
The analyser must be connected to the address bus
after relocation from virtual to physicaL address space and
before cache memory otherwise the addressing information will not
reflect programme execution. Also the memory-read status signals
must represent processor requests not cache requests. On some
minicomputers, the cache memory is an integral part of the processor and the address bus is only accessible after cache. Thus
it does not contain the address of the current instruction during
a cache hit. On a cache miss severaL instructions (which may not
be executed) after the current one are read into cache, and again
the addressing signals to main memory may not reflect programme
execution.
Adding an additional status signal which indicates whether
the current memory reference is a cache hit or miss allows cache
performance for instruction fetches, operand fetches and data
fetches to be measured for individual modules. Once again, the
need to define performance measurement requirements at design
time and build them into an integrated hardware/software design
is emphasized.
9.

CONCLUSION

Throughout this paper a theory of performance measurement
has been developed. Performance requirements vary from system to
system and user to user within the same system. Even though the
actual variables measured depend upon the context, the type of
measurement is the same in all contexts. The basic measurements
are:
(i)

Execution time.

(ii)

Execution path.

(iii)

Frequency of execution.

(iv)

Bottleneck identification.

(v)

Stimulus information reLated to the above.

A hierarchy of performance analysis strategies is provided.
The system under study can be broken down into a number of tasks.
A task is a collection of modules that interact to perform a
desired operation.
A module is a contiguous piece of code that
is executed in response to an event. An event is any action that
initiates a significant change in system state. A module map
defines the information contained in the probe-registers during
the execution of each module.
An easy-to-use, real-time, hybrid performance analyser which

- 17 -

will provide the above information has been described. It can be
added to an existing system or implemented as part of an
integrated design of a new system.
The real power of the tool is in its ability to reduce,
analyse and display the data being read. Execution time, execution path, execution frequency, and stimulus information can all
be obtained without having to refer to the code of the modules,
considerably simplifying the measurement process. A flow diagram
of a module or a task can be built up and displayed. Access to
symbol tables aLLows the LabeLs and source code Line numbers to
be included on the display. At this stage, when the module is
fuLLy characterised and understood, the analyst consults the code
to fiLL in final detaiLs or make corrections.
Real-time data reduction and anaLysis provides the information needed for model deveLopment, modeL verification, module
spectral analysis, task spectral analysis, bottleneck analysis
and adaptive system controL.

· - 18 BibLiography
1.

P.J. McKerrow, Evaluation of Interrupt Handling Routines
with a Logic State Analyser, Submitted to Performance
Evaluation, North Holland.

2.

P.J. McKerrow, Control of Coating Mass on a Continuous Hot
Dip Galvanizing Line, Master of Engineering Thesis, The
University of WoLLongong, 1978.

3.

C.H. Sauer & K.M. Chandy, Computer
ModeLLing, Prentice-HaLL, 1981.

4.

D.

Ferrari,

Prentice~HaLL,

Computer
1978.

Systems

Systems

Performance

Pe rformance

EvaLuation,

5.

D. Ferrari, Architecture and Instrumentation in a ModuLar
Interactive System, Computer 6, 11, November 1973, pages
25-29.

6.

J.H. Saltzer & J.W. GinteLL, The Instrumentation of Multics,
Communications of the ACM, Volume 13, Number 8, August 1970,
pages 495-500.

7.

G. 80uLaye et.al., A Computer Measurement and ControL System, Proc 3rd InternationaL Symposium, Measuring, ModeLLing
and EvaLuating Computer Systems, October 1977, North HolLand.

8.

C.D. Warner, Hardware Monitors: The State of
Infotech Reports, Vol 18, 1974, pages 623-631.

9.

CarLson, Hardware Monitoring for System Tuning,
Reports System Tuning, 1977, pages 255-274.

10.

W.C. Lynch, Operating System Performance, Communications
the ACM, Volume 15, Number 7, JuLy 1972, pages 579-585.

11.

Update on Hardware Monitors, EDP Performance Review,
2, Number 10, October 1974, 8 pages.

12.

D. Ferrari, & M. Spadoni, Lecture Notes of the Second Summer
School on Computer Systems Performance Evaluation, SOGESTA,
Urbino, Italy, June 1980, North Holland.

13.

W.T. Wilner, Design of the Burroughs
Proc. 41 (FJCC), 1972, pp 489-497.

14.

L. Svobodova, OnLine System Performance Measurements with
Software and Hybrid Monitors, Operating Systems Review, 7,
4, October 1973, pages 45-53.

81700,

the

Art,

Infotech

AFIPS

of

VoLume

Conf.

- 19 15.

A.H. Agajanian, A Bibliography on System Performance Evaluation, Computer, Volume 8, No. 11, November 1975, pp 63-74.

16.

G.J. Nutt, Tutorial: Computer System Monitors,
Volume 8, No. 11, November 1975, pp 51-61.

17.

A.G. Nemeth & P.o. Rovner, "User Program Measurement in a
Time-Shared Environment", Communications of the ACM, Vol 14,
No. 10, pp 661-666, October 1971.

18.

O.F. Stevens, "System Evaluation of the Control Data
Proceedings of the IFIP Congress, pp 34-38, 1968.

19.

W.M. Denny, The Burroughs B1800 Microprogrammed Measurement
System: A Hybrid Hardware/Software Approach, Proc of the
10th Annual Microprogramming Workshop called MICRO 10, ACM,
New York, 1977.

20.

G. Estrin, D. Hopkins, B. Coggan and S.D. trocker, SNUPER
COMPUTER
A Computer in Instrumentation Automation. AFIPS
Conf. Proc 30 (SJCC), pp 645-656 1967.

21.

J. Hughes and D. Cronshaw, On Using a Hardware Monitor as an
Intelligent Peripheral, Performance Evaluation Review, 2, 4,
pages 3-19, December 1973.

22.

R.J. Ruud,The CPM-X, A Systems Approach to Performance
Measurement, Proceedings of the FJCC, Vol. 41, Part II, pp
949-957, 1972.

23.

P. Deutch and C.A. Grant, A Flexible Measurement Tool for
Software Systems, Information Processing 71, Proc IFIP
Congress 71, North Holland, 1971.

24.

R.W. Comerford, Measurement-Computer Era Arrives,
ics, September 8, 1981, pages 96-100.

25.

B. Kumar & E.S. Davidson, Computer System Design Using a
Hierarchical Approach to Performance Evaluation, Communications of the ACM, Vol 23, No. 9, September 1980, pp 511-521.

26.

C.A. Rose, A Measurement Procedure for Queueing Network
Models of Computer Systems, Computing Surveys, Vol 10, No.
3, September 1978.

27.

P.R. Sebastian, Hybrid Events Monitoring Instrument,
1974 SIGMETRICS Symposium, ACM, New York, pp 127-139.

28.

J.M. Grochow, Real-time Graphic Display of Time-Sharing System Operating Characteristics, Proc AFIPS 1969 FJCC, Vol 35,
AFIPS Press, pp 379-386.

Computer,

6600",

Electron-

Proc

.- 20 29.

D.E. Morgan, W.Sanks, D.P. Goodspeed, R. KoLanko, A Computer
Network Monitoring System, IEEE Transactions on
Software Engineering, Vol SE-1, No.3, September 1975, pp
299-311.

30.

A. Geck, Performance Improvement by Feedback ControL of the
Operating Sy~em, Performance of Computer Systems, North HolLand, 1976, pages 459-471 Ed. M. Arato.

