TABLE 1. SIB Address Map
Byte addresses in Hex
ADDRESS
Description __________________________________________________ 0,000,000 -0,000,FFC SBus Boot PROM 0,001,000 -0,001,1FC
MultiKron II 0,001,200 -0,001,20C SIB 0,001,200 SIB Configuration Register 0,001,204 SIB Memory Address Pointer 0,001,208 SIB TEST Operation 0,001,20C SIB Software Reset 0,001,210 -0,FFF,FFC N/A 1,000,000 -1,FFF,FFC SIB Local Memory *used for testing only cause the stored Samples to be scrambled, a safety interlock has been programmed into the controlling logic which ignores any write to the memory address pointer while in LOCAL mode. If there is a need to write to the counter during the course of an experiment, the user should wait for any Samples enroute to be safely written, then disable LOCAL mode in the SIB Configuration register and write the new value into the memory address pointer, then re-enable LOCAL mode. Enabling or disabling LOCAL while Samples are being taken has an undetermined effect on the state machine handling that transfer and therefore should be avoided.
Before sampling begins, the user should initialize the SIB memory address pointer. The initialization code distributed with the Experimenter's Toolkit provides for this initialization. The user can read the memory address pointer during the experiment to determine how much of the memory space has been written. The memory address pointer points to the next available memory location.
In the simple buffer mode, Sample writing is halted when the memory is full. If new Samples continue to be generated, the SIB FIFO will eventually fill and stop accepting data from the MultiKron II. The MultiKron II will then fill up its small internal FIFO, and either discard new Samples or cause an SBus time out, depending on how the MultiKron II is configured [MIN94] .
In the circular buffer mode, when the memory is full, Sample writing continues from the beginning of memory by wrapping the memory address pointer around to zero, overwriting the oldest Samples and thus retaining the most recent Samples generated. 16 bits wide by 1024 ranks deep. Each transfer consists of eight data bits along with an "end of Sample" flag and an odd parity bit. Thus 6 of the 16 SIB FIFO bits are not used. The SIB FIFO provides buffer storage for up to 51 20-byte Trace Samples to improve peak Sample rate performance and reduce the risk of Samples being lost due to collection delays. For diagnostic purposes, there are SIB commands to manually control the flow of data from the SIB FIFO.
The input and output of the FIFO are independent of one another. An SIB FIFO write clock is produced by the MultiKron II (at half the rate of the MultiKron II input NodeClk). The FIFO read clock is the MultiKron II input NodeClk. The FIFO full flag generated by the FIFO input logic is fed into the MultiKron II, where it can stop further MultiKron II output when the SIB FIFO is full. The MultiKron II can continue to generate Samples until its small internal FIFO is filled, at which point it either forces the processor/SBus to wait until room is available or discards new Samples and sets an error flag. The course of action to take is determined by the configuration instructions which are written to the MultiKron II prior to Sampling. The FIFO empty flag generated by the FIFO output logic is used to determine whether there is data available to read. This data will either be sent over the external cable (16 bits at a time) to an S16D interface on another computer or written into the SIB local memory (32 bits at a time), depending on the SIB configuration option selected.
FIFO Testing
A test mode can be activated by bit 8 of the SIB Configuration register. Enabling this test mode allows processor control of the SIB FIFO output. A read to the SIB TEST address (see Table 1 ) will remove one entry from the FIFO and send it to the processor via the SBus. A write to the SIB TEST address will remove four entries from the FIFO and send it to the SIB local memory at the location pointed to by the SIB memory address pointer. The SIB memory address pointer will then be incremented.
External Cable Interface
The External Cable Interface provides the logic needed to extract two entries from the SIB FIFO, combine them into a single 16-bit value and send it over an external cable to an S16D interface [EDT91] , a commercial SBus I/O interface board, installed on another machine. The S16D board plugs into an SBus slot of the external data collection computer. Disabling the LOCAL option ("0"), bit 9 in the SIB Configuration register, allows experimenters to perform "on-the-fly" analysis of measurement Samples and also store more Samples than the SIB local memory. The average Sample collection rate is slower via the non-LOCAL option than the SIB local memory.
SIB local Memory
The 16 Mbyte SIB memory is configured as four banks of 4M x 32 bits, built from 4M x 4 bit DRAM chips and a commercial DRAM controller chip. Four entries are extracted from the FIFO and combined into a 32-bit word and then written into the local memory via a pipeline holding register. Although a FIFO entry is 16 bits, only 8 bits are Sample data. The SIB local memory requires 22 SBus address bits to access any 32-bit word, 2 bits for bank select and 20 bits for word select. Since these are word accesses aligned on word boundaries, the byte level address is always 00. The SIB local memory may be accessed via the SBus at any time, since SIB memory arbitration will interleave SBus access with Sample storage. Writing to the SIB local memory by the processor is provided mainly for testing purposes.
A 24-bit counter serves as the memory address pointer for Sample storage; the SBus address is used for processor access. In principle, this counter may be read or written by the processor at any time. However, since a write to the counter while it is storing MultiKron II Samples could MultiKron II are memory mapped into the 32-bit memory address space. The SIB address and control decoders recognize SBus accesses and convert them into the specified SIB operations. Configuration, access, and testing of the MultiKron II are supported through SBus access and the SIB Configuration register. Two MultiKron II output Sample collection methods are supported by the SIB. One method is to a SIB local memory, which requires a memory arbiter/controller and a memory address pointer. The other method is via an external cable connected to an S16D interface card [EDT91] on a separate SUN workstation. The architecture of these facilities is discussed more fully below.
The SIB is based on a synchronous design using the SBus supplied clock. A local oscillator, or an optionally supplied external clock, is used to derive various clocks for the MultiKron II and the SIB memory. All decoding and controls are implemented in programmable devices, which are installed via sockets on the SIB for easy replacement/reprogramming.
SBus Interface
The SIB is a standard SBus card and fits into a SBus slot. All SIB accesses are 32-bit data transfers aligned on 32-bit boundaries (the two least significant address bits are zero), with the exception of the SBus boot operation which are 8-bit data transfers aligned on 8-bit boundaries. Block transfer and DMA requests are not implemented. The SBus implements a separate address space, starting at zero, for each device that can be plugged into a slot. The address map for the SIB is listed in Table 1 .
MultiKron II CPU ID Signals
The eight, unencoded processor ID lines (also called CPUID) are used to allow hardware identification of individual processors in multiprocessor environments. In its intended use, this feature has two functions: (1) it is encoded in a three-bit field in the Sample header to identify the processor taking the Sample, and (2) it selects the contents of one of the eight 32-bit Source Address registers to be placed in the Sample. The MultiKron II Source Address registers are written before Sampling starts and should be updated on context switches. They are intended to contain the node number (if applicable) and the process identification of the process writing the Sample. The processor ID input consists of eight input lines--one per processor, so only one line may be asserted at any time. The SIB provides for two options to drive these lines: (1) an SIB register --bits 0-7 of the Configuration Register, or (2) external signal lines, via the SIB connector, for hardware identification of the active processor. EXT_CPU, bit 10 of the Configuration Register (see Table 2 ) controls this selection. The connector option will require custom wiring to the connector and consultation with NIST to obtain the connector specification. For single processor machines, the SIB register option should be selected and initialized once. The effect is to select a single MultiKron II Source Address Register for all Samples.
MultiKron II External Counter Signals
The MultiKron II external inputs are one of the selectable counting sources for the MultiKron II Resource Counters. An SIB connector option to provide these external signals will require custom wiring to the connector and consultation with NIST to obtain the connector specification.
SIB FIFO
The MultiKron II output network sends measurement Sample data to the SIB FIFO, which is
Output Data Collection
The SIB provides two ways in which to collect MultiKron II output, selectable via the "LOCAL" option, bit 9 of the SIB Configuration Register. One can collect Samples directly in the SIB local memory, which is accessible immediately, without any additional devices or wires. The Samples can remain stored there until they are read out for processing or storage. The second method is to collect Samples on an external machine. In this case an external cable connects the SIB to a hardware interface, an S16D commercially available SBus interface [EDT91] , on another machine. Information will be supplied by NIST to allow the experimenter to obtain the correct cable and connector. Using the S16D interface option on the SIB allows experimenters to perform "on-the-fly" analysis of the measurement data and store more Samples than will fit in the SIB local memory.
The SIB local memory contains 16 megabytes, and can store up to 838,860 Samples (20 bytes per Trace Sample). This memory can be read and written directly by the CPU as 32-bit words, even simultaneously while Samples are being taken. The SIB has a dedicated address pointer which it uses to linearly place Samples into the SIB local memory. The CPU can read this address pointer to find out how many Samples are in the local memory and where the last Sample is located. The CPU can also write to this address pointer, but this is primarily for testing purposes and it is not expected to be used operationally.
The SIB local memory can be configured as a simple buffer, or as a circular buffer, via the "NO_WRAP" option, bit 14 of the SIB Configuration Register. As a simple buffer, loading starts at address 0 and ends at FFFFFF. Once the memory address pointer reaches its maximum value, it stops incrementing, and new writes are disabled. Any Samples arriving after the local memory is full, will back up through the MultiKron II's FIFO. As a circular buffer, loading starts at address 0, but upon reaching FFFFFF the memory address pointer "wraps around" to 0 and starts overwriting older data. Thus, when configured as a simple buffer the SIB memory will retain the oldest data, and when configured as a circular buffer it will retain the newest data (the cockpit voice recorder mode).
Default Configuration
Initially, it is anticipated that the experimenter will not connect any external wires to the SIB. Therefore, the SIB Configuration Register should be set to 0C3201 (Hex). This configuration sets all the test controls to inactive. It stores MultiKron II Samples in the local memory, treating it as a circular buffer. All Samples are identified with CPU ID 0 (bits 0-7 of the SIB Configuration Register set = 01 (Hex)) and Source Address Register 0. It sets the MultiKron II wait states to 0, and enables MultiKron II outputs. The local oscillator is selected and its frequency is divided by two for the MultiKron II NODECLK. The NODECLK frequency is further divided by four to obtain the MultiKron II Timestamp frequency. This is the default setting used by the initialization routine supplied with the Experimenter's Toolkit.
HARDWARE ARCHITECTURE
The SIB is designed to provide control, access, and testing of the MultiKron II, and also to provide collection of the MultiKron II output measurement Samples. The principal MultiKron II signals are the address and data lines (used for processor interaction), the output network lines (for output of measurement Samples), the processor ID input lines (used to identify the CPU triggering a Sample), the resource counter external inputs (a MultiKron II selectable option used to count external signal occurrences), and control lines (used for testing and initialization). A block diagram of the SIB is shown in Figure 1 and the printed circuit board layout is shown in Figure 2 . Processor access to the SIB is provided via the SBus interface. The SIB and the Sampling operations. This reduces the range of values possible in Sampling data by wasting the upper bits of the data path. Less time critical operations, for example loading the Resource Counter control registers, should occur very infrequently and can endure the required overhead.
An SIB Configuration register (Table 2) provides the means for an experimenter to control the MultiKron II and the SIB, and are defined as follows.
Bits 0-7, CPU_ID, of the SIB Configuration Register represent the individual processor ID signals to the MultiKron II. The CPU_ID signals are used in multiprocessor applications to identify which processor triggered each measurement Sample and select the corresponding MultiKron II Source Address Register to be included in that Sample. Assuming a single processor, one can permanently set CPU_ID = 01 (Hex), resulting in Source Address Register 0 being selected for each Sample.
Bit 8, SIB_TEST, of the SIB Configuration Register is used for testing and should always be in the operational state ("0") indicated in Table 2. Bit 9, LOCAL, of the SIB Configuration Register selects the destination of measurement Samples. The chioce is between the SIB local memory ("1") or to another machine via an external cable ("0"). This is discussed in more detail low.
Bit 10, EXT_CPU, of the SIB Configuration Register selects whether the MultiKron II CPU_ID inputs are connected to the external hardwired SIB input signals ("1") that the experimenter may configure, or to bits 0-7 of the SIB Configuration Register ("0").
Bit 11, WAIT, of the SIB Configuration Register enables a wait state for the MultiKron II on all processor interactions. This value is read by the MultiKron II only upon a RESET, either a hardware reset or a software reset. For fastest operation no wait state ("0") is recommended, while a wait state ("1") will provide slower operations.
Bit 12, MK_TESTB, of the SIB Configuration Register is used for testing and should always be in the operational state ("1") indicated in Table 2. Bit 13, OUTEN, of the SIB Configuration Register is the output enable signal to the MultiKron II. This bit should be placed in the operational state ("1") when using the MultiKron II. If this bit is off ("0") then All MultiKron II output signals are disabled.
Bit 14, NO_WRAP, of the SIB Configuration Register controls whether the SIB local memory operates as a simple buffer ("1") or a circular buffer ("0"). This is discussed in more detail below.
Bit 15, EN_EXT, of the SIB Configuration Register is used to select the source of the MultiKron II NODECLK and RESET signals. The choice is between a local oscillator on the SIB ("0") or an externally supplied clock signal ("1"). The SIB uses the SBus clock for all its processor interactions (25 MHz maximum). A SIB reset command, always resets the MultiKron II. If the external clock option is selected, then in addition to the SIB reset command, an externally supplied signal will also cause a MultiKron II reset.
Bits 16-18, TS_2 TS_3 & TS_4, of the SIB Configuration Register are used to select the MultiKron II Timestamp clock rate as a function of the MultiKron II NODECLK rate. The choices are 1/2 (bit 16 --the default), 1/3 (bit 17), or 1/4 (bit 18).
Bit 19, DIV_2, of the SIB Configuration Register is used to select the MultiKron II NODECLK frequency as unmodified ("0") or divided by two ("1"). required to design and build a hardware interface between the MultiKron and their computer. Up to 838,860 measurements can be collected, during an experiment, to the SIB on-board memory; a practically-unlimited number of measurements can be collected if an external datacollection computer is used.
During execution of a program under test, performance measurement data (Samples) are acquired as directed by measurement probes. There are two types of measurement probes, hardware and software. A hardware measurement probe is a wire physically connected to an electrical signal in the system being measured. One of the options for the MultiKron Resource Counters is to count the occurrences of these external signals via a dedicated pin for each Resource Counter. A software measurement probe is the code that generates a Sample. This code is an assignment statement to a memory mapped MultiKron address. An experimenter wishing to obtain performance measurements via the SIB must insert measurement probes at points in their program and recompile the program. The Samples acquired can be processed "on-the-fly" using a separate data-logging computer. Otherwise they are processed, after program execution, by reading the Samples out of the SIB local memory. An SIB/MultiKron initialization routine and an analysis program are supplied as part of the MultiKron Experimenter's Toolkit, but the experimenter can replace or modify them as desired.
FUNCTIONAL OVERVIEW
The SIB provides the necessary interfacing capabilities between a processor and the MultiKron II [MIN94] (an enhanced version of the MultiKron [MIN92] ) via a standard SBus, and provides MultiKron II output data collection facilities.
Processor Interface
The SIB is memory mapped, so all interactions are memory reads and writes (i.e., assignment statements in high level languages) to addresses listed in Table 1 . The SBus is an I/O bus and thus processor interactions are slower than if the MultiKron II were directly interfaced to the processor memory bus. The SIB provides facilities to control, access, and test the MultiKron II from the processor being measured. The SBus-to-SIB data path is 32 bits wide. Although the MultiKron II is a 64-bit device, it can accommodate 32-bit data transfers with the aid of an internal holding register. Thus the MultiKron II must be placed in its 32-bit mode when used in the SIB. The MultiKron II's internal holding register is designed to hold the 32 high order MultiKron II data bits (bits 32-63) during CPU interactions. This is a single register used for both input and output CPU interactions. For a 64-bit input (CPU write) operation, the experimenter should first load the MultiKron II internal holding register with the high order 32 bits of data to be written (if any), and then write the low-order 32 bits directly to the MultiKron II causing a full 64-bit input to the MultiKron II. Similarly for output, the experimenter should read the low-order 32 bits of data directly from the MultiKron II, which causes the high-order 32 bits of data to be loaded into the MultiKron II internal holding register. Then the experimenter can read the high-order 32 bits of data from the MultiKron II internal holding register.
The MultiKron II does not provide the mechanism to handle its 32-bit mode as an indivisible operation. Thus, if an interrupt occurs between writing the high-order 32 bits of data into the MultiKron II internal holding register and writing the low-order data to the MultiKron II, another MultiKron II write may occur that will overwrite the MultiKron II internal holding register. A similar problem could happen between reading the low-order 32-bit data and the high-order 32-bit data. If the computer being measured is a multiprocessor the problem is even more pronounced. It is up to the experimenter to guarantee that register overwrite will not occur. Because of the anticipated processing overhead to handle this indivisibility correctly, it is recommended that only 32-bit data fields be transferred to the MultiKron II in time critical
Operating Principles of the SBus MultiKron Interface Board
By A. Mink, G. Nacht, and J. Antonishek Advanced Systems Division National Institute of Standards and Technology mink@cmr.ncsl.nist.gov
ABSTRACT
The MultiKron* Experimenter's Toolkit contains an SBus MultiKron interface board (SIB) or alternatively a VME MultiKron interface board (MIB), installation software, data logging software, and analysis software; all of the software supplied is written in C. The Toolkit allows users to take advantage of the NIST MultiKron performance measurement chip in systems that do not already have a MultiKron designed into them. The SIB is applicable to both multiprocessor systems and single-processor systems. The Experimenter's Toolkit allows researchers to obtain handson experience with the MultiKron performance measurement chip, without the engineering effort required to design and build a hardware interface between the MultiKron and their computer. Over 800,000 Trace Samples can be collected during an experiment to the SIB on-board memory; a practically-unlimited number of Samples can be collected if an optional external data-collection computer is used.
Key words: Computers; hardware support; MIMD; multiprocessor computers; performance characterization; printed circuit board; PCB; VLSI
INTRODUCTION
The MultiKron* Experimenter's Toolkit contains the SBus MultiKron interface board (SIB) or alternatively a VME MultiKron interface board (MIB) [MIN93] , installation software, data logging software, and analysis software. The SIB is to be inserted into the SBus of the computer being measured. All of the software supplied with MultiKron Experimenter's Toolkit is written in C and distributed in source code. The Toolkit allows users to take advantage of the MultiKron [MIN92, MIN94] performance measurement chip in systems that do not already have a MultiKron designed into the system. The SIB is applicable to both multiprocessor systems and single-processor systems. The Experimenter's Toolkit allows researchers to obtain hands-on experience with the MultiKron performance measurement chip, without the engineering effort _________________________________ * MultiKron is a trademark of NIST.
This National Institute of Standards and Technology contribution is not subject to copyright in the United States. Certain commercial equipment, instruments, or materials may be identified in this paper to adequately specify experimental procedures. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that materials or equipment identified are necessarily the best available for the purpose.
This work was partially sponsored by the Advanced Research Projects Agency.
