A computer system to control and acquire data from a set of ten neutron and x-ray scattering and diffraction experiments located at the High Flux Beam Reactor at Brookhaven National Laboratory has operated in a routine manner for over two years. The system has been constructed according to a functionally distributed architecture and thus consists of a set of functional nodes. Ten of these nodes, the private or application nodes, perform the function "execute programs to control and acquire data from experiment number x". An additional functional node, the common or shared service node, performs the function "provide a set of shared services to the application nodes".
Introduction
A computer system to control and monitor and acquire data from nine neutron scattering and diffraction experiments and one x-ray diffraction experiment located at the High Flux Beam Reactor at Brookhaven National Laboratory has operated in a routine manner for over two years. This computer system, the Reactor Experiment Control Facility, has been constructed according to a distributed function architecture. A detailed discussion of the principles upon which this architecture is based has been given elsewhere, 1 as has a complete overview of the Facility.2 The more important of these principles are briefly reviewed in order to establish the functional definition of an important part of this system, the common or shared service node.
Once this node has been functionally defined, principles of systems analysis and a knowledge of the currently available system implementation elements can be used to implement the node in the best way possible. In addition, intuitive extrapolations of current trends in the production of system implementation elements should be made in order to anticipate ways in which future extensions in existent elements and additions of new ones could be utilized to extend the node. (It is important to note, however, that the functional definition of a node does not change with time.) The application of systems analysis principles in this manner is illustrated with a discussion of the implementation and subsequent extension of a major part of the common node, the common node operating system.
System Functional Definition
The system function of the Reactor Experiment Control Facility is to "control and monitor and acquire data from a set of laboratory experiments". Once the system function has been identified and stated, the system can be designed according to the architecture of functional distribution.
Functional Distribution
The primary objective in structuring a system according to a functionally distributed architecture is to iteratively partition the system function into a set of subfunctions at a lower complexity level in such a manner that the resulting subfunctions have the following properties:
(1) the boundaries (i.e. , the input to and output from the subfunctions) of all subfunctions in the set have approximately the same complexity level and an approximately uniform structure;
(2) the subfunctions in the set are further separable into subsets which do not overlap each other with respect to the hardware structures and software structures which are required to implement the subfunctions.
When such a set of subfunctions has been identified, each subfunction can be confined to a node of the system, where a node contains all of the hardware and software required to implement the subfunction. This confinement process can also be viewed as a distribution process for the system function and is thus the fundamental design procedure which gives the architecture its name.
In general, the partitioning process can be iterated within a system node to isolate nodes at still lower levels of functional complexity. However, with currently available hardware and software, it is usually not economically feasible to implement more than the first level of nodes. This is the case in the present system, and the system nodes at this level will be referred to simply as nodes. The functional level of the subfunctions performed by the nodes is called the node level.
The advantages which accrue to a computer system as a result of distributing in this manner the function which it performs have been discussed in detail elsewhere.1 Two of the more important of these advantages should be mentioned here:
(1) the functions which are very expensive to implement can be conf ined to one node of the system and performed for the other system nodes in a shared manner;
(2) a set of functions which tend to remain constant in time can be confined to one node which can be left undisturbed (with respect to its implementation hardware and software) throughout the lifetime of the system.
In general, there are many ways in which the function which a system is to perform can be partitioned into subfunctions. However, the partitioning process must be carried out with additional constraints in mind. Many of these constraints are indirectly imposed by the necessity of confining the resulting subfunctions to system nodes and implementing the nodes with currently available implementation elements. Additional constraints may more correctly be considered to be due to adherence to general principles of systems analysis.
In particular the definition of the subfunctions must be realized in such a way that they are readily visible to the users of the system.
Function Visibility
A great advantage of the functional approach to systems design is that the description of the system which results corresponds closely both in its terminology and in its partitioned subparts to the description which the system users would employ in relating how they interact with and utilize the system. That is, it is believed that users think in terms of functions. They divide their work (a function at the highest complexity level) up into units (lower level functions) which they comprehend well and set about executing these units one-after-another in sequential fashion. A computer system which is partitioned in a manner which corresponds well to the way a system user analyzes his use of the system will be easy for the user to understand.
It is likely, then, if certain other important criteria are met, that the user will consider the system to be a success. The realization of such an opinion from a majority of the users of the system is the ultimate goal of the systems analyst.
A careful analysis of a computer system along functional lines usually leads to definition of a set of functions just below the system level which can be considered to be standard in that they are in one-to-one correspondence with the individual operations which a majority of the users would expect to perform in the process of utilizing the system. As the functional partitioning process proceeds to the middle levels of functional complexity, it becomes more difficult to define functions which are easily recognized by users. This is because most users have little previous experience with operations at these levels, have no preconceived notions of the form that the operations should assume, and in addition hesitate to examine the operations because they consider these levels the private domain of the systems analyst. In fact, it might be expected that this same situation would prevail at even lower levels of functional complexity, and in some systems, especially those devoted to numerical analysis and computation, such is the case. A striking exception arises in the case of computer systems for experiment control and data acquisition. Most users have very definite ideas about the form which operations at the lowest functional complexity level should assume in such systems.
Thus in the case of computer systems for experiment control and data acquisition, the constraints on function definition imposed by the requirement that the system be highly visible to its users are most numerous at the highest and lowest levels of functional complexity.
First Partitioning of the System Function
The system function of the Reactor Experiment Control Facility, given above, is partitioned at the second level of functional complexity into the functions "develop programs for experiment control and data acquisition" and "perform operations to control and acquire data from a set of laboratory experiments". The first of these functions can be confined to a system node, the program development node, as it stands. The second function must be further partitioned into the n functions "perform operations to control and acquire data from experiment number x", where n is the total number of laboratory experiments. Each of these functions is further partitioned into the function "execute program to control and acquire data from experiment number x" and the function "provide a set of services required to control and acquire data from experiment x". The n functions in the first set are confined to n private or application nodes. The n functions in the second set are collected together to form the function "provide a set of shared services required for experiment control and data acquisition" and this function is confined to a common or shared service node. The detailed justifications for this particular partitioning of the system function have been given elsewhere. 1,2 Also, the implementation of the program development and application nodes has been discussed at length.1 '2 Here, the original implementation of the common node3-5 is reviewed in order to provide a basis for discussing the extensions to this node. Thus an additional constraint on implementation of the common node is that the processor (or processors) present at this node must execute the same instruction set as the application node processors.
Since, by functional definition, the common node provides a set of services to the application nodes, it is readily apparent that failure of this node to operate can have an adverse effect on the operation of not just one but all of the application nodes. In the worst case, all laboratory experiments may cease to operate; in practice certain of the experiments can continue without the common node services in a very limited stand-alone mode. Thus by far the most important factor to be considered in the implementation of the common node is the reliability of this node. Accordingly, the node has been implemented with completely main memory resident software. All software elements, both the tasks which supply the services and the operating system which supports task execution, reside permanently in the main memory of the node processor. The continuing decrease in the cost of large capacity core memory arrays is the implementation element trend which first indicated that such an implementation would be possible and led to its eventual acceptance. This trend and others which influenced the implementation of the Reactor Experiment Control Facility have been given elsewhere.1 A second major implementation element which can contribute substantially to the reliability of a node is a memory management option for the node processor. Such an element can be used to provide hardware isolation between the various modes of logical address space utilized at the processor. In particular, tasks which execute in the processor can be isolated from the operating system which supports their execution. In addition, the various portions of a logical space which can be modified by code resident in the logical space can be isolated from areas which the code should access in read-only fashion. For these reasons the common node has been implemented with a processor which includes a memory management option.
A third consideration to be taken into account in implementing the common node is that the reliability of the node should increase as the complexity of the operations performed by the node decreases. It will be shown below that the common node has been implemented as a transaction processor. The node responds to requests for service submitted to it in the form of transactions over the communication links between it and the application nodes. However, no additional inputs to this node, not even a console terminal, have been allowed. The common node responds to transaction requests and nothing else. Other implementation elements present at this node are listed below.
Common Node Implementation
The common node has been implemented as a transaction processor. Each request for service, the processing implied by the request, and the response to the request take the form of a transaction.
Transactions
A transaction consists of two mandatory and one optional transmissions over the communication link between an application node and the common node. These transmissions consist of the following:
(1) a REQUEST transaction parameter block (32 words) is transmitted from the application node to the common node. One parameter in this block, the function code, labels the service which the application node is requesting;
(2) an ACKNOWLEDGE transaction parameter block (32 words) is transmitted from the common node back to the application node. Parameters in this block specify whether or not the requested function has been performed successfully. In addition, the ACKNOWLEDGE block may contain information which represents the output of the requested function; (3) an optional transaction DATA-BLOCK may be transmitted between the nodes in either direction. The DATABLOCK length can vary but has a maximum of 8192 words.
In order to perform its function as a transaction processor, the common node must contain code and hardware to carry out the block transmissions which comprise a transaction and to perform the operations specified by the transaction function. The hardware and software elements used to implement the common node are listed below.
Hardware Implementation Elements
Hardware components utilized to implement the common node include the following: (1) 
Software Implementation Elements
The form of the software present at the common node is a reflection of the functional definition of the node as a transaction processor. In particular, four types of code, divided into common node subsystems at various levels of functional complexity, are present: task subsystems, service subsystems, the unsolicited transaction handler subsystem, and the device driver subsyst Sm8 These components have been discussed elsewhere -8 in some detail and will be reviewed here only breifly.
Task Subsystems. Each transaction submitted to the common node is processed by two task sequences. Tasks in the first sequence access information in the REQUEST transaction parameter block in order to assemble an ACKNOWLEDGE transaction parameter block for transmission back to the requesting application node. Part of the information in the ACKNOWLEDGE block consists of parameters of the transaction DATABLOCK if a DATA-BLOCK transmission is required. A transaction is complete with respect to the application node once the transaction DATABLOCK transmission has taken place. However, at the common node, execution of a second task sequence may be required upon completion of this transmission. The second task sequence is usually required when the direction of transmission of a transaction DATABLOCK is from an application node to the common node. Upon completion of the second task sequence, the transaction is complete with respect to the common node and may be removed from this node.
A task sequence consists of a number of tasks executed one-after-another in series, i.e., no two tasks in a sequence execute in parallel. When a task in a sequence has finished its processing for a transaction, it indicates the identification of the next task in the sequence ("primes" the next task), attaches the transaction to this next task, calls for execution of the next task to begin ("starts" the next task), and voluntarily gives up the processor ("exits"). When a task exits without having passed its transaction on to another task, the current task sequence is considered to be finished. 
Physical Resources. Physical resources4 are blocks of main memory which are dynamically allocated, i.e., allocated and deallocated after the common node operating system has been loaded to main memory and initialized. Physical resources are accessed by both task and service level routines. Resources accessed at the task level include buffers for input/output operations and control blocks for carrying out such operations. Resources accessed by service level routines include blocks to contain information about the status of task execution and the collection of resources associated with a task.
Transaction Processing Scheme
The scheme employed for processing transactions at the common node is very simple to describe but can involve some unexpected subtleties if it is rigidly enforced:
All resources, both logical and physical, required to completely process a transaction are claimed as a group before processing of the transaction commences.
In this manner, the classic lockout problem is avoided.
The lockout problem occurs when two partially completed transactions each require additional resources for their completion and at least one resource required by one transaction is currently assigned to the other transaction and vice versa. In general, the set of task level resources required to process a transaction to completion is easily determined. What is not so obvious is the group of resources accessed at the service level which can be required for the processing.
Conversion of the Common Node
Operating System For reasons discussed above, the common node operating system is completely main memory resident. A discussion of the conversion of the operating system from two-to three-mode operation reduces to a discussion of the details of the memory management scheme employed, the layout of the operating system in physical memory, and the methods of communicating between different main memory logical address spaces.
Memory Management Scheme
The memory management scheme employed in the PDP-11 series of Digital Equipment Corporation computers supports a number of spaces, or modes, of logical main memory addresses. Each mode can be used to reference logical address locations 000 000 to 177 777 (octal), 
In addition, a page of logical address space can have a variable length of from 000 100 to 020 000 logical address locations (corresponding to 001 to 200 memory management units). Thus in some instances, a particular mode of logical space may have unused portions or "holes". This feature of the memory management scheme is used only at the task level in the present system. Associated with each mode of logical address space is a stack pointer register. Only one set of general registers is used for operations in all modes of the operating system, however. Two processor instructions are provided for moving small quantities of information from one mode to another. In particular, in order to move a word (two bytes) of information to or from the current mode of logical address space, the currently executing routine must be able to generate (either implicitly or explicitly) the following parameters:
(1) the mode of the logical address space which is to be the source or destination of the word to be transferred. This space is always referred to as the "previous" logical address space and its mode is referred to as the "previous" mode; (2) A second near-standard page assignment holds for logical space page number three; this page is almost always used to access the transaction information block,3 a dynamically allocated resource. Since part of this block becomes the ACKNOWLEDGE transaction parameter block, the task must load into it information which comprises the output of the transaction function. Hence this page is always assigned modification access.
It is worth noting here that the memory management scheme is deficient in one respect. The scheme does not divide up the mode logical address space into enough pages. Thus for the more complex task level subsystems, where many different resources must be accessed, the pages of logical address space must be dynamically switched back and forth between resources, a very cumbersome procedure.
Memory Management of Service Routines. Code and resources belonging to the service level subsystems reside at the lowest level mode of logical address space, the kernel mode. The assignment of service level routines to the eight pages of kernel mode logical space is summarized in Table II . In contrast to the management of user mode logical space, the correspondence between kernel mode logical space and the physical main memory occupied by the routines executed in kernel mode is established at the time the common node operating system is loaded and initialized and does not change. This means that the kernel mode routines and resources do not overlap in logical address space.
The reasons for keeping the logical-to-physical space map constant in time and having the service level routines occupy kernel mode space in a non-overlapping manner are a result of both the functional definition of these routines and practical considerations. In practice, the kernel mode logical space is mapped into physical space on a one-to-one correspondence. An exception is the "external" page which contains logical space addresses used to access peripheral device registers. This page must be mapped to physical addresses 760 000 -777 777. The logical space to physical space map for the kernel mode is summarized in Table II .
Inter-mode Communication Scheme for Control Parameters
Whenever a task requests a service from the operating system, it must first submit a small (. 10) number of control parameters which describe the service to be provided and then call the appropriate service routine into operation. As mentioned above, the mechanism for calling a service level routine into operation is the software trap. Communication of the control parameters to the requested routine must be accomplished in a manner which conforms to the system design objectives.
According to these system design objectives, communication between routines at different levels should be implemented by means of a physically (and logically) contiguous area of main memory space which is mapped by a set of globally defined logical offsets into the space. The method for accessing the communication area is to set a general register to its logical space start address and then access individual address units within the area by adding the offsets to the contents of this base register. In this manner, both the length and arrangement of the information within the area can be modified by redefining the logical offsets and relinking the routine. In the case of the service level routines, this scheme for communicating between routines has been implemented by utilizing the mode stacks.
A task which must request a service establishes a space on its stack (user mode stack) by adjusting the contents of its stack pointer register so that a logical number of address locations are added to the stack. Control parameters are moved into this space by utilizing the stack pointer and a set of the above-mentioned offsets. When task execution continues upon completion of the service, the control parameters returned by the service are also present in the stack area and are retrieved by utilizing the stack pointer and additional offsets in the set. After retrieving the returned parameters, the task destroys the communication area by removing the required number of logical address locations from the stack.
Thus one of the objectives of the functional approach, the rigorous specification of the input and output of a function, is realized. The input to and output from each service level routine are defined by a logical length of main memory space and a set of logical offsets. The parameters of a typical service which can be requested by a task level routine are illustrated in Fig. 1 .
When the software interrupt (trap) instruction which calls a service routine into execution is encountered, the following sequence of operations takes place:
(1) execution of the software trap instruction itself causes the logical space mode of operation of the processor to be switched to a lower level;
(2) execution control moves via the software interrupt vector to a routine which examines the interrupt label and dispatches to the correct trap service routine; (3) the first instruction executed by the service routine is a call to a special purpose subroutine which allocates an area on the current mode stack which is equal in length to the stack area prepared by the requestor routine; (4) the control parameters which form the input to the function (service routine) are moved from the previous mode stack to the current stack.
The service routine, utilizing the same set of globally defined offsets mentioned above, accesses the input parameters and carries out the requested operations. The offsets are also used to place output parameters into the stack communication area. Upon completion of its operations, the service routine effects a return to its requestor by initiating the following operations:
(5) the next to last instruction executed by the service routine is a call to a special purpose subroutine which moves the control parameters to be returned from the current mode stack to the previous mode stack and removes the control parameter area from the current mode stack; (6) the service routine executes its last instruction, a return-fromsoftware-interrupt. This instruction switches the logical space mode of operation of the processor back to the mode of the requesting routine.
Not only tasks, but service routines themselves may request services supplied by the routines operating in kernel mode space. Thus an implicit hierarchical structure can be ascertained amongst the service level subsystems. It is shown below that this hierarchical structure becomes more explicit when the operating system is extended to three modes of logical address space. Requests in the other direction, i.e., from a service routine to a task, are, of course, not allowed.
Inter-mode Communication Scheme for Data
Quantities of information which exceed the maximum length specified for service routine control parameters are transferred between modes via physical resources. Also, such information has, in general, much longer lifetime requirements than do the control parameters. Space for control parameters is allocated at the time that execution of a service routine is requested and deallocated upon completion of this execution. Hence space for the parameters is reserved for a maximum of , 102 -103 [tsec. In contrast, space required for physical resources is allocated at the start of transaction processing and deallocated when the transaction has been completely processed. is mentioned above, this processing may require -10 -103 milliseconds.
Thus the time requirements for the re-entrancy of the two types of information, as well as the ccnstraints on tMeir maximum amounts, are different by a factor of 10 -103.
These differences in re-entrancy requirements between the two types of information lead to differences in the manner in which multi-mode access to the information is provided. Whereas the control parameters, being few in number, are simply copied to the logical space of the new mode when a service is requested, quantities of information in physical resources are accessed in different modes by establishing a correspondence between the different logical space start addresses of the physical resources in the different modes. Tables and code to Tables III and IV .
Conclusions
Extension of the operating system at the common node has made it possible to increase the number of task level subsystems present at the node from four to twelve; correspondingly, the total number of tasks has been increased from 33 to 128. Perhaps the most important aspect of this exercise is that it has provided some working experience in dealing with the problems which arise in the process of partitioning functions, defining rigorously the methods for communicating between the functions, and establishing a hierarchical framework for the functions.
It is believed that the requirements which must be met when sets of functions are confined to separate modes of logical address space closely approximate those which would have to be met if the functions were to be partitioned and distributed onto individual system nodes at a level below that of the common node itself.
While such a partitioning exercise is theoretically possible in a computer which has only one mode of logical address space, in practice an actual implementation on a multimode machine is required to bring out all the subtleties of the technique and to rigidly enforce, during the implementation phase, adherence to the principles of the technique. It has been the author's experience that attempts to implement a partitioned system on a single-mode machine are always foiled by the enormous pressures placed on the systems analyst to take short cuts and violate the logical space boundaries which, in the case of such single-mode machines, can be only artifically enforced.
Future Work
The next constraint on extending the set of common node services will be imposed when the physical space available to the task level routines and resources is exhausted. However, the node has been functionally defined in such a way that only one task need be in execution at any one time. This means that the operations required to establish the logical-to-physical space map need to be performed only a small number of times in the course of a complete execution of a task. Furthermore, the logical space (and hence the maximum physical space) available to a task has been defined in such a way that it constitutes only a portion of the total physical space available for user mode routines. These facts suggest that additional memory management hardware could be added to the node in order to map a portion of the user mode physical main memory space into a large bank of external memory. Such memory management hardware, a module of which is termed a multiport memory controller, has already been developed here at Brookhaven and is described in detail elsewhere.9 Such an addition to the node hardware would allow many more tasks to be added to the common node. Since all task code executes at the same page of logical address space, the linking procedures for these new tasks would be the same as those for the present tasks. A rather sophisticated extension to the present task loader program would be required, however.
As mentioned above, the eight pages of logical address space are insufficient. After pages have been assigned to the task stack, the task itself, and its standard statically allocated resources, and an additional page has been assigned to the transaction information block, only two or three pages remain for accessing the dynamically allocated resources. It is believed that sixteen pages would be a comfortable number. The prospects for a machine in the PDP-ll series with such a memory management appearing on the market appear to be nonexistent. It may be possible, however, to implement such a memory management scheme by utilizing the microcoding capabilities of the PDP-11/60. Also, more than three hierarchy levels are present among the routines used to implement the common node. If a memory management with more modes were available, even more partitioning of these routines to separate logical address spaces could take place and a more reliable common node would result. present at the common node to lower level nodes. In this scheme, each subsystem would have its own processor, probably an LSI-11, and the main function of the common node system level processor would be to arbitrate the access that these processors would have to a large bank of main memory. It is believed that the present work is a step toward the realization of such a system. 
