Abstract
Introduction
Recently, PAFAS has been proposed as a useful tool for comparing the worst-case efficiency of asynchronous systems [ I I , 61 . PAFAS is a CCS-like process description language [15] where basic actions are atomic and instantaneous but have associated a time bound interpreted as a maximal time delay for their execution. As discussed in [ 11, 61, due to these upper time bounds time can be used to evaluate efficiency, but it does not influence functionality (which actions are performed); so compared to CCS, also PAFAS treats the full functionality of asynchronous systems. Processes are compared via a variant of the testing approach developed in [8] . Unlike [8] , our tests are not simply test environments but test environments together with a time bound. A process is embedded into the environment (via parallel composition) and satisfies a (timed) test, if success is reached before the time bound in every run of the composed system, i.e. even in the worst case. This gives rise to a preorder relation over processes which is naturally an eflciencypreorder. This efficiency preorder can be characterized as inclusion of some kind of refusal traces; this ' also provides a decidability result for the preorder for finitestate processes. Furthermore, the preorder is independent of the choice to let time progress in a continuous or discrete way; therefore, we only consider discrete time in this paper. These ideas and results were originally successfully studied within the Petri net formalism [16, IO] . We refer the reader to [6] for more details and results on PAFAS.
This paper shows the applicability of PAFAS to concrete meaningful examples. We consider three different implementations of a bounded buffer and relate them according to the above mentioned efficiency preorder. The three implementations are called Fifo, Pipe and Buff. Fifo is a bounded-length first-in-first-out queue, which one could also consider as the specification of a bounded buffer; Pipe is a sequence of one place buffers connected end to end and Buff is an array used in a circular fashion.
We prove that Fifo and Pipe are unrelated according to our (worst-case) efficiency preorder (unrelated means that the former process is not more efficient than the latter one and vice versa); this is presumably in contrast to expectation, since only in Pipe items have to be transported in several steps from one end to the other. Similarly, one would expect Buff to be faster than Pipe, since the latter needs more such steps, but they also turn out to be unrelated. We give good reasons for these results and also prove that Fifo is more efficient than Buff, but not vice versa.
For Buff and Pipe, the same results were obtained in [ 161, where more or less the same efficiency preorder was defined and studied for Petri nets as specification model instead of a process description language such as PAFAS. This shows that the ideas behind our efficiency preorder are not model-dependent -though, of course, the different models impose a different development.
The same buffers we consider were also contrasted in [ 2 ] . Their approach is based on a bisimulation-based preorder; visible actions are regarded as instantaneous and the costs are measured as the number of internal actions. Hence [2] presents an interleaving approach, which disregards the parallel execution of actions. According to this efficiency measure, it has been proven that Fifo is more efficient than Buff and Buff is more efficient than Pipe. Since parallel execution of actions is taken into account in the present paper, Pipe is incomparable to the others in our approach.
The rest of the paper is organized as follows. The next section briefly recalls PAFAS and the characterization of the testing scenario in terms of refusal traces. Section 3 provides a description of the three buffer implementations and their operational behaviour, while Section 4 studies their relationships according to the efficiency preorder.
PAFAS and Its Alternative Characterization
This section briefly presents our process description language PAFAS, its operational semantics and the preorder relating processes according to their worst-case efficiency. For a more detailed treatment of the general framework we refer the reader to our original paper [6] . A full version of this paper is [7] .
Processes and Refusal Traces
In [6] a CCS-like process algebra, called PAFAS, is w.r.t. functionality and efficiency. The faster-than-relation 7 is characterized using some sort of refusal traces which provide decidability of the testing preorder for finite state processes (such as those considered in [ 121).
Due to space limitations, we omit the presentation of the respective testing theory and pay more attention to the refusal semantics, considered for our comparison.
A (ranged over by a , b, c pects from asynchronous systems. PAFAS is a slight extension of the architectural description language described in [12, 131 .' It has a CCS [15] sequential composition (called action prefix) and a TCSP [9] parallel composition. In this paper we only report the PAFAS operators strictly needed for our comparison (see [6] for a full account of the language).
We assume that time elapses in a discrete way (in where y is a or g for some a E A,, CP a general relabelling function, z E X and A A possibly infinite. 0 is the Nil-process, which cannot perform any action, but may let time pass without limit; a trailing 0 will often be omitted, so e.g. a.b + c abbreviates a.b.0 + c.0. a.P and g.P is (action-) prefixing, known from CCS. In particular, process a.P performs a with a nzaximal delay of 1; hence, it can either perform a immediately, or can idle for time 1 and become a.P. In the latter case, the idle-time has elapsed; hence a must either occur or be deactivated (in a choice-context) before time may pass further -unless it has to wait for synchronization with another component (in case a # T ) . This means that our processes are patient: As a stand-alone process, a.P has no reason to wait; but as a component in a.Pll{,)a.Q, it has to wait for synchronization on a and this can take up to time 1, since component a.Q may idle this long. PI + P2 models the choice between two conflicting processes PI and Pz. PI ll~P2 is the parallel composition of two processes PI and Pz that run in parallel and have to synchronize on all actions from A; this synchronization discipline is inspired from TCSP. P[@] behaves as P but with the actions changed according to a. px.P models a recursive definition; in the examples, we will define processes by recursive equations instead.
We are now ready to define the refusal traces of a process P. Intuitively, a refusal trace records, along a computation, which actions process P can perform ( P %, PI, a E A,) and which actions P can refuse to perform ( P +, P', X C A). A transition like P ifs, P' is called a time step.
The actions listed in the set X are not urgent; hence, P is justified in not performing them, but performing a time step instead. Other actions might be urgent, so as a stand-aloneprocess P might actually be unable to make a time step; but as a component of a larger system, it might take part in a time step if it has to synchronize on those other actions with the environment and the latter can refuse them (see rule Pref,z in Fig. 1) . P can make a time step in any context, if X A, and %,C (P x P), where a C_ A,.
The rules in Fig. 1 explain the operational semantics of PAFAS processes. A process like a.P can either perform action a and then become P (rule Pref,,), or can let time 1 pass and refuse to perform cy (after which action (Y becomes urgent, rule Pref,l). A process P prefixed by an urgent action g, g.P, can perform an action a (rule Pref,:!) and on its own cannot delay such an execution (rule Pref,~). Since internal action T never has to be synchronized, a process prefixed by an urgent T cannot make a time step. Another rule worth noting is Par, which defines which actions a parallel composition can refuse during a time step. The intuition is that PI ll~P2 can refuse an action a if either a A ( P I , P 2 are not forced to synchronize on a ) and both P I , P2 can refuse a, or a E A (PI, P 2 are forced to synchronize on a ) and either PI or P2 can refuse a. The other rules are as expected.* For sequences w E (A, U 2*)*, we define P ar P' as expected: P %, P if w = E (the empty sequence) or there exist Q E P and p E (A, U2A) such that P 3, Q -$, P' and w = pw'. Define P 3, P' if P 3, P' and v is the sequence w with all T ' S removed. Finally, RT(P) = {w 1 P 3,) is the set of refusal traces of P.
The efficiency preorder 2 is characterized by refusaltrace-inclusion (see [6] for the proof).
Theorem 2.3 (Characterization of the efficiency preorder)
Process P is faster than Q ( P &) if and only if RT(P) WQ).
*[6] uses a different rule
Reca which is equivalent due to guardedness.
Description of the Three Buffers
We now describe three implementations of a buffer of capacity N + 2, where N is a fixed positive natural number.
The buffer receives and stores values from V = (0, l } . We use the following notation: Strings are denoted s, t , ... and Is[ denotes the length of s; thus, Is1 = 0 means s = E. V k denotes the set of strings in V* of length k while V k = V i denotes the set of strings of length at most k .
P(e), where the "value-variable'' e appears in P(e); the notation stands for P(0) + P(1), where P(0) is obtained by substituting 0 for e in a way that will be obvious, and similarly for P (1) .
We also extend V to include values Q,l and put D = { O , l , Q , l } and Dl = D U {.l,L}. I is a special value which will denote the absence of values. We assume the properties Q = Q and I -= 1. The extension of -to strings is as expected.
In the following, we use the notation xeE
Buffer "Fifo"
Buffer Fifo directly implements a first-in-first-out queue of capacity N + 2; it has no overhead in the form of internal actions, and it is purely sequential. The state is denoted by the string contained in Fifo; thus, the state space is given by V N +~. The software architecture of Fifo is described in Fig. 2 
The target state for a time step of Fifo(s) is denoted by Fifo(s) and is defined by the following recursive definitions:
3. if I s 1 = N + 2 and s = ds', then 
2.

3.
4.
.
6.
7.
Is1 < N + 2 implies Fifo(s) %, Fifo(sd);
Observe that Fifo(s) for 0 < Is( < N + 2 has an urgent input and an urgent output to perform; after each it becomes some Fifo(s'), which has no urgent action. This reflects that a sequential process may let time 1 pass after each action.
Buffer "Pipe"
A buffer can also be seen as the "concatenation" of cells, each of them containing at most one value. A cell is an input-output device. It is defined as follows: 
C ( x ) [ @ j ] ,
where the relabelling @ j is defined as:
L otherwise and for j = 1,. . . , N , Then, all action transitions of the process Pipe(s) are the following: 
Buffer "Buff"
Assume that N cells are not connected end to end (as in Pipe) but are used as a storage. The cells interact with a centralized buffer controller that can store two more values.
The software architecture of Buff is described in Fig. 4 .
First, we describe the functional behaviour of store Mem.
Definition 3.7 (Mem)
Let z E DI and j = 0 , . . . , N -1.
The j-th element of Mern is described by process B3(x) z
C(z)[@:
], where the various relabelling functions are defined as:
Mem(s) BO(sO)l10.. . I l 0 B N -l ( S N -l )
and let Mem acts as a store. It is used by a buffer controller (BC) to store data received from the external environment. BC uses the N cells of Mem as a circular queue (ordered as 0 < 1 < . . . < N -1). More in detail, the buffer controller accepts a value from the external environment and then writes the accepted value into the first available empty cell. It cannot accept any other value until the accepted one is actually stored in one of the N cells. BC also retains the oldest undelivered value and delivers it whenever possible.
The state BC(z, y, i, m ) of BC is determined by the four arguments:
(1) x E V, the value in input. That is, the value that BC has recently accepted from the environment; if z =I, then a new value can be accepted. The buffer controller is then defined as follows.
(2) y E V, the value in output. That is, the value read from Mem that can be made available to the external environment; if y =I, then no value in output is avail- 
3.
4.
.
6.
7.
8. BC(d,a,(i+ l ) m o d N , N --1 ) ; BC(d,a,i,m) $ + 1) modN, m -1) ). Mem(s)l(BBC(z,y,i,m)) 
BC(d, I,i,O) E wi(d).BC(I,
IBC(I,a,i,m) E (EdEv in(d).
0 < m implies -BC(I,I,i,m) ( C d E V~( d ) . B C ( d , I , i , m ) )
+ (C,,vpi(.).BC(La,(i + 1)modN,m -1 ) ) .
. BC(d, l , i , O ) E -wi(d).BC(I, I , i , 1).
0 < m < N and j = ( i + m) modN imply -BC(d,I,i,m)
(CaEv pi(a).BC(d, a
CaEvpi(u).BC(d,a,(i + l ) m o d N , N -1). -mt(a).BC(I, I, i, m). u j ( d ) . B C ( I , I , i
The following proposition deals with Buff in place of Buff. 4.
Proposition3.11 Let a , d E V and
5.
6.
.
8.
-_ _ -- 
2.
3.
4.
5.
6.
.
4
Comparing the three buffers
This section is the core of the paper. We compare the three buffer implementations with respect to the efficiency preorder 2 discussed in Section 2.1. To do this, we will exploit the alternative characterization in terms of refusal traces.
Given two processes P and Q , to prove that P is not more efficient than Q , P p Q , we exhibit a refusal trace of P that Q cannot perform. This is sufficient since, by Theorem 2.3, RT(P)
RT(Q) implies P 2 &. On the other hand, to prove that P 2 Q, we show that RT(P) RT(Q)
238
-again by Theorem 2.3. To prove this turns out to be a little bit more involved. It is well-known, however, that trace inclusion can be shown by exhibiting a suitable simulation relation. In our setting, to prove RT(P) 5 RT(Q), we give a simulation relation between states of the refusal transitional semantics of P and of Q. A simulation relation is defined as follows:
A relation R is a simulation relation for two processes P and Q if (P, Q) E R and whenever ( R , S) E R and The existence of a simulation relation for two processes P and Q ensures RT(P) 5 RT(Q) and hence P 2 &.
Relating Fifo and Pipe
We start with Fifo Fifo(&) and Pipe Pipe(lN+2) (for our fixed positive N ) and prove that they are unrelated. 
Relating Fifo and Buff
Now we relate Buff E Buff(lN, I, I , 0,O) and Fifo for our fixed positive N . 
4.3
Buff, like Pipe, performs internal activities to manage the store. Hence, it cannot be more efficient than Fifo.
Although Buff is a distributed implementation (as Pipe is), it works sequentially (see above). Therefore, we do not have the effect that prevents Fifo from being more efficient than Pipe, and Fifo is indeed strictly more efficient than Buff.
Relating Pipe and Buff
Finally, we relate Pipe and Buff. Again, one would expect that Buff is more efficient, because it takes less time to move an item from input to output, but actually these two buffer implementations are unrelated in general -as the following theorem shows. 
Concluding Remarks
In this paper, we have shown the applicability of PAFAS to concrete meaningful examples by contrasting the worstcase efficiency of three bounded-buffers.
The worst-case efficiency is one of the most prominent efficiency measures in the traditional theory of computation and complexity. This paper shows how such an efficiency measure can be successfully and smoothly applied to compare complex systems.
Several papers in the literature have advocated the importance of making qualitative and quantitative analysis of complex systems early in the software life cycle stages. This explains the recent proliferation of papers aiming at comparing systems according to non qualitative aspects. Most of them concentrate on performance evaluation in the traditional sense allowing for analysis like throughput, response time, component utilization etc. Typically they derive performance models such as queuing networks or Markov chains (see [I] and [3] resp., and references therein) from the dynamic behaviour of the system description. Over such performance models, performance analysis as mentioned above can be carried out by exploiting standard techniques.
Our framework allows for the integrated study of both functionality and efficiency of the system dynamic behaviour. Besides comparing different system descriptions according to these two aspects (as we have done in the current paper), the framework is also particularly suitable during the step-wise refinement of abstract specifications. In
[6] we consider a system server that manages requests from the external environment. We start with a purely sequential specification and then refine, step-by-step, this very abstract description by adding more and more parallelism to reach a completely parallel description. This latter more concrete view gives a full account of the logical/physical structure of the system we want to implement and can be used as a guide to the system implementation. Each refinement step is validated according to our efficiency preorder in order to make sure that each description behaves as dictated by the refined one while, in addition, its worst-case efficiency improves.
During this stepwise refinement, we show how our preorder is able to remove useless non-deterministic sub-behaviours from the various descriptions.
We do not need generating extra models apart from the state transition system describing the dynamic behaviour of the system. Then, standard simulation-based relation techniques among system states can be exploited to verify whether or not a system description is more efficient than another. If P is not faster than Q , i.e. P 2 Q, then there is a refusal trace of P that is not one of Q . This is a witness of slow behaviour of P; it is a diagnostic information that tells us why P is not faster. If P and Q are finite-state, inclusion of refusal traces can be checked automatically; a respective tool, FastAsy, has been developed for a Petri net setting [4] , and we plan to adapt this for PAFAS. In case that P is not faster, FastAsy presents a respective refusal trace; this can be used to improve P -and in practice, it can also help to find errors that can occur when formalizing an intuitive idea as a PAFAS-process.
We are also investigating the possibility of deriving performance measures for the (worst-case) time needed by a given system to satisfy specific user requests.
