The impact of timing on linearizability in counting networks by Mavronicolas, M. et al.





















Counting networks form a new class of distributed, low-contention data structures, made
up of balancers and wires, which are suitable for solving a variety of multiprocessor syn-
chronization problems that can be expressed as counting problems. A linearizable counting
network guarantees that the order of the values it returns respects the real-time order they
were requested. Linearizability signicantly raises the capabilities of the network, but at
a possible price in network size or synchronization support [13, 18]. In this work, we fur-
ther pursue the systematic study of the impact of timing assumptions on linearizability for
counting networks, along the line of research recently initiated by Lynch et al. in [18]. We
consider two basic timingmodels, the instantaneous balancer model, in which the transition
of a token from an input to an output port of a balancer is modeled as an instantaneous
event, and the periodic balancer model, where balancers send out tokens at a xed rate.
In both models, we assume lower and upper bounds on the delays incurred by wires con-
necting the balancers. We present necessary and sucient conditions for linearizability in
these models, in the form of precise inequalities that involve not only parameters of the
timing models, but also certain structural parameters of the counting network, which may
be of more general interest. Our results extend and strengthen previous impossibility and
possibility results on linearizability in counting networks [13, 18].

This work will also appear for publication elsewhere. Contact author: Philippas Tsigas
y
Partially supported by funds for the promotion of research at University of Cyprus (research project \Par-
allel, Concurrent and Distributed Computations"). Part of the work of this author was performed while visiting
Max-Planck Institut fur Informatik, Saarbrucken, Germany.
1
1 Introduction
In the counting problem, a number of concurrent processors repeatedly assign themselves suc-
cessive values from a given range, such as memory addresses or destinations on an interconnec-
tion network. A solution is said to be linearizable [14] if the order of the assigned values reects
the real-time order in which they were requested. Linearizable counting provides the ground for
a number of concurrent solutions to signicant multiprocessor synchronization problems, such
as time-stamp generation, multi-version database handling, scheduling of multi-threaded com-
putations, implementation of data structures, dynamic load balancing, and buer management
(see, e.g., [7, 9, 11, 23]).
A counting network [3] is a highly concurrent data structure used to implement a counter.
Roughly speaking, a counting network is a directed graph whose nodes are simple computing
elements called balancers, and whose edges are called wires. A request for a counter value is
represented by a token, which enters on one of the network's input wires, propagates through
the network asynchronously by traversing a sequence of balancers, and leaves on an output
wire. Counting networks are among the very few counting techniques that are known to be
scalable, since they minimize contention (\hot-spots") as concurrency increases by distributing
memory accesses, thus increasing parallelism and throughput (see, e.g., [3, 6, 12, 22, 21]).
In order to enhance the design of concurrent counting techniques, so that they both are
scalable and support eective specication and analysis of MIMD shared memory algorithms
|that rely on linearizability for correctness| it would be desirable to construct linearizable
counting networks. Alternatively, it would be useful to study the possibility of using additional
software constructs in order to extend a given counting network to become linearizable [13],
or the conditions under which implemented counting networks always exhibit a linearizable
behavior [18].
In this work we pursue the latter approach, following a direction pointed by a recent
watershed paper [18], which studied linearizability properties of uniform counting networks
relatively to timing assumptions on trac speed. We further continue the systematic study of
the impact of timing assumptions on linearizability for counting networks. More specically,
we study the boundaries between linearizable and non-linearizable behaviors of any counting
network with respect to speed variations of its tokens and balancers, in a hope to provide
practitioners with additional formal tools to support decision making in the phase of design.
We provide necessary and sucient conditions for a counting network to be linearizable, in the
form of precise inequalities expressed in terms of specic graph-theoretic parameters and their
relation to variations in trac speed.
In more detail, we consider two basic timing models for balancer implementations in either
shared memory or message passing. In both models, we follow Lynch et al. [18] and assume




, respectively, on the time it
takes a token to traverse a wire from balancer to balancer. In the instantaneous balancermodel,
2
introduced and studied by Lynch et al. [18], the transition of a token from an input to an output
port of a balancer is modeled as an instantaneous event. As mentioned in [18], this can be




bounds including the traversal of a node, but the
output being determined at the instant of the token arrival. However, in some implementations
of balancers there may be restrictions due to bandwidth or clock rates. In shared memory
implementations of balancers memory accesses to variables implementing the balancers may
require a constant number of steps to be completed due to restrictions in bandwidth, while
in message passing implementations, processors that use messages to \simulate" the balancers
may have access to clocks running at a bounded or xed rate (see, e.g., [3, 12, 21, 22]).
We model these \non-instantaneous" implementations by introducing a new timing model,
called the periodic balancer model, assuming a constant period at which a balancer forwards
tokens to its outputs. This assumption is motivated by periodic constraints commonly used in
many real-time problems (especially in scheduling real-time tasks on multiprocessors [15, 17]),
and resembles a timing model for periodic processes studied by Rhee and Welch [20]. The
periodic balancer model is more realistic in that it models balancer delay to be proportional
to the number of tokens concurrenttly traversing a balancer; this modeling is aligned with
the concept of stalls introduced and used by Dwork et al. [6] in their elegant framework for





denote the minimum and maximum balancer's periods, respectively,
over all balancers in the network.
We study, in particular, uniform and non-uniform counting networks; in uniform net-
works [18], each node lies on some path from inputs to outputs, and all paths from inputs to
outputs have constant lengths. Our study introduces and identies a crucial graph-theoretic
parameter of counting networks, called inuence radius and denoted irad(
e
G) for the undirected
version
e
G of G; roughly speaking, the inuence radius is dened to be the maximum common
length of two maximal paths from an inner node of the network to any two output nodes,
and captures the maximum degree of inuence two output nodes can receive in common. It
turns out that the inuence radius determines in a precise, quantitative way whether a uniform
counting network is linearizable. We believe that the inuence radius plays a similar role for
other classes of networks as well. Our specic results and their relation to previous work are
as follows.
Necessary conditions
For the instantaneous balancer model, we show that any uniform, counting network of depth




> 1 + d=irad(
e
G). This result for the case of diracting trees
[22] where irad(
e
G) = d = log n constitutes an alternative proof of an impossibility result of
Lynch et al. [18, Theorem 4.1].
For the limit case where c
max
tends to innity, corresponding to completely asynchronous




































Figure 1: Pictorial summary of the results for the instantaneous balancer mobel
condition for linearizability in uniform, counting networks { this constitutes an alternative
proof of an impossibility result of Herlihy et al. [13, Theorem 5.1].
Moreover, our previous necessary condition justies a method proposed in [18, Corollary




= k, for any constant
k > 2, into a linearizable network of depth O(d); the method prepends each of the network's
inputs with a simple path consisting of d  k single-input single-output balancers. This can be
seen as a mechanism for safeguarding the appropriate ratio d=rad(
e
G).
We next turn to the periodic balancer model, for which we show a necessary condition
for linearizability, which depends on the inuence radius of the network and the product of
fan-outs of balancers along a certain crucial path from inputs to outputs. The proof of this
condition follows the same structure as the one for the instantaneous balancer model, but,
because of the more delicate timing assumptions in the periodic balancer model, it requires
substantially more careful timing arguments.
Sucient condition




 2s=d, where s, the shal-
lowness of the network, is the length of the shortest path from inputs to outputs and d is
the depth of the network. This result extends the tight result by Lynch et al. which is for
uniform networks, to the non-uniform cases. The aforementioned result shows that for uniform




 2 is sucient to guarantee linearizability.
Our results agree well with and provide a complement to known results [13, 18] on lineariz-
ability properties of counting networks, in the ways explained in the previous paragraphs. More
important, our necessary and sucient conditions together imply that, in general, linearizabil-
ity may not be dictated by local conditions [18, Sections 3 and 4], but, rather, by conditions



















































Figure 2: A (2,3)-balancer: symbolic and node representations
remark that our impossibility results are shown using very simple, lock step and round-robin
executions, which are expected to be common in practice. Besides, we believe that our proof
techniques may be of independent interest.
Our results, which for the instantaneous balancer model are depicted in Figure 1, imply
that given a (uniform) counting network, we can determine trac classes for which the network
is linearizable or not by simply computing its depth and inuence radius. We proceed by
describing counting networks and the model of computation in Section 2. Sections 3 and 4
contain our necessary and sucient conditions, respectively, for linearizable counting networks.
We conclude, in Section 4, with a discussion of our results and some open problems.
2 Preliminaries
Our denitions for balancing networks are standard and follow those in [1, 2, 3, 8, 10, 13, 18].





)-balancer, or balancer for short, is a computing element receiving tokens on f
in






are called the fan-in and
fan-out of the balancer, respectively. Tokens arrive on the balancer's input wires at arbitrary
times and are output on its output wires. Intuitively, a balancer resembles a toggle mechanism
which, given a stream of input tokens, alternately forwards them to its output wires, from top
to bottom; thus, a balancer eectively balances the number of tokens that have been output
on its output wires. We denote by x
i
, i 2 [f
in
], the number of tokens ever received on the i-th
input wire of a balancer, and by y
j
, j 2 [f
out
], the number of tokens ever sent on its j-th output




) as both the
name of the i-th input wire (resp., output wire) and the number of tokens received (resp., sent)
on the wire.) The state of a balancer at a given time is the collection of tokens on its input
and output wires partitioned per input or output wire. In the initial state all wires contain no





















; that is, the number of tokens that entered the balancer























(a balancer never creates output
tokens).













)-balancer, the balancer reaches within a nite amount of time a
















= m (a balancer never
\swallows" input tokens).




 1 for any pair of indices i and j such
that 0  i < j  f
out





)-balancing network is a collection of interconnected balancers; such a network is

















which represent the input and output wires of the network,
and a nite number of inner nodes, which represent the balancers of the network. The edges









)-balancer's input and output wires, respectively;
the sink and source nodes have degree 1. We denote by
e
G the non-directed version of G.
Throughout the paper we consider acyclic networks. The size of a balancing network is the
total number of its balancers. For any wire z in a balancing network, its depth, denoted















. The depth d of a balancing
network is the maximal depth over all of its wires. For any balancer b in a balancing network,
its depth, denoted depth(b), is the maximal wire depth over all of its input wires. Each maximal
set of balancers having the same depth l is called the level l of the network.
A balancing network is uniform [18] if each node of the network lies on some path from
inputs to outputs, and all paths from inputs to outputs have the same length, which is equal
to the depth of the network. Note that for any balancer b in a uniform balancing network,
depth(b) = dist
G
(x; b) for any source node x in G. The conguration of a balancing network
at a given time is dened as the tuple of states of its component balancers at that time.
For a conguration , denote by state

(b) the state of balancer b in . A conguration is

















; that is, the number of tokens that entered the network is equal to
the number of tokens that left it. The safety and liveness properties of a balancing network









)-balancing network for which, in any quies-




 1, for any pair of indices j and k such that 0  j < k  w   1; that
is, the output of a counting network has the step property.
Each one of the w
out
outputs of a balancing network is connected to an atomic counter (sink




, are consecutively assigned the integers j; j + w; j + 3w; : : :. The integer assigned to
a token T by a counter is called the returned value, or value for short, of the token, and is
denoted by val(T ).
We briey describe below our model of multiprocessor computation, following [4, 14, 19].
We model computations of the system as sequences of (atomic) events, or events for short. Each
event is either a balancer transition event, denoted by transhT; bi, representing transition of





transition of token T through a wire connecting an output of b
1
to an input of b
2
.
An execution E of a balancing network is a (possibly innite) sequence of alternating con-










; : : : , where 
0









<1, be the minimum and maximum time,
respectively, that it takes for a token to traverse a link of the network. In the asynchronous




is unbounded, while in the completely synchronous it equals 1. A





= min r(b) and r
max
= max r(b), where the minimum and maximum
are taken over all balancers. In the instantaneous balancer model, r
b
= 0 for all balancers
b. In the periodic balancer model, r
b
> 0 for all balancers b. In the instantaneous balancer





, but the output of the node is determined at the instant of the token arrival; in the
periodic model there are no bounds on the time that it takes for a token to traverse a node.
A timed event is a pair (t; e), where t, the \time", is a nonnegative real number, and





















; : : : ; where the times are nondecreasing











; : : : is an execution;
2. (Balancer transition time)


















) such that the kth





(b) if the jth timed event is (t
j
























































for some balancer b
2
.
The original denition of linearizability proposed by Herlihy and Wing [14] is adopted to
counting networks in the natural way (cf. [13]). Given a timed execution E , for any token T ,
dene t
in




i) is an event in E , and t
out
(T; E)


















to denote this precedence. A timed










, it holds that val(T
1
; E) < val(T
2
; E) for the values that they get in E . A balancing
network is linearizable if each of its timed executions is linearizable.
3 Necessary Conditions
Consider an arbitrary uniform counting networkG and its undirected version
e





















(there is at least one simple path connecting them, since
e
G is














). Notice that for uniform networks, the length of any geodesic is odd; hence,

























































) and a closest





















that if two nodes u and v are both closest common ancestors for the same pair of sink nodes in
e




G), is the maximum inuence
radius among any of its sink nodes.
Sections 3.1 and 3.2 present necessary conditions for linearizability for uniform networks,
in the instantaneous and periodic balancer models, respectively. in terms of timing and the
depth and inuence radius of the network. In all proofs of necessary conditions, we use a
timed execution in which tokens propagate through the network in lock step, and each token




, respectively. It is called
a fast, synchronous timed execution.
3.1 Instantaneous Balancer Model
We prove the following theorem:
8
Theorem 3.1 In the instantaneous balancer model, a linearizable, uniform counting network























The following is an informal outline of the proof. We start with a fast, synchronous timed
execution of G in which two distinguished tokens exit G through two antipodal sink nodes

.
By \retiming", we slow down the token receiving the least value, while maintaining the prop-
agation of the other token through G, thus, the latter token receives the same value after
\retiming". The \retimed" timed execution is further \augmented" to include a sucient
number of tokens fed in the network after the \fast" token exits it, and performing a fast
traversal. The assumption on the timing parameters implies that at least one of the additional
comments will bypass the \slow" token of the previous timed execution and attain the value
it received before retiming. This value is smaller than that of the \fast"token, contradicting
linearizability. We now present the details of the formal proof.
















G). Let E be a timed execution in which we feed the network
with k + 1 tokens T
0








and executing in lockstep in the maximum speed (1=c
min
). Consider the network
after it reaches a quiescent state; the output will have the step property and henceforth y
i






be the tokens that exit








in E traverse the network through
the paths (T
|
















Now, \perturb" E to create another timed execution E
0
of G in the following way: let
the same k + 1 tokens enter the network from the same input nodes as in E and keep the
same timing as in E until T
|
















































for 0  l  d  rad(
e
G).





to any node of (T

; E).






is not retimed in E
0
















with 0  i 
d irad(
e






with i > d irad (
e
G)+1















sink nodes are antipodal if they realize the inuence radius
9
Lemma 3.3 In E
0
each balancer is visited by the same number of tokens as in E.
Proof. Since the total number of tokens entering the network in E
0
is the same as in E , the
lemma holds for each balancer at level 0. By a simple inductive argument using the liveness













. Assume that for










is traversed by the
same number of tokens as in E ; none of these tokens is T
|
(from lemma 3.2). Moreover, from
the construction of E
0
all the tokens but T
|





output port after traversing b
;l
as in E . This shows that T

will go to the same balancer of











Proof. Since the network is counting, after a quiescence state is reached, the output must
have the step property. Since only k + 1 (k < w
out
) tokens enter the network and since, by
lemma 3.4, T

exits from the k-th output node, T
|
must exit from a node with a lower index.
2
We now perturb E
0
to obtain a third timed execution E
00





















+  (recall that dc
min
is the time that T

exits the network), where  is an arbitrarily
small constant, and propagate in the the network in lockstep and at maximum speed (1=c
min
).
We will prove that there is at least one token T










, and that will bypass T
|
before T| exits the network. Let W (B; |)
denote the output wire of each balancer B that token T
|
has used in E
0
.























. Since they execute in lockstep, by time dc
min












them have exited balancer b
0
|;0
, which proves the base case. Assume that the lemma holds for















+  + (l + 1)c
min





) of these tokens are output on the same port
that T
|
is output in E
0
and are, hence, forwarded to b
0
|;l+1
. Since all delays in link traversals
10
involving these tokens are equal to c
min
, it follows that by time dc
min
+  + (l + 2)c
min
they
will have exited b
0
|;l+1
, thus showing the lemma for l+ 1, as well. 2
Lemma 3.6 shows that by time 2dc
min
+ at least one token in  will exit the network from




exits from in E
0
. The tokens in  enter the network immediately
after T

exits and, hence T

precedes them in E
00


















) time units. The tokens in  need dc
min
















G) + 1) T
|
will be bypassed by at least one T

2 , which
will be the rst to exit from y
r




) = r; from lemma 3.5 it is known
that r < k. Since T

has exited the network before the tokens in  entered, it follows that it
will again, as in E
0













fact that r < k implies that the linearizability condition is violated. 2
3.2 Periodic Balancer Model
We prove the following theorem:
Theorem 3.7 In the periodic balancer model, a linearizable, uniform counting network does
























Proof. Assume, by way of contradiction, that a linearizable, uniform counting network G
exists while the condition of the theorem holds, where F
out
is the maximal product of fan-
outs of balancers along a path from a source to a sink node y
j
, taken over all such paths.
We construct a timed execution of G which is not linearizable. The structure of the proof is
similar to the one of theorem 3.1 for the instantaneous balancer model, but, because of the
more delicate timing assumptions in the periodic balancer model, it requires substantially more
careful timing arguments.





















0  l  k, enters the network through source node x
l mod w
in
, each balancer outputs one token
per r
min
time, and all links incur a delay of c
min
. By the step property for counting networks,
when G reaches a quiescent conguration, y
l








, respectively, so that val(T
|
; E) = j
and val(T





traverse the network through the paths
(T
|


















encounters d balancers while traversing the network. Since E involves








time to traverse each of the d balancers,
and c
min











We \perturb" E to obtain another timed execution E
0
of G so that each event occurring in E






i is not retimed, while later events are retimed so that




outputs one token per r
max
time, and incurs a
c
max







































for 0  l  d   irad(
e





by | in E is not retimed in E
0
























































Hence, it follows that;





to any node of (T

; E).








Proof. Since tokens propagate through G in lock step, and no event occurring no later than the
traversal of the level d  irad(
e
G) is retimed in E
0








. We proceed to show by induction on l, where d   irad(
e






. For the base case where l = d irad(
e































where d   irad(
e
G) + 1  l < d   1, and consider the balancer b
0
;l+1
. Since the execution is
lock step, all not retimed in E
0
tokens will reach b
0
;l
, and by Claim 3.8, no retimed token will
reach b
;l




will follow the same link out of b
;l
as





, as needed. 2
Since only k + 1  w
out
tokens are involved in E
0








We now \perturb" E
0
to obtain a third timed execution E
00












); that is, F
out





















































. All additional tokens traverse the network in round robin order, each balancer
they encounter outputs one token per r
min
































































































































token per time r
min



















, which proves the base case.







































. Since links incur a delay of c
min







































































































one token per time r
min




















, which proves that the lemma holds for l + 1, as well. 2






















































is \fast" in E
00





, and \slow" afterwards.
Hence, even if T
|
never waits at a balancer en route due to other tokens concurrently traversing













































































) = r. Since val(T








, a contradiction. 2
4 Sucient Conditions
In this section, we present our sucient condition for linearizability in counting networks,
which generalizes the respective result in [18], for the case of non-uniform networks, too.




)-counting network G (uniform or not), and let its shal-








); that is, s is the length of the shortest directed path in G. We
prove:
14













 2s=d, while there exists a uniform
counting network B which is not linearizable. By denition of linearizability, there exists a















We start with some auxiliary denitions. Following [18], we associate with each balancer





which formalize the \knowledge" b and T have, respectively, at time t about other tokens in
the network. Formally, at time 0, H
b
(0) = ; and H
T
(0) = T . Each time a token traverses a
balancer, the knowledge of the two is combined; formally, if a token T traverses a balancer b














is the time of occurrence of the
timed event immediately preceding this in the timed execution. For a token T traversing B




; : : :; b
d 1


















in the timed execution.
Let y
k
be the output node through which T

exits B in E . The proof of [18, Lemma 3.1]
does not rely on uniformity; hence, it applies to non-uniform networks as well to yield:














w(a  1) + k + 1.
We now \perturb" E to obtain a timed execution E
0








; E)) following the same timing in traversing B. Since no events were retimed and
we only removed tokens about which T

did not \know" in E , token T













; E) in E
0
. Note also that since in E T













; E)); hence, T

is not participating in E
0























at the time it is exiting B, and this information must have been propagated
through B from some input wire to y
k
, it must have traversed at least s wires. Hence, T

must
have entered B by time sc
min

























The rest of the proof shows that T

must have been bypassed in E by some faster token, which,
in turn, has similarly been bypassed by some other faster token; repeating over this argument
yields that T

was bypassed by T

; note that T














have entered B before T








Let V be the set of balancers visited by tokens during E
0
. Then,




Proof. Since the total number of tokens entering the network in E
0
is less than in E , the lemma
holds for each balancer at level 0. By a simple inductive argument using the safety property





is traversing the network through a sequence of balancers b
0






), by going through b
i
at time instant t

i






















balancers in E and its history sequence numbers corresponding to them equal the respective






in E may simulate itself). From the previous
lemma we conclude that there exist tokens T

1
; : : : ; T

"










. The following lemma is essential for the completeness of the
proof of our theorem.
Lemma 4.5 For any T

j
























; : : : ; b
i
x



















). At the rst wire T

simulates itself.
Tokens can not traverse links slower than 1=c
max
and a token that is simulating T

has to
bypass a token that is already simulating T

... that has simulated T

because it has bypassed

































; E) + dc
max
































Theorem 4.1 essentially says that the less \equilateral" the network is, the smaller are the
variations in token speeds under which it can retain linearizability; specialized in the case




 2 is sucient for
linearizability { a result shown in [18].
16
Discussion
We presented necessary and sucient conditions for linearizability in counting networks, under
dierent timing assumptions on balancers and wires. Although we do not yet have a complete
characterization of linearizability for the specic timing models we consider, our results demon-
strate how the possibility of achieving linearizability depends on both timing parameters of
the model and structural parameters of the network.
We remark that the proofs of our necessary conditions can be extended to apply to other
classes of balancing networks too, suggesting that it is not the requirement for the step property
that has been the main obstacle to implementing linearizable counting networks, but, rather,
the requirement for linearizability.
Our work leaves open several interesting problems. Can our necessary conditions be ex-
tended to general (non-uniform) counting networks? An obvious open problem is to prove a
sucient condition for linearizability in the periodic balancer model. It would also be interest-
ing to understand how much non-linearizable a counting network may be in case linearizability
is impossible; Lynch et al. [18, Theorem 4.4] take the rst step in this direction by providing
a lower bound on the non-linearizability fraction for the special case of the bitonic counting
network. Our necessary conditions should yield similar results for any uniform network in
the models we studied. Does our sucient condition for linearizability hold also for counting
networks required to handle both tokens and antitokens [21]?
17
References
[1] E. Aharonson and H. Attiya, \Counting Networks with Arbitrary Fan-Out," Distributed
Computing, Vol. 8, pp. 163{169, 1995. Preliminary version: Proceedings of the 3rd Annual
ACM{SIAM Symposium on Discrete Algorithms, pp. 104{113, January 1992.
[2] W. Aiello, R. Venkatesan and M. Yung, \Coins, Weights and Contention in Balancing
Networks," Proceedings of the 13th Annual ACM Symposium on Principles of Distributed
Computing, pp. 193{205, August 1994.
[3] J. Aspnes, M. Herlihy and N. Shavit, \Counting Networks," Journal of the ACM, Vol. 41,
No. 5, pp. 1020{1048, September 1994. Preliminary version: \Counting Networks and
Multi-Processor Coordination," Proceedings of the 23rd Annual ACM Symposium on The-
ory of Computing, pp. 348{358, 1991.
[4] H. Attiya and M. Mavronicolas, \Eciency of Semi-Synchronous versus Asynchronous
Networks," Mathematical Systems Theory, Vol. 27, No. 6, pp. 547{571, Novem-
ber/December 1994.
[5] C. Busch and M. Mavronicolas, \A Combinatorial Treatment of Balancing Networks,"
Proceedings of the 13th Annual ACM Symposium on Principles of Distributed Computing,
pp. 206{215, August 1994.
[6] C. Dwork, M. Herlihy and O. Waarts, \Contention in Shared Memory Algorithms," Pro-
ceedings of the 25th Annual ACM Symposium on Theory of Computing, pp. 174{183, 1993.
[7] C. S. Ellis and T. J. Olson, \Algorithms for Parallel Memory Allocation," Journal of
Parallel Programming, Vol. 17, No. 4, pp. 303{345, August 1988.
[8] E. W. Felten, A. LaMarca and R. Ladner, \Building Counting Networks from Larger
Balancers," Technical Report 93-04-09, Department of Computer Science and Engineering,
University of Washington, April 1993.
[9] A. Gottlieb, B. D. Lubachevsky and L. Rudolph, \Basic Techniques for the Ecient Co-
ordination of Very Large Numbers of Cooperating Sequential Processors," ACM Trans-
actions on Programming Languages and Systems, Vol. 5, No. 2, pp. 164{189, April 1983.
[10] N. Hardavellas, D. Karakos and M. Mavronicolas, \Notes on Sorting and Counting
Networks," Proceedings of the 7th International Workshop on Distributed Algorithms
(WDAG-93), Lecture Notes in Computer Science, Vol. # 725 (A. Schiper, ed.), Springer-
Verlag, pp. 234{248, Lausanne, Switzerland, September 1993.
[11] M. Herlihy, \A Methodology for Implementing Highly Concurrent Data Structures," Pro-
ceedings of the 2nd Annual ACM Symposium on Principles and Practice of Parallel Pro-
gramming, pp. 197{206, March 1990.
18
[12] M. Herlihy, B.-C. Lim and N. Shavit, \Low Contention Load Balancing on Large-Scale
Multiprocessors," Proceedings of the 4th Annual ACM Symposium on Parallel Algorithms
and Architectures, pp. 219{227, July 1992.
[13] M. Herlihy, N. Shavit and O. Waarts, \Linearizable Counting Networks," Distributed
Computing, to appear, 1996. Preliminary version: \Low Contention Linearizable Counting
Networks," Proceedings of the 32nd Annual IEEE Symposium on Foundations of Computer
Science, pp. 526{535, October 1991.
[14] M. Herlihy and J. Wing, \Linearizability: A Correctness Condition for Concurrent Ob-
jects," ACM Transactions on Programming Languages and Systems, Vol. 12, No. 3, pp.
463{492, July 1990.
[15] K. Jeay, D. F. Stanat and C. U. Martel, \On Optimal, Non-Preemptive Scheduling of Pe-
riodic and Sporadic Tasks," Proceedings of the 12th IEEE Real-Time Systems Symposium,
pp. 129{139, December 1991.
[16] M. Klugerman and C. Plaxton, \Small-Depth Counting Networks," Proceedings of the
24th Annual ACM Symposium on Theory of Computing, pp. 417{428, May 1992.
[17] C. L. Liu and J. W. Layland, \Scheduling Algorithms for Multiprogramming in a Hard
Real-Time Environment," Journal of the ACM, Vol. 20, No. 1, pp. 46{61, January 1973.
[18] N. Lynch, N. Shavit, A. Shvartsman and D. Touitou, \Counting Networks are Practi-
cally Linearizable," Proceedings of the 15th Annual ACM Symposium on Principles of
Distributed Computing, May 1996, to appear.
[19] N. Lynch and M. Tuttle, \An Introduction to Input/Output Automata," CWI Quarterly,
Vol. 2, No. 3, pp. 219{246, September 1989.
[20] I. Rhee and J. L. Welch, \The Impact of Time on the Session Problem," Proceedings of
the 11th Annual ACM Symposium on Principles of Distributed Computing, pp. 191{202,
August 1992.
[21] N. Shavit and D. Touitou, \Elimination Trees and the Construction of Pools and Stacks,"
Preliminary version: Proceedings of the 7th Annual ACM Symposium on Parallel Algo-
rithms and Architectures, pp. 54{63, July 1995.
[22] N. Shavit and A. Zemach, \Diracting Trees," Proceedings of the 6th Annual ACM Sym-
posium on Parallel Algorithms and Architectures, pp. 167{176, June 1994.
[23] H. S. Stone, \Database Applications of the Fetch-and-Add Instruction," IEEE Transac-
tions on Computers, Vol. C-33, No. 7, pp. 604{612, July 1984.
19
