University of Nebraska - Lincoln

DigitalCommons@University of Nebraska - Lincoln
CSE Journal Articles

Computer Science and Engineering, Department
of

3-1992

PPMB: A Partial-Multiple-Bus Multiprocessor Architecture with
Improved Cost-Effectiveness
Hong Jiang
University of Nebraska-Lincoln, jiang@cse.unl.edu

Kenneth C. Smith
Texas A & M University - College Station

Follow this and additional works at: https://digitalcommons.unl.edu/csearticles
Part of the Computer Sciences Commons

Jiang, Hong and Smith, Kenneth C., "PPMB: A Partial-Multiple-Bus Multiprocessor Architecture with
Improved Cost-Effectiveness" (1992). CSE Journal Articles. 57.
https://digitalcommons.unl.edu/csearticles/57

This Article is brought to you for free and open access by the Computer Science and Engineering, Department of at
DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in CSE Journal Articles by an
authorized administrator of DigitalCommons@University of Nebraska - Lincoln.

36 1

IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 3. MARCH 1992

[7] Y. J. Ma, J. F. Wang, and J. Y. Lee, “Systolic array mapping of sequential
algorithm for VLSI architecture,” in Pruc. Int. Comput. Symp., Tainan,
Taiwan, R.O.C., 1986, pp. 865-874.
[8] W. A. Porter and J. L. Aravena, “Cylindrical arrays for matrix multiplication,” in Proc. 24th Annu. Allernton Conf: Commun., Contr. and Comput.,
Monticello, 1986, pp. 595402.
[9] W. A. Porter and J . L. Aravena, “Orbital architectures with dynamic reconfiguration,”Proc. IEE, vol. 134, pt. E, no. 6, pp. 281-287, Nov. 1987.

PPMB: A Partial-Multiple-Bus Multiprocessor
Architecture with Improved Cost-Effectiveness
Hong Jiang and Kenneth C. Smith

Abstract-This paper addresses the design and performance analysis
of partial-multiple-bus interconnection networks. They are bus architectures that have evolved from multiple-bus structure by dividing buses
into groups and reducing bus connections. Their effect is to reduce
cost and alleviate arbitration and drive requirements without degrading
performance significantly. One such structure, called processor-oriented
partial-multiple-bus (or PPMB), is proposed. It serves as an alternative
to the conventional structure called memory-oriented partial-multiple-bus
(or MPMB) and is aimed at higher system performance at less or equal
system cost. It has been shown, both analytically and by simulation, that
a substantial increase in system bandwidth (up to 20%) is achieved by
the PPMB structure over the MPMB structure. With very large systems,
the results also imply a significantly improved cost-effectiveness over the
conventional multiple-bus architecture.
Index Terms-Cost-effectiveness, interconnection network, load-balancing arbitration, multiprocessor architecture, partial multiple-bus structures, performance evaluation.

I. INTRODUCTION
Due to their reliability and cost-effectiveness, multiple-bus structures have assumed considerable importance in both research on, and
applications of, interconnection networks in the multiprocessor arena.
As a result, a great deal of work has been done in the performance
analysis of multiple-bus systems. Such analysis shows that among
the three major categories of interconnection networks (i.e., crossbar
networks, multistage networks, and multiple-bus networks), multiplebus structures are the most reliable and, under certain circumstances,
the most cost effective [1]-[3], [5], [6], [SI. Nevertheless, multiplebus structures might still be too costly for very large systems, due to
the arbitration and drive requirements they entail.
Lang etal. [6] proposed, based on the conventional multiplebus structure, a new network structure called a partial multiple-bus.
The motivation for proposing the new structure was to reduce the
cost of the system while trading off an acceptable and tolerable
degree of performance degradation. This structure is derived from a
conventional multiple-bus structure by dividing memory modules and
buses into identical parts (or groups) while maintaining the connection
of each processor to every bus. This partial-multiple-bus structure
Manuscript received May 10, 1989; revised October 23, 1990.
H. Jiang was with the Department of Computer Science, Texas ALM
University, College Station, TX 77843. He is now with the Department of
CSLE, University of Nebraska, Lincoln, NE 68588.
K. C. Smith is with the Department of Electrical Engineering and Computer
Science, University of Toronto, Toronto, Ont., M5S 1A4 Canada.
IEEE Log Number 9102592.

is shown in Fig. 1. As shown in [6], the performance degradation
of a partial-multiple-bus is not significant. For a two-group partialmultiple-bus system of size 16 (i.e., AY = .\[ = l G , where S is
the number of processors and -11 the number of memory modules), the decrease in performance (system bandwidth) is below 6%.
For the sake of simplicity and consistency, we shall call this structure
memory-oriented partial-multiple-bus, or MPMB.
A different partial multiple-bus structure is proposed in this paper
a\ an alternative to the one proposed by Lang, and as one which
provides higher system bandwidth and faster arbitration at lower
or equal cost. Derived also from the conventional multiple-bus
structure, this structure, called processor-oriented partial multiple-bus,
or PPMB, divides processors and buses into identical groups while
maintaining the connection of each memory module to every bus.
A notable difference between this structure and the one by Lang
is that in it, a memory module has a maximum of B potential
paths (where B is the number of buses) to processors while, in
Lang’s, a memory module has a maximum of only B / g potential
paths to processors (where 9 is the number of groups of buses).
This structural difference gives rise to a distinguishing feature of
the PPMB structure, namely of having potential for load-balancing
arbitration. Load balancing, aimed at fully exploiting the potential
for higher bandwidth inherent in the structure, is able to provide a
substantial improvement in system performance. As a matter of fact,
analytical and simulation results have both shown a maximum of 20%
increase in system bandwidth of the PPMB over MPMB. Meanwhile,
the cost of a PPMB system has been shown in general to be less than
or equal to that of an MPMB of the same size. Note that while the
partial-multiple-bus structure, proposed by Lang, was motivated to
reduce cost and arbitration time without reducing system bandwidth
significantly, we have shown as well that the PPMB structure can
lead to a substantial improvement in cost-effectiveness when system
size is very large.
In the section that follows, details of the PPMB structure
and its load-balancing feature are discussed on a comprehensive
basis. Section III introduces probabilistic models for evaluating
synchronous-system bandwidth of the structures under study and
comparisons are made between PPMB and MPMB. The numerical
results produced by them all lie within &3‘k of the results of
simulation, implying a high level of confidence in the models. Finally,
some concluding remarks are given in Section IV.

11. PROCESSOR-ORIENTED PARTIAL MULTIPLE-BUS
STRUCTURE
(PPMB)
A. The Structure
In PPMB, shown in Fig. 2, S processors are divided into g groups
with each group of ( S / g ) processors fully connected to a set of
( B / g )buses, whereas all -11 memory modules are connected to
all 13 buses. This is to be contrasted with MPMB in which the
J I memory modules are divided into 9 groups where each group of
( . 2 1 / g ) memory modules is fully connected to a set of ( B / g )buses,
and all of the -1-processors are fully connected to all buses. For both
MPMB and PPMB, g is assumed to be a factor of both B and .I1
(or S).
In the rest of this paper on the study, we will refer to an
-1-x 11 x B / g system as a partial multiple-bus system that has
B buses, \I. memory modules, -\-processors, and is divided into
groups. In addition, we will replace the notation M/g, -I-/g, and
B / g with J I G , -\-G, and BG, respectively.

0018-9340/92$03.00 0 1992 IEEE

Digital Object Identifier: 10.1109/12.127450

IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 3. MARCH 1992

362

M

memory modules

B
buses

Fig. 1. MPMB structure
.if memory modules

B
buses

N procesmrs
Fig. 2. PPMB structure

One of the important issues in designing a multiple-bus is how to
control the traffic flow in the network. A mechanism for handling
traffic control is often referred to as an arbitration scheme.

B. Load-Balancing Arbitration
As a widely accepted arbitration mechanism, a two-level arbitration
scheme, proposed by Lang et al. [7], is assumed for the MPMB
structure in this study. The scheme operates in a -I’x 32.1x B / g system

as follows. Associated with each memory module is an S-user -+
1-server type arbiter, since there are S demand inputs (each from a
single processor) and only one can be granted. This arbiter performs
the first level of arbitration that selects one among the processors
that require a particular memory. Once this is done, g MG-user +
BG-server type arbiters, one for each Pair of groups of memory
modules and buses, then carry out the second level of arbitration that selects, within each pair of groups, inin (BG J ) of the

363

IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 3, MARCH 1992

.I memory modules that have at least one outstanding request.
Therefore, for a S x .\I x B / y system, the arbitration mechanism is composed of basically -If -\--user -+ 1-server type arbiters
and g JIG-user + BG-server type arbiters. Different designs of
.Y-user -+ 1-server type arbiters and JI-user + B-server type
arbiters can be found in the literature [7].
It is observed, however, that in a PPMB system, a memory module
with outstanding requests may be granted a bus in any of the g groups,
depending on the processor from which the accepted request is made.
This is true simply because any memory module is connected to all
groups of buses. In contrast, in an MPMB system, a memory module
with an outstanding request can only be granted a bus from the group
to which the memory module belongs.
This distinguishing feature of PPMB makes the arbitration scheme
employed in MPMB no longer suitable, giving rise to the need for a
new one. This feature also suggests that the new scheme should be
load-balancing, such that the memory module that has outstanding
requests should always be granted a bus, as long as there is at least
one free bus in a group to which any of the memory’s requesting
processors belongs. This is possible since when a memory fails to
win the arbitration in one group, it can (literally) always participate
in the arbitration process of other groups where its other requesting
processors (if any) belong. In other words, memory requests are
accepted in such a way that the processors generating the accepted
requests are distributed in the most balanced way possible among
different groups. The new scheme is thus called load-balancing
arbitration. Due to the lack of the space in this paper, details of
the design and implementation of the load-balancing arbiter, given
in [SI, will not be presented here. However, an outline of them is
sketched, in order to provide the reader with a better insight into the
proposed structure.
These are two levels of arbitration. The first level selects one
request from each memory queue (if nonempty) as a participant for
the second level of arbitration. Each memory module is associated
with an arbiter, called a First-Level-Arbiter ( F L d , ) . An FL.4
consists of g SG-user --t 1-server type arbiters (-YGl=l, ) and a
logic component L B performing the load-balancing function. The
FL.4 takes S inputs, one from each processor, as request lines
and another y sets of inputs, log, J I G in number, for use by the
L B logic. Each -l‘GIAJ performs arbitration among the competing
processors of its corresponding group. Outputs of all -\-Gl.iJ’s are
then used as inputs to the L B logic. The L B logic decides, based on
the “Least Demanded Group First” (LDGF) policy, which one of the
first round winners [outputs of SGl=I, ’s, i.e., memory requests from
different processor groups (if any)] is to participate in the second-level
arbitration, and outputs the group number y n , which designates where
the final winner (if any) belongs. If there is such a winner, FL.4,
raises a binary signal D,, indicating that memory .\I, is demanding
a bus from group g n , . The LDGF policy simply says that among the
first-round winners, the one whose processor group has requested the
least number of memory modules in the current bus cycle is selected
as the final winner of the first-level arbitration.
The Second-Level-Arbiter (SLA) is composed of J I combinational
modules -1IB(z)that perform the assignment of the g x BG buses,
and a state-register which stores the state of the arbiter after each
assignment subcycle. It takes the outputs of the FLA,’s as its
inputs. The J I J I B modules, interconnected in a ring fashion by
lines carrying arbitration information in combination with bypassing
switches, function at any given arbitration cycle as k ( k 5 9 ) embedded iYG-user -+ BG-server type arbiters that are dynamically
distributed amoung the AI J I B modules. Each such rl-G-user i
BG-server arbiter is associated with, and arbitrates on, a group
that has more than BG memory modules demanding its buses. The

outputs of SLA give the locations of the granted buses and the
corresponding memory modules to which they are assigned.
The load-balancing arbitration mechanism, together with the structure of PPMB, is shown to improve the system performance substantially.
It has been shown that the cost of a multiple-bus can be approximately estimated in terms of bus connections [5], [6]. For instance,
the cost of an S x M x B system can be said to be proportional to
B ( Jf S ) .This measure can be directly adapted to partial multiplebus systems MPMB and PPMB, of size -\*x -11x B / g , with resulting
costs being B ( S + ( M / g ) ) for MPMB and B ( M + (-Y/g))
for
PPMB.
It is apparent that one design can be more costly than the other
depending on the values of S and -11.However, simulations performed in this study and in the literature [6], [8] have shown that, with
B fixed, the increase in -11beyond -\-,results in very insignificant
improvement in system performance. As well, these simulations
indicate that the effect on performance due to a change in -11 within
the range [.\-/2.S] is much lower than that in the range [O.-V/2].
Therefore, in applications it is wise to choose \I and S such that
-\-/2 5 .\I 5 -I7.This implies that (JI (-\-/g)) 5 (-\(M/g)),
indicating that the cost of PPMB can be generally less than, or equal
to, that of MPMB.

+

+

+

111. PERFORMANCE
ANALYSIS

Performance measures of system bandwidth will now be described.
Here, system bandwidth is defined as the expected number of busy
buses in each bus cycle. Mathematical models are introduced for
these performance measures of the PPMB system. For the purpose
of comparison, a probabilistic model developed by Das and Bhuyan
[4] is employed to produce numerical results for the MPMB system.
Assumptions: The general assumptions incorporated in the analysis are the following:
1) The processors are synchronized;
2) The memory requests are independent and uniformly distributed random variables;
3 ) The cycle time of all the memory modules is the same and
constant;
4) A processor issues a new request in the next cycle with
probability 11, after receiving memory service. Probability 11 is
also the request rate, taking the bus cycle time as the basic unit;
5) The propagation delays and arbitration times associated with
the interconnection network are not included explicitly but may
be thought of as forming part of the memory cycle;
6) Buses are assumed to be assigned at random to the memory
modules that have at least one outstanding request. This is
done on a cyclic basis;
7) For each memory module that has been granted a bus, a processor is selected at random (also on a cyclic basis) from those
with outstanding requests for that module. Other processors are
blocked and may request again during the next cycle.
Probabilistic Model: Here we further assume that the requests
issued in any cycle are independent of those of the previous cycle.
This implies that a rejected request is discarded, rather than being
resubmitted in the next cycle.
Now consider a -1-x .If x B / g system, regardless of orientation,
with 1) defined as above. The probability of processor P, requesting
memory module M,,for 1 5 i 5 4 and 1 5 j 5 JI, is
given by p / M . It follows that the probability of P, not requesting
AIJ is (1 - ( p / J f ) ) . Furthermore, the probability of none of the
*I’processors requesting AI, is given by (1- ( p / M ) ) . Therefore,
the probability that MI is requested by at least one processor is

IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 3, MARCH 1992

364

particular Combination is multiplied by one of (2)'s product terms for
the same combination.
To take into account the load-balancing effect, (3) is replaced by
the following expression

given by

$)\.

y=l-(l-

Supposing that i memory modules have outstanding requests, it
is necessary to consider all possible ways of distributing i memory
modules among y groups, since it is equally likely that any of
the i memory modules may have been requested by any one of
the -Vprocessors. In addition, the fact that the arbitration is loadbalancing-oriented must also be taken into account.
Now let us first find the expression for the number of ways that
i items are distributed among 9 groups of S G places, given that
each place can only hold one item. The expression is derived in a
constructive way:

r!l

Iz-'

(5)

where

and

where
and
For each combination C = ( G I . .. . . G,). G1 is the number of
memory modules (out of () that have been requested by processors
from groupl, and
is the number of ways of selecting GI
processors from -1- Gr is the number of memory modules (also out
of i ) that have been requested by processors from groupz, and
is the number of ways of selecting G2 processors from J-G'; and so
on, and so forth. Therefore, the number of buses that will be assigned
to the 1 memory modules, given a combination
= ( G I . .. . . G g),
is given by

ty)

( L';.')

(3)
with probability

Given that there are exactly i memory modules being requested, the
mean number of buses that will be assigned with memory modules,
is therefore

all

R

4

I= 1

/=1

Z is the number of memory modules, in the combination G, that
would not be assigned with buses if the load-balancing mechanism
were not employed. I? is the number of buses still available. For the
convenience of later discussion, let 7 be a set containing exclusively
those 2 modules, and I- be a set containing exclusively those 1buses. yl is the probability that any one of the Z memory modules
described above is requested at the same time by at least one processor
from a group that still has free buses, under the given conditions.
According to the load-balancing policy, a memory module in 2
may be granted a bus in f as long as it is requested by a processor
from a group to which buses in
belong. If the number of such
memory modules is less than or equal to I', all of them are granted
buses. Otherwise, only I- of them can be assigned with buses. The
second such term of the right-hand side of (5) gives the expected
number of such memory modules being granted buses.
Finally, the bandwidth of PPMB is given by the following expression:

where y is given in (1).
A bandwidth expression for the MPMB system is given as [4]:

c:

.!lG

BTlKiphrs = 9 .l i f G . /1 - g .

1

(1 - BG)p(i).

(7)

r=BG+l

Since lV(i),as given in (2), produces all possible combinations
thus (4) can be explicitly expressed as

[(

G,-I

)f(G,) 6
I=1

1

C,

iiiiri( BG. G I ) .

Note that this does not mean that (2) as a whole is multiplied by
the sum
- min(BG.Gl). Instead, it means that each sum of a

Improved Model: Because of the assumption that any rejected
request is discarded, the models in the previous section tend to underestimate the system bandwidth. If a rejected request is resubmitted in
the next cycle, then (intuitively) the rate at which a processor issues
requests is higher than it would be otherwise.
To take this fact into account, and thus make the analytical model
more realistic and accurate, Yen et al. [9] proposed a method called
the Steady-State Flow Approach to model the memory interference in
synchronous multiprocessor systems. We now adapt it to the partial
multiple-bus case to modify our analytical models. The basic idea

IEEE

TRANSACTIONS ON COMPUTERS, VOL.

41,

NO. 3, MARCH

365

1992

Bandwidth

g=4 (PPMB)
g=8 (PPMB)
g=16
g=4

PPMB
(LMB)

g=8

(MPMB)

g=16

7

(hfPhIB)

-

6

0.2

0.3

I
0.4

I

I

I

I

1

I

0.5

0.6

0.7

0.8

0.9

1.0
Request Rate

Fig. 3. Bandwidth as a function of request rate.
is to modify q , the probability that there is at least one request for
a particular memory module as defined earlier, so as to reflect the
effect of the blocked requests that are to be resubmitted. A modified
q is given as follows:

TABLE I
BANDWIDTH
OF PPMB SYSTEM
AV= ,If
~~~

1)

0.1

where f is a degradation factor for system performance and also the
processor utilization in the steady state. The first product term on the
right side of (8) represents the probability that none of the processors
has a request for a particular memory module, whereas the second
product term is an estimate of the probability that there is no blocked
(queued) request for a particular memory module.
Finally, the analytical models in the previous section are modified
by replacing the expression for q in (6) and (7) for the PPMB and
MPMB systems, respectively, by (S), and then the equation

is solved for f by iteration using Newton's method. Here B l i - ( f ) is
the bandwidth expression, and f is initially set to one.
Numerical Results: Numerical results produced by the improved
model are displayed in Table I for the PPMB system, and are
compared with the results of simulation. As shown, agreement is
very good. In fact, all results shown are within f3'8 of the results of
simulations, a significant improvement over the unimproved models.'
'Analytic results of this study for the PPMB and (81 and [2] the MPMB,
using the unimproved models, indicate errors within 7% of the simulation
results.

= 32, B = 16, 9 = 4

~

0.2
0.3
0.4
0.5

0.6
0.7
0.8

0.9

1.o

Analytical
Results

Simulation
Results

3.1833
6.2519
9.0365
11.318
12.943
13.943
14.55
14.911
15.132
15.277

3.14
6.233
9.048
11.422
13.1692
14.354
14.979
15.238
15.354
15.4264

Percentage
Error

+1.37
+0.3
-0.12
-0.91
-1.7
-2.7
-2.8
-2.1
-1.4
-0.9

PPMB System Versus MPMB System: Based on simulation results,
Table I1 shows the degree of performance improvement of the PPMB
structure over the MPMB structure.
A maximal increase of almost 20% in system bandwidth is
achieved (see Table 11) while cost remains the same and could even be
decreased in applications where 1f < -l
as',
discussed in Section 11.
Further, Fig. 3 shows another feature that further evidences the costeffectiveness of the PPMB structure. As we can see in the figure,
a 32 x 32 x 16/4 MPMB system is equivalent (or even a little inferior) to a 32 x 3 2 x lCi/lG PPMB system. However, the cost of
such a PPMB system is a lot lower than that of the MPMB system.
Recall that the cost of partial multiple-bus system is in part inversely
proportional to the number of groups into which it is divided.

IEEE TRANSACTIONS ON COMPUTERS, VOL. 41, NO. 3, MARCH 1992

366

REFERENCES

TABLE I1
BANDWIDTH
IMPROVEMENTOF PPMB OVERMPMB

s = .\I
number of
groups
8

8
8
8
8

16
16
16
16
16

= 32, B = 16

request
rates

BW of
PPMB

BW of
MPMB

0.2
0.4
0.6
0.8
1.o
0.2
0.4
0.6
0.8
1.o

6.2646
11.0700
13.8936
14.9458
15.1842
6.1628
10.6596
13.2930
14.4126
14.7456

6.2300
10.6700
12.6168
13.3140
13.6804
6.0486
9.8634
11.4652
12.0196
12.4504

percentage
improvement

+1.40
+3.40
+10.12
$12.25
$11.00
+1.80
$8.07
+16.00
+ 19.50
+ 18.40

At this point, it is necessary to emphasize that the introduction
of the load-balancing arbitration mechanism into PPMB does not
necessarily imply an increase in cost nor a decrease in arbitration
speed. First of all, the extra logic in the arbiter may indeed increase
the complexity, but with present-day VLSI technology, this is unlikely
to lead to difficulty with implementation since, as partly shown in
Section I1 and expanded in [5], the total number of wires going into
and out of the arbiter is not changed from that conventionally required
[7]. Furthermore, the arbitration time may even decrease, because
in the new scheme g parallel .\-G-user + one-server type arbiters,
instead of a single S - u s e r + one-server type arbiter, are used at
each memory module for first-level arbitration, an arrangement which
likely decreases the arbitration time at that level by almost a factor of
g. Moreover, the second-level arbiter of PPMB is virtually a dynamic
combination of up to 9 .\-G-user + J I G - s e r v e r type arbiters, and
therefore will not increase arbitration time at this level either.

1V. CONCLUSIONS
The processor-oriented partial-multiple-bus structure (PPMB),
proposed here as an alternative to the memory-oriented partialmultiple-bus structure (MPMB), has been shown to improve system
performance substantially. It can provide an increase in system
bandwidth of up to 20%, without the tradeoff in cost usually
demonstrated by alternative systems. That this occurs is not totally
surprising in view of the structural difference between these two
partial-multiple-bus systems. That is, in a PPMB structure, a memory
module has a maximum of B potential paths (where B is the
total number of buses) to processors while, in a MPMB structure,
a memory module has a maximum of only B / g potential paths
(where 9 is the total number of groups of the partial-multiple-bus) to
processors. This potential for improvement of system bandwidth is
fully fulfilled by the load-balancing arbitration mechanism, whose
positive effect is demonstrated by both analytical results and
simulations. While the MPMB structure was motivated to reduce
the cost of a very large system without degrading its performance
significantly, the PPMB structure will evidently substantiate this
perspective by outperforming MPMB itself. To further increase the
cost-effectiveness of the structures under study, however, future
research must be directed to incorporating a cache mechanism into
the structures (both PPMB and MPMB) and analyzing its effect on
system performance.

D. P. Bhandarkar, “Analysis of memory interference in multiprocessors,”
IEEE Trans. Comput., vol. C-24, pp. 897-908, Sept. 1975.
L. N. Bhuyan, “A combinatorial analysis of multibus multiprocessors,”
in Proc. ’84 Ini. Con$ Parallel Processing, pp. 225-232.
-,
“An analysis of processor-memory interconnection networks,”
IEEE Trans. Comput., vol. C-34, pp. 279-283, Mar. 1985.
C. R. Das and L. N. Bhuyan, “Bandwidth availability of multiple-bus multiprocessors,” IEEE Trans. Comput., vol. C-34, pp. 918-926, Oct. 1985.
H. Jiang, “Partial-multiple-bus computer structures with improved costeffectiveness,” M.A.Sc. Thesis, Dep. Elec. Eng., Univ. of Toronto,
Jan. 1987.
T. Lang et af., “Bandwidth of crossbar and multiple-bus connections
for multiprocessors,” IEEE Trans. Comput., vol. C-31, pp. 1227-1233,
Dec. 1982.
T. Lang and M. Valero, “M-users B-servers arbiter for multiplebuses multiprocessors,” Microprocessing and Microprogramming, NorthHolland Publishing Company, 10, 1982, pp. 11-18.
Q. Yang, “Communication performance in multiple-bus systems,”
M.A.Sc. Thesis, Dep. Elec. Eng., Univ. of Toronto, 1985.
D. W.L. Yen, J.H. Patel, and E.S. Davidson, “Memory interference in
synchronous multiprocessor systems,” IEEE Trans. Comput., vol. C-31,
pp. 1116-1121, Nov. 1982.

On Optimal Single Jog River Routing
Tai-Ching Tuan

Abstract-The wiring problem of providing a planar rectilinear wire
connection between two sets of terminals which lie on two horizontal lines
in the plane is called the river routing. The problem has been widely
studied. It is normally studied in conjunction with design variable(s)
optimization problem. In this paper, we study this problem when there is
at most one horizontal segment in each wire. Efficient optimal algorithms
are given for the following design variables: offset, separation, area, and
shortest total wire length. The tight upper bound on the separation is
also given.

Index Terms-Algorithm, AVL tree, optimization, planar river routing,
rectilinear wiring, single jog, VLSI.

I. INTRODUCTION
Consider two sequences of increasing integers -4 = ( n1 . 0 2 . . . . .
n,, ) and B = ( b l . b 2 . . . . .b,, ) which represent the coordinates of
two sets of terminals (pins) on two parallel (horizontal) lines. The
distance between these lines, denoted by .s, is a positive integer and
is a design variable called separation. A wire representing the net
for / = 1.2, . . . n , must join the terminal at n , to the terminal
at b, by means of a continuous rectilinear curve of total length s
In, - b,l on a unit-grid (where one unit is the minimum spacing
between two wires). The total length s InL - b , I suffices to connect
n , and 11, because wires that are extended beyond the two end points
do not help reduce separation. Furthermore, it is assumed that the
coordinates of each horizontal or vertical segment of each wire are
integers, and of course, no wire can touch another wire. The above
B ) , is sometimes called
wiring problem, denoted by the pair (-4.
River Routing.

.

+

+

Manuscript received May 15, 1989; revised May 25, 1990.
The author is with School of Electrical Engineering and Computer Science,
University of Oklahoma, Norman, OK 73019.
IEEE Log Number 9102591

0018-9340/92$03.00 0 1992 IEEE

