Generalized hypercube structures and hyperswitch communication network by Young, Steven D.
¢< -_-
NASA Technical Memo_!!4380
_ =
Generalized
Structures and
Communication
Steven D. Young
t T _, '"<'Jr ' 0 t A",. _"
JUNE 1992
_ r:"
:
:h
_'_:_ALTZFC HYPr-kCU _
_4YP_:RSWITCN CL;MMUNTCAT IC_J
15 p
. j
f
HI/b2
/i
UncIds
0092936
https://ntrs.nasa.gov/search.jsp?R=19920016722 2020-03-17T10:54:37+00:00Z
1
NASA Technical Memorandum 4380
Generalized Hypercube
Structures and Hyperswitch
Communication Network
Steven D. Young
Langley Research Center
Hampton, Virginia
National Aeronautics and
Space Administration
Office of Management
Scientific and Technical
Information Program
1992
Tile use of trademarks or names of manufacturers in this
report is for accurate reporting and does not constitute an
official endorsement, either expressed or implied, of such
products or manufacturers by the National Aeronautics and
Space Administration.
Abstract
One of the Grand Challenges of the Federal High Performance Computing
and Communications (HPCC) Program is in remote exploration and exper-
imcntation (REE). The goal of the REE Project is to develop a space-borne
computing technology base that will enable the next generation of missions
to explore the Earth and the Solar System. This paper discusses an ongoing
study that uses a recent development in communication control technology
to implement hybrid hypercube structures. These architectures are similar to
binary hypercubes, but they also provide added connectivity between the pro-
cessors. This added connectivity increases communication reliability while
decreasing the latency of interprocessor message passing. Because these fac-
tots directly determine the speed that can be obtained by multiprocessor sys-
terns, these architectures are attractive for applications such as REE, where
high performance and ultrareliability are required. This paper describes and
enumerates these architectures and discusses how they can be implemented
with a modified version of the hyperswitch communication network (HCN).
The HCN is analyzed because it has three attractive features that enable these
architectures to be effective: speed, fault tolerance, and the ability to pass
multiple messages simultancously through the same hyperswitch controller.
1. Introduction
One of the Grand Challenges of the Federal
High Performance Computing and Communications
(HPCC) Program is in the area of remote exploration
and experimentation (REE). The goal of the REE
Project is to develop a space-borne computing tech-
nology base that will enable high-performance, fault-
tolerant, adaptive space systems for a new genera-
tion of missions to explore the Earth and the Solar
System. The specific objectives of the REE Project
are to demonstrate that a thousandfold increase in
performance is feasible and to identify a parallel,
scalable architecture that can incorporate new tech-
nologies to meet a broad range of requirements. As
described in The Remote Exploration and Experi-
mentation Project Plan by the Jet Propulsion Labo-
ratory, the architecture nmst also provide affordable
fault tolerance and long-term reliability in an envi-
ronment of limited power and weight, high radiation,
and no maintainability. To meet these objectives,
new architectures nmst be investigated with consid-
eration given to REE-type applications.
This paper discusses an ongoing study that at-
tempts to use a recent development in hypercube
communications control technology, the hyperswitch
communication network (HCN) chip set (ref. 1), to
implement a variety of generalized and hybrid hyper-
cube architectures. These architectures are similar to
binary hypercubes; but they also provide added con-
nectivity between the processors. This added con-
nectivity increases communication reliability while
decreasing the latency incurred when passing mes-
sages between processors. Because these factors di-
rectly determine the speed that can be obtained with
multiprocessor systems, these architectures are at-
tractive for applications such ms REE, where high
performance and u]trare]iability are required.
This paper describes and enumerates these archi-
tectures and discusses how they can be implemented
with a modified version of the HCN chip set devel-
oped at the Jet Propulsion Laboratory. The HCN
chip set is analyzed here because it has three attrac-
tive features that enable these architectures to be
effective: speed, fault tolerance, and ability to pass
multiple messages sinmltaneously through the same
hyperswitch controller.
This paper is organized as follows. Section 2 de-
scribes generalized interconnection networks: both
their organization and their relation to binary hyper-
cube implementations. Expressions are given for
the number of links, the number of disjoint paths
between nodes, and other characteristic indices.
Section 3 describes the hyperswitch communication
network chip set: both its capabilities and its lim-
itations. Section 4 describes and enumerates the
possible generalized hypercubes that become feasible
when hyperswitch technology is used in the network
input/output (I/O) elements. Section 5 describes
how the HCN chips can be modified to implement
these architectures. Section 6 presents the benefits
of these networks when used for nmltiple instruction
nmltiple data (MIMD) architectures and how these
netw,_rks can be used to increase system performance
and reliability.
features:theability to passmultiplemessagessimul-
taneouslythroughthe samehyperswitch(upto 11),
theability to reroutearoundbusychannelsandmost
importantly,the ability to reroutethesemessages
quickly(lessthan200#sec for 512 byte messages).
The hyperswitch chip set (HSP) (fig. 5) consists
of a custom hyperswitch (crossbar) element (HS), a
hyperswitch I/O element (HSIO), and a message dis-
patch processor element (DP) (ref. 5). The HSP in-
terfaces with other HSP's through ll bidirectional
channels (Ch0 to Chl0). These chips were de-
signed specifically to provide fast dynamic circuit-
and packet-switching capabilities in binary hyper-
cube architectures.
/
Data bus (32)
Dispatch
processor(MC88000)
I
 ooo
Header bus (16) /,
Crossbar switch
• ....... 7 ....... V ............................... 3
! i........ rl---1 .... rl ............................. 3-i
, i , ..... _-r_--i .... 1-r-_ .............................
_ 1 • -
ChO Chl Ch2 ChlO
Figure 5. Hyperswitch processor.
In circuit-switching mode, the HSP establishes
a path from source to destination before message
transmission. This path is established by emitting a
circuit probe (1 to 4 bytes) from the source node. The
probe contains the destination node address, message
length information, distance information, and some
history information in case backtracking is required
to establish the virtual link. The probe is then
sent through intermediate nodes to the destination
and the virtual link is established. At this time,
the message itself can be transmitted across the
virtual link at a rate equal to the link bandwidth.
For circuit-switching mode, the message transmission
latency Tck t is
Tckt ----(SprobeHBlink) q- (SmsgBlink) (4)
where Sprob e is the size of the probe, H is the number
of hops in the virtual link, Blink is the bandwidth of
the links, and Smsg is the size of the message.
In packet-switching mode, the HSP passes an en-
tire message as a packet or set of packets, just as
it passes a probe in circuit-switching mode. For
packet-switching mode, the message transmission la-
tency Tpk t is
Tpkt ----SpktNHBlink (5)
where Spk t is the size of each packet, and N is
the number of packets required to send the entire
message.
In busy networks, both equations (4) and (5) must
be appended to include the effects of encountering
busy or failed links when establishing a path from
source to destination. When a busy or failed link is
encountered, one of three options is available: buffer
the message until the link becomes available, drop the
transaction and try again at a later time, or detour
around the link. Each of these options increases the
overall message latency.
Each HSP has 11 hyperswitch elements that act as
the I/O ports for each node in the hypercube. There-
fore, for binary hypercubes, the maximum number of
nodes is 211 (2048) because only one port is needed
for each dimension. For nonbinary (e.g., generalized)
hypercubes, a slightly different interpretation is dis-
cussed in section 4. For each hyperswitch, an HSIO
performs the parallel-to-serial--serial-to-parallel con-
version of the &bit data that travel between the hy-
perswitch and serial links that connect to neighboring
HSP's (up to ll serial links connect every node).
The DP is a Motorola MC88000 32-bit reduced
instruction set computer (RISC), which can provide
17 million instructions per second. The DP performs
transfers to and from system memory and acts as
the interface between the HSP and the application
processor. This processor also controls all crossbar
settings in the hyperswitches of the HSP when es-
tablishing paths from source to destination during
message transmission. The DP can act as the appli-
cation processor as well.
Message routing latency is reduced with an adap-
tive backtracking algorithm implemented in the DP.
This algorithm automatically avoids congested links
based on its current knowledge of congestion in the
network. When a message encounters a busy link,
it does not wait for the link to become idle; instead,
it tries to reach the destination by backtracking to
the previous intermediate node and departing from
another port. Virtual links between nodes are es-
tablished by the switching elements in the HSP's of
each node. This dynamic routing method has been
shown to significantly reduce message routing over-
head as well as increase the communication reliability
4
becauseof theability to backtrackandavoidbusyor
faultynetworklinks(ref.4).
4. Generalized Structures and the HCN
UsinganHSPastheI/O controllerat eachnode
ofageneralizedhypercubearchitectureallowsawide
varietyof configurationsto beimplemented.Asdis-
cussedpreviously,eachHSPhas11I/O portsthat
canbeusedto interconnecta numberof processing
sites.Thechipsetspecificationdenotesthat oneof
theseportsshouldbeusedfor diagnosticpurposes;
that is, it shouldbeconnectedto itselfandperiodi-
callyhavetestdatarun throughtheport. Theother
10ports arethen freeto be interconnectedto the
HSP'sof othernodesin thesystem.
Therefore,wecannowcalculatethe numberof
possiblegeneralizedhypercubearchitecturesthatcan
beconstructedwithamaximumof 10portspernode.
This numberequalsthe numberof uniqueinteger
partitionsof 10aswellasany integerlessthan 10.
An integerpartitionof an integerr is the division of
r into a number of integers whose sum is r. Thus,
the list of generalized hypercubes that can be im-
plemented with the hyperswitch can be represented
by any set of integers whose sum is less than or
equal to 10. For example, the partition {2, 2, 3, 3} is
an integer partition of 10. The corresponding four-
dimensional generalized hypercube is a (3,3,4,4) con-
figuration consisting of 144 nodes. The integers in
the partition correspond to the number of ports re-
quired in each dimension.
From reference 6, the number of unique integer
partitions of a number r is obtained from the coeffi-
cient of x r in the following generating function:
m=l k=0
(6)
Specifically, for r _< 10,
G(x) = (1 + x
x(l+
x(l+
×(1+
×(1+
+x 2 +... +x 8+x 9 +x 10)
x 2 + x 4 + x 6 + x 8 + x 10)
x 3+x 6 +x9)(1 +x 4 +x 8)
z 5 + x1°)(1 + x6)(l + x7)
x8)(1 + x9)(1 + x 10) (7)
or
G(x)=l+x+2x 2+3x 3+5x 4+7x 5
+ llx 6 + 15x 7 + 22x 8 + 30x 9 + 42x l0 (s)
Where in equations (7) and (8), all terIns with powers
larger than 10 have been eliminated, because 10 is
the maximum r we are interested in for this example.
Furthermore, the generating function in equation (8)
indicates the number of possible architectures with
respect to the numt)er of ports required per node
(table 3). Finally, we can calculate the total nuinber
of generalized hypercube architectures possible by
simply adding the coefficients of equation (8) as
follows:
1+1+2+3+5+7+11+15+22+30+42=139
Table 3. Possible Generalized Hypercubes
Number of ports/node 0 1 2 3 4 5 6 7 8 9 10
Number of architectures 1 1 2 3 5 7 11 15 22 3(1 .12
These architectures are listed in the appendix
(with the exception of the trivial architecture that
has 0 ports per node) and grouped according to
the number of dimensions. The one-dimensional ar-
chitectures in the appendix represent the fully con-
nected systems that can be implemented. In a(hti-
tion to the list in the appendix, a large mmfl)er of
hyperrectangular and hybrid hypercubes can bc con-
structed. Again, the only constraint imposed is tile
number of I/O ports required per node.
Architectures can now be chosen based on the
characteristics of the application. For example, con-
sider an application with three distinct distributed
components: A, B, and C. Each component has in-
creasing levels of communication bandwidth require-
mcnts. Choose a three-dimensional architecture with
tile processors in diinension 1 connected in a ring,
processors in dimension 2 connected in a mesh, and
processors in dimension 3 fully connected. Finally,
map component A onto the processors in dimen-
sion 1, component B onto the processors in diinen-
sion 2, and component C onto tile processors in di-
mension 3. Choosing the number of processors in
each dimension now depends on the ainount of paral-
lelism inherent in the corresponding distributed com-
ponents of the application.
5. Modifying HSP Element
To implement generalized hypercubes with the
hyperswitch network element (fig. 5), two issues must
be addressed. The first issue relates to the header
information within the probes and message packets.
The second issue requires changes in the coding of
the DP as well as any hardwired functions pertain-
ing to the architecture being configured (neighbor ad-
dresses) and the routing algorithm used.
Appendix
Generalized Hypercubes With the HCN
Tables A1 to A10 list the generalized hypercubes that can be implemented with a modified version of
the hyperswitch communication network (HCN). Architectures are described by the generalized hypercube
representation (which conveys the number of nodes in each dimension and the number of dimensions d), the
number of I/O ports required for each node P, the number of bits required to represent the node addresses Bg,
and the total number of nodes in the topology N.
Table A1. Ten-Dimensional Generalized Hypercubes Table A5. Six-Dimensional Generalized Hypercubes
Configuration P B 9 N
2,2,2,2,2,2,2,2,2,2 10 10 1024
Table A2. Nine-Dimensional Generalized Hypercubes
Configuration P Bg N
2,2,2,2,2,2,2,2,2 9 9 512
2,2,2,2,2,2,2,2,3 10 10 768
Configuration
2,2,2,2,2,2
2,2,2,2,2,3
2,2,2,2,3,3
2,2,2,2,2,4
2,2,2,3,3,3
2,2,2,2,3,4
2,2,2,2,2,5
2,2,3,3,3,3
2,2,2,3,3,4
2,2,2,2,4,4
2,2,2,2,3,5
2,2,2,2,2,6
P B q N
6
7
8
8
9
9
9
10
10
10
10
10
6
7
8
7
9
8
8
10
9
8
9
8
64
96
144
128
216
192
160
324
288
256
240
192
Table A3. Eight-Dimensional Generalized Hypercubes Table A6. Five-Dimensional Generalized Hypercubes
Configuration P Bg N
2,2,2,2,2,2,2,2
2,2,2,2,2,2,2,3
2,2,2,2,2,2,3,3
2,2,2,2,2,2,2,4
8
9
10
10
8
9
10
10
256
384
576
512
Table A4. Seven-Dimensional Generalized Hypercub_s
Configuration P Bg N
72,2,2,2,2,2,2
2,2,2,2,2,2,3
2,2,2,2,2,3,3
2,2,2,2,2,2,4
2,2,2,2,3,3,3
2,2,2,2,2,3,4
2,2,2,2,2,2,5
8
9
9
10
10
10
7
8
9
8
10
9
9
128
192
288
256
432
384
320
Configuration P Bg N
2,2,2,2,2
2,2,2,2,3
2,2,2,3,3
2,2,2,2,4
5
6
7
7
32
48
72
64
2,2,3,3,3
2,2,2,3,4
2,2,2,2,5
2,3,3,3,3
2,2,3,3,4
2,2,2,4,4
2,2,2,3,5
2,2,2,2,6
3,3,3,3,3
2,3,3,3,4
2,2,3,4,4
2,2,3,3,5
2,2,2,4,5
2,2,2,3,6
2,2,2,2,7
9
9
9
9
10
10
10
10
10
10
10
8 108
7 96
7 80
9 162
8 144
7 128
8 120
7 96
10 243
9 216
8 192
9 180
8 160
8 144
7 112
TableA7.Four-DimensionalGeneralizedHypercubes
Configuration P B q N
2,2,2,2
2,2,2,3
2,2,3,3
2,2,2,4
2,3,3,3
2,2,3,4
2,2,2,5
3,3,3,3
2,3,3,4
2,2,4,4
2,2,3,5
2,2,4,5
2,2,3,6
2,2,2,7
3,3,4,4
3,3,3,5
2,4,4,4
7
8
8
8
8
9
9
9
10
10
10
16
24
36
32
54
48
40
81
72
64
60
80
72
56
1,14
135
128
2,3,4,5
2,3,3,6
2,2,5,5
2,2,4,6
2,2,3,7
2,2,2,8
10
10
10
10
10
10
8 120
8 108
8 100
7 96
7 84
6 64
9
References
1. Chow, E.; Madan, H.; Peterson, J.; Grunwald, D.; and
Reed, D.: Hyperswitch Network for the Hypercube Com-
puter. The 15th Annual International Symposium on
Computer Architecture, IEEE Catalog No. 88CH2545-2,
IEEE Computer Soc. Press, 1988, pp. 90-99.
2. Bhuyan, Laxmi N.; and Agrawal, Dharma P.: General
Class of Processor Interconnection Strategies. The 9th
Annual Symposium on Computer Architecture, IEEE Cat-
alog No. 82CH1754-1, IEEE Computer Soc. Press, 1982,
pp. 9O 98.
3. Bhuyan, Laxmi N.; and Agrawal, Dharma P.: Generalized
Hypercube and Hyperbus Structures for a Computer Net-
work. IEEE Trans. Comput., vol. C-33, no. 4, Apr. 1984,
pp. 323 333.
4.
5.
6.
7.
Peterson, J.; Chow, E.; and Madam H.: A High-Speed
Message-Driven Communication Architecture. Confer-
ence Proceedings 1988 International Conference on
Supercomputing, Assoc. for Computing Machinery, 1988,
pp. 355-366.
Hypercube Project -Hyperswitch Communication Network
Chip Set. JPL D-5956, California Inst. of Technology,
May 1989.
Liu, C. L.: Introduction to Combinatorial Mathematics.
McGraw-Hill, Inc., c.1968.
Young, Steven D.; and Yalamanchili, Sudhakar: Adaptive
Routing in Generalized Hypercube Architectures. Pro-
ceedings of the Third IEEE Symposium on Parallel and
Distributed Processing, IEEE Catalog No. 91TH0396-2,
IEEE Computer Soc. Press, 1991, pp. 564 571.
12

Form Approved
REPORT DOCUMENTATION PAGE OMB No 0704 0188
Public reporting burden for this collection of information is estlmated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources,
gathering and maintaining the data needed, and completing and reviewing the collection of information Send comments regarding this burden estimate or any other aspect of this
collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Intormation Operations and Reports, 1215 Jefferson
Davis Highway, Suite 1204, Arlington, VA 22202-4302. and to the OfFice of Management and Budget, Paperwork Reduction ProJect (0704-0188), Washington DC 20503
1. AGENCY USE ONLY(Leave blank) 2. REPORT DATE 3. REPORT TYPE AND DATES COVERED
,hme 1992 Technical ,Memorandunl
4. TITLE AND SUBTITLE S. FUNDING NUMBERS
Generalized Hypercube Structures and tlyperswitch
Communication Network \VU 509-10-04
6. AUTHOR(S)
Steven D. Young
7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES)
NASA Langley Research Center
Hampton, VA 23665-5225
9, SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES)
National Aeronautics and Space Administration
\Vashington, DC 20546-0001
11. SUPPLEMENTARY NOTES
8. PERFORMING ORGANIZATION
REPORT NUMBER
L-16984
10. SPONSORING/MONITORING
AGENCY REPORT NUMBER
NASA TM-4380
]2a. DISTRIBUTION/AVAILABILITY STATEMENT
Unclassified Unlimited
Subject Category 62
12b. DISTRIBUTION CODE
13. ABSTRACT (Maximum 200 words)
One of the Grand Challenges of the Federal High Perfornlance CoIlt]llltillg and ColIllllttllieat, ioIlS (tlPCC)
Program is in remote exploration and experimentation (IIEE). The goal of tile REE Project. is to develop
a space-borne computing technology base that will enable the next generation of missions to explore the
Earth and the Solar Ss'stem. This paper discusses an ongoing study that uses a recent dexxqopment in
communication control technology to implement hybrid hypercube structures. These architectures are similar
to binary hypercubes, bul the.y also provide added connectivity between the processors. This added connectivity
increases comnmnication reliability while decrea.sing tile latency of interproeessor message passing. Because
these factors directly determine the speed that can be obtained by multiprocessor systems, these ardfitectures
are attractive for applications such _s FLEE, where high performmme and ultrareliability are required. This
paper describes and enumerates these architectures and discusses how they can be implemented with a moditied
version of the hyperswiteh communication network (HCN). The HCN is analyzed because it has three attracl _ve
features that enable these architectures to be effective: si)eed, fault t.olerance, and the ability to pass multiple
messages simultan('ously through the same hyperswitch (:ontroller.
14. SUBJECT TERMS
Hypereube; Hyperswiteh, Integer partitions; Adaptive routing
17. SECURITY CLASSIFICATION 18. SECURITY CLASSIFICATION 19. SECURITY CLASSIFICATION
OF REPORT OF THIS PAGE OF ABSTRACT
Unclassified Unclassified
NSN 7540-01-280-5500
15. NUMBER OF PAGES
13
16. PRICE CODE
A03
20. LIMITATION
OF ABSTRACT
;tandard Form 298(Rev. 2-89)
Prescribed by ANSI 5td Z39 18
298-102
NASA-Langle¢, 1992
