On the design of a dynamic reconfigurable network switch by Smit, Gerard J.M. et al.
Microprocessing and Microprogtamm[ng 34 (1992) 59-62 59 
North-Holland 
On the design of a dynamic reconflgurable network switch 
Gerard J.M. Smit. Paul J.M. Havinga. Pierre G. Jansen 
University of Twente. Dvpt. of Computer Science 
EO. Box 217. 7500 AE Easehcde, the Netherlands 
o- mail: smit@cs.utwente.ul 
Keywards: Kau~ graphs, programmable architecture, net- 
work switch, Field Programmable Gate Array (FPGA). 
I. lntnMuetion 
In this paper we "*viii presenl a reconfigurable network 
switch for multi-computer systems. A multi-computer sys- 
tem is defined as a collectkm of linked node computers 
(abbreviated as nodes), in which the nodes communicate via 
message passing [Dally 87]. As a communication network 
tbr our system we use a K~utz network [Kautz 68], 
Each node consists of three autonomous sub-systems: a 
Computation Processor (CP) with klcal memory, a Router 
(R) and a Network Switch (NS). The Computation Processor 
is a standard off-the-shelf" type of proce~or that executes 
applicatkm programs, The Router provides the interface to 
other midas and implements he communication protocol. 
The routes in the network are generated by the Route Gen- 
erator (RG) [Smit q I b] which is part of the Router. The Net- 
work Switch and Router are implemented with Field Pro- 
grammable Gate Array (FPGA) technology [Xilinx 91]. In 
this technology the gate arrays can be re-programmad n
unlimited number of times, Essential in our approach is that 
FPGAs are used as dymlmic programmable units, which 
function can be changed on-the-fly under program control. 
Tberefore they can be used in designs where hardware is 
changed ynamically, t)r when haalware must be adapted to 
different user applications. 
In order It) provide lull connectivity in a network of comput- 
ers. routing mechanisms must be used. These mechanisms 
must satist:v a numhcr of requirements such as: free of dead- 
locks, no starvation, low latency and high throughput. Well- 
known routing mechanisms arc store-and-forward, wtlrm- 
hole routing, virtual cut-through. When a me,sage in the net- 
work is unable to proceed because some resource it needs is 
held by ofller messages (collision) some action has tl~ he 
taken, Possible optkms are: blocking tile message, buffering 
the message prior In the node where the collision t~c,:urs, 
dropping and retransmission f the mt:~sag:, tit rnisrouting 
the mes,'~age lt~a fiee link. It is known that some of these 
solutitms have a potential danger t~f deadlock. There are a 
number of mechanisms filr avoiding communicatkm dead- 
locks ill networks. Most technitluus such as virtuul networks 
aud chtss climbing, are based on bruukiug loops ill tile 
dependeucy graph [Dally 87 I. Dropping and rOransmission 
of messages i inherently free of deadlocks. This last 
method, aL~,~ called the method oftbe nosy worrhs [Who- 
bray 8g], is used in the configuration presented in this paper. 
The method of misrouting as used in the Coaaecfion- 
Machine is not applicable, because misrouting is not dead- 
lock free and in case of a misroute a new route has to be 
computed. 
The main function of the Network Switch is to route mes- 
sages in the communication network. The NS does not com- 
pute nor change the contents of the messages. It only uses 
the intormation i  the route field of a message to control the 
destination of that message. The route itself is computed by 
the Router and the Route Generator. The NS communicates 
with other switches and the Router via links. The NS man- 
ages incoming menages autonomously: establishes the 
route and pasts  the data of the messages through er returns 
a message if all links are busy. Because the switches are 
implemented in FPGA technology, the precise message for- 
mat is not fixed by design. For instance the decision for fixed 
or variable length messages can be taken at a later stage in 
the design process. 
The Router a~sembles outgoing messages, ends tbe~ mes- 
sages to the NS and handles incoming messages. The data of 
the messages generally comes from and goes to the local 
memory. The Router interacts with the memory without 
intervention from the CP. The Route Generator. asub-unit of 
the Router, is a logic unit that generates the d node disjoint 
routes ofa Kautz graph, given a source and a destination 
node (see section 2). If the NS of" the source relxms that a 
message did not reach the destination (due to congestion or 
link/node tailures), the Router eads a new rtmte from the 
Route Generator and assembles a new me,sage, if all node 
disjoint path~ have been tried the CP is informed. The CP 
can decide to try again later after a random delay [Whobmy 
,ss 1. 
2. Kautz networks 
We use Kautz networks in our project because these net- 
w~rks have interesting properties [Bermoad 89 I. Particu- 
larly, they interconnect onsiderably more nodes than the 
usual toi~.~logies, alld they have a small diameler, lind a snlafl 
and Iixed degree. Furthermore they are highly fimlt tt~lerant, 
admil sellrouting and ei:ll embed ~t[lllthlrd computation 
60 G.J.M, Smit et aL 
graph~, hnasc Ilrna.,c Nt~ I ' ,ho~cd the exmSlCllC¢ (11 tl Ilode 
disjoint paths between an'¢ pair of nodes in a Kaulz graph of 
N - d k + d k- I nodes. These properties makes Kautz graphs 
suitable as an inlerctmneetion network ror large scale paral- 
lel computer s) stems. 
DeJinition of Kautz graphs [Kautz fig], 
210 
Fig. I : Example of a Kautz graph (K(2,3)). 
The Kaulz digraph K(d.k) with in-degree and out-degree d 
and diameter k is the digraph w ho~,e vertices are labelled 
~,.ith x,,ords (Xl.....Xk) ~o[" length k from an alphabet of  d+ 1 
letters b 3 romp,, ing thnse ~ord~; in which there art: two ctm 
scculive identical etters (x i # xi+ I, Ibr I ~ i $ k-J). There is 
all arc from a vertt:x x to a vertex y if and tltlly i l  the ]asl 
k 1 letters tff x are the s;lme as the lirsl k-I !criers of y. 
A straighttbrx~,ard generic route of  length k can be lbund by 
simple enneatenation f stlurce and destination word. How- 
e~er there ma~ hc tt~ulcs '.x ilh length < k [Stair 91a]. 
Example I Isce lig. I)  
In the graph we find Ihc r~mtc Rg = ,: 120201 > from 
(120) to (201) ~ia nl~tle t21121 and (112(I). ] 'his rtmtc has 
length 3 ( = k~. 
The shiftiest route i~, Rs = < 12(H > ol  length 1. 
Table I ct~mpares Knutz digr',phs ~ ith "de l]ru ijn" digraphs 
{de Hruijn 4ill ;rod lhe binary hypercubc Illi l l is N5]. [Scilz 
N5]. The de Bruijn digraph h;Is bonn selected because its def- 
inition is clo~,el? related In Kaulz d igruphs 
The difik'renc¢ between a dc Brui.in digraph B(d.kl and a 
Kautz digraph K(d.k)i~, that in a de Brui.jn digraph tv.t~ con- 
secutive letters in th~ x~t~rd r~.'presenting a particular vertex 
ma) be equal, As a ct~n~equcncc, this digraph contains ell" 
h~ps.  
d=k=4 d=k=h d=k=N number o l  nodes 
hypercube I~ n4 256 N = 2 ~ 
dc Bruijn Ib 729 ¢~553fl N = d ~ 
Kaulz 24 t~72 ~ It)21l N = d ~ + d ~' 
l~ble 1.: Number td n(~dt:,, ill some graphs. 
N~!e thai b~r the de Hruijn and Kaulz digraphs the out-dc- 
grt:e and in-dc~rce arc hall  the dcgrcc mentioned in the ta- 
ble. l 'hu~ a Kuutz digraph ~ iln in dcgrt!e and oul-degrcc ~I 
and a diameter ol,N connccls NI~)2() nodc~. "~, hich is signili- 
canll} inule than the 251~ nlldcs ill a hyl~crcuhe, 
The Rtluter Generat¢~r as descrihed in [Smi191b] generates 
the d node disjoint routes with increasing length. The routes 
are as short as pt~ssible and free of loops, 
3. The  Network  Switch 
A designer of eommunieatitm '~ystems for multi-eomputer~ 
is f~ced with conl]icting demands, due to varying applica- 
tion requirements, last changing technology, etc. Experi- 
ences with existing parallel machines and simulations hove 
sh(lwn far instance that not pile single ronl ing mechanism is
optimal fl~r all kinds of  applications. In the case tff low or 
mediulTi comrclun iealion traffic, circuit switching or worm- 
hole routing seem to have advantages over store-and-for- 
warding. But in ease of in!ensive communicat ion as for 
instance due to frequent broadcasting, the store-and-fi~rward 
mechanism is retire adequate [Scidcl 89]. I f  a communica- 
tion network is designed with fldl-eustom VLSI compt~- 
ncuts, tile designer is tbreed to make crucial decisitms early 
in the design process. 
In our design the Nctwnrk Swilch and Router are imple- 
mented with Field Vrogrnmmable Gate Array (FPGA) tech- 
m~logy. This lechnology alk~ws the gale arrays to be repro- 
grammcd Ibr an unlimited number of times, q'herelbrc they 
arc suiled liar designs in which the runetions of the hardware 
needs adaplatit~ns in order tl~ n~eet changing application 
requirements. This has a numhcr of advantages, uch as: 
The selection of tile most sgi!:~hhr C(~nlmtlllleation meeh- 
unisn~ e~ln ~c ptlslpoiled to a later stage of the design. 
The system designer or application progrummer can 
'csign applicalitm specific communicat ion primitives 
and mechanisms. Knowledge of the o~mmunicatit~n 
struelul c of the ¢omputatit~n c n he used to tune the net- 
work to the requirements nf Ihe corflplllatJtln till-the-fly. 
Due In its ttcxibility the syslem can be used in a wide 
variety of applications, ranging frt~m high speed compu- 
I:*titlns tn dedicated real-time applications. 
The design cycle of  FPGAs b, very shrift. Mimer design 
changes can 11¢ made inslanlaneonsJy. 
The cost o l  it prohtt.~ pt: is very [ow. FPGAs are standard 
eompl~nent~,, ntl de',ign Costs Ior lull-custom comp~l- 
nears. 
l tl this paragraph v.c prc~cnt a ptlssiblc Nelwtlrk Switch e .n-  
li~uruticm. The nl¢~r,t impt)rtant design decisit~ns arc: 
Uni-dirz'{li~mal links i~l 12 bits wide. 
%~,L" have cht~scn a]~ccd ttelh'ork topoh~gy and ;i lixed 
nnnlhcr 01" wires fllr each tink hceat~sc v.e cannot change 
the ph}sical v, iring ul Iht~ ~'3 sleH1 dyn~tmically. 
x,~,e use ,~,-m-hoh, mtltin?¢ [l)atly N7]. This type tlf rt,llt 
ing suits ~cl l  tt~ r .ut ing ill Kanl/n~:tx~.llrk~. illltl giv~:s ~t 
hlw liltellC 3. As soot] ;IS il Nclwtlrk Sv*iteh reads Ihe 
headt.:r tit a nle~.~agc, i~Olltilillill~ tilt: rtlute in li~rmalion, it
A dynamic reconfigurable network switch 6~ 
helucls Ihe next l i l lk on the route and fi~r++artls ile 
remaining part of the message dowll that link. Each NS 
consumes one byte oF the route. 
TO avoid deadlock we use the nosy worms protocol 
[Whobrey 88]. If a message is blocked it is recoiled to 
the source. Because a Kaulz network has d node-disjoint 
routes we expect hat the probabilily of a collision is 
acceptable. 
As the amount of memory in a FPGA is relatively small 
the store-and-furward outing mechanism seems less 
obvious. Although the local memory could he used as a 
bufl;erspace Ion store-and-fitrward tmting. Store-and- 
lbrward has a latency dial is prolxwfional to the product 
of packet length and number of hops. So worm-hole 
routing or virtual cut-through arc more appropriate t~r 
our system. 
All Network Switches ynct.ronize with each other via 
two synchronizatitm signals per link. 
A message consists of: a routc field, a variable length data 
i~eld and an End Of Data marker. The nmte field of the 
packet defines precisely the route the message takes fi'om 
source to destination. 
As soon as a NS reads the header of a message, cxmtaining 
the route information, it selects the next link on the route, 
"consumes" the used rotltc information and lbrwards the 
remaining part of the message down that link. Ilowever, if 
the output link is occupied it returns a negative acknowledge 
(NACK). Each NS between source and destinalion con- 
samos one byte of the n~ate until the message arrives at the 
NS oF the destination. Now the ro0te tield of the message is 
empty and the message is tlelivered at the Router of the des- 
linalion, The end of the data is indicated with an End Of 
Data (EO1)) mark. 
We assume that the Router of the dcsfinatkm never refuses 
an incoming message fi~rever. 
If the Router of the source receives a NACK (due to eonges- 
tkm or link/node l~tilures), the Roulcr assembles a new mes- 
sage wilh anothcr node disjoint route. If all node disjoint 
paths, have been tried, Ihe CP is notified. 
4. Implementation 
Fig. 2 shows tile internal structure tff the dala-palh of the 
Network Switch. Ph¢ switch has 3 input and 3 output links. 
One of the input links and tmc of the output links is otto- 
netted to the Router. With riffs switch several physical net- 
works with in-degree and t,ut-tlegree 2 man bc buill, such as: 
h~lLIn. In~:~.h. dcl~ruijn Ilet'+t.tffks ~tnd I(.auD' IleK~ork~..¾ lillk 
Fig. 2: Internal structuie of the Network Swilch. 
consists of the lbfiowing 12 uni-directional signals: 8 data 
bits, I type bit. a NACK signal and 2 synchronization sig- 
nals (¢1 i and clai). The type bit is used to indicate the star~ of 
a message and the end of the me~sage (EOD). The NACK 
signal goes in the direction oplx~site to the data. 
In the t~re~ented system we use an externally a.synchronous 
and internally clocked design methodology. 
f i l l& cl I & ... el5 
(intcrnall I , , D 
¢ll~uk 
clo & ~I & "'" ~15 
t i l l  i l l l i l lks oullinks cla 3 
d I ~ cI 4 
cla2 (inlcrn;ll~ck~¢k cl 5 
Fig+ 3: Stale diagram and block diagram of a 
Muller-t' ¢lem,znt, 
The design pruecss of synchronous circuits is much ,:aster 
compared u) the design of asynehn)nons circuits. Moreover, 
the structure t~f FPGAs sails well Io synehixmous dc~-igns. 
tlowevcr in large scale multi.compulers Ihere is a diltiuufi 
and often undercsfimaled pr,~blem of dock distribution. In 
our design we fi~und a compromise that tllVt~ke~; file adv;~,:- 
tages of fiolh the synehrol~ous and asyachromms desi+~n 
methodtdogics, hut whhout many ol + their disildVallta~es. 
From the perspccfiw: td' Ihe external x+orkl such a s s stern 
o|g'rates as~, nchrollously bu ~ :Is illlcrnal check is dri~.¢ll fill- 
h~wing a specific h;llldMlak¢ [imtt+kx+l I'l'indcr OI I. in Ibis 
way the profi les ol u'loek ~kcx+ c;m i+~, ;I, tddet~, The internal 
ek~ck in gelleratc.d h~ a Mul ler (" clelllk'nt, also callu'd 'Rcn-  
tlu]~ous Ml~dulc". All intern:ll registers arc clocked on the 
62 G.J.M Smit et al. 
rising edge ~1 file iltterll~dch~ek. I he s) nchnlnl/~ltiofl ~ iilks 
as t~llows: the Mullcr-C clement of a switch receives the 
clock signals (c~i) of all neighbor switches (see fig, 3). When 
all clock signals are asserted it negates the internal oloek. 
This signal is alstl sent as aa al-knowledge (clai) to all its 
neighbors, If a C element has received at l the acknowledges 
(~11 el i signals negated) itasserts the internal clock and so the 
cla i signals ~;gain. "1~ assu~e the cc~rreet synchronization f 
the sv, itches all switches must resptmd to the assertion and 
negation of the el i signals of the neighbors even if t',,cy have 
nothing to ~end. 
5. Re~fizatian 
The ab~we mentioned network s~itch configuration has 
been realized with a XC3042 Field Programmable Gate 
Array ~ff Xilinx [Xilinx 91 ]. It uses a decentralized control. 
so six bytes c~n be handled simultaneously b  a switch in 
one "clock c,'cle". Table 2 gives some results of tile realiza- 
tion. The clock speed is derived from the outp,at of the bard- 
ware simulator of Xilinx. The implementation was Iirst 
described in VI t DL and then simulated. After 1hat he switch 
was automatically s nthesized by the VHDI. synthesizer 
from Viewlogic [Viewlogic 90]. Due to the high level spce- 
ilicatltm in VttDL and the p~wer['ul synthesizer, the whole 
design prt~cess took only 3 weeks. We expect hat the speed 
of the sw~tch can be improved significantly by a careful 
(manual) redesign. 
N umber of CI_B I )~ used: 126 available: 144 
Number t~f If) pins used: 73 available: 96 
Number of logic levels used: 4 
M~ximum "clock" speed: Ill Mh:,. 
Trunster ate per link: Sl) Mt~i!'scc. 
lhble -.2: Results of a realizatitm with a XC31)42 FPGA. 
6. Conelusi~m 
In this paper we have presented tile design c~l a Network 
Switch fi~r a multi-computer, It can efficienlly support dit: 
Icrcnl styles of et~mmunication, such .'_.s worm-hole routing. 
stt~re-and-lbrward muting and virtual cut-thalugh. Fach 
node tff the multi-computer tmsisls of three ;iutc~nomons 
subs3 stems: the ('i.imputzllitm Pr~Jee~sor. the Router. and the 
Net,.~t~rk S'.~ itch. The Rt~uter and tbe Network Sv, itch are 
implemented with FPG ~t technok~gy. This implies that file 
system designer can alter the eommuniealion mechanism. 
even in hell, con the execution t)f two appbcation programs. 
The communici~tion etwork in hosed tin a Kaulz topolt)gy. 
Kautz graphs l]~rm a class ol inlerconncclion networks wilh 
interesting properties such as: small diameter, large number 
of nodes (N = d k + d k-I). the dcgrcc is independent of tile 
network size. the net'~ ork is fault-tolerant, i  can embed 
standard c'tmlputalitm graphs and has a simple rotltitlg algo- 
• ~, ( " , ,n l lgt t r ;dq~ I ~'~'1= !~hwk ((I1~) c , ,n , : ,m~ pr , lg ramm:~h]c  ~ , ,m~, l -  
The presented network switch uses worm-hole routing, and 
a simple deadlock avoidance protocol. Worm-hole routing 
suits well to muting in Kautz networks and gives a low 
latency. All Network Switches ynchronize with each other, 
such that a global clock is nnt required. 
Processors communicate with non neighboring nodes 
directly without involving the Computation Processor and 
Routers at the intervening nodes. 
Refe~nees 
[Bermond 89] Bermond J.C,, Homobono N., Peyrat C.; 
"Large Fault-Tolerant Interconnection Netw,,r~s", 
Grapb~, and Combinatories, 1080. 
[Dally 87] Dally W,J.: "A VLSI Architecture tor Concurrent 
Data Structures", Ph.D. thesis, Computer Science, 
California Institute of'Technology, 1987. 
[de Bruijn 46] de Bruijn N.G.: "A combinatorial pr~lblem"; 
Koninklijke Nederlandse Academic van 
Wetenschappen Froc, A49, pp 758-764; 1946. 
[llillis 851 Hillis W.D.: "~i'llc onnection machine"; The 
MIT press; 1985. 
[Imase 8ill Imase M., Soneoka T., Gkada K.: "A fault- 
tolerant processor interccnncction network" (original 
in Japan,,:se), tr~n~!nled in Systems and Computers in 
Japan, vol 17, no 8 pp 21-3(I, 1986. 
[Kautz t~g] Kautz W.H.: "Bounds on directed (d,k) graphs. 
Theory of cellular logic networks an~{ machines". 
AFCRI,-68-066g Final rep~)rt, pp 20-28, 1968. 
[Seilz 85] Seitz C.L.: '"l'he cosmic ube"; Comnl. ACM, Vol 
28, no l,jan. 1985. 
[Smit 91at Smit G.J.M.. Havinga P.J.M., Jansen P.G.: "An 
algorithm Ibr gener~tting node disjoint routes in Kaulz 
digraphs", Pl~Jccedfiigs Fifth International Par,dlel 
Processing Sympos;um, pp. 102-1t17..",'lay l C)ql. 
[Smil 91b] Smit G.J.M., t l,~'~ toga P.J.M., Jansen P.G.. de 
Brier F.. Molenkamp t i.: "On hardware for generaliJ:g 
routes iu Kautz graphs". 5rtwcedings Euroulicro91, 
1991. 
[Tinder 91] Tinder R.F.: "'Digilal El~ineering Design, A 
Mudern Approach", pp. 638-646. Prentice Hall. 1991. 
[V I ID I ,  S71 "VIIDL l.anguage Rel~:Icnee Manual". IEEE- 
STD- III7fi- 19S7, IEEE Computer Society. 
[VIEWh~gic 90] "'VHDL-Designer User's Guide", 
VIEWIogic Systems, Inc. April 199fl, 
[Whohrcy 88] Whobrey D.: "A ctlmmunicatiolls chip for 
multiprocessors". Proc. ('ONPAR ;',S pp 464-473, 
19~8. 
[Xilinx till "qhe Prt~gramm,ble G~;:,.' Array Data Book", 
Xilinx Inc., 1991. 
