Hybrid fault tolerant routing algorithm in NoC  by Bishnoi, Rimpy
PH
i
R
D
R
A
I
T
t
c
t
a
r
h
2
lerspectives in Science (2016) 8, 586—588
Available  online  at  www.sciencedirect.com
ScienceDirect
j our na l homepage: www.elsev ier .com/pisc
ybrid  fault  tolerant  routing  algorithm
n  NoC
impy  Bishnoi
epartment  of  Information  Technology,  JB  Institute  of  Engg.  &  Technology,  Hyderabad,  India
eceived 21  February  2016;  accepted  14  June  2016
vailable  online  18  July  2016
KEYWORDS
NoC;
SoC;
Routing
Summary  Network-on-Chip  has  been  a  growing  design  paradigm  with  the  rise  in  Multi-
Processor  System  on  Chip  (MPSoCs)  primarily  due  to  its  scalability.  While  regular  meshes  (2
or 3-dimensional)  are  the  usual  proposal  for  such  a  paradigm,  a  real  chip  may  not  follow  it.
Heterogeneous  cores,  hardware  failures  or  manufacturing  defects  can  possibly  cause  irregular
topologies in  a  Network-on-Chip.  Selection  of  a  routing  algorithm  is  an  important  challenge  in
NoC design  as  it  affects  power  consumption,  communication  latency  and  overall  system  perfor-
mance. Routing  can  be  supported  in  such  faulty  environment  by  use  of  routing  tables.  But  this
is not  a  scalable  solution  as  table  size  grows  with  network  size.  Logic  Based  Distributed  Routing
(LBDR) is  proposed  as  a  new  routing  implementation  technique  which  offers  compact  routing
implementation  and  fault  tolerance  without  use  of  routing  table.  In  this  paper  we  propose  a
Hybrid Fault  Tolerant  Routing  Algorithm  (HFTRA),  which  aims  to  provide  fault  tolerance  sup-
port in  presence  of  on-chip  link  failures.  Proposed  routing  is  implemented  with  LBDR  scheme.
Analysis of  the  method  has  shown  the  effectiveness  of  proposed  scheme  as  compared  to  routing
tables when  implemented  using  LBDR.
r  Gm
rg/l
e
m
t
(© 2016  Published  by  Elsevie
(http://creativecommons.o
ntroduction
he  chip  design  concept  has  shifted  from  single  core  to  mul-
icore  system  design.  As  the  number  of  cores  on  a  single
hip  continues  to  increase,  focus  has  shifted  from  computa-
ion  to  communication  (Lee  et  al.,  2008).  Communication
rchitecture  greatly  impacts  the  area,  performance  and
 This article belongs to the special issue on Engineering and Mate-
ial Sciences.
E-mail address: rimpybishnoi@gmail.com
a
N
m
I
D
i
p
t
ttp://dx.doi.org/10.1016/j.pisc.2016.06.028
213-0209/© 2016 Published by Elsevier GmbH. This is an open access art
icenses/by-nc-nd/4.0/).bH.  This  is  an  open  access  article  under  the  CC  BY-NC-ND  license
icenses/by-nc-nd/4.0/).
nergy  consumption  of  overall  multicore  system  design.  To
eet  the  challenges  of  current  and  future  multicore  archi-
ectures,  a  packet-switched  interconnect  Network-on-Chip
NoC)  emerged  as  an  alternative  to  the  traditional  bus  based
nd  point  to  point  interconnects  (Dally  and  Towles,  2001).
oC  has  proven  to  be  a  scalable,  reliable  and  efﬁcient  com-
unication  paradigm  (Benini  et  al.,  2002).
Multicore  systems  consist  of  multiple  heterogeneous
ntellectual  Property  (IP)  cores  (CPU,  memory  controllers,
SP  modules,  etc.)  that  communicate  through  the  underly-
ng  net-work  infrastructure.  As  there  could  be  many  possible
aths  between  a  source  and  destination  IP  core,  selec-
ion  of  some  path  may  degrade  the  system’s  performance
icle under the CC BY-NC-ND license (http://creativecommons.org/
f
t
a
s
r
ﬂ
a
t
b
a
t
c
s
d
d
L
D
t
a
(
i
o
s
g
f
R
i
D
b
u
t
R
i
C
i
d
i
t
s
X
t
d
f
1
2
3
t
PHybrid  fault  tolerant  routing  algorithm  in  NoC  
drastically  (Guerrier  and  Greiner,  2000).  Therefore  the  way
messages  are  routed  is  one  of  key  challenging  problem  in
NoC  design  (Marculescu  et  al.,  2009).  Selection  of  a  given
routing  algorithm  is  used  to  determine  the  way  messages  are
routed  through  the  network.  Routing  algorithm  affects  the
communication  latency,  power  consumption  and  overall  per-
formance  of  underlying  network  architecture  (Bjerregaard
and  Mahadevan,  2006).
Based  on  the  way  next  hop  along  the  route  is  determined,
routing  algorithm  can  be  deterministic  or  adaptive  (Rantala
et  al.,  2006).  Deterministic  routing  algorithm  does  not  take
into  account  the  current  network  status  and  always  gener-
ates  a  ﬁxed  path  for  each  source  destination  pair  in  network.
On  the  other  side,  adaptive  routing  generates  multiple  paths
for  each  source  destination  pair  and  based  on  current  net-
work  status  it  selects  the  ﬁnal  path.  Based  on  the  number
of  hops  a  message  takes,  routing  algorithms  can  be  mini-
mal  or  non-minimal.  Performance  of  routing  algorithm  also
depends  on  its  implementation.  Routing  algorithm  can  be
implemented  as  source  or  distributed  routing  (Rantala  et  al.,
2006).  In  source  routing,  complete  path  from  source  to  des-
tination  is  computed  at  source  node  and  stored  in  packet
header.  In  distributed  routing,  path  is  computed  at  each  hop
based  on  destination  address  stored  in  packet  header.  To
address  the  challenges  posed  by  irregular  topologies,  both
techniques  make  use  of  routing  tables.  This  is  not  an  efﬁ-
cient  solution  as  size  of  table  increases  when  network  size
increases  resulting  in  increase  of  size  of  router  and  also
its  complexity.  Recently,  Logic  Based  Distributed  Routing
(LBDR)  (Flich  and  Duato,  2008)  has  been  proposed  as  a new
routing  implementation  technique  which  is  compact  and
offers  high  fault  tolerance.  With  increasing  chip  densities,
the  probability  that  a  manufacturing  defect  affects  commu-
nication  system  is  also  increased.  Defects  affecting  links  of
NoC  can  be  handled  by  fault  tolerant  routing  algorithms.
In  this  paper  we  propose  a  novel  Hybrid  Fault  Tol-
erant  Routing  Algorithm  (HFTRA)  that  uses  minimal  and
non-minimal  paths  to  route  the  trafﬁc.  Basically,  it  uses
different  virtual  channels,  each  implementing  a  differ-
ent  routing  algorithm.  All  the  algorithms  are  implemented
using  LBDR  which  offers  great  advantage  when  implemented
using  routing  tables.  In  absence  of  failures,  HFTRA  behaves
like  minimal  path  routing  for  a  given  source  destination
node.  When  fault  occurs,  it  searches  for  minimal  path.  If
unavailable,  it  routes  the  trafﬁc  to  virtual  channels  hav-
ing  non-minimal  routing.  In  this  way  it  enjoys  the  beneﬁt
of  both  minimal  and  non-minimal  routings  by  adapting  to
fault-tolerance  and  bypassing  defected  links  where  neces-
sary  otherwise  stick  to  minimal  for  efﬁciency.
The  rest  of  the  paper  is  structured  as  follows.  In  ‘‘Related
work’’  section  an  overview  of  fault  tolerance  is  presented.
‘‘LBDR  overview’’  section  gives  a  brief  overview  of  LBDR
technique.  ‘‘Proposed  hybrid  fault  tolerant  routing’’  sec-
tion  is  devoted  to  the  explanation  of  proposed  method.
‘‘Analysis’’  section  analyzes  the  proposed  scheme.  Finally,
‘‘Conclusion  and  future  work’’  section  concludes  and  direc-
tions  for  future  work  are  introduced.Related work
Duato  et  al.  (1997)  explained  the  effect  of  faults  on  the
correctness  of  routing  algorithm.  They  presented  common
I
A
t587
ault  models  along  with  various  deﬁnitions  related  to  fault
olerant  routing  algorithms.  Sui  and  Wang  (1997)  proposed
 constrained  fault  model  in  which  faults  form  a  convex
hape  (ring  or  chain)  and  route  message  alongside  faulty
egions  until  it  is  back  in  its  normal  route.  In  their  approach,
its  use  speciﬁc  virtual  channels  based  on  what  route  they
re  on.  This  resulted  in  bad  utilization  of  all  available  vir-
ual  channels.  An  improvement  to  this  approach  proposed
y  Rezazadeh  et  al.  (2009)  uses  lesser  virtual  channels  but
llows  U-turns  (180  degrees)  to  support  non-minimality  of
he  route.  Li  et  al.  (2009)  applies  a  multi-level  congestion
ontrol  for  load  balancing  and  detours  packet  based  on  the
ame  in  case  a  fault  is  encountered.  Certain  approaches  use
eactivation  of  healthy  nodes  over  virtual  channels  to  ensure
eadlock  freedom.
BDR overview
istributed  routing  algorithms  often  deploy  turn  restrictions
o  obtain  freedom  from  deadlock  and  livelock.  A  routing
lgorithm  can  therefore  be  solely  depicted  by  its  routing
turn)  restrictions.  Distributed  routing  algorithms  can  be
mplemented  as  source  and  destination  routing  with  or  with-
ut  using  routing  table.  To  map  irregular  topology,  both
chemes  require  routing  tables.  As  the  size  of  routing  table
rows  with  network  size,  tables  are  not  scalable  solution
or  representing  irregular  topology.  Logic  Based  Distributed
outing  (LBDR)  is  proposed  as  a  new  mechanism  for  rout-
ng  implementation  without  using  routing  table  (Flich  and
uato,  2008).  Any  topology  derived  from  initial  2D  mesh  can
e  easily  mapped  using  LBDR.  LBDR  uses  2  sets  of  conﬁg-
ration  bits  to  map  topology  and  the  routing  algorithm  to
he  NoC.  In  each  output  direction  of  a  switch,  a  routing  bit
xy  indicates  whether  a  turn  towards  ‘y’  on  the  next  hop
n  ‘x’  direction  is  allowed  or  not  and  the  connectivity  bit
z  indicates  the  connectivity  (which  may  not  exist  due  to
rregularity  in  the  topology  or  faults)  with  the  switch  in  the  z
irection.  For  a  2-dimensional  mesh,  there  would  be  2  rout-
ng  bits  and  1  connectivity  bit  per  output  direction,  hence  a
otal  of  12  routing  and  4  connectivity  bit  per  switch.  Table  1
hows  the  routing  and  connectivity  bits  corresponding  to
Y  routing  in  3  ×  3  topology.  XY  routing  restricts  packets  to
ake  YX  turn.  Logic  of  distributed  routing  computation  is  as
escribed  below:
A  direction  (say  North  (N))  can  be  taken  if  one  of  the
ollowing  condition  is  satisﬁed:
)  Packet  destination  is  in  North  (N)  direction.
)  Packet  destination  is  in  NE  direction  and  an  East  (E)  turn
is  allowed  on  the  next  hop.
)  Packet  destination  is  in  NW  direction  and  a  West  (W)  turn
is  allowed  on  the  next  hop.
Similarly,  LBDR  computes  availability  of  all  other  direc-
ions  and  one  of  the  available  directions  is  chosen.
roposed hybrid fault tolerant routingn  this  section  we  propose  Hybrid  Fault  Tolerant  Routing
lgorithm  (HFTRA).  Proposed  algorithm  is  Hybrid  in  sense
hat  it  suggests  an  effective  way  to  combine  multiple  routing
588  R.  Bishnoi
Table  1  Routing  and  connectivity  bits  for  XY  routing.
ID  Rne  Rnw  Ren  Res  Rse  Rsw  Rwn  Rws  Cn  Ce  Cw  Cs
0  1  1  1  1  0  1  1  1  0  1  0  1
1 1  1  1  1  0  0  1  1  0  1  1  1
2 1  1  1  1  1  0  1  1  0  0  1  1
3 0  1  1  1  0  1  1  1  1  1  0  1
4 0  0  1  1  0  0  1  1  1  1  1  1
5 1  0  1  1  1  0  1  1  1  0  1  1
6 0  1  1  1  1  1  1  1  1  1  0  0
7 0  0  1  1  1  1  1  1  1  1  1  0
1  
a
m
r
m
t
f
e
V
t
p
f
f
v
M
t
i
f
H
t
A
I
u
w
n
c
a
a
r
v
i
o
C
I
r
t
v
i
o
u
e
w
R
B
B
D
D
F
G
L
L
M
R
R8 1  0  1  1  1  
lgorithms  for  different  virtual  channels,  each  one  imple-
enting  a  separate  routing  algorithm.
In  presence  of  single  or  multiple  failures,  ﬁrst  it  tries  to
oute  the  trafﬁc  along  minimal  path  only.  When  all  available
inimal  paths  become  faulty  then  it  route  the  trafﬁc  along
he  non-  minimal  path.  On  the  event  of  a  failure,  it  looks
or  the  virtual  channel  which  can  bypass  faulty  routes.  For
xample,  if  there  are  initially  three  virtual  channels  (VC1,
C2,  VC3)  implementing  XY,  YX  and  west  ﬁrst  routing.  If
he  failure  is  in  east  direction  of  the  source,  then  the  pro-
osed  algorithm  forwards  the  packet  using  VC3  as  it  offers
ault  free  minimal  or  non-minimal  paths.  In  the  absence  of
ailures,  proposed  algorithm  can  route  trafﬁc  to  any  of  the
irtual  channels,  thus  improving  the  overall  performance.
oreover,  using  different  algorithms  on  the  VCs  tries  to  dis-
ribute  trafﬁc  evenly  to  all  the  links.  Also,  all  algorithms  are
mplemented  using  LBDR,  which  further  improves  the  per-
ormance  as  it  does  not  require  any  access  to  routing  table.
FTRA  maintains  the  deadlock  freedom  by  not  allowing  any
ransition  between  virtual  channels.
nalysis
n  this  section  we  analyse  the  impact  of  HFTRA  implemented
sing  LBDR  as  compared  to  routing  tables.  Idea  is  that  when
e  use  LBDR  to  implement  the  HFTRA,  irrespective  of  the
umber  of  virtual  channels,  it  uses  a  single  routing  logic.  It
aptures  the  rules  of  different  routing  algorithms  by  using
 separate  set  of  LBDR  bits  corresponding  to  each  routing
lgorithm.  On  the  other  side,  routing  table  implementation
equires  number  of  routing  tables  equal  to  the  number  of
irtual  channels,  each  one  storing  routes  calculated  accord-
ng  to  different  routing  algorithms.  Hence  this  increases  the
verall  area  overhead  as  compared  to  proposed  one.
onclusion and  future workn  this  paper  we  have  presented  a  hybrid  fault  tolerant
outing  algorithm  in  NoC.  Proposed  method  is  based  on
he  fact  that  if  we  use  different  algorithms  for  different
S1  1  1  0  1  0
irtual  channels  then  it  would  offer  fault  tolerance  by  choos-
ng  the  virtual  channel  which  offers  fault  free  path.  Area
verhead  can  be  controlled  by  implementing  the  algorithms
sing  LBDR.  In  future,  we  will  see  the  impact  on  fault  tol-
rance  by  allowing  the  transition  between  virtual  channels
hile  preserving  deadlock  freedom.
eferences
enini, L., De, G., Micheli, 2002. Networks on chips: a new soc
paradigm. Computer 35 (1), 70—78.
jerregaard, T., Mahadevan, S., 2006. A survey of research
and practices of network-on-chip. ACM Comput. Surv. 38
(June (1)).
ally, W.J., Towles, B., 2001. Route packets, not wires: on-chip
interconnection networks. In: Design Automation Conference,
2001. Proceedings, pp. 684—689.
uato, J., Yalamanchili, S., Ni, L.M., 1997. Interconnection
Networks — An Engineering Approach. IEEE.
lich, J., Duato, J., 2008. Logic-based distributed routing for NoCs.
CAL 7 (1), 13—16.
uerrier, P., Greiner, A., 2000. A generic architecture for on-chip
packet- switched interconnections. In: Design, Automation and
Test in Europe Conference and Exhibition 2000. Proceedings, pp.
250—256.
ee, H.G., et al., 2008. On-chip communication architecture explo-
ration: a quantitative evaluation of point-to-point, bus, and
network-on-chip approaches. ACM Trans. Des. Autom. Electron.
Syst. 12 (May (3)), 231—2320.
i, X., et al., 2009. Fault-tolerant routing algorithm for network-
on-chip based on dynamic xy routing. Wuhan Univ. J. Nat. Sci.
14 (4), 343—348.
arculescu, R., et al., 2009. Outstanding research problems in
NoC design: system, microarchitecture, and circuit perspec-
tives. IEEE Trans. CAD-ICS 28 (1), 3—21.
antala, V., et al., 2006. Network on Chip Routing Algorithms.
ezazadeh, A., Fathy, M., Rahnavard, G., 2009. An enhanced
fault-tolerant routing algorithm for mesh network-on-chip. In:
International Conference on Embedded Software and Systems,
2009. ICESS’09, pp. 505—510.
ui, P.-H., Wang, S.-D., 1997. An improved algorithm for fault-
tolerant wormhole routing in meshes. IEEE Trans. Comput. 46
(9), 1040—1042.
