Stochastic Automata Network for Performance Evaluation of Heterogeneous
  SoC Communication by Deshmukh, Ulhas & Sahula, Vineet
ar
X
iv
:2
00
6.
05
50
3v
1 
 [c
s.P
F]
  7
 Ju
n 2
02
0
Stochastic Automata Network for Performance
Evaluation of Heterogeneous SoC Communication
Ulhas Deshmukh
Lectrurer in ECE, Govt. Polytechnic, Dhule, India &
Research Scholar MNIT, Jaipur, India
Email: deshmukhur@gmail.com
Vineet Sahula, Senior Member, IEEE
Professor, Deptt. of Electronics & Comm. Engg.
Malaviya National Institute of Technology, Jaipur, India
Email: sahula@ieee.org
Abstract—To meet ever increasing demand for performance
of emerging System-on-Chip (SoC) applications, designer employ
techniques for concurrent communication between components.
Hence communication architecture becomes complex and major
performance bottleneck. An early performance evaluation of
communication architecture is the key to reduce design time,
time-to-market and consequently cost of the system. Moreover,
it helps to optimize system performance by selecting appropriate
communication architecture. However, performance model of
concurrent communication is complex to describe and hard to
solve. In this paper, we propose methodology for performance
evaluation of bus based communication architectures, modeling
for which is based on modular Stochastic Automata Network
(SAN). We employ Generalized Semi Markov Process (GSMP)
model for each module of the SAN that emulates dynamic
behavior of a Processing Element (PE) of an SoC architecture.
The proposed modeling approach provides an early estimation of
performance parameters viz. memory bandwidth, average queue
length at memory and average waiting time seen by a processing
element; while we provide parameters viz. number of processing
elements, the mean computation time of processing elements
and the first and second moments of connection time between
processing elements and memories, as input to the model.
I. INTRODUCTION
Modern-day System-on-Chip (SoC) platforms use a large
number of embedded processors and application specific hard-
ware components [1]. An integration of these heterogeneous
components into a single chip makes communication among
them critical. Besides, these components are pre-verified and
optimized. Hence, communication architecture emerges as a
key performance determining component of these multiproces-
sor SoC (MP-SoC) platforms. Furthermore, availability of sev-
eral commercial communication architectures such as, AMBA,
CoreConnect and their customization facilitate the designer
with variety of design alternatives. Therefore, system level
performance estimation is essential for selection of optimum
communication architecture from a wide design space at an
early stage of design cycle.
System-on-Chip applications use different types of commu-
nication architectures viz. bus-based, Network-on-Chip (NoC)
based, hybrid bus-NoC architecture and crossbar architecture.
Bus based architectures can be further classified as dedicated
buses, single shared bus and network of shared buses. In
SoCs and embedded applications, bus based architectures are
popular because these are simple, consume less power and
area. Moreover, performance of bus based architectures not
only suffices for low end and high volume applications but
also results in cheaper design. This has been motivation for
our efforts for estimating performance of bus based commu-
nication architectures at the system level.
In this paper, we propose system level performance esti-
mation of bus based communication architectures based on
Stochastic Automata Network (SAN). Mainly, we focus on
formulation of SAN model for a Single Shared Bus (SSB) ar-
chitecture and its extension for Hierarchical Bus Bridge (HBB)
architecture. The approach has been proposed as an extension
of GSMP based performance model of these architectures
[2]. In Section II, we present basic concepts and terminology
of SAN, related work and our contribution. In Section III,
we propose the SAN framework of a SSB architecture for
performance estimation. Section IV contains enhancement of
the SAN formulation for HHB architecture. We present the
results in Section V. We conclude in Section VI.
II. BACKGROUND
A. Stochastic Automata Network: an overview
A stochastic automata network consist of a number of
modules or stochastic automata. A module is modeled by a
set of states and a set of transitions which determines dynamic
behavior of a component of the parallel system. The state of
one module is called local state, while global or system state
is the collection of local states of all modules. In short, the
SAN model is modular representation of parallel system. The
modules of a SAN model interact with each other using local
and synchronizing events. Local event changes the state of a
single component module by triggering local transition. Syn-
chronizing event modifies the states of more than one modules
by simultaneous transitions in those modules. Probabilities of
local and synchronizing transition can be functional or non-
functional. In functional transition, transition probability is the
function of the states of other modules whereas it is constant
in non-functional transition.
For formal description, let us consider a SAN model
with N component modules and a set of events E. The
ith automaton, A(i) (where i = 1, 2, ..., N ) with a set of
states S(i) = {a(i), ..., z(i)} having cardinality ni. Local state
variable of A(i) is denoted by x(i). Hence, global state of
the SAN is the collection of all local states i.e. a vector
x˜ = (x(1), x(2), ..., x(N)) whereas S = S(1) x S(2) x ... x
Pre-print of manuscript in NORCHIP-2008S(N) is called the global state space . The details of SAN can
be found in [3] and references there in.
B. Related Work
Work reported in [4], uses static performance estimation
technique for allocation of communication channels. Our pre-
vious work [2], proposes an analytical performance evaluation
of SSB and HBB architectures based on GSMP model. Analyt-
ical approach as in [5], estimates communication overhead in
the pipelined communication path, which considers an impact
of various protocol parameters on data transfer. Work in [6]
proposes simulation based approach based on Operation State
Machine for performance estimation of the system. Authors in
[7] have proposed two phase hybrid performance estimation
approach which first performs initial co-simulation with ab-
stract communication and then analyzes time inaccurate com-
munication graph by specifying communication architecture.
A large body of work dealing SAN formalization is available
in [3] [8]. Authors in [9] use SAN model for performance
analysis in platform based design.
C. Contribution of the paper
Main contribution of the paper lies in the proposal for
system level performance estimation of a SSB architecture and
HBB architecture. The formulation is based on the SAN model
of communication architectures. We present high level simu-
lation model of these architectures in the Stateflow component
of MATLAB.
Proposed modeling approach provides an early estimation
of memory bandwidth (BW), average queue length (L) and
average waiting time (W ) for a SSB architecture; whereas in
case of HBB architecture, we estimate local bandwidth (BWℓ),
local average queue length (Lℓ), local average waiting time
(W ℓ), global memory bandwidth (BWg), global average queue
length (Lg) and global average waiting time (W g). The input
parameters to the model are number of Processing Elements
(PEs) (N), the mean computation time (T ) and first and second
moment of connection time of PEs (C, C2). Additional input
parameters for HBB architecture are: probability of local and
global requests (Xℓ and Xg), first and second moment of local
and global connection times (Cℓ, Cℓ
2, Cg , Cg
2).
III. SAN BASED MODEL FOR SSB ARCHITECTURE
In this section, we propose the SAN model of a hetero-
geneous SSB architecture for evaluating performance metrics.
The model has been proposed as an extension of GSMP based
performance model of a homogeneous SSB architecture [2].
Two types of abstract communication models are being used in
SoC platforms- (i) massage passing communication model and
(ii) shared memory communication model. Our formulation
is based on the latter model, in which SoC function involves
communication of the PEs with the memories. Figure 1 shows
synchronous SSB architecture which consists of N heteroge-
neous processing elements, PE1, PE2,...,PEN competing for
the use of a bus. We assume that a bus arbitration is based on
the fixed priorities of PEs. The lowest priority is assigned to
PE1 while the highest to PEN . The bus access is assumed to
be non-preemptive. Arbiter of N-user one-server type resolves
the bus access conflict.
ArbiterMEM
PE1 PE 2 PEN
SINGLE SHARED BUS 
 I/F
 I/F
 I/F
 I/F  I/F
Fig. 1. A single shared bus communication architecture.
A. Model formulation
Stochastic automata network of a heterogeneous SSB ar-
chitecture is modeled as a collection of interacting modules
of PEs. We employ GSMP model [2] for each module which
represents dynamic behavior of a PE. We use functional and
synchronizing transitions to describe an interactions among
these modules. Figure 2 depicts SAN model of a SSB ar-
chitecture, whereas Fig. 3 shows details of one automaton
A(i) that represents GSMP model of PEi. Computing state
labelled as CP i, corresponds to the situation when the PEi
is computing. In Accessing state ACi, the PEi accesses
MEM. In full waiting state labelled as FW i, the PEi waits
for MEM for full connection time of another PE which is
accessing MEM; while in residual waiting state labelled as
RW i, the PEi waits for MEM for residual connection time
of a accessing PE. In each state, model spends random amount
of time with mean value ηk, called mean sojourn time of k
th
state (k = CP i, ACi, FW i, RW i).
We express state transition probabilities of the SAN model
in terms of transition probabilities of GSMP model of a
homogeneous SSB architecture [2]. These are explained as
follows. (i) α∗0i- a local transition involves only A
(i), with con-
stant probability α0i. (ii) α
∗
1i- the functional transition which
depends on the global state of the system. This transition takes
place if all high priorities PEs are in computing states. (iii)
α∗2i- a synchronizing transition which synchronizes with event
ej (any α1 transitions of higher priority PEs) with probability
pe and alternate probability 1. (iv) α
∗
3i- a functional transition
which takes place if any one of the PEs is in accessing state.
α∗0i = α0i = 1
α∗1i = f(x
j) =
{
1 if, xj = CP j , j = i+ 1, ..., N
0 otherwise
α∗2i = (ej, pe, 1), j = i + 1, ..., N
α∗3i = f(x
j) =
{
1 if, xj = ACj , j = 1, 2, ..., N
0 otherwise
α∗1i = 1− α
∗
1i
Performance parameters of the ith PE are computed from
steady state probabilities [2] viz. BWi = P
i
AC , PUi =
P iAC + P
i
CP , Li = (P
i
FW + P
i
RW ) and W i = (η
i
FWα
∗
2i +
Pre-print of manuscript in NORCHIP-2008
PE1 PE N
A(N)
A(2)
A(1)
PE 2
23 1
0
23 1
0
23 1
0
State 0 − CP
State 1 − AC
State 2 − FW
State 3 − RW
Fig. 2. The SAN model for a heterogeneous SSB communication architecture.
ηiRWα
∗
3i)/α
∗
1i (where, Pk is steady state probability of the
kth state).
RW FWi AC
CP
i i
i
α
∗
1i
α
∗
3i
α
∗
2i
α
∗
0i
α
∗
1i
α
∗
1i
α
∗
1i α
∗
1i
Fig. 3. An automaton A(i) representing GSMP model of PEi.
IV. SAN BASED MODEL FOR HBB ARCHITECTURE
In this section, we extend SAN modeling approach for HBB
architecture. HBB architecture is composed of two shared
buses BUS1 and BUS2, and connected by a bus bridge
as shown in Fig. 4. Here, N number of PEs on each bus,
compete to access shared memories MEM1 or MEM2. At
the bridge level communications on two buses are concurrent
whereas at bus level behavior of PEs are concurrent. For
simplicity, let us consider a scenario when a PE mapped to
BUS1 generates either a local request to access MEM1 or
global request to access MEM2. With reference to this PE,
parameters ofMEM1 andMEM2 are referred to as local and
global parameters, respectively. Let Xℓ be the probability of
local request, implying only BUS1 would be used to access
MEM1, and arbitration of BUS1 is sufficient. WhereasXg be
the probability of global request where both BUS1 and BUS2
would be used to access MEM2, and two stage arbitration of
BUS1 and BUS2 is essential.
A. Model formulation
We propose two level SAN model for HBB architecture. At
bridge level the SAN consist of two automata correspond to
BUS1 and BUS2 and are similar to the Fig. 2. At bus level,
each module is composed of automata of PEs. At bridge level
two automata of buses interact with each other while at bus
level interaction among automata of PEs is modeled.
Automata of the PE1i in aforementioned scenario (mapped
to BUS1) is depicted in Fig. 5. State lACi, state lFWi and
B
us
 I/
f
B
us
 I/
f
B
rid
ge
ArbiterMEM1
BUS 1
PE 2 PE NPE 1
Arbiter
PE 1
MEM 2
PE 2 PE N
 I/F
 I/F
 I/F  I/F I/F
 I/F
 I/F
 I/F
BUS 2
 I/F  I/F
Fig. 4. Hierarchical bus bridge communication architecture.
state lRWi correspond to local memory MEM1 and are
similar to the states of automata of a PEi of SSB architecture
(Fig. 3). Global accessing state labelled as state gACi, global
full waiting state labelled as state gFWi and global residual
waiting state labelled as state gRWi are analogous states when
a PE attempts to access MEM2. Detail discussion of model
equations and performance parameters is omitted.
lRW i ilAC
gACgRW gFW
i i i
i
lFW i
CP
β∗
1i
β∗
2i
β∗
2i
α∗
0i
α∗
0i
α∗
2iα
∗
3i
α∗
6i
β∗
2i
β∗
1i
β∗
1i
β∗
1i
β∗
2i
α∗
1i
α∗
5i α∗
4i
Fig. 5. An automaton A(i) of PEi in HBB architecture.
V. RESULTS
In this section, we present performance evaluation results
of SSB and HBB architectures obtained using the proposed
modeling approach. We have captured the SAN model of
both architectures with fixed arbitration scheme in Stateflow
component of MATLAB. Simulation was performed on on
P-IV, 1 GB Linux-workstation. In both examples, random
computation and communication times of PEs were generated
by using MATLAB m-functions with generalized distribution.
Pre-print of manuscript in NORCHIP-2008
As first example, we have considered a SSB architecture
with three PEs- PE1, PE2 and PE3. We assigned the lowest
priority to PE1 and the highest to PE3. We assigned mean
values of computation times of PEs as: T 1 = T 2 = T 3= 2 cy-
cles. We varied mean communication time (C1) of PE1 with
C2 and C3 as parameters. Various performance parameters of
the PEs viz. BW , L and W have been estimated. For brevity,
we present results of BW1 and L1 of PE1, as shown in Fig.
6(a) and 6(b).
 0
 0.1
 0.2
 0.3
 0.4
 2  4  6  8  10  12  14  16  18  20
B
an
dw
id
th
 (B
W
1)
Communication time (Cl)
C2=C3=2 cyclesC2=2,C3=4 cyclesC2=4,C3=2 cyclesC2=C3=4 cycles
 0
 0.1
 0.2
 0.3
 0.4
 0.5
 2  4  6  8  10  12  14  16  18  20
Q
ue
ue
 le
ng
th
 (L
1)
Communication time (Cl)
C2=C3=2 cyclesC2=2,C3=4 cyclesC2=4,C3=2 cyclesC2=C3=4 cycles
(a) (b)
Fig. 6. Variation of (a) BW1 and (b) L1, with C1.
As observed from the Fig. 6(a), bandwidth increases with
communication time which is due to increase in mean sojourn
time of AC1 state. The Fig. also shows influence of C2 and/or
C3 on BW1. Reduction in bandwidth is observed when we
changed C2 and/or C3 from two to four cycles, since PE1 has
to wait more time in waiting states. PE1 received maximum
bandwidth (25 %) when C2=C3=2 cycles and C1=20 cycles;
and minimum bandwidth (3 %) when C2 = C3 = 4 cycles
and C1=2 cycles. Figure 6(b) reveals converse observations
for queue length, L1. For higher values of C2 and/or C3,
PE2 and/or PE3 access MEM for more time than PE1. As a
consequence PE1 spends more time in waiting states. Hence,
higher value of L1 is noted for C2 = C3 = 4 cycles.
In second example, we have considered a HBB architecture
with two PEs mapped to each bus. Processing elements, PE11
and PE12 are mapped to BUS1; while PE21 and PE22 are
mapped to BUS2. We assigned descending priorities from
global requests of PE22, PE21, PE12 and PE11; and then lo-
cal requests in the same order. Various model input parameters
are assigned values as follows-Xℓ11=0.7,Xℓ12=0.8,Xℓ21=0.7,
T 11=T 12=T 21=T 22= 2 cycles, Cℓ11=Cℓ12=Cℓ21= 2 cycles,
and Cg11=Cg12=Cg21= 2 cycles (here, ℓ and g denote local
and global parameters followed by PE number). From various
evaluated performance parameters of PEs, we present local and
global bandwidth (BWℓ22,BWg22) of PE22. We varied Cℓ22
for local bandwidth and Cg22 for global bandwidth. Figure
7(a) and 7(b) show plot of these parameters with probability
of local request, Xℓ22 with Cℓ22 and Cg22 as parameters.
We observe that local bandwidth, BWℓ22 increases with
increase in Xℓ22 as well as with Cℓ22. At higher values of
Xℓ22, BWℓ22 is more sensitive to Cℓ22. An influence of Cg11
on BWℓ22 is clearly noted from the Fig. 7(a). Share of local
bandwidth declined as we increased Cg11 from two cycles
to four cycles. In case of global bandwidth, BWg22 gradual
 0
 0.05
 0.1
 0.15
 0.2
 0.25
 0.3
 0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9
Lo
ca
l b
an
dw
id
th
 (B
W
l2
2)
Probability of local request (Xl)
Cl22=2 cyclesCl22=4 cyclesCl22=2,Cg11=4 cycles
 0
 0.05
 0.1
 0.15
 0.2
 0.25
 0.3
 0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9
G
lo
ba
l b
an
dw
id
th
 (B
W
g2
2)
Probability of local request (Xl)
Cg22=2 cyclesCg22=4 cyclesCg22=6 cycles
(a) (b)
Fig. 7. Effect of Xℓ22 on (a) BWℓ22 and (b) BWg22.
decrease is observed with increase in Xℓ22. At the same value
ofXℓ22, the PE22 received more bandwidth with higher Cg22.
Variations in BWg22 with Cg22 at higher values of Xℓ22 are
not significant.
VI. CONCLUSIONS
This paper presents SAN based modeling approach
for system level performance evaluation of SSB and
HBB architectures. We have evaluated performance metric
viz. bandwidth, queue length and waiting time with
communication times of processing elements for SSB
architecture. For HBB architecture performance parameters
for local and global memories are evaluated with local
requesting probabilities. Proposed approach provides an early
estimation of performance metrics that can help the designer
to select the appropriate communication architecture for SoC
and embedded applications.
Acknowledgments: We gratefully acknowledge the financial
support provided by the Department of IT, Ministry of Communica-
tion & IT, Govt. of India under SMDP-VLSI-II project.
REFERENCES
[1] International Technology Roadmap for Semiconductor (ITRS), 2007
Edition, [online] Available: http://public.itrs.net.
[2] U. Deshmukh and V. Sahula, “Interactive generalized semi Markov pro-
cess model for evaluating arbitration schemes of SoC bus architectures,”
in Second UKSIM European Sym. on Computer Modeling and Simulation,
Sep. 2008, pp. 578–583.
[3] B. Plateau and K. Atif, “Stochastic automata network for modeling
parallel systems,” IEEE Trans. on Software Eng., vol. 17, no. 10, pp.
1093–1108, 1991.
[4] J. M. Daveau, T. B. Ismail, and A. A. Jerraya, “Synthesis of system-level
communication by an allocation-based approach,” in Pro. of the 8th Int.
Sym. on System Synthesis, 1995, Sep. 1995, pp. 150–155.
[5] P. V. Knudsen and J. Madsen, “Integrating communication protocol
selection with partitioning in hardware/software codesign,” in Proc. 11th
Int. Sym. on System Synthesis, Dec. 1998, pp. 111–116.
[6] X. Zhu, W. Qin, and S. Malik, “Modeling operation and microarchi-
tecture concurrency for communication architectures with application to
retargetable simulation,” IEEE Trans. VLSI Systems, vol. 14, no. 7, pp.
707–716, Jul. 2006.
[7] K. Lahiri, A. Raghunathan, and S. Dey, “System-level performance
analysis for designing on-chip communication architectures,” IEEE Trans.
on CAD of ICs, vol. 20, no. 6, pp. 768–783, Jun. 2001.
[8] W. J. Stewart, K. Atif, and B. Plateau, “The nuemerical solution of
stochastic automata networks,” European Journal of Operation research,
vol. 86, no. 3, pp. 503–525, 1995.
[9] A. Nandi and R. Marculescu, “System-level power/performance analysis
for embedded systems design,” in DAC, 2001, pp. 599–604.
