Synchronous elastic networks by Krstic, Sava et al.
Synchronous Elastic Networks
Sava Krstic´
Strategic CAD Labs, Intel Corporation
Hillsboro, Oregon, USA
Jordi Cortadella
Universitat Polite`cnica de Catalunya
Barcelona, Spain
Mike Kishinevsky, John O’Leary
Strategic CAD Labs, Intel Corporation
Hillsboro, Oregon, USA
Abstract— We formally define—at the stream transformer
level—a class of synchronous circuits that tolerate any variabil-
ity in the latency of their environment. We study behavioral
properties of networks of such circuits and prove fundamental
compositionality results. The paper contributes to bridging the
gap between the theory of latency-insensitive systems and the
correct implementation of efficient control structures for them.
I. INTRODUCTION
The conventional abstract model for a synchronous circuit is
a machine that reads inputs and writes outputs at every cycle.
The outputs at cycle i are produced according to a calculation
that depends on the inputs at cycles 0, . . . , i. Computations
and data transfers are assumed to take zero delay.
Latency-insensitive design by Carloni et al. [2] aims to relax
this model by elasticizing the time dimension and so decou-
pling the cycles from the calculations of the circuit. It enables
the design of circuits tolerant to any discrete variation (in
the number of cycles) of the computation and communication
delays. With this modular approach, the functionality of the
system only depends on the functionality of its components
and not on their timing characteristics.
The motivation for latency-insensitive design comes from
the difficulties with timing and communication in nanoscale
technologies. The number of cycles required to transmit data
from a sender to a receiver is governed by the distance
between them, and often cannot be accurately known until
the chip layout is generated late in the design process. Tra-
ditional design approaches require fixing the communication
latencies up front, and these are difficult to amend when
layout information finally becomes available. Elastic circuits
offer a solution to this problem. In addition, their modularity
promises novel methods for microarchitectural design that
can use variable-latency components and tolerate static and
dynamic changes in communication latencies, while—unlike
asynchronous circuits—still employing standard synchronous
design tools and methods.
Cortadella et al. [4] present a simple elastic protocol, called
SELF (Synchronous Elastic Flow) and describe methods for
efficient implementation of elastic systems and for conversion
of regular synchronous designs into elastic form. Inspired by
the original work on latency-insensitive design [2], SELF also
differs from it in ways that render the theory developed in [2]
hardly applicable.
In this paper we give theoretical foundations of SELF: a
novel and arguably more practicable definition of elasticity,
and the basic compositionality results. For space reasons, the
3267
...
...
...
1253
2014
3267+e
123 5
2014
...
...
...+(a)
(b)
Fig. 1. (a) Conventional synchronous adder, (b) Synchronous elastic adder.
proofs are omitted, but are available in the technical report
[7].
A. Overview
Figure 1(a) depicts the timing behavior of a conventional
synchronous adder that reads input and produces output data
at every cycle (boxes represent cycles). In this adder, the i-th
output value is produced at the i-th cycle. Figure 1(b) depicts
a related behavior of an elastic adder—a synchronous circuit
too—in which data transfer occurs in some cycles and not in
others. We refer to the transferred data items as tokens and we
say that idle cycles contain bubbles.
Put succinctly, elasticization decouples cycle count from
token count. In a conventional synchronous circuit, the i-th
token of a wire is transmitted at the i-th cycle, whereas in
a synchronous elastic circuit the i-th token is transmitted at
some cycle k ≥ i.
Turning a conventional synchronous adder into a syn-
chronous elastic adder requires a communication discipline
that differentiates idle from non-idle cycles (bubbles from
tokens). In SELF, this is implemented by a pair of single-
bit control wires: Valid and Stop. Every input or output wire
Z in a synchronous component is associated to a channel in
the elastic version of the same component. The channel is a
triple of wires 〈Z, validZ , stopZ〉, with Z carrying the data and
the other two wires implementing the control bits, as shown
in Figure 2(b). A token is transferred on this channel when
validZ ∧¬stopZ : the sender sends valid data and the receiver
is ready to accept it; see Figure 4. Additional constraints that
guarantee correct elastic behavior are given in Section III.
There we define precisely the class of elastic circuits and what
it means for a circuit Ae to be an elastization of a given circuit
A. In particular, our definition implies liveness: Ae produces
infinite streams of tokens if its environment produces infinite
streams of tokens at the input channels and is ready to accept
infinite streams at the output channels.
Suppose N is a network of standard (non-elastic) compo-
nents, as in Figure 2(a). Suppose we then take elasticizations of
Proceedings of the Formal Methods in Computer Aided Design (FMCAD'06)
0-7695-2707-8/06 $20.00  © 2006
CB
D C
e
e
A
(b)
B e
D e
(a)
data
valid channel=
A bu
ffe
r
stop
Fig. 2. A synchronous network (a) and its elastic counterpart (b).
these standard components and join their channels accordingly,
as in Figure 2(b), ignoring the buffer. Will the resulting
network N e be an elasticization of N ? Will it be elastic at all?
These fundamental questions are answered by Theorem 4 of
Section IV, which is the main result of the paper. The answers
are “yes”, provided a certain graph ∆e(N e) associated with
N e is acyclic. This graph captures the information about paths
inside elastic systems that contain no tokens—analogous to
combinational paths in ordinary systems. Importantly, ∆e(N e)
can be constructed using only local information (the “sequen-
tiality interfaces”) of the individual elastic components.
Since elastic networks tolerate any variability in the latency
of the components, empty FIFO buffers can be inserted in
any channel, as shown in Figure 2(b), without changing the
functional behavior of the network. This practically important
fact is proved as a consequence of Theorem 4.
Synchronous circuits are modeled in this paper as stream
transformers, called machines. This well-known technique (see
[8] and references therein) appears to be quite underdeveloped.
Our rather lengthy preliminary Section II elaborates the nec-
essary theory of networks of machines, culminating with a
surprisingly novel combinational loop theorem (Theorem 1).
Figure 3 illustrates Theorem 1 and, by analogy, Theorem 4
as well. It relies on the formalization of the notion of combina-
tional dependence at the level of input-output wire pairs. Each
input-output pair of a machine is either sequential or not, and
the set of sequential pairs provides a machine’s “sequentiality
interface”. When several machines are put together into a
network N , their sequentiality interfaces define the graph
∆(N ), the acyclicity of which is a test for the network to
be a legitimate machine itself.
Elasticizations of ordinary circuits are not uniquely defined.
On the other hand, for every elastic machine A there is a
unique standard machine, denoted Aᵀ, that corresponds to it.
We do not discuss any specific elasticization procedures in this
paper, but state our results in the form that only involves elastic
machines and their unique standard counterparts. This makes
the results applicable to multiple elasticization procedures.
B. Related Work
Carloni et al. [2] pioneered a theory of latency-insensitive
circuits based on their notion of patient processes. Patient
processes are defined at a high level of abstraction that models
communication on a channel only by “token or bubble”, leav-
ing implementation protocol(s) unspecified. In the companion
paper [3], Carloni et al. give an incomplete description of
an implementation protocol. Assuming our recovery of that
protocol (let us call it LID) is accurate, its transfer condition is
1
3 4
2
A
5 6
8
B
7
10
C
9
11 12
D
1
3 4
2
A
5 6
8
B
7
10
C
9
11 12
D
A
AA
B
B
D
DC
10−11
3 6
8−94−7
2−5
12−1
A
Fig. 3. Four machines (left) put into a network N (middle), and the network’s
dependency graph ∆(N ) (right). The nodes of ∆(N ) are wires; internal
wires get two labels. The arcs are non-sequential input-output wire pairs of
component circuits. Dotted arcs indicate that (1,2) and (7,10) are sequential
pairs for A and C resp.; they are not part of ∆(N ) so ∆(N ) is acyclic.
more complex than that of SELF (Figure 4) and consequently
LID requires significantly more complex implementation. For
example, conversion of a regular design into LID form needs
a wrapper or registers around every module, increasing the la-
tency of each module’s computation by two cycles—a penalty
that is not required in the SELF elasticization. There might
also be practical challenges in interfacing a LID system with
an existing non-LID module, requiring the latter to generate
stop signals with complex semantics.
cycle 0 1 2 3 4 5 6 7 8 9 . . .
dataZ ∗ A B B B C ∗ ∗ D D . . .
validZ 0 1 1 1 1 1 0 0 1 1 . . .
stopZ 0 0 1 1 0 0 0 1 1 0 . . .
SELF @ t @ @ t t @ @ @ t . . .
LID @ t t @ t t @ @ @ t . . .
Fig. 4. Comparing the SELF and LID protocols. The bottom rows show the
states of the channel Z, differentiating between bubbles (@) and tokens (t).
When ¬validZ , the value at the data wire is irrelevant (labelled ∗ in cycles
0, 6 and 7). The receiver can issue a stopZ even when the sender does not
send valid data (cycle 7). In the cycles 3, 4, and 9, the sender persistently
maintains the same valid data as in the previous cycle. In SELF, data transfer
takes place in cycles 1,4,5,9, so the transferred sequence is ABCD . . .. In
LID, the same sequence of values on the channel wires signifies transfer of a
different sequence of data: ABBCD . . . This is because a token is transferred
on the LID channel when validZ ∧ ¬(stopZ ∧ pre(stopZ)), where pre
stands for the value during the previous cycle. (The first occurrence of the
stop request stopZ = 1 means “perhaps you will need to stop next cycle”
and the data item B sent through the channel during cycle 2 is assumed to
be successfully transmitted to the receiver.)
We emphasize that the limitations of LID implementations
are not inherent to the concept of patient processes. Regarding
latency properties, they do not seem to be more limited than
elastic systems. Still, it turns out that patient processes are not
general enough to model elastic systems as we define them
in Section III. This we prove in Section V where patient
processes and elastic systems are compared as alternative
formalizations of latency-insensitive circuits.
Suhaib et al. [12] revisited and generalized Carloni’s elasti-
cization procedure, validating its correctness by a simulation
method based on model checking.
Lee et al. [9] study causality interfaces (pairwise input-
output dependencies) and are “interested in existence and
uniqueness of the behavior of feedback composition”, but do
not go as far as deriving a combinational loop theorem.
In their work on design of interlock pipelines [6], Jacobson
et al. use a protocol equivalent to SELF, without explicitly
2
Proceedings of the Formal Methods in Computer Aided Design (FMCAD'06)
0-7695-2707-8/06 $20.00  © 2006
specifying it.
Manohar and Martin discuss “slack elasticity” of asyn-
chronous implementations in [10]. Their slack elasticity con-
ditions relate to the structure of choices in the asynchronous
specification. Unlike [10], in the current paper we deal with
synchronous systems and we take a black box view of their
control—no information about the control flow (and hence on
the structure of choices) is ever used. Instead the connectivity
information corresponding to the system data-flow is used for
elasticization. Conservatively ignoring control flow may lead
to a performance penalty, but simplifies the translation to an
elastic system.
II. CIRCUITS AS STREAM FUNCTIONS
In this section we introduce machines as a mathematical
abstraction of circuits without combinational cycles. For sim-
plicity, this abstraction implicitly assumes that all sequential
elements inside the circuit are initialized. Extending to par-
tially initialized systems appears to be trivial. While there is a
large body of work studying circuits or equivalent objects with
good (e.g. constructive [1]) combinational cycles and their
composition (e.g. [5]), we deliberately restrict consideration
to the fully acyclic objects, since neither logic synthesis nor
timing analysis can properly treat circuits with combinational
cycles.
Most of the effort in this section goes into establishing
modularity conditions guaranteeing that a system obtained as a
network of machines (the feedback construction in particular)
is a machine itself.
A. Streams
A stream over A is an infinite sequence whose elements
belong to the set A. The first element of a stream a is referred
to by a[0], the second by a[1], etc. For example, the equation
a[i] = 3i+ 1 describes the stream (1, 4, 7, . . .).
The set of all streams will be denoted A∞. Occasionally
we will need to consider finite sequences too; the set of all,
finite or infinite, sequences over A is denoted Aω.
We will write a ∼k b to indicate that the streams a and
b have a common prefix of length k. The equivalence rela-
tions ∼0,∼1,∼2, . . . are progressively finer and have trivial
intersection. Thus, to prove two sequences a and b are equal,
it suffices to show a ∼k b holds for every k. Note also that
a ∼0 b holds for every a and b.
We will use the equivalence relations ∼k to express prop-
erties of systems and machines viewed as multivariate stream
functions. All these properties will be derived from the fol-
lowing two basic properties of single-variable stream functions
f : A∞ → B∞.
causality: ∀a, b ∈ A∞. ∀k ≥ 0. a ∼k b⇒ f(a) ∼k f(b)
contraction: ∀a, b ∈ A∞. ∀k ≥ 0. a ∼k b⇒ f(a) ∼k+1 f(b)
Informally, f is causal if (for every a) the first k elements of
f(a) are determined by the first k elements of a, and f is
contractive if the first k elements of f(a) are determined by
the first k − 1 elements of a.
Lemma 1: If f : A∞ → A∞ is contractive, then it has a
unique fixpoint.
Remark. One can define the distance d(a, b) between se-
quences a and b to be 1/2k, where k is the length of the
largest common prefix of a and b. This gives the sets A∞ and
Aω the structure of complete metric spaces and Lemma 1 is an
instance of Banach Fixed Point Theorem. See the review paper
[8] for more details and references about the metric semantics
of systems and [13] for “diadic arithmetic of circuits”. We
choose not to use the metric space terminology in this paper
since all “metric reasoning” we need can be as easily done
with equivalence relations ∼k instead. See [11] for principles
of reasoning with such “converging equivalence relations” in
more general contexts.
B. Systems
Suppose W is a set of typed wires; all we know about
an individual wire w is a set type(w) associated to it. A
W -behavior is a function σ that associates a stream σ.w ∈
type(w)∞ to each wire w ∈ W . The set of all W -behaviors
will be denoted JW K. Slightly abusing the notation, we will
also write JwK for the set type(w)∞. Notice that the equiva-
lence relations ∼k extend naturally from streams to behaviors:
σ ∼k σ′ iff ∀w ∈W. σ.w ∼k σ′.w
Notice also that a W -behavior σ can be seen as a single
stream (σ[0], σ[1], . . .) of W -states, where a state is an as-
signment of a value in type(w) to each wire w.
Definition 1: A W -system is a subset of JW K.
Example. A circuit that at each clock cycle receives an
integer as input and returns the sum of all previously received
inputs is described by the W -system S, where W consists
of two wires u, v of type Z, and S consists of all stream
pairs (a, b) ∈ Z∞ × Z∞ such that b[0] = 0 and b[n] =
a[0]+· · ·+a[n−1] for n > 0. Each stream pair (a, b) represents
a behavior σ such that σ.u = a and σ.v = b.
We will use wires as typed variables in formulas meant to
describe system properties. The formulas are built using ordi-
nary mathematical and logical notation, enhanced with tempo-
ral operators next, always, and eventually, denoted respectively
by ( )+,G,F. As an illustration, the system S in the example
above is characterized by the property v = 0∧G (v+ = v+u).
Also, one has S |= FG (u > 0) ⇒ FG (v > 1000), where |=
is used to denote that a formula is true of a system.
C. Operations on Systems
If W ′ ⊆ W , there is an obvious projection map σ 7→
σ ↓W ′ : JW K → JW ′K. These projections are all one needs
for the definition of the following two basic operations on
systems.
Definition 2: (a) If S is a W -system and W ′ ⊆ W , then
hiding W ′ in S produces a (W − W ′)-system hideW ′(S)
defined by
τ ∈ hideW ′(S) iff ∃σ ∈ S. τ = σ ↓ (W −W ′).
3
Proceedings of the Formal Methods in Computer Aided Design (FMCAD'06)
0-7695-2707-8/06 $20.00  © 2006
(b) The composition of a W1-system S1 and a W2-system S2
is a (W1 ∪W2)-system S1 unionsq S2 defined by
σ ∈ S1 unionsq S2 iff σ ↓W1 ∈ S1 ∧ σ ↓W2 ∈ S2.
If W and W ′ are disjoint wire sets, σ ∈ JW K, and τ ∈JW ′K, then there is a unique behavior ϑ ∈ JW ∪W ′K such that
σ = ϑ ↓W and τ = ϑ ↓W ′. This “product” of behaviors will
be written as ϑ = σ ∗ τ . (If W is the empty set, then JW K has
one element—a “trivial behavior” that is also a multiplicative
unit for the product operation ∗.) We will also use the notation
[u 7→ a, v 7→ b, . . .] for the {u, v, . . .}-behavior σ such that
σ.u = a, σ.v = b, etc.
Hiding and composition suffice to define complex networks
of systems. To model identification of wires, we use simple
connection systems: by definition, Conn(u, v) is the {u, v}-
system consisting of all behaviors σ such that σ.u = σ.v.
Now if S1, . . . ,Sm are given systems and u1, . . . , un,
v1, . . . , vn are some of their wires, the network obtained
from these systems by identifying each wire ui with the
corresponding wire vi (of equal type) is the system
〈S1, . . . ,Sm |u1 = v1, . . . , un = vn〉
defined as hide{u1,...,un,v1,...,vn}(S), where
S = S1 unionsq · · · unionsq Sm unionsq Conn(u1, v1) unionsq · · · unionsq Conn(un, vn).
The simplest case (m = n = 1) of networks is the construction
〈S |u = v〉 = hide{u,v}(S unionsq Conn(u, v)),
used for a feedback definition in Section II-E. A behavior σ
belongs to 〈S |u = v〉 if and only if σ ∗ [u 7→ a, v 7→ a] ∈ S
for some a ∈ JuK.
D. Machines
Suppose I and O are disjoint sets of wires, called inputs
and outputs, correspondingly. By definition, an (I,O)-system
is just an (I ∪ O)-system. Consider the following properties
of an (I,O)-system S.
deterministic:
∀ω, ω′ ∈ S. ω ↓ I = ω′ ↓ I ⇒ ω ↓O = ω′ ↓O
functional:
∀σ ∈ JIK.∃!τ ∈ JOK. σ ∗ τ ∈ S
causal:
∀ω, ω′ ∈ S.∀k ≥ 0. ω ↓ I ∼k ω′ ↓ I ⇒ ω ↓O ∼k ω′ ↓O
Clearly, functionality implies determinism. Conversely, a
deterministic system is functional if and only if it accepts
all inputs. Note also that causality implies determinism: if
ω ↓ I = ω′ ↓ I , then ω ↓ I ∼k ω′ ↓ I holds for every k, so
ω ↓O ∼k ω′ ↓O holds for every k too, so ω ↓O = ω′ ↓O.
Definition 3: An (I,O)-machine is an (I,O)-system that is
both functional and causal.
A functional system S uniquely determines and is deter-
mined by the function F : JIK → JOK such that F (σ) = τ
holds if and only if σ ∗ τ ∈ S. The causality condition for
such S can be also written as follows:
∀σ, σ′ ∈ JIK.∀k ≥ 0. σ ∼k σ′ ⇒ F (σ) ∼k F (σ′).
The system in the example in Section II-B is a machine if
we regard u as an input wire and v as an output wire. The
same is true of the system Conn(u, v): its associated function
F is the identity function.
E. Feedback on Machines
We will use the term feedback for the system 〈S |u = v〉
as mentioned in Section II-C when S is a machine and the
wires u and v of the same type are an input and output of
S respectively. Our concern now is to understand under what
conditions the feedback produces a machine.
To fix the notation, assume S is an (I,O)-machine given
by F : JIK→ JOK, with wires u ∈ I , v ∈ O of the same type
A. By the note at the end of Section II-C, we have that for
every σ ∈ JI − {u}K and τ ∈ JO − {v}K,
σ ∗ τ ∈ 〈S |u = v〉
if and only if
∃a ∈ A∞. F (σ ∗ [u 7→ a]) = τ ∗ [v 7→ a]),
so 〈S |u = v〉 is functional when the function Fσuv : A∞ →
A∞ defined by Fσuv(a) = F (σ ∗ [u 7→ a]).v has a unique
fixpoint. By Lemma 1, this is guaranteed if Fσuv is contractive.
The following definition introduces the key concept of
sequentiality that formalizes the intutive notion that there is
no combinational dependence of a given output wire on a
given input wire. Sequentiality of the pair (u, v) easily implies
contractivity of Fσuv for all σ.
Definition 4: The pair (u, v) is sequential for S if for every
σ, σ′ ∈ JIK and every k ≥ 0
∧ σ.u ∼k−1 σ′.u
∧ ∀x ∈ I − {u}. (σ.x ∼k σ′.x)
⇒ F (σ).v ∼k F (σ′).v
Lemma 2 (Feedback): If (u, v) is a sequential input-output
pair for a machine S, then the feedback system 〈S |u = v〉 is
a machine too.
Example. Consider the system S with I = {u, v}, O =
{w, z}, specified by equations
w = u⊕ ((0)#v) z = v ⊕ v,
where all wires have type Z, the symbol ⊕ denotes the
componentwise sum of streams, and # denotes concatenation.
Since z does not depend on u, the pair (u, z) is sequential.
The pair (v, w) is also sequential since to compute a prefix
of w it suffices to know (a prefix of the same size of u and)
a prefix of smaller size of v. The remaining two input-output
pairs (u,w) and (v, z) are not sequential.
To find the machine 〈S | v = w〉, we need to solve the
equation v = u⊕((0)#v) for v. For each u = (a0, a1, a2, . . .),
the equation has a unique solution v = uˆ = (a0, a0+a1, a0+
a1+a2, . . .). Substituting the solution into z = v⊕v, we obtain
4
Proceedings of the Formal Methods in Computer Aided Design (FMCAD'06)
0-7695-2707-8/06 $20.00  © 2006
a description of 〈S | v = w〉 by a single equation that relates its
input and output: z = uˆ⊕ uˆ. The other feedback 〈S |u = z〉 is
easier to calculate; it is given by equation w = v⊕v⊕((0)#v).
F. Networks of Machines and the Combinational Loop Theo-
rem
Consider a network N = 〈S1, . . . ,Sm |u1 = v1, . . . , un =
vn〉, where S1, . . . ,Sm are machines with disjoint wire sets
and the pairs (u1, v1),. . . ,(un, vn) involve n distinct input
wires ui and n distinct output wires vi. (There is no assump-
tion that ui, vi belong to the same machine Sj .) Our goal is to
understand under what conditions the system N is a machine.
Note that N = 〈S |u1 = v2, . . . , un = vn〉, where S =
S1 unionsq · · · unionsq Sm. It is easy to check that an input-output pair
(u, v) of S is sequential if either (1) (u, v) is sequential for
some Si, or (2) u and v belong to different machines. Thus,
the information about sequentiality of input-output pairs of the
“parallel composition” machine S is readily available from the
sequentiality information about the component machines Si,
and our problem boils down to determining when a multiple
feedback operation performed on a single machine results in
a system that is itself a machine.
Simultaneous feedback specified by a set of two or more
input-output pairs of a machine does not necessarily produce
a machine even if all pairs involved are sequential. Indeed,
in the example in Section II-E, we had a system S with
two sequential pairs (u, z) and (v, w), but (u, z) ceases to
be sequential for 〈S | v = w〉. Indeed, if z and u are related
by z = uˆ⊕ uˆ, then knowing a prefix of length k of z requires
knowing the prefix of the same length of u; a shorter one
would not suffice.
To ensure that a multiple feedback construction produces a
machine, one needs to show that, in addition to the wire pairs
to be identified, sufficiently many other input-output pairs are
also sequential. A precise formulation for a double feedback
is given by a version of the Bekic´ Lemma: for the system
〈S |u = w, v = z〉 to be a machine, it suffices that three
pairs of wires be sequential—(u,w), (v, z), and one of (u, z),
(v, w). This non-trivial auxiliary result is needed for the proof
of Theorem 1 below, and is a special case of it.
Given an (I,O)-machine S, let its dependency graph ∆(S)
have the vertex set I ∪ O and directed edges that go from u
to v for each pair (u, v) ∈ I × O that is not sequential. For
a network system N = 〈S1, . . . ,Sm |u1 = v1, . . . , un = vn〉,
its graph ∆(N ) is then defined as the direct sum of graphs
∆(S1), . . . ,∆(Sm) with each vertex ui (1 ≤ i ≤ n) identified
with the corresponding vertex vi (Figure 3).
Theorem 1 (Combinational Loop Theorem): The network
system N is a machine if the graph ∆(N ) is acyclic.
III. ELASTIC MACHINES
In this section we give the definition of elastic machines.
Its four parts—input-output structure, persistence conditions,
liveness conditions, and the transfer determinism condition—
are covered by Definitions 5-8 below.
A. Input-output Structure, Channels, and Transfer
We assume that the set of wires is partitioned into data,
valid, and stop wires, so that for each data wire X there
exist associated wires validX and stopX of boolean type. (In
actual circuit implementations, validX and stopX need not be
physical wires; it suffices that they be appropriately encoded.)
Definition 5: Let I,O be disjoint sets of data wires. An
[I,O]-system is an (I ′, O′)-machine, where I ′ = I ∪
{validX |X ∈ I} ∪ {stopY |Y ∈ O} and O′ = O ∪
{validY |Y ∈ O} ∪ {stopX |X ∈ I}.
The triples 〈X, validX , stopX〉 (for X ∈ I) and
〈Y, validY , stopY 〉 (for Y ∈ O) are to be thought of as elastic
input and output channels of the system.
Let transferZ be a shorthand for validZ ∧ ¬stopZ and say
that transfer along Z occurs in a state s if s |= transferZ .
Given a behavior σ = (σ[0], σ[1], σ[2], . . .) of an [I,O]-system
and Z ∈ I ∪ O, let σZ be the sequence (perhaps finite!)
obtained from σ.Z = (σ[0].Z, σ[1].Z, σ[2].Z, . . .) by deleting
all entries σ[i].Z such that transfer along Z does not occur
in σ[i]. The transfer behavior σᵀ associated with σ is then
defined by σᵀ.Z = σZ . If all sequences σZ are infinite, then
σᵀ is an (I ∪O)-behavior; in general, however, we only have
σZ ∈ type(Z)ω.
For each wire Z of an [I,O]-system S we introduce
an auxiliary transfer counter variable tctZ of type Z. The
counters serve for expressing system properties related to
transfer. By definition, tctZ is equal to the number of states
that precede the current state and in which transfer along Z
has occurred. That is, for every behavior σ of S, we have
σ.tctZ = (t0, t1, . . .), where tk is the number of indices i
such that i < k and transfer along Z occurs in σ[i]. Note that
the sequence σ.tctZ is non-decreasing and begins with t0 = 0.
The notation min tctS , for any subset S of I ∪ O will be
used to denote the smallest of the numbers tctZ , where Z ∈ S.
B. Definition of Elasticity
An elastic component, when ready to communicate over
an output channel must remain ready until the transfer takes
place.
Definition 6: The persistence conditions for an [I,O]-
system S are given by
S |= G (validY ∧ stopY ⇒ (validY )+ ∧ Y + = Y ) (1)
for every Y ∈ O.
The conjunct Y + = Y can be removed from (1) without
affecting the definition of elastic machines (it follows from
other conditions). The most useful consequence of persistence
is the “handshake lemma”:
S |= GF validY ∧ GF¬stopY ⇒ GF transferY
Liveness of an elastic component is expressed in terms of to-
ken count: if all input channels have seen k transfers and there
is an output channel that has seen less, then the communication
on output channels with the minimum amount of transfer must
be eventually offered. The following definition formalizes this,
5
Proceedings of the Formal Methods in Computer Aided Design (FMCAD'06)
0-7695-2707-8/06 $20.00  © 2006
ba
b
a
b
a
e
d
c
e
d
c
e
d
c2
1 1
0
1 2
1
2
2
1
1
1
1
1
1
Fig. 5. Liveness: Only the hungriest channels (shaded) are being served.
The numbers indicate the current token count at each channel.
together with a similar commitment to eventual readiness on
input channels. (See also Figure 5.)
Definition 7: The liveness conditions for an [I,O]-system
are given by
S |=G (min tctO = tctY ∧min tctI > tctY ⇒ F validY)(2)
S |=G (min tctI∪O = tctX ⇒ F¬stopX) (3)
for every Y ∈ O and every X ∈ I .
In practice, elastic components will satisfy simpler (but
stronger) liveness properties; e.g. remove min tctO ≥ tctY
from (2) and replace min tctI∪O ≥ tctX with min tctO ≥
tctX in (3). However, a composition of such components,
while satisfying (2) and (3), may not satify the stronger
versions of these conditions.
Consider single-channel [I,O]-systems satisfying the per-
sistence and liveness conditions: an elastic consumer is a
[{Z}, ∅]-system C satisfying (4) below; similarly, an elastic
producer is a [∅, {Z}]-system P satisfying (5) and (6).
C |= GF¬stopZ (4)
P |= G (validZ ∧ stopZ ⇒ (validZ)+) (5)
P |= GF validZ (6)
Let CZ be the {Z, validZ , stopZ}-system characterized by
condition (4)—the largest (in the sense of behavior inclusion)
of the systems satisfying this condition. Similarly, let PZ be
the {Z, validZ , stopZ}-system characterized by properties (5)
and (6). Finally, define the [I,O]-elastic environment to be the
system
EnvI,O =
⊔
X∈I PX unionsq
⊔
Y ∈O CY .
Note that EnvI,O is only a system; it is not functional and so
is not a machine.
When a system satisfying the persistence and liveness con-
ditions (1-3) is coupled with a matching elastic environment,
the transfer on all data wires never comes to a stall:
Lemma 3 (Liveness): If S satisfies (1-3), then for every
behavior ω of S unionsq EnvI,O, all the component sequences of
the transfer behavior ωᵀ are infinite.
As an immediate consequence of Liveness Lemma, if S
satisfies (1-3), then
Sᵀ = {ωᵀ |ω ∈ S unionsq EnvI,O}
is a well-defined (I,O)-system.
Definition 8: An [I,O]-system S is an [I,O]-elastic ma-
chine if it satisfies the properties (1-3) and the associated
system Sᵀ is deterministic.
The liveness conditions (2,3) are visibly related to causality
at the transfer level: k transfers on the input channels imply
k transfers on the output channels in the cooperating envi-
ronment. Thus, it is not surprising (even though the proof is
not obvious) that the determinism postulated in Definition 8
suffices to derive the causality of Sᵀ:
Theorem 2: If S is an [I,O]-elastic machine, then Sᵀ is an
(I,O)-machine.
In the situation of Definition 8, we say that S is an ela-
sticization of Sᵀ and that Sᵀ is the transfer machine of S.
IV. ELASTIC NETWORKS
An elastic network N is given by a set of elastic machines
S1, . . . ,Sm with no shared wires, together with a set of chan-
nel pairs (X1, Y1), . . . , (Xn, Yn), where the Xi are n distinct
input channels and the Yi are n distinct output channels. As
a network of standard machines, the elastic network N is
defined by
N = 〈S1, . . . ,Sm |Xi = Yi, validXi = validYi , AAAAAAi
stopXi = stopYi (1 ≤ i ≤ n)〉
for which we will use the shorter notation
N = 〈〈S1, . . . ,Sm []X1 = Y1, . . . , Xn = Yn〉〉.
We will define a graph that encodes the sequentiality infor-
mation about the network N and prove in Theorem 4 that
acyclicity of that graph implies that N is an elastic machine
and that N ᵀ = 〈Sᵀ1 , . . . ,Sᵀm |X1 = Y1, . . . , Xn = Yn〉.
A. Elastic Feedback
Elastic feedback is a simple case of elastic network:
〈〈S []P = Q〉〉 = 〈S |P = Q, validP = validQ, stopP = stopQ〉.
Definition 9: Suppose S is an elastic machine. An input-
output channel pair (P,Q) will be called sequential for S if
S |= G
( ∧ min tctI∪O = tctQ
∧ min tctI−{P} > tctQ ⇒ F validQ
)
. (7)
Condition (7) is a strengthening of the liveness condition
(2) for channel Q. It expresses a degree of independence of
the output channel Q from the input channel P ; e.g., the first
token at Q need not wait for the arrival of the first token
at P . This independence can be achieved in the system by
storing some tokens inside, between these two channels. Note
that (7) does not guarantee that connecting channels P and Q
would not introduce ordinary combinational cycles. Therefore
the acyclicity condition in the following theorem is required
to ensure (by Theorem 1) that the elastic feedback, viewed as
an ordinary network, is a machine.
Theorem 3: Let S be an elastic machine and F the elastic
feedback system 〈〈S []P = Q〉〉. If the channel pair (P,Q) is
sequential for S, then: (a) the wire pair (P,Q) is sequential for
Sᵀ. If, in addition, ∆(F) is acyclic, then: (b) F is an elastic
machine, and (c) Fᵀ = 〈Sᵀ |P = Q〉.
6
Proceedings of the Formal Methods in Computer Aided Design (FMCAD'06)
0-7695-2707-8/06 $20.00  © 2006
B. Main Theorems
Sequentiality of two channel pairs (P,Q), (P ′, Q) of an
elastic machine does not imply their “simultaneous sequen-
tiality”
S |= G
( ∧ min tctI∪O = tctQ
∧ min tctI−{P,P ′} > tctQ ⇒ F validQ
)
.
This deviates from the situation with ordinary machines, where
the analogous property holds and is instrumental in the proof
of Combinational Loop Theorem.
To justify multiple feedback on elastic machines, we have
thus to postulate that simultaneous sequentiality is true where
required. Specifically, we demand that elastic machines come
with simultaneous sequentiality information: If S is an [I,O]-
elastic machine, then for every Y ∈ O a set δ(Y ) ⊆ I is given
so that
S |= G
( ∧ min tctI∪O = tctY
∧ min tctI−δ(Y ) > tctY ⇒ F validY
)
. (8)
Note that if P ∈ δ(Q), then the pair (P,Q) is sequential, but
the converse is not implied. A function δ : O → 2I with the
property (8) will be called a sequentiality interface for S.
For an [I,O]-elastic machine S with a sequentiality inter-
face δ, we define ∆e(S, δ) to be the graph with the vertex
set I ∪ O and directed edges (X,Y ) where X /∈ δ(Y ). By
Theorem 3(a), ∆e(S, δ) contains ∆(Sᵀ) as a subgraph.
Given an elastic network N = 〈〈S1, . . . ,Sm []X1 =
Y1, . . . , Xn = Yn〉〉, where each Si comes equipped with a
sequentiality interface δi, its graph ∆e(N ) is by definition the
direct sum of graphs ∆e(S1, δ1), . . . ,∆e(Sm, δm) with each
vertex Xi (1 ≤ i ≤ n) identified with the corresponding vertex
Yi.
Theorem 4: If the graphs ∆(N ) and ∆e(N ) are acyclic,
then the network system N is an elastic machine, the cor-
responding non-elastic system N¯ = 〈Sᵀ1 , . . . ,Sᵀm |X1 =
Y1, . . . , Xn = Yn〉 is a machine, and N ᵀ = N¯ .
As in Theorem 3, acyclicity of ∆(N ) is needed to ensure
(by Theorem 1) that N defines a machine. Elasticization
procedures (e.g. [4]) will typically produce elastic components
with enough sequential input-output wire pairs, so that ∆(N )
will be acyclic as soon as ∆e(N ) is acyclic.
Note, however, that cycles in ∆e(N ) need not correspond
to combinational cycles in N seen as an ordinary network,
since empty buffers with sequential elements cutting the
combinational feedbacks may be inserted into N . Even though
non-combinational in the ordinary sense, these cycles contain
no tokens and therefore no progress along them can be made.
Theorem 4 impies that insertion of empty elastic buffers
does not affect the basic functionality of an elastic network,
as illustrated in Figure 2(b).
Definition 10: An empty elastic buffer is an elastic machine
S such that Sᵀ = Conn(X,Y ) for some X,Y .
Theorem 5 (Buffer Insertion Theorem): Suppose B is an
empty elastic buffer with channels X,Y . Let N =
〈〈S1, . . . ,Sm []X1 = Y1, . . . , Xn = Yn〉〉 and M =
〈〈B,S1, . . . ,Sm []X = Y1, X1 = Y,X2 = Y2, . . . , Xn = Yn〉〉.
If ∆(N ), ∆(M), and ∆e(N ) are acyclic, then M is an elastic
machine, and Mᵀ = N ᵀ.
The precise relationship between graphs ∆(M) and ∆(N )
can be easily described. In practice they are at the same time
acyclic or not, as a consequence of sequentiality of sufficiently
many input-output wire pairs of B.
V. ELASTIC VS. PATIENT SYSTEMS
Elastic machines and patient processes of [2] provide two
formalizations of the intuitive concept of latency-insensitive
circuits. In this section we address their connections and differ-
ences. We begin with an overview of [2], using a minimalistic
approach and terminology that differs from the original. We
believe, however, that Definition 11 below matches the original
definion accurately in most important aspects.
A. Patient Systems
The notation A∗ is for the set of finite sequences over A. A
finitary W -system, by definition, is a set of behaviors σ such
that σ.w is a finite sequence for every w ∈W .
A stalling stream over A is a stream over A ∪ {@}. We
will refer to @ as the bubble and to elements of A as tokens.
We will consider only stalling streams that contain finitely
many tokens. If a is such a stream, let a ∈ A∗ denote the
sequence over A obtained by dropping all bubbles from a.
Clearly, a is determined by a and the sequence ∂(a) ∈ N∗ of
lengths of bubble sequences between consecutive tokens of a.
For example, if
a = (@,@, 7,@, 4, 5,@,@,@, 8, . . .) (9)
we have a = (7, 4, 5, 8, . . .) and ∂(a) = (2, 1, 0, 3, . . .). Two
stalling streams a, b are latency equivalent, written a $ b,
when a = b. Note that a $ a.
By definition, a stalling W -system is a set of behaviors
σ such that for every w ∈ W , σ.w is a stalling stream over
type(w). Latency equivalence extends to W -behaviors and W -
systems: σ $ τ iff σ.w $ τ.w holds for every w ∈W ; S $ S ′
iff for every σ ∈ S (σ ∈ S ′) there exists τ ∈ S ′ (τ ∈ S) such
that σ $ τ .
A stalling W -system S determines a standard finitary W -
system Sᵀ = {σ | σ ∈ S}, where σ is given by σ.w = σ.w
(for all w ∈W ). Clearly, Sᵀ $ S.
Stalling the k-th token of a by d steps produces a latency
equivalent stream that will be denoted stall(a, k, d). Omitting
the easy definition, we give an example: if a is as in (9), then
stall(a, 1, 3) = (@,@, 7,@,@,@,@, 4, 5,@,@,@, 8, . . .)
Definition 11: Let ≺ be a well-founded order1 on W and
let D > 0. A patient W -system (relative to ≺ and D) is a
1Introduction of a well-founded ordering of wires is motivated in [2] with
the purpose of modeling combinational dependencies, but such dependencies
in patient systems are not discussed in any detail. Moreover, the ordering of
wires is implicitly assumed to be total in [2], which is somewhat unnatural.
For instance, when constructing a patient adder with inputs u, v and output
w, one has two ordering choices: u ≺1 v ≺1 w and v ≺2 u ≺2 w. It is not
clear that a patient adder in the ≺1-sense will be patient in the ≺2-sense too.
7
Proceedings of the Formal Methods in Computer Aided Design (FMCAD'06)
0-7695-2707-8/06 $20.00  © 2006
stalling system P such that for every σ ∈ P , every u ∈ W ,
and every k ≥ 0 there exists σ′ ∈ P such that
(Pat-1) σ′.u = stall(σ.u, k, 1)
and for every v 6= u there exists dv ≤ D such that
(Pat-2) σ′.v =
{
stall(σ.v, k, dv) if u ≺ v
stall(σ.v, k + 1, dv) otherwise
The main results of [2] can now be summarized:
1) a theorem saying that the composition of patient systems
(with the same W , ≺, and D) is a patient system;
2) the definition and analysis of patient buffers, i.e. patient
systems B such that Bᵀ = Connfin(u, v)—the finitary
connection system;
3) a general construction that, for a given finitary system
M without combinational dependencies (model of a
Moore machine), produces a patient system P such that
P $M.
B. Comparison
The formalization given by patient systems is at a higher
level of abstraction. While elastic machines deal explicitly with
handshaking signals between communicating systems, patient
systems communicate purely in the token/bubble language.
Given an elastic (as defined in Section III) [I,O]-system E ,
the corresponding stalling (I ∪ O)-system E@ is obtained by
projecting the finite-transfer behaviors of E to data wires and
replacing data items on each wire with @ at all cycles where
transfer along that wire does not occur. Precisely, let EF be
the subset of E consisting of all behaviors ω such that ωᵀ.Z
is finite for all channels Z.2 Then, given ω ∈ EF, we define a
stalling (I ∪O)-behavior ω@ by
(ω@.Z)[i] =
{
(ω.Z)[i] if (ω.validZ)[i] ∧ ¬(ω.stopZ)[i]@ otherwise
and finally we define the stalling system E@ as the set of all
such behaviors ω@. Clearly, the system (E@)ᵀ is the finitary
version of the standard machine Eᵀ.
Now we can address some questions pertinent to the com-
parison of patient processes vs. elastic machines.
Are patient processes more general? The answer is “no”
because there exist elastic machines E such that E@ is not
patient. To see this, consider an elastic machine E that starts
offering new valid outputs on channel u only on even cycles.
(The existence of such elastic machines is obvious.) Observe
that σ.u = (@, 7, 9, . . .) is possible for some behavior σ of
E@ (token 7, even though transmitted on cycle 1 was first
offered on cycle 0). Then stall(σ.u, 0, 1) = (@,@, 7, 9, . . .)
must also be part of a behavior of E@, by condition (Pat-1)
of Definition 11. This implies that token 9 is first offered on
cycle 3, contrary to our assumption.
The above example can be viewed as an indication that
the condition (Pat-1) is too restrictive. It would be interesting
to see if an appropriate modification of (Pat-1) results in a
definition of patient processes that captures elastic machines.
2One can prove that E is the set of all limits of behaviors of EF and so E
is determined by EF.
Are elastic machines more general? The answer is an easy
“no” since, for example, the set of all possible stalling W -
behaviors is a patient system in the sense of Definition 11.
However, if one adds to Definition 11 a reasonable require-
ment that a patient system be a machine, the answer is not
immediately clear.
Which formalization is easier to use? Without offering a
definitive answer, we would argue that verifying that a low-
level design (RTL, say) implements an elastic machine would
be easier than verifying that it implements a patient system.
The bottom line is that the conditions for a system to be
an elastic machine are expressible as temporal properties of
suitably constructed infinite-state models. This is not obvious
for the determinism condition for Sᵀ in Definition 8, but
can be done by replacing determinism with causality and
introducing auxiliary variables for sequences of transferred
values over channels. Even though (e.g., because of infinite
counters involved) these conditions are not directly checkable
by the existing model checking technology, there are palpable
opportunities to find manageable stronger conditions that taken
together imply elasticity (e.g., postulating a limit on the token
count differences between channels eliminates the need for
infinite counters). On the other hand, the definition of a patient
system, being of the form “for every behavior σ, there exists
a behavior σ′ such that . . . ” appears to us to be intrinsically
more complex. Our only positive conclusion, however, is that
the mechanical checking of either of the definitions is an open
problem deserving further study.
VI. CONCLUSION
We have presented a theory of elastic machines that gives an
easy-to-check condition for the compositional theorem of the
form “an elasticization of a network of ordinary components
is equivalent to the network of components’ elasticizations”.
Verification of a particular implementation is reduced to prov-
ing that conditions of Definition 8 are satisfied for all elastic
components used, and that the graph ∆e(N e) is acyclic for
every network N to which the elasticization is applied. While
the definition of the graphs ∆e may appear complex because
of the sequentiality interfaces involved, it should be noted that
the elasticization procedures, e.g. [4], are reasonably expected
to completely preserve sequentiality: a channel P belongs to
δ(Q) if the wire-pair (P,Q) is sequential in the original non-
elastic machine. This ensures ∆e(N e) = ∆(N ) and so testing
for sequentiality is done at the level of ordinary networks.
Future work will be focused on proving correctness of
particular elasticization methods, on techniques for mechanical
verification of elasticity, and on extending the theory to more
advanced protocols.
Acknowledgments: Luca Carloni clarified some details of [2].
Ken McMillan pointed out several inaccuracies in a previous
version of the paper and further clarified [2] for us. Gerard
Berry, Ching-Tsun Chou, John Harrison, and the anonymous
reviewers provided useful remarks. We are grateful for all the
help we received.
8
Proceedings of the Formal Methods in Computer Aided Design (FMCAD'06)
0-7695-2707-8/06 $20.00  © 2006
REFERENCES
[1] G. Berry. The Constructive Semantics of Pure Esterel. Draft book,
available at http://www.esterel.org, version 3, July 1999.
[2] L. P. Carloni, K. L. McMillan, and A. L. Sangiovanni-Vincentelli.
Theory of latency-insensitive design. IEEE Transactions on Computer-
Aided Design of Integrated Circuits, 20(9):1059–1076, September 2001.
[3] L. P. Carloni and A. L. Sangiovanni-Vincentelli. Coping with latency in
SoC design. IEEE Micro, Special Issue on Systems on Chip, 22(5):12,
October 2002.
[4] J. Cortadella, M. Kishinevsky, and B. Grundmann. Synthesis of syn-
chronous elastic architectures. In Proc. Digital Automation Conference
(DAC), July 2006.
[5] S. A. Edwards and E. A. Lee. The semantics and execution of a
synchronous block-diagram language. Sci. Comput. Program., 48(1):21–
42, 2003.
[6] H. M. Jacobson et al. Synchronous interlocked pipelines. In Proc. Int.
Symp. on Advanced Research in Asynchronous Circuits and Systems,
pages 3–12, 2002.
[7] S. Krstic´, J. Cortadella, M. Kishinevsky, and J. O’Leary. Syn-
chronous elastic networks. Available at www.lsi.upc.edu/
˜jordicf/gavina/BIB/reports/fmcad06 ext.pdf, 2006.
[8] E. A. Lee and A. Sangiovanni-Vincentelli. A framework for comparing
models of computation. IEEE Transactions on Computer-Aided Design
of Integrated Circuits and Systems, 17(12):1217–1229, 1998.
[9] E. A. Lee, H. Zheng, and Y. Zhou. Causality interfaces
and compositional causality analysis. Invited paper in
Foundations of Interface Technologies (FIT 2005), available at
http://ptolemy.eecs.berkeley.edu/publications.
[10] R. Manohar and A. J. Martin. Slack elasticity in concurrent computing.
In Proc. 4th Int. Conf. on the Mathematics of Program Construction,
volume 1422 of Lecture Notes in Computer Science, pages 272–285,
1998.
[11] J. Matthews. Recursive function definition over coinductive types. In
TPHOLs ’99: Proc. the 12th Int. Conf. on Theorem Proving in Higher
Order Logics, pages 73–90, London, UK, 1999. Springer-Verlag.
[12] S. Suhaib, D. Berner, D. Mathaikutty, J.-P. Talpin, and S. Shukla.
Presentation and formal verification of a family of protocols for latency
insensitive design. Technical Report 2005-02, FERMAT, Virginia Tech,
2005.
[13] J. Vuillemin. On circuits and numbers. IEEE Transactions on Comput-
ers, 43(8):868–879, 1994.
9
Proceedings of the Formal Methods in Computer Aided Design (FMCAD'06)
0-7695-2707-8/06 $20.00  © 2006
