Single-generation Network Coding for Networks with Delay by Prasad, K. & Rajan, B. Sundar
ar
X
iv
:0
90
9.
16
38
v1
  [
cs
.IT
]  
9 S
ep
 20
09
Single-generation Network Coding for Networks
with Delay
K. Prasad and B. Sundar Rajan
Abstract—A single-source network is said to be memory-free if
all of the internal nodes (those except the source and the sinks)
do not employ memory but merely send linear combinations of
the incoming symbols (received at their incoming edges) on their
outgoing edges. Memory-free networks with delay using network
coding are forced to do inter-generation network coding, as a
result of which the problem of some or all sinks requiring a large
amount of memory for decoding is faced. In this work, we address
this problem by utilizing memory elements at the internal nodes
of the network also, which results in the reduction of the number
of memory elements used at the sinks. We give an algorithm
which employs memory at the nodes to achieve single-generation
network coding. For fixed latency, our algorithm reduces the
total number of memory elements used in the network to achieve
single-generation network coding. We also discuss the advantages
of employing single-generation network coding together with
convolutional network-error correction codes (CNECCs) for
networks with unit-delay and illustrate the performance gain
of CNECCs by using memory at the intermediate nodes using
simulations on an example network under a probabilistic network
error model.
I. INTRODUCTION
Network coding was introduced in [1] as a means of
achieving maximum rate of transmission in wireline networks.
An algebraic formulation of network coding was discussed in
[2] for both instantaneous networks and networks with delays.
Convolutional network-error correcting codes(CNECCs) were
introduced for acyclic instantaneous networks in [3] and for
unit-delay, memory-free networks in [4].
In this work, we consider acyclic, single-source networks
with delays which have a multicast network code in place. The
set of all code symbols generated at the source at any particular
time instant is called a generation. In unit-delay, memory-free
networks, the nodes of the network may receive information
of different generations on their incoming edges at every time
instant and therefore network coding across generations (inter-
generation) is unavoidable in general. However, the sinks
have to employ memory to decode the symbols. If memory
is utilized in the internal nodes also, such inter-generation
network coding can be avoided thus making the decoding
simpler.
We define a single-generation network code as a network
code where all the symbols received at all the sinks are linear
combinations of the symbols belonging to the same generation.
In [5], the technique of adding memory at the nodes to achieve
single-generation network coding was discussed. However this
was done only on a per-node basis without considering the
entire topology or the network code of the network. On the
other hand, we consider the entire network topology and
the network code, which govern the addition of memory
elements at the nodes and the way in which they are rearranged
across the network to reduce the overall memory usage in the
network.
The organization and contributions of this work are as
follows
• After briefly discussing the network setup and the net-
work code for an acyclic network with delays and mem-
ory (Section II), we introduce different methods of adding
memory at a node and analyze how each of them affect
the local and global encoding kernels of the network code
(Section III).
• We also present different memory reduction and distribu-
tion techniques (Section IV).
• We propose an algorithm which uses the memory at
the nodes to achieve single-generation network coding
while reducing the overall memory usage in the network
(Section V).
• We discuss the advantages of employing memory at the
intermediate nodes in tandem with CNECCs in terms of
their encoding/decoding (Section VI).
• We illustrate the the performance benefits by using
memory for CNECCs for unit-delay networks using
simulations on an example unit-delay network under a
probabilistic error setting (Section VII).
II. NETWORKS WITH DELAY AND MEMORY
The model for acyclic networks with delays considered in
this paper is as in [2]. An acyclic network can be represented as
an acyclic directed multi-graph (a graph that can have parallel
edges between nodes) G = (V , E) where V is the set of all
vertices and E is the set of all edges in the network.
We assume that every edge in the directed multi-graph
representing the network has unit capacity (can carry utmost
one symbol from Fq, the field with q elements). Network
links with capacities greater than unit are modeled as parallel
edges. The network has delays, i.e, every edge in the directed
graph representing the input has a unit delay associated with it,
represented by the parameter z. Such networks are known as
unit-delay networks. Those network links with delays greater
than unit are modeled as serially concatenated edges in the
directed multi-graph. We assume a single-source node s ∈ V
and a set of sinks T . Let n
T
be the unicast capacity for a sink
node T ∈ T i.e the maximum number of edge-disjoint paths
from s to T . Then
nmin = min
T∈T
n
T
is the max-flow min-cut capacity of the multicast connection.
2A. Network code for unit-delay, memory-free networks
We follow [2] in describing the network code. For each node
v ∈ V , let the set of all incoming edges be denoted by ΓI(v).
Then |ΓI(v)| = δI(v) is the in-degree of v. Similarly the set
of all outgoing edges is defined by ΓO(v), and the out-degree
of the node v is given by |ΓO(v)| = δO(v).
For any e ∈ E and v ∈ V , let head(e) = v, if v is such
that e ∈ ΓI(v). Similarly, let tail(e) = v, if v is such that
e ∈ ΓO(v). We will assume an ancestral ordering on V and E
of the acyclic graph of the unit-delay, memory-free network.
The network code can be defined by the local kernel
matrices of size δI(v) × δO(v) for each node v ∈ V with
entries from Fq. The global encoding kernels for each edge
can be recursively calculated from these local kernels.
The network transfer matrix, which governs the input-output
relationship in the network, is defined as given in [2] for an
n-dimensional (n ≤ nmin) network code. Towards this end,
the matrices A,K ,and BT (for every sink T ∈ T ) are defined
as follows.
The entries of the n× |E| matrix A are defined as
Ai,j =
{
αi,ej if ej ∈ ΓO(s)
0 otherwise
where αi,ej ∈ Fq is the local encoding kernel coefficient at
the source coupling input i with edge ej ∈ ΓO(s).
The (i, j)th entry of the |E| × |E| matrix K is Kei,ej ∈ Fq
which is the local kernel coefficient between ei and ej at the
node head(ei) = tail(ej) (if such a node exists), and zero if
head(ei) 6= tail(ej).
For every sink T ∈ T , the entries of the |E|×n matrix BT
are defined as
BTi,j =
{
ǫej ,i if ej ∈ ΓI(T )
0 otherwise
where all ǫej ,i ∈ Fq.
For unit-delay, memory-free networks, we have
F (z) := (I − zK)−1
where I is the |E| × |E| identity matrix. Now we have the
following definition.
Definition 1 ( [2]): The network transfer matrix, MT (z),
corresponding to a sink node T ∈ T for a n-dimensional
network code, is a full rank (over the field of rationals Fq(z))
n× n matrix defined as
MT (z) := AF (z)B
T = AFT (z).
With an n-dimensional network code, the input and the
output of the network are n-tuples of elements from Fq[[z]],
the formal power series ring over Fq. Definition 1 implies
that if x(z) ∈ Fnq [[z]] is the input to the unit-delay, memory-
free network, then at any particular sink T ∈ T , we have the
output, y(z) ∈ Fnq [[z]], to be
y(z) = x(z)MT (z).
B. Network code for networks with delay and memory
We define the instantaneous counterpart of a unit-delay
network as follows.
Definition 2: Given a unit-delay network G(V , E), the net-
work obtained from G (having the same node set V and the
same edge set E) by removing the delays associated with the
edges is defined as the instantaneous counterpart of G(V , E).
Example 1: Fig. 1 illustrates an example. A modified but-
terfly unit-delay network (top) and its instantaneous counter-
part (bottom) are shown. The global kernels of the incoming
edges to the sinks T1 and T2 corresponding to a 2 dimensional
network code are indicated for both networks.
Fig. 1. The figure corresponding to Example 1 (A unit-delay network and
its instantaneous counterpart).
Let Gm(V , E) be a single-source, acyclic network with every
edge of the network having some delay (a positive integer) and
with memory elements at the nodes available for usage. If none
of the memory elements at the nodes are used, then we can
model Gm as a unit-delay, memory-free network Gu. Let Ginst
be the instantaneous counterpart of Gu. The following lemma
ensures the equivalence of a network code between Ginst and
Gu.
Lemma 1 ( [4] ): Let G′(V , E) be a single-source acyclic,
unit-delay, memory-free network, and G′inst be the instanta-
neous counterpart of G′. Let N be the set of all δI(v)×δO(v)
matrices ∀ v ∈ V , i.e, the set of local encoding kernel matrices
at each node, describing an m-dimensional network code (over
Fq) for G′inst (m ≤ min-cut of the source-sink connections in
G′inst). Then the network code described by N continues to
3be an m-dimensional network code (over Fq(z)) for the unit-
delay, memory-free network G′.
If the nodes use memory elements such that inter-generation
network coding is prevented at any particular node of the
network, then this leads to single-generation network coding
in the network.
In Section V we give an algorithm which uses memory
elements at the nodes to achieve single-generation network
coding, i.e, the network transfer matrix MT (z) of every sink
T ∈ T in the in Gm becomes
MT (z) = z
LTMT (1)
where LT is some positive integer and MT is the network
transfer matrix of the sink T in Ginst. Clearly, if MT is full
rank (over Fq), so is MT (z) (over Fq(z)).
III. MEMORY ADDITIONS AT A NODE
For the source node s, let Γ˜I(s) denote the set of n virtual
incoming edges which denote the n inputs. The global kernels
of these edges are therefore the columns of an n× n identity
matrix over Fq, the field over which the network code is
defined. For every non-source node v ∈ V , let Γ˜I(v) = φ.
For a sink T ∈ T , let Γ˜O(T ) denote n virtual outgoing edges
denoting the n outputs at sink T. The global kernels of these
edges are the columns of the network transfer matrix MT (z).
For every non-sink node v ∈ V , let Γ˜O(v) = φ. We then
define the set E˜ as
E˜ := E ∪ Γ˜I(s) ∪
( ⋃
T∈T
Γ˜O(T )
)
The ancestral ordering on E can then be extended to an
ancestral ordering on E˜ .
For any ei, ej ∈ E˜ such that head(ei) = tail(ej) = v ∈ V ,
with memory being used at v, the local kernel Aei,ej (the
kernel coefficient between ei ∈ Γ˜I(s) and ej ∈ ΓO(s) with
s = v), Kei,ej or Bvei,ej (the kernel coefficient between ei
and ej ∈ Γ˜O(v) for some sink node v) can have elements
from Fq(z). We show in Section V that using the memory
elements at the nodes according to Subsection III-A and
Subsection III-B is sufficient to guarantee single-generation
network coding at each node and therefore in the given
network.
A. Adding memory at a node for a pair of an incoming and
an outgoing edge
For any ei, ej ∈ E˜ such that head(ei) = tail(ej) = v ∈ V ,
we define Mei,ej as the number of memory elements utilized
at the node v to delay the symbols coming from the incoming
edge ei (before any network coding is performed at node v
on the symbols from ei) such that the local kernel between ei
and ej is modified in one of the following ways
Aei,ej 7−→ z
Mei,ejAei,ej if ei ∈ Γ˜I(s), ej ∈ E (2)
Kei,ej 7−→ z
Mei,ejKei,ej if ei, ej ∈ E (3)
Bvei,ej 7−→ z
Mei,ejBvei,ej if ei ∈ E , ej ∈ Γ˜O(v) (4)
while none of the other local kernels are changed. The matrix
F (z) = (I − zK)−1 is also correspondingly modified.
B. Adding memory at a node for an outgoing edge
For ej ∈ ΓO(v) ∪ Γ˜O(v), we define Mej ,tail(ej ) as the
number of memory elements added at node v to delay the
symbols going into the edge ej after performing network
coding at v. In such a case, the elements of the matrix K
(or of the matrix or A, or Bv) are modified according to the
following rule.
Aei,ej 7−→ z
Mej,tail(ej )Aei,ej ∀ei ∈ ΓI,ej (v), if v = s
(5)
Kei,ej 7−→ z
Mej,tail(ej )Kei,ej ∀ei ∈ ΓI,ej (v), (6)
if v 6= s, ej ∈ ΓO(v)
Bvei,ej 7−→ z
Mej,tail(ej )Bvei,ej ∀ei ∈ ΓI,ej (v), if ej ∈ Γ˜O(v)
(7)
where the set ΓI,ej (v) ⊆ ΓI(v) ∪ Γ˜I(v) is defined as in the
top of the next page. The elements of the matrix F (z) are also
correspondingly modified.
Example 2: Fig 2 illustrates an example of the memory
additions at a node. The memory elements indicated inside
the box labeled ‘A’ are added at the node for the pair of edges
ei and ej thereby delaying the symbols on ei before network
coding at the node, i.e, Mei,ej = 2. Similarly the memory
element indicated by ‘C’ is added for the pair of edges ei and
ek, i.e, Mei,ek = 1. The memory element indicated by ‘B’
is added for the outgoing edge ej after network coding, i.e,
Mej ,tail(ej) = 1.
Fig. 2. The figure corresponding to Example 2 (Adding memory at a node).
IV. MEMORY REDUCTION AND DISTRIBUTION
TECHNIQUES
In this section, we look at techniques to reduce the memory
used at the nodes of the network and the overall memory used
in the network and also to obtain a fairly uniform memory
usage distribution throughout the network.
We define the maximum number of memory elements added
to delay the symbols coming from an edge ei ∈ E˜ into node
head(ei) = v as
Mei,head(ei),max := max
ej∈ΓO,ei (v)
Mei, ej (9)
4ΓI,ej (v) :=
{
ei ∈ ΓI(v) | Kei,ej 6= 0
}⋃{
ei ∈ Γ˜I(v) | Aei,ej 6= 0
}
. (8)
ΓO,ei(v) :=
{
ej ∈ ΓO(v) | Kei,ej 6= 0
}⋃{
ej ∈ Γ˜O(v) | B
v
ei,ej
6= 0
}
(10)
where ΓO,ei(v) is defined as shown at the top of the next page.
We define the total number of memory elements used at node
v as
Mv =
∑
ei∈ΓI (v)∪Γ˜I(v)
Mei,head(ei),max+
∑
ej∈ΓO(v)∪Γ˜O(v)
Mej ,tail(ej).
A. Memory reduction in a single node
Consider a node v ∈ V in which memory elements have
been added to delay symbols coming from an edge ei ∈
ΓI(v) ∪ Γ˜I(v).
Then, retaining the Mei,head(ei),max(as defined in (9))
memory elements, all other memory elements placed on ei
can be removed without any change in any local or global
kernels by tapping symbols from the Mei,head(ei),max memory
elements wherever necessary. Doing this for every incoming
edge of v is equivalent to obtaining a minimal encoder (one
with minimum number of memory elements) of the transfer
function (input-output relationship) at node v.
Example 3: Fig. 3 illustrates a particular example of such
a reduction. The figure on the top (all ai ∈ Fq) represents
a node v before memory reduction with Mv = 3, while the
figure on the bottom is the same node after memory reduction
with Mv = 2.
B. Memory reduction between nodes
For a set of edges E ′ ⊆ E˜ , let VE′ be the set of all nodes
defined as follows
VE′ = {head(ej) | ej ∈ E
′} (11)
We now define Mei,head(ei),min and ME′ as follows.
Mei,head(ei),min := min
ej∈ΓO,ei (v)
Mei, ej (12)
ME′ := min
ej∈E′
Mej ,head(ej),min (13)
where ΓO,ei(v) is as defined in (10).
For a node v ∈ V , we define the set of adjacent nodes of v
as the set of nodes
Ev := {v
′ | v′ = head(ej) ∀ej ∈ ΓO(v)} .
1) Memory reduction between adjacent nodes: For a node
v ∈ V , and for some Γ′O(v) ⊆ ΓO(v) ∪ Γ˜O(v), let Γ′I(v) ⊆
ΓI(v) ∪ Γ˜I(v) be defined as
Γ′I(v) =
⋃
ej∈Γ′O(v)
ΓI,ej (v).
where ΓI,ej (v) is as in (8), i.e, the global kernels of the edges
in ej ∈ Γ′O(v) are linear combinations of the global kernels of
the edges in Γ′I(v) only and none else. Also let MΓ′O(v) and
Fig. 3. The figure corresponding to Example 3 (Memory reduction at a
node).
the set VΓ′
O
(v) ⊆ Ev of nodes be defined for the set of edges
Γ′O(v) as in (13) and (11) respectively.
We define the term Mei,Γ′O(v) as
Mei,Γ′O(v) = max
{
0,MΓ′
O
(v) −Mei,head(ei),max
}
(14)
Then, if the condition is satisfied,∑
ei∈Γ′I (v)
Mei,Γ′O(v) ≤MΓ′O(v)|Γ
′
O(v)| (15)
then all of the |Γ′O(v)|MΓ′O(v) used at the nodes VΓ′O(v) (to
delay symbols coming from the edges ej ∈ Γ′O(v)) can be
‘absorbed’ into node v by removing all these memory elements
and adding Mei,Γ′O(v) memory elements at node v for every
ei ∈ Γ′I(v) (and thereby used for delaying the symbols coming
from every ei ∈ Γ′I(v)), without using any additional memory
and without changing the global kernels of any outgoing edge
of any node in VΓ′
O
(v).
This technique of ‘absorption’ of the memory elements from
a set of nodes which are the ‘heads’ of the outgoing edges from
a node v, to the node v itself, is beneficial in terms of reducing
the overall memory usage of the network (to achieve single-
5generation network coding) if the condition (15) is satisfied as
a strict inequality.
Example 4: Fig. 4 illustrates an example for memory re-
duction between multiple nodes (v1, v2, v3 and v4 here) of a
network. Here MΓ′
O
(v) = 1, |Γ
′
O(v)| = 3, and Me1,Γ′O(v) =
Me2,Γ′O(v) = 1. Therefore, three memory elements at nodes
v2, v3 and v4 are ‘absorbed’ into two memory elements at
node v1. The boxes indicate the use of memory elements and
the node to which the memory elements are attached.
Fig. 4. The figure corresponding to Example 4 (Memory reduction between
adjacent nodes).
2) Memory reduction between nodes not necessarily adja-
cent: For EI , EO ⊂ E˜ being two sets of edges, we say that
they form a pair [EI , EO] if
EI =
⋃
ej∈EO
ΓI,ej (tail(ej)).
and
EO =
⋃
ei∈EI
ΓO,ei(head(ei)).
We say that the sets EI , EO form a pair [EI , EO ) if
EI =
⋃
ej∈EO
ΓI,ej (tail(ej)).
and
EO ⊂
⋃
ei∈EI
ΓO,ei(head(ei)).
For a node v, we define the set Pv as follows
Pv := {[ΓIi(v),ΓOi (v)] | 1 ≤ i ≤ sv}
such that the following conditions are satisfied
ΓIi(v) ∩ ΓIj (v) = φ, ∀ 1 ≤ i, j ≤ sv, i 6= j (16)
ΓOi(v) ∩ ΓOj (v) = φ, ∀ 1 ≤ i, j ≤ sv, i 6= j (17)
where sv is the maximum number of sets satisfying conditions
(16) and (17). Algorithm 1 shown at the top of the next page
obtains the set Pv for some node v.
Example 5: Fig. 5 illustrates a node v with the local kernel
matrix over some field Fq. For this node, the set Pv is given
as
Pv = {[ΓI1(v),ΓO1(v)] , [ΓI2(v),ΓO2 (v)]}
where
ΓI1(v) = {e1, e2, e3} ΓO1(v) = {e5}
ΓI2(v) = {e4} ΓO2(v) = {e6, e7, e8} .
Fig. 5. The figure corresponding to Example 5 which gives the set Pv of
the node v.
For an pair of edge-sets [ΓIi(v),ΓOi (v)] ∈ Pv, we define
Si(v), a sequence of pairs of edge-sets as
Si(v) :=
[
Eim , Eim−1
)
,
[
Eim−1 , Eim−2
]
, ..., [Ei2 , Ei1 ] , [Ei1 , Eo1 ]
(18)
where [Ei1 , Eo1 ] = [ΓIi(v),ΓOi(v)] , and m is the maximum
length of the sequence, that is possible to be obtained as in
(18) for the edge-set pair [ΓIi(v),ΓOi (v)] .
Let k be an integer such that
|Eik | = min
1≤j≤m
|Eij |.
For the set ΓOi(v), let MΓOi (v) be defined as in (13), and
the set of nodes VΓOi (v) be defined as in (11). Let the set
of nodes VEik be defined as in (11) for the set Eik . Also, let
Meik ,ΓOi (v) be defined as in (14) for the set ΓOi(v) and for
an edge eik ∈ Eik . As in the memory reduction procedure of
adjacent nodes, if∑
eik∈Eik
Meik ,ΓOi (v) ≤MΓOi (v)|ΓOi(v)| (19)
then the |ΓO,i(v)|MΓO,i(v) used at the nodes VΓOi (v) (to
delay symbols coming from the edges ej ∈ ΓO,i(v)) can be
6Input: A node v ∈ V with the edge sets ΓI(v) ∪ Γ˜I(v) and ΓO(v) ∪ Γ˜O(v).
Output: The set Pv for the node v.
Let i = 1, Out(v) = ΓO(v) ∪ Γ˜O(v), Pv = φ.1
repeat2
Let ΓIi(v) = ΓOi(v) = φ.3
For some ej ∈ Out(v), let ΓIi(v) = ΓI,ej (v)4
repeat5
Let6
ΓOi(v) =
⋃
ei∈ΓIi (v)
ΓO,ei(v)
Let7
ΓIi(v) =
⋃
ej∈ΓOi (v)
ΓI,ej (v)
until the sets ΓIi(v) and ΓOi(v) remain unchanged for 2 consecutive iterations ;8
Let Pv = Pv ∪ {[ΓIi(v),ΓOi (v)]} .9
Let Out(v) = Out(v)\ΓOi (v) and i = i+ 1.10
until Out(v) = φ ;11
Algorithm 1. Algorithm to obtain the set Pv for a node v.
removed without changing the global kernels of the edges of
ΓO(v
′), ∀ v′ ∈ VΓOi (v) by adding Meik ,ΓOi (v) memory ele-
ments for each edge eik ∈ Eik at the node head(eik) ∈ VEik .
This technique will save memory if the condition (19) is
satisfied as a strict inequality.
Example 6: Figure 6 illustrates an example for the mem-
ory reduction procedure between non-adjacent nodes. Let
Kei,ej 6= 0, ∀ 9 ≤ i ≤ 12, 13 ≤ j ≤ 15. In the
example, for the node v3, the set Pv3 and the sequence S1(v3)
corresponding to the only element of Pv3 are given by (20)
and (21) at the top of the next page.
Now, we have MΓO,1(v3) = 1, |ΓO,1(v3)| = 3, Eik = {e1}
and Me1,ΓO,1(v3) = 1. Therefore, the 3 memory used for the
edges in ΓO,1(v3) at the nodes v4, v5, and v6 are ‘absorbed’
into a single memory element used at node v1 for edge e1,
thus reducing the memory usage by 2.
Remark 1: The memory reduction procedures of Subsub-
section IV-B1, and Subsubsection IV-B2 can sometimes result
in exactly the same memory reduction event. However, there
could be instances in which only one of the procedures can
achieve memory reduction.
For example, the memory reduction procedure of Subsub-
section IV-B1 cannot reduce memory at node v3 in the situ-
ation shown in Example 6 because for any Γ′O(v) ⊆ ΓO(v),
|Γ′I(v)| > 3 ≥ |Γ
′
O(v)|, since Γ′I(v) = ΓI(v). However the
memory reduction procedure of Subsubsection IV-B2 does
work as shown in Fig 6.
Similarly, in some cases, at a node, the procedure of
Subsubsection IV-B1 can be used to reduce memory usage,
while Subsubsection IV-B2 cannot be applied. This is because
of the fact that, at any node, the procedure of Subsubsection
IV-B2 takes into account only those sets of the form Pv , while
the procedure of Subsubsection IV-B1 takes into account all
possible incoming and outgoing edges. Such a case is seen in
Example 7.
Example 7: Fig. 7 shows the node v of Fig. 5 (Example 5)
in a particular configuration. The memory reduction procedure
of Subsubsection IV-B2 cannot be applied for the set ΓO,2(v)
because MΓO,2(v) = 0.
But M{e6,e7} = 1, and therefore 2 memory elements at node
v1 and v2 can be absorbed into a single memory element at
node v, thereby facilitating memory reduction according to
Subsubsection IV-B1.
Fig. 7. The figure corresponding to Example 7. The box with the incoming
edges e1, e2, e3, and e4 represents the node v of Fig. 5 (Example 5).
7Pv3 = { [ΓI1(v3) = {e9, e10, e11, e12} ,ΓO1(v3) = {e13, e14, e15}] }. (20)
S1(v3) = [{e1} , {e2, e3}) , [{e2, e3} , {e5, e6, e7, e8}] , [{e5, e6, e7, e8} ,ΓI1(v3)] , [ΓI1(v3),ΓO1(v3)] (21)
Fig. 6. The figure corresponding to Example 6 (Memory reduction between non-adjacent nodes).
C. Memory distribution
The following technique can be used to distribute memory
elements throughout the network in a somewhat uniform way.
Suppose there exists a node v ∈ V such that for some
ej ∈ ΓO(v) with v′ = head(ej) and for some integer
m ≤Mej ,head(ej),min,
Mv +m ≤Mv′ −m (22)
then the m memory elements at node v′ used to delay symbols
coming from edge ej can be ‘absorbed’ into node v (thereby
using them to delay symbols going into edge ej) without
changing the global kernels of any edge in ΓO(v′).
This technique reduces the number of memory elements
used at node v′ for delaying its incoming symbols while
increasing the number (Mej ,tail(ej)) of memory elements used
at node v for delaying its outgoing symbols.
Example 8: Fig 8 illustrates an example for memory dis-
tribution between two nodes v1 and v2. In the figure on the
top, m = 1,Mv1 = 0, and Mv2 = 3. Therefore one memory
element from v2 (used to delay symbols coming from ej into
v2) can be ‘absorbed’ into node v1 (and thereby used to delay
symbols going into ej from v1). The boxes indicate the node
to which the memory elements are attached. After distribution,
Mv1 = 1, and Mv2 = 2.
V. SINGLE-GENERATION NETWORK CODING - ALGORITHM
This section presents the main contribution of this paper.
For an edge ei ∈ E , let f ei(z) ∈ F
n
q (z) represent the global
kernel of ei. We say that a node v ∈ V\ {s} is a coding node
if the global kernel of at least one of its outgoing edge is a
Fq(z) linear combination of the global kernels of at least two
of its incoming edges. Otherwise, we call v a forwarding node.
Let Vcod be the set of coding nodes, and Vfwd be the set
of forwarding nodes. Let V0cod be the set of all coding nodes
such that there exist no path in the network from any other
coding node to any node in V0cod.
Towards proposing an algorithm to enable single-generation
network coding, we make some observations and discuss the
addition of memory elements at the coding nodes to achieve
single-generation network coding.
8Fig. 8. The figure corresponding to Example 8 (Memory distribution).
Observation 1: For any v ∈ V0cod, the global kernel of any
e ∈ ΓI(v) is of the form
f e(z) = z
lefe (23)
for some positive integer le, with fe ∈ Fnq . If the network is a
unit-delay network and the node v uses no memory, the global
kernel of any ej ∈ ΓO(v) is of the form
fej (z) =
∑
ei∈ΓI(v)
zKei,ejf ei(z) =
∑
ei∈ΓI(v)
Kei,ej z
lei+1fei
(24)
where lei is a positive integer signifying accumulated delay
from the source to edge ei, and Kei,ej ∈ Fq signifies the local
kernel coefficient between ei and ej . The additional z is to
account for the delay in the unit delay network.
A. Single-generation processing at the nodes
For every pair of edges ei, ei′ ∈ ΓI,ej (v) (ΓI,ej(v) being
as in (8)) in (24) such that lei < lei′ , we may add Mei,ej =
lei′ − lei memory elements at node v to delay the symbols
coming from ei such that the global kernel of the edge ej
becomes
fej (z) = z
lej,max+1
∑
ei∈ΓI (v)
Kei,ejf ei (25)
where lej ,max = maxei∈ΓI,ej (v) lei and Kei,ej ∈ Fq. Once
this process of using memory at the node v results in the global
kernel of every edge in ΓO(v) to be a linear combination
of symbols from the same generation (generations between
different outgoing edges need not be the same), we say that
single-generation processing has been achieved at node v. For
a node T ∈ T , we say that single-generation processing has
been achieved at sink T if the condition (1) is satisfied along
with condition (25) for each ej ∈ ΓO(T ).
Observation 2: We iteratively define the set V icod ⊆ Vcod
as the set of coding nodes which have path only from
i−1⋃
j=0
Vjcod

⋃Vfwd
where V0cod is as defined before. Once memory has been used
to achieve single-generation processing at all nodes in V i−1cod ,
it can be observed that the global kernels of the incoming
and outgoing edges of any node v ∈ V icod satisfy the same
condition as in (23) and (24).
Thus again memory elements can be used at the nodes
of V icod to implement single-generation processing, ultimately
achieving single-generation processing at each coding node of
the network.
B. Algorithm for single-generation network coding
Algorithm 2 shown in the next page is used to achieve
single-generation network coding using memory at the nodes
of the network, while trying to minimize the total number of
memory elements used in the network.
Remark 2: Algorithm 2 assumes that every node has unlim-
ited memory to use and then tries to obtain a configuration that
reduces the number of memory elements used in the network.
However, if the maximum available memory in the nodes is
limited, then the following techniques may be adopted after
running Algorithm 2.
• In line 27 of the algorithm, instead of checking condition
(22) at every pair of nodes connected by some edge, the
actual memory capability of the nodes must be taken into
account and then the distribution procedure of Subsection
IV-C can be run.
• Finally, at every node in which the algorithm demands
more memory elements than what is available, sufficient
memory elements should be removed so that the total
memory used at the node is utmost what is available. As
the penalty of removing these memory elements will be
compensated by the sinks, the memory elements that will
be removed at the nodes should ideally be such that the
compensation occurs in the least number of sinks in the
least possible quantity.
Example 9: Fig. 10, Fig. 11, and Fig. 12 represent the
network at various stages of the algorithm applied on a
modified double-butterfly network as shown in Fig. 9. The
modified unit-delay double-butterfly network shown in Fig. 10
has the standard network code over F2. s is the source node,
Ti, i = 1, 2, 3, 4 are the sinks. The dotted lines represent the
virtual input edges at the source and virtual output edges at
the sinks.
Table I shows the network transfer matrices before and after
obtaining single-generation processing using Algorithm 2.
Table I also shows a comparison between the memory require-
ments at the sinks (for decoding) between inter-generation
network coding (i.e the memory-free case; the numbers shown
are the sum of the row degrees of realizable inverse matrices
in the third column) and single-generation network coding (as
shown in Fig. 12). In the memory-free case, assuming that
9Input: A network Gm with delays and unused memory elements
Output: The network Gm with a single-generation network code using memory elements at nodes
foreach v ∈ Vcod in the ancestral order do1
Introduce sufficient memory elements at node v accordingly as in Subsection V-A in order to enable single-generation2
processing at node v.
foreach ei ∈ ΓI(v) ∪ Γ˜I(v) do3
Run the memory reduction procedure as in Subsection IV-A.4
end5
end6
Now the global kernel of any edge ej ∈ ΓI(T ) of any sink T is of the form7
f ej (z) = z
Lejfej
for some positive integer Lej , with f ej ∈ F
n
q .
foreach T ∈ T do8
Add sufficient memory according to Subsection III-A and Subsection III-B such that single-generation processing is9
achieved at the sink T.
end10
foreach v ∈ V in the reverse-ancestral order do11
foreach pair of edge-sets [ΓIi(v),ΓOi(v)] ∈ Pv do12
if condition (19) is satisfied then13
Run the memory reduction procedure as in Subsubsection IV-B2.14
end15
end16
end17
foreach v ∈ V in the reverse-ancestral order do18
foreach subset Γ′O(v) ⊆ ΓO(v) ∪ Γ˜O(v) do19
if condition (15) is satisfied then20
Run the memory reduction procedure as in Subsubsection IV-B1.21
end22
end23
end24
foreach v ∈ V in the ancestral order do25
foreach ej ∈ ΓO(v) do26
if condition (22) is satisfied then27
Run the memory distribution procedure at v as in Subsection IV-C.28
end29
end30
end31
foreach v ∈ V in the ancestral order do32
foreach ej ∈ ΓO(v) ∪ Γ˜O(v) do33
foreach ei ∈ ΓI,ej (v) do34
Update the corresponding elements in A, K , and Bv matrices according to (2), (3), and (4) of Subsection35
III-A upon calculating Mei,ej .
end36
Update the corresponding elements in A, K , and Bv matrices according to (5), (6), and (7) of Subsection III-B37
upon calculating Mej ,tail(ej).
end38
end39
Algorithm 2. Algorithm for using memory at nodes to obtain a single-generation network code
10
Fig. 9. Figure corresponding to Example 9. A modified double-butterfly network. The mapping between the incoming and outgoing symbols
(a1, a2, b1, b2, c1, c2 ∈ F2) at the nodes v4, T1, and v9 are shown.
Fig. 10. Figure corresponding to Example 9. After line 10 of Algorithm 2, single-generation network coding has been implemented in the network and all
the sinks see a network transfer matrix as in (1). Each box indicates the presence of memory elements at the associated node. The way sink T1 uses memory
is expanded below. Total memory used at this stage is 20.
11
Fig. 11. Figure corresponding to Example 9. The network after line 24 of the algorithm. Comparing this figure with Fig. 10, memory reduction according
to Subsubsection IV-B1 has resulted in the ‘absorption’ of memory elements from the nodes v4, T1, v7, v9, and T4. Total memory used in the network now
is 12.
Fig. 12. Figure corresponding to Example 9. The network at the end of Algorithm 2. The 12 memory elements used in Fig. 11 are further distributed
amongst the nodes of the network.
sinks use memory individually to decode, the total number of
memory elements used in the network is 19, and all of them
are used at the sinks. In the single-generation network coded
network as shown in Fig. 12, it can be seen that the total
number of memory elements used in the network is 12, out of
which only 7 are used at the sinks, thereby showing a marked
reduction from the memory-free case. The rest of the memory
elements (numbering 5) are distributed across the nodes of the
network.
C. Comparison with the approach of [5]
We can compare the straightforward approach of [5] and
our approach to obtaining a single-generation network coded
network for the modified unit-delay double-butterfly network
of Fig. 9. According to the technique in [5], the result would
be the network as in Fig. 10, thereby resulting in the use of 20
memory elements to obtain single-generation network coding.
However, our algorithm utilizes the memory reduction and
distribution techniques as given in Section IV and results in
the output being as in Fig 12 using 12 memory elements and
a more uniform distribution of memory elements across the
network than in Fig. 10. Although the overall memory usage
is reduced, it still remains to be shown whether Algorithm 2
actually obtains a configuration of the network with minimal
number of memory elements being used to obtain single-
generation network coding.
12
TABLE I
COMPARING INTER(MEMORY-FREE) AND SINGLE-GENERATION NETWORK CODING(USING MEMORY) FOR THE NETWORK IN FIG. 9
Sink Network transfer matrix Realizable decoding matrix Network transfer matrix No. of memory No. of memory
before Algorithm 2 obtained from M−1
T
(z) after Algorithm 2 elements used elements used
before Algorithm 2 after Algorithm 2
T1 MT1 (z) =
„
z z3
0 z4
«
PT1(z) =
„
z3 z2
0 1
«
MT1(z) = z
4
„
1 1
0 1
«
3 1
T2 MT2 (z) =
„
z3 0
z4 z
«
PT2(z) =
„
1 0
z3 z2
«
MT2(z) = z
4
„
1 0
1 1
«
3 2
T3 MT3 (z) =
„
z5 + z8 z5
z9 z6
«
PT3 (z) =
„
z z4
1 1 + z3
«
MT3(z) = z
9
„
1 0
1 1
«
7 2
T4 MT4 (z) =
„
z3 z5 + z8
0 z9
«
PT4 (z) =
„
z6 z2 + z5
0 1
«
MT4(z) = z
9
„
1 1
0 1
«
6 2
VI. IMPACT OF SINGLE-GENERATION NETWORK CODING
ON NETWORK-ERROR CORRECTION
A. Impact on encoding
Construction of a CNECC: For details on the basics of con-
volutional codes, we refer the reader to [6]. The construction
of a CNECC [4] for a given acyclic, unit-delay, memory-
free network which corrects error vectors corresponding to
a given set Φ of error patterns (an error pattern is a subset of
E indicating the edges in error) can be summarized as follows
• Compute the set Ws of error vector reflections given by
Ws =
⋃
T∈T ,ρ∈Φ
{
wFT (z)pT (z)M
−1
T (z) | w ∈ ρ
}
where w ∈ F |E|q is an error vector, and w ∈ ρ means
that w matches an error pattern ρ. p
T
(z) ∈ Fq[z](the
ring of polynomials) is some processing function chosen
such that the processing matrix p
T
(z)M−1T (z) = PT (z)
is a polynomial matrix.
• Let ts = maxws(z)∈Ws wH (ws(z)) . Choose an input
convolutional code Cs with free distance at least 2ts + 1
as the CNECC for the given network.
The following lemma gives a bound on ts and therefore the
free distance demanded of the CNECC.
Lemma 2 ( [4] ): Given an acyclic, unit-delay, memory-
free network G(V , E) with a given error pattern set Φ, let
Tdelay − 1 be the maximum degree of any polynomial in
the F (z) matrix. Let wH indicate the Hamming weight over
Fq. If r is the maximum number of non-zero coefficients of
the polynomials p
T
(z) corresponding to all sinks in T , i.e
r = maxT∈T wH (pT (z)), then we have
ts ≤ rn [(n+ 1) (Tdelay − 1) + 1] .
Algorithm 2 does not increase the value of Tdelay in the
matrix F (z) because of the fact that an additional delay would
not be introduced on any path between nodes which are at a
distance of Tdelay edges (the maximum number of edges on
any path between any two nodes) from each other. Also, with
memory being introduced in the nodes according to Algorithm
2, the network transfer matrices at all the sinks are of the form
as given in (1). Therefore the processing functions at any sink
T is of the form p
T
(z) = zLT , i.e r = 1.
Therefore we have that, for the network with delay and
memory (used to achieve single-generation network coding),
ts ≤ n [(n+ 1) (Tdelay − 1) + 1] .
Thus, it is seen that the bound for ts and therefore for the
free distance demanded of the CNECC may be lower (if r > 1)
for the unit-delay, single-generation network coded network
compared to the unit-delay, memory-free counterpart. However
a decrease in the actual value of ts cannot be guaranteed and
has to be computed for every network individually in order
to decide whether the CNECC designed for the unit-delay,
memory-free network will continue to work for the single-
generation network coded unit-delay counterpart.
B. Impact on decoding
Decoding of a CNECC: Let GI(z) be the generator matrix
of the code Cs thus designed. Then we refer to the code Cs as
the input convolutional code [3]. The effective code seen by a
sink T is generated by the matrix GO,T (z) = GT (z)MT (z),
which is known as the output convolutional code [3], CO,T ,
at sink T. The decoding of the CNECC at any sink T can be
performed either on the trellis of the code Cs or that of the code
CO,T at that particular sink according to the free distance of
CO,T (dfree(CO,T )), the catastrophic/non-catastrophic nature
of GO,T (z), and a parameter called Tdfree(CO,T ), whose
definition for a rate b/c code C over Fq is given in [3] as
follows.
Tdfree(C) := max
v[0,j)∈Sdfree
j + 1 (26)
where Sdfree [3] is defined as follows.
Sdfree :=
{
v[0,j) | wH
(
v[0,j)
)
< dfree(C),σ0 = 0, ∀ j > 0
}
where
v[0,j) := [v0,v1, ...,vj−1]
is a truncated codeword sequence with vi ∈ Fcq), σt indicates
the content of the delay elements in the encoder at a time t,
and wH indicates the Hamming weight over Fq. The set Sdfree
consisting of all possible truncated codeword sequences v[0,j)
of weight less than dfree(C) that start in the zero state. Then,
we have the following proposition.
Proposition 1 ( [3] ): The minimum Hamming weight trel-
lis decoding algorithm can correct all error sequences which
have the property that the Hamming weight of the error
sequence in any consecutive Tdfree(C) segments (a segment
being a collection of c output symbols corresponding to every
b input symbols) is utmost
⌊
dfree(C)−1
2
⌋
.
With the CNECC in place in a unit-delay. memory-free
network, under certain conditions (see Subsection IV-D of
13
[4]), a sink has to decode on the trellis of the input con-
volutional code, in which case the sink has to multiply the
incoming n output streams with the processing matrix PT (z),
which may require additional memory elements to implement.
However, with a single-generation network code implemented
using memory elements, part of this processing is done in a
distributed manner in the other nodes of the network, thereby
decreasing the memory requirement at the sinks.
In the forthcoming section, we further observe the advan-
tages that the use of memory in the intermediate nodes offers
in the performance of CNECCs under a probabilistic error
setting.
VII. SIMULATION RESULTS
A. A probabilistic error model
Probabilistic error models have been considered in the con-
text of random network coding in [7]. We define a probabilistic
error model for a unit delay network G(V , E) by defining the
probabilities of any set of i (i ≤ |E|) edges of the network
being in error at any given time instant. Across time instants,
we assume that the network errors are i.i.d. according to this
distribution.
Prob.(i network edges being in error) = pi (27)
Prob.(no edges are in error) = q (28)
where 1 < i ≤ |E|, and p, q ≤ 1 are real numbers indicating
the probability of any single edge error in the network and
the probability of no edges in error respectively, such that
q +
∑|E|
i=1 p
i = 1.
B. Simulations on the modified butterfly network
Fig. 13 on the top of the next page shows a modified
butterfly network before and after running Algorithm 2. This
network is clearly a part of the modified double-butterfly
network of Fig. 9, and the associated matrices at the sinks
T1 and T2 are given in Table I. With the probability model as
in (27) and (28) with |E| = 10 for this network, we simulate
the performance of 3 input convolutional codes implemented
on this network for both the with-memory and memory-free
cases as in Fig. 13 with the sinks performing hard decision
decoding on the trellis of the input convolutional code.
In the following discussion we refer to sinks T1 and T2 of
Fig. 13 as Sink 1 and Sink 2. The 3 input convolutional codes
and the rationality behind choosing them are given as follows.
• Code C1 is generated by the generator matrix
GI1(z) = [1 + z 1] ,
with dfree(C1) = 3 and Tdfree(C1) = 2. This code is
chosen only to illustrate the error correcting capability of
codes with low values of dfree(C) and Tdfree(C).
• Code C2 is generated by the generator matrix
GI2(z) =
[
1 + z2 1 + z + z2
]
,
with dfree(C2) = 5 and Tdfree(C2) = 6. This code cor-
rects all double edge errors in the instantaneous version
(with all edge delays and memories being zero) of Fig.
13 as long as they are separated by 6 network uses.
• Code C3 is generated by the generator matrix
GI3(z) =
[
1 + z + z4 1 + z2 + z3 + z4
]
,
with dfree(C3) = 7 and Tdfree(C3) = 12. This code
corrects all double edge errors in the unit-delay network
given in Fig. 13 as long as they are separated by 12
network uses.
We note here that values of Tdfree(C) of the 3 codes are
directly proportional to their free distances, i.e, the code with
greater free distance has higher Tdfree(C).
Fig. 14 and Fig. 15 illustrate the BERs for these 3 codes
for both the with-memory and memory-free case for different
values of the parameter p (the probability of a single edge
error) of (27). Clearly the BER values fall with decreasing p.
The description and explanation of the regions marked
‘dfree dominated region’ and ‘Tdfree dominated region’
(named so according to the dominant parameter in those
regions) are given in [3]. In the following discussion, we
concentrate on the comparison between the performance of
every code in the memory-free and the with-memory case.
Towards that end, we recall from Proposition 1 that both the
Hamming weight of error events and the separation between
any two consecutive error events are important to correct them.
Performance improvement of CNECCs with memory at the
intermediate nodes:
1) With respect to codes C2 and C3, we see that there is
an improvement in performance when memory is used
at the intermediate nodes. This is because of the fact
that the presence of memory elements in the network
results in a clumping-together of error bits at the sinks.
For example, assume that in the network of Fig. 13,
an error occurs in edge s → v1 at time instant t1. We
consider the situation at Sink 2. In the memory-free case,
the effect of this error is felt at different time instants at
the two incoming edges of Sink 2, at t1+1 and at t1+
4. However, with memory elements at the intermediate
nodes, the effects of the edge error now occur at the
same time instant (t1 + 4) in both the incoming edges
of Sink 2. The effect of such errors cumulatively result
in more error events (with less Hamming weights each)
in the memory-free case (because of the distribution of
errors) and less error events (with comparatively more
Hamming weights each) in the with-memory case (as
a result of clumped errors). However, because Codes
C2 and C3 have enough free distance, the number of
such error events is what dominates the performance.
Therefore Codes C2 and C3 correct more errors in the
with-memory case. The same effect may be observed at
Sink 1 also.
2) With respect to the code C1, there is no observable
change in performance between the memory-free and
with-memory cases. We note that the same effect is
observed with the errors as in the previous case. But
because of Tdfree(C1) being less (only 2), the clumping
together of error bits does not benefit much. Therefore
there is no significant improvement in performance.
14
Fig. 13. A modified butterfly network
00.050.10.150.20.250.30.350.40.450.50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Probability of single edge error (p)
B
E
R
 a
t S
in
k 
1
Probability of single error error(p) vs BER at Sink 1
 
 
Code 1 with memory 
 (Free dist. = 3, Td
free
 = 2.)
Code 2  (Inst.) with memory 
 (Free dist. = 5, Td
free
 = 6.)
Code 3 with memory 
 (Free dist. = 7, Td
free
 = 12.)
Code 1 without memory 
Code 2 (Inst.) without memory 
Code 3 without memory 
dfree dominated 
region
Td
free
 dominated region
Fig. 14. BER (with and without memory) at Sink 1
15
00.050.10.150.20.250.30.350.40.450.50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Probability of single edge error (p)
B
E
R
 a
t S
in
k 
2
Probability of single edge error (p) vs BER at Sink 2
 
 
Code 1 with memory 
 (Free dist. = 3, Td
free
 = 2.)
Code 2 (Inst.) with memory 
 (Free dist. = 5, Td
free
 = 6.)
Code 3 with memory 
 (Free dist. = 7, Td
free
=12.)
Code 1 without memory 
Code 2  (Inst.) without memory 
Code 3 without memory 
dfree dominated
 region
Td
free
 dominated region
Fig. 15. BER (with and without memory) at Sink 2
3) There is no significant difference in the performance
of any code between the memory-free and the with-
memory case in the ‘dfree dominated region.’ This is
because of the fact that the errors that occur in the
network are already sparse.
ACKNOWLEDGMENT
This work was supported partly by the DRDO-IISc program
on Advanced Research in Mathematical Engineering through
a research grant to B. S. Rajan.
REFERENCES
[1] R. Ahlswede, N. Cai, R. Li and R. Yeung, “Network Information
Flow”, IEEE Transactions on Information Theory, vol.46, no.4, July
2000, pp. 1204-1216.
[2] R. Koetter and M. Medard, “An Algebraic Approach to Network
Coding”, IEEE/ACM Transactions on Networking, vol. 11, no. 5, Oct.
2003, pp. 782-795.
[3] K. Prasad and B. Sundar Rajan, “Convolutional codes for Network-
error correction”, arXiv:0902.4177v3 [cs.IT], August 2009, Available
at: http://arxiv.org/abs/0902.4177. A shortened version of this paper is
to appear in the proceedings of Globecom 2009, Nov. 30 - Dec. 4,
Honolulu, Hawaii, USA.
[4] K. Prasad and B. Sundar Rajan, “Network error correction
for unit-delay, memory-free networks using convolutional
codes”, arXiv:0903.1967v3[cs.IT], September 2009, Available at:
http://arxiv.org/abs/0903.1967.
[5] X. Wu, C. Zhao and X. You, “Generation-Based Network Coding over
Networks with Delay”, IFIP International Conference on Network and
Parallel Computing, Shangai, China, Oct. 18-21 2008, pp. 365-368.
[6] R. Johannesson and K.S Zigangirov, Fundamentals of Convolutional
Coding, John Wiley, 1999.
[7] D. Silva,F. R Kschischang, and R. Koetter, “Capacity of random
network coding under a probabilistic error model”, 24th Biennial
Symposium on Communications, Kingston, USA, 24-26 June 2008,
pp. 9-12.
