Abstract-A single-source network is said to be memory-free if all of the internal nodes (those except the source and the sinks) do not employ memory but merely send linear combinations of the incoming symbols (received at their incoming edges) on their outgoing edges. Memory-free networks with delay using network coding are forced to do inter-generation network coding, as a result of which the problem of some or all sinks requiring a large amount of memory for decoding is faced. In this work, we address this problem by utilizing memory elements at the internal nodes of the network also, which results in the reduction of the number of memory elements used at the sinks. We give an algorithm which employs memory at all the nodes of the network to achieve single-generation network coding. For fixed latency, our algorithm reduces the total number of memory elements used in the network to achieve single-generation network coding. We also discuss the advantages of employing single-generation network coding together with convolutional network-error correction codes (CNECCs) for networks with unit-delay and illustrate the performance gain of CNECCs by using memory at the intermediate nodes using simulations on an example network under a probabilistic network error model.
I. INTRODUCTION
Network coding was introduced in [1] as a means of achieving maximum rate of transmission in wireline networks. Convolutional network-error correcting codes (CNECCs) were introduced for acyclic instantaneous networks in [2] and for unit-delay, memory-free networks in [3] .
In this work, we consider acyclic, single-source networks with delays, which have a multicast network code in place. The set of all code symbols generated at the source at any particular time instant is called a generation. In unit-delay, memory-free networks, the nodes of the network may receive information of different generations on their incoming edges at every time instant and therefore network coding across generations (intergeneration) is unavoidable in general. However, the sinks have to employ memory to decode the symbols. If memory is utilized in the internal nodes also, such inter-generation network coding can be avoided thus making the decoding simpler.
A single-generation network code is one where all the symbols received at all the sinks are linear combinations of the symbols belonging to the same generation. In [4] , the technique of adding memory at the nodes to achieve singlegeneration network coding was discussed. However, this was done only on a per-node basis without considering the entire topology or the network code of the network. On the other hand, we consider the entire network topology and the network code, which govern the addition of memory elements at the nodes and the way in which they are rearranged across the network to reduce the overall memory usage in the network.
The organization and contributions of this work are as follows
• After briefly discussing the network setup and the network code for an acyclic network with delays and memory (Section II), we introduce different methods of adding memory at a node and analyze how each of them affect the local and global encoding kernels of the network code (Section III).
• After presenting different memory reduction and distribution techniques (Section IV), we propose an algorithm which uses the memory at the nodes to achieve singlegeneration network coding while reducing the overall memory usage in the network (Section V).
• We discuss the advantages of employing memory at the intermediate nodes in tandem with CNECCs in terms of their encoding/decoding (Section VI).
• We illustrate the performance benefits by using memory for CNECCs for unit-delay networks using simulations on an example unit-delay network under a probabilistic error setting (Section VI-A). An expanded version of this paper can be found in [6] with several additional examples and explanations, which have been omitted here due to space considerations.
II. NETWORKS WITH DELAY AND MEMORY
The model for acyclic networks with delays considered in this paper is as in [5] . An acyclic network can be represented as an acyclic directed multi-graph (a graph that can have parallel edges between nodes) G = (V, E), where V is the set of all vertices and E is the set of all edges in the network.
We assume that every edge in the directed multi-graph representing the network has unit capacity (can carry at most one symbol from F q , the field with q elements). Network links with capacities greater than unit are modeled as parallel edges. The network has delays, i.e., every edge in the directed graph representing the input has a unit delay associated with it, represented by the parameter z. Such networks are known as unit-delay networks. Those network links with delays greater 978-1-4244-6404-3/10/$26.00 ©2010 IEEE than unit are modeled as serially concatenated edges in the directed multi-graph. We assume a single-source node s ∈ V and a set of sinks T . Let n be the max-flow min-cut capacity of the multicast connection between s and T .
A. Network code for unit-delay, memory-free networks
An n-dimensional network code can be described by three matrices (over F q ), A n×|E| , K |E|×|E| , and B T |E|×n (for every sink T ∈ T ), the details of which can be found in [5] .
Definition 1 ( [5] ): The network transfer matrix, M T (z), corresponding to a sink node T ∈ T for a n-dimensional network code, is a full rank (over the field of rationals F q (z)) n × n matrix defined as
where I is the identity matrix of size |E| × |E|.
With an n-dimensional network code, the input and the output of the network are n-tuples of elements from F q [[z] ], the formal power series ring over
] is the input to the unit-delay, memoryfree network, then at any particular sink T ∈ T , we have the output,
In Section V we give an algorithm which uses memory elements at the nodes to achieve single-generation network coding, i.e., the network transfer matrix M T (z) of every sink
where L T is some positive integer and M T is a full rank n×n matrix over F q .
III. MEMORY ADDITIONS AT A NODE
In this section, we discuss the different ways of adding memory at a node. We show in Section V that using the memory elements at the nodes according to Subsection III-A and Subsection III-B is sufficient to guarantee single-generation network coding in the given network. The following list of definitions will be used throughout the paper.
ΓI (v)
:Set of incoming edges at node v.
: Set of outgoing edges at node v.
A e i ,e j : local kernel between ei ∈ΓI (s) and ej ∈ ΓO(s).
K e i ,e j : local kernel between ei ∈ ΓI (v) and ej ∈ ΓO(v). : local kernel between ei ∈ ΓI (v) and ej ∈ΓO(v).
The set of all edges feeding into e j ∈ ΓO(v) ∪ΓO(v).
The set of all edges fed by edge e i ∈ ΓI (v) ∪ΓI (v). 
Total memory added at node v.
where
The global kernels of the edges ofΓ I (s) (andΓ O (T )) are the columns of an n × n identity matrix (and M T (z)) over F q , the field over which the network code is defined.
A. Adding memory at a node for a pair of an incoming and an outgoing edge
For any e i , e j ∈Ẽ such that head(e i ) = tail(e j ) = v ∈ V, let M ei,ej be as defined. The local kernel between e i and e j is then modified as
where X may be either A ei,ej , K ei,ej or B v ei,ej according to the node where the memory elements are being added. The matrix F (z) = (I − zK) −1 is also correspondingly modified.
B. Adding memory at a node for an outgoing edge
For some
The elements of one of the matrices, K, A, or B v , are then modified as
where X may be either A ei,ej , K ei,ej or B v e i ,ej according to the node where the memory elements are being added. The elements of the matrix F (z) are also correspondingly modified.
IV. MEMORY REDUCTION AND DISTRIBUTION

TECHNIQUES
In this section, we discuss techniques to reduce the memory used at the nodes of the network and the overall memory used in the network and also to obtain a fairly uniform memory usage distribution throughout the network.
A. Memory reduction in a single node
Consider a node v ∈ V in which memory elements have been added to delay symbols coming from an edge e i ∈ Γ I (v) ∪Γ I (v). Then, retaining the M ei,head(ei),max memory elements, all other memory elements placed on e i can be removed without any change in any local or global kernels by tapping symbols from the M ei,head(ei),max memory elements wherever necessary. Doing this for every incoming edge of v is equivalent to obtaining a minimal encoder (one with minimum number of memory elements) of the transfer function (inputoutput relationship) at node v.
B. Memory reduction between nodes
In this subsection, we discuss memory reduction techniques between different nodes. Towards that end, we define the following terms.
Ev
: Set of adjacent nodes of v.
M E (E ⊆Ẽ) : Minimum number of memory elements added to delay symbols coming from any ej ∈ E . M E := min
V E (E ⊆Ẽ) : Set of all nodes which receive symbols from the edges in E .
1) Memory reduction between adjacent nodes: For a node v ∈ V, and for some
i.e., the global kernels of the edges in e j ∈ Γ O (v) are linear combinations of the global kernels of the edges in Γ I (v) only and none else. Also let M Γ O (v) and the set V Γ O (v) ⊆ E v be the values of M E and V E for the set of edges E = Γ O (v) according to (5) and (6) .
We define the term
If the condition
is satisfied, then all of the |Γ O (v)|M Γ O (v) memory elements used at the nodes V Γ O (v) (to delay symbols coming from the edges e j ∈ Γ O (v)) can be 'absorbed' into node v by removing all these memory elements and adding M ei,Γ O (v) memory elements at node v for every e i ∈ Γ I (v) (and thereby used for delaying the symbols coming from every e i ∈ Γ I (v)), without using any additional memory and without changing the global kernels of any outgoing edge of any node in V Γ O (v) . For a node v, we define the set P v as
2) Memory reduction between nodes not necessarily adja-
are satisfied, where s v is the maximum number of sets satisfying conditions (9) and (10). Algorithm 1 obtains the set P v for some node v. Algorithm 1: Algorithm to obtain set P v for a node v.
Input:
A node v ∈ V with the edge sets
, a sequence of pairs of edge-sets as
, and m is the maximum length of the sequence, that is possible to be obtained as in (11) for the edge-set pair (v) and the set of nodes V ΓO i (v) be the values of M E and V E for the set E = Γ Oi (v), according to (5) and (6) . Let V Ei k be the set of nodes V E with E = E i k in (6) . Also, let M ei k ,ΓO i (v) be as in (7) for the set Γ Oi (v) and for an edge e i k ∈ E i k . As in the memory reduction procedure of adjacent nodes, if 
C. Memory distribution
The following technique can be used to distribute memory elements throughout the network in a somewhat uniform way. Suppose there exists a node v ∈ V such that for some e j ∈ Γ O (v) with v = head(e j ), and for some integer m ≤ M ej ,head(ej ),min ,
Then, the m memory elements at node v used to delay symbols coming from edge e j can be 'absorbed' into node v (thereby using them to delay symbols going into edge e j ) without changing the global kernels of any edge in Γ O (v ).
V. SINGLE-GENERATION NETWORK CODING -ALGORITHM
This section presents the main contribution of this paper. For an edge e i ∈ E, let f ei (z) ∈ F n q (z) represent the global kernel of e i . We say that a node v ∈ V\ {s} is a coding node if the global kernel of at least one of its outgoing edge is a F q (z) linear combination of the global kernels of at least two of its incoming edges. Otherwise, we call v a forwarding node.
Let V cod be the set of coding nodes, and V fwd be the set of forwarding nodes. Let V 0 cod be the set of all coding nodes such that there exist no path in the network from any other coding node to any node in V 0 cod . Towards proposing an algorithm to enable single-generation network coding, we make some observations and discuss the addition of memory elements at the coding nodes to achieve single-generation network coding.
Observation 1: For any v ∈ V 0 cod , the global kernel of any e ∈ Γ I (v) is of the form f e (z) = z le f e for some positive integer l e , with f e ∈ F n q . If the network is a unit-delay network and the node v uses no memory, the global kernel of any e j ∈ Γ O (v) is of the form
where l ei is a positive integer signifying the accumulated delay from the source to edge e i , and K ei,ej ∈ F q signifies the local kernel coefficient between e i and e j . The additional z is to account for the delay in the unit delay network.
A. Single-generation processing at the nodes
For every pair of edges e i , e i ∈ Γ I,ej (v) such that l ei < l e i , we may add M ei,ej = l e i − l ei memory elements at node v to delay the symbols coming from e i such that the global kernel of the edge e j becomes
where l ej ,max = max ei∈ΓI,e j (v) l ei and K ei,ej ∈ F q . Once this process of using memory at the node v results in the global kernel of every edge in Γ O (v) to be a linear combination of symbols from the same generation (generations between different outgoing edges need not be the same), we say that single-generation processing has been achieved at node v. For a node T ∈ T , we say that single-generation processing has been achieved at sink T if the condition (1) 
B. Algorithm for single-generation network coding
Algorithm 2 achieves single-generation network coding using memory by concatenating all the techniques described in the previous sections.
Example 1: Fig. 1 shows a modified unit-delay doublebutterfly network before and after running through Algorithm 2 with the standard network code over F 2 . Node s is the source, and T i , i = 1, 2, 3, 4 are the sinks. The arrows with double arrow-heads represent the virtual input edges at the source and virtual output edges at the sinks. Table I shows the network transfer matrices before and after obtaining single-generation processing using Algorithm 2. Table I also shows a comparison between the memory requirements at the sinks (for decoding) between intergeneration network coding (i.e., the memory-free case; the numbers shown are the sum of the row degrees of realizable inverse matrices in the third column) and single-generation network coding (as shown in Fig. 1 ). In the memory-free case, assuming that sinks use memory individually to decode, the total number of memory elements used in the network is 19, and all of them are used at the sinks. In the single-generation network coded network as shown in Fig. 1 , it can be seen that the total number of memory elements used in the network is 12, out of which only 7 are used at the sinks, thereby showing a marked reduction from the memory-free case. The rest of the memory elements (numbering 5) are distributed across the nodes of the network. A modified double-butterfly network before and after applying Algorithm 2 (Example 1). The boxes indicate the nodes to which the memory elements belong.
VI. IMPACT OF SINGLE-GENERATION NETWORK CODING ON NETWORK-ERROR CORRECTION
For details on the basics of convolutional codes, we refer the reader to [7] . Details on construction of CNECCs for memoryfree networks can be found in [2] (for instantaneous networks) and [3] (for unit-delay networks). It is observed [6] that using memory might reduce the demand on the free distance (compared with the memory-free case) of a CNECC which corrects any errors in given set of error patterns. Also, the decoding of such CNECCs for unit-delay networks involves the use of memory at the sinks, which is again reduced and distributed amongst all the nodes of the network.
The error correcting capability of a convolutional code C can be expressed as a function of both its free distance and a parameter called T d free (C) [3] defined as follows (with v [0,j) being a codeword sequence truncated from the j th instant)
A good code has more free distance and less T d free (C).
A. Simulation results
1) A probabilistic error model:
We define a probabilistic error model for errors at any given time instant in a unit delay network G(V, E) as
where p, q ∈ R ≥0 are such that q +
2) Simulations on the modified butterfly network:
The subnetwork (shown in dotted lines) of the network in Fig. 1 is a modified butterfly network before and after running Algorithm 2. With the probability model as in (16) and 3) Performance improvement of CNECCs with memory at the intermediate nodes:
1) With respect to codes C 2 and C 3 , we see that there is an improvement in performance when memory is used at the intermediate nodes. This is because of the fact that the presence of memory elements in the network results in a clumping-together of error bits at the sinks. Network errors, hence, cumulatively result in more error events (with less Hamming weights each) in the memory-free case and less error events (with comparatively more Hamming weights each) in the with-memory case. However, because C 2 and C 3 have sufficient free distance, the number of such error events is what dominates the performance. 2) With respect to the code C 1 , there is no observable change in performance between the memory-free and with-memory cases. This is because T d free (C 1 )(= 2) is less, so the clumping together of error bits does not benefit much. 3) There is no significant difference in the performance of any code between the memory-free and the withmemory case in the 'd free dominated region.' This is because of the fact that the errors that occur in the network are already sparse.
ACKNOWLEDGMENT
This work was supported partly by the DRDO-IISc program on Advanced Research in Mathematical Engineering through a research grant to B. S. Rajan.
