On the Energy Complexity of LDPC Decoder Circuits by Blake, Christopher & Kschischang, Frank R.
1On the Energy Complexity of LDPC Decoder
Circuits
Christopher Blake and Frank R. Kschischang
Department of Electrical & Computer Engineering
University of Toronto
christopher.blake@mail.utoronto.ca frank@comm.utoronto.ca
Abstract
It is shown that in a sequence of randomly generated bipartite configurations with number of left nodes
approaching infinity, the probability that a particular configuration in the sequence has a minimum bisection width
proportional to the number of vertices in the configuration approaches 1 so long as a sufficient condition on
the node degree distribution is satisfied. This graph theory result implies an almost sure Ω
(
n2
)
scaling rule for
the energy of capacity-approaching LDPC decoder circuits that directly instantiate their Tanner Graphs and are
generated according to a uniform configuration model, where n is the block length of the code. For a sequence of
circuits that have a full set of check nodes but do not necessarily directly instantiate a Tanner graph, this implies
an Ω
(
n1.5
)
scaling rule. In another theorem, it is shown that all (as opposed to almost all) capacity-approaching
LDPC decoding circuits that directly implement their Tanner graphs must have energy that scales as Ω
(
n (log n)
2
)
.
These results further imply scaling rules for the energy of LDPC decoder circuits as a function of gap to capacity.
I. INTRODUCTION
Low density parity check codes are a class of codes first introduced by Gallager in [1]. This paper
finds fundamental lower bounds on the energy of VLSI implementations of capacity-approaching LDPC
decoders. Central to the construction and analysis of LDPC codes is the randomly generated Tanner graph
with a given degree distribution. A widely used method of analysis involves analyzing an ensemble of
LDPC codes whose Tanner graphs are generated according to some distribution. It has been shown that
there exist degree distributions that result in LDPC codes and decoders that can get arbitrarily close to
capacity for an erasure channel [2]. The first main result of this paper is an ”almost-sure” scaling rule
for the energy of capacity-approaching LDPC decoders whose Tanner graphs are generated according to
a uniform configuration model. The second main result of this paper is a scaling rule for the energy of
all, as opposed to almost all, capacity-approaching LDPC decoders. What we mean by an ”almost sure”
and ”sure” scaling rule will be made more precise later in the paper.
To find energy-complexity lower bounds on a class of algorithms a computation model is needed. We
use a standard circuit model that was first presented by Thompson in [3]. In this model, we consider
the energy of a circuit implementation of an algorithm to be the area of the circuit multiplied by the
number of clock cycles required to execute the algorithm. We will give a more detailed discussion of this
model later in the paper. The authors of [4] used the Thompson model to analyze the energy complexity
of all decoding algorithms by showing that as the target block error probability approaches 0, the total
energy must approach infinity. In [5] the authors showed that any fully-parallel decoding scheme that
asymptotically has block error probability less than 1
2
must have energy complexity which scales as
Ω
(
n
√
log n
)
. These results, though general, do not suggest the existence of any decoder implementations
that reach these lower bounds. In this paper, we in particular show that the energy of LDPC decoding
schemes that directly-implement their Tanner graphs cannot reach the Ω
(
n
√
log n
)
energy lower bound,
and in fact must have energy that scales at least as Ω
(
n (log n)2
)
.
Submitted for publication on February 25th, 2015. Presented in part at the 2014 IEEE North American School of Information Theory,
June 18–21, Toronto, Canada.
ar
X
iv
:1
50
2.
07
99
9v
1 
 [c
s.I
T]
  2
7 F
eb
 20
15
2We begin the paper in Section II with a discussion of the graph theory used in the paper, and we
also discuss some prior work that reaches similar conclusions to our paper. Then, in Section III we
introduce graph theory definitions and the circuit model that we will use. We also present some important
lemmas that will be used in our theorems. Then, in Section IV, after defining some properties of node
degree distributions, we present the main theorem which shows that almost all LDPC Tanner graphs have
minimum bisection width proportional to the number of vertices. We proceed to show how this theorem
allows us to find scaling laws for the energy of directly-implemented LDPC decoders in Section V. The
results presented in these sections are true for almost all LDPC decoders (i.e., for a set of decoders with
probability approaching one), but it is not clear whether there is a set of LDPC decoders of probability
approaching 0 that can approach capacity. Thus, in Section VI we present a theorem that relates the
number of edges and vertices in a graph to the area of its circuit instantiation to show a scaling rule
that is applicable to any LDPC decoding algorithm that approaches capacity. This results in a sure as
opposed to almost sure scaling law for the energy per iteration of a directly-instantiated LDPC decoder
of O
(
n (log n)2
)
.
II. BACKGROUND
A. Related Work on LDPC Scaling Rules
There are some results on fundamental limits on wiring complexity of LDPC decoders. In particular,
in [6], the authors assume that the average wire length in a VLSI instantiation of a Tanner graph is
proportional to longest wire in an asymptotic sense, and that the longest wire is proportional to the
diagonal of the circuit upon which the LDPC decoder is laid out. The implication of these assumptions
is an Ω (n2) scaling rule for the area of directly-implemented LDPC circuits, which is the same result of
this paper. However, these assumption are taken as axioms without being fully justified; there certainly
can exist bipartite Tanner graphs that can be instantiated in a circuit without such area. The result of
this paper suggests that, in fact, the Ω (n2) scaling rule is justified for almost all VLSI instantiations of
LDPC Tanner graphs as the block length of these LDPC codes grow large, where the Tanner graphs are
generated from a uniform configuration model and a sufficient condition on the node degree distributions
is satisfied. This scaling rule is an implication of the main theoretical contribution of this paper: a result in
random graph theory that we present as Theorem 1. In addition to this, we provide a super-linear energy
scaling rule for all directly-implemented LDPC decoders, even if the Tanner graph of such decoders is
not generated according to the uniform configuration model.
B. Related work on Graph Theory
In graph theory, there are a number of results that study the minimum-bisection width of graphs. Often
this work looks at a graph’s Laplacian, which is a matrix equal to the difference in the graph’s degree
matrix and adjacency matrix. In [7] a graph’s Laplacian is analyzed and it is shown that the second largest
eigenvalue, λ2, can be used to find a lower bound of λ2n4 on the graph’s minimum bisection width. In
[8], the authors find some bounds on the bisection width of graphs that are related to this λ2 value. The
authors in [9] provide almost sure upper bounds for the bisection width of randomly generated regular
graphs. Our result does not consider the second greatest eigenvalue of the Laplacian of a graph to bound
the minimum bisection width. Instead, we use a unique purely combinatorial approach to reach our almost
sure lower bounds. Furthermore, our analysis is of random bipartite graphs, as opposed to random regular
graphs. As well, our result makes only weak assumptions on the node degree distribution to get our lower
bound, without requiring a degree-regularity assumption. The generality of the result allows us to apply
the theorem to find a scaling rule for the area of almost all capacity-approaching directly-implemented
LDPC decoding circuits.
3Fig. 1. Example of two graphs with a minimum bisection labelled. Nodes are represented by circles and edges by lines joining the circles.
A dotted line crosses the edges of each graph that form a minimum bisection.
III. DEFINITIONS AND MAIN LEMMAS
A. Graph Theory Definitions
The main result of our paper involves the minimum bisection width of a graph. The minimum bisection
width is a property of any graph. A bisection is a set of edges that once removed divides the graph into
two subgraphs that have the same number of vertices. A formal definition is given below.
Definition 1. Consider a graph G with vertices V and edges E. Let Es ⊆ E be a subset of the edges.
Then Es bisects G if removal of Es cuts V into unconnected sets V1 and V2 in which ||V1| − |V2|| ≤ 1.
A minimal bisection is a bisection of a graph whose size is minimal over all bisections. The minimum
bisection width is the size of a minimal bisection.
Generally speaking, finding the minimum bisection width of a graph is a difficult problem (it is in fact
NP-Complete [10]). The diagram in Fig. 1 shows minimal bisections of a few simple graphs. Associated
with a bisection Es of a graph G are two unconnected graphs G1 = (V1, E1) and G2 = (V2, E2) induced
by the bisection. We will refer to the set of vertices V1 and V2 each as a bisected set of vertices induced
by a bisection or, more compactly, a bisected set of vertices, where the association with the particular
bisection is to be implicit.
Note that in this paper we will often consider dividing the vertices of a subset into two disjoint sets
V1 and V2 in which ||V1| − |V2|| ≤ 1. For convenience of discussion, we call this process dividing the
vertices in half. We make particular note of this to avoid in every case having to distinguish between if
the cardinality of the set of vertices in question is even or odd.
B. Circuit Model
Central to our discussion is the relation between minimum bisection width of a graph and the area (and
thus energy) of a circuit that implements that graph. Our discussion applies directly to LDPC decoders,
and within our model we must define an LDPC decoder, as well as a more general circuit. In this paper,
the definition of a circuit is adapted from Thompson [3] and is considered to be a mathematical object
consistent with the following circuit axioms. This model was also used in [4] to find bounds on the energy
complexity of encoding and decoding algorithms. We also provide a diagram of an example circuit in
Figure 2.
• A circuit is a collection of nodes and wires laid out on a planar grid of squares. Each grid square
can be empty, can contain a computational node (sometimes referred to more simply as a node), a
wire, or a wire crossing. A circuit also has some special nodes called input nodes and also output
nodes. The purpose of a circuit is to compute a function f : (0, 1)n → (0, 1)k. Such a circuit is said
to have n inputs and k outputs. The computation is divided into τ clock cycles. The inputs into a
computation are to be loaded into the input nodes, and the outputs are to appear in the output nodes
during some set clock cycle of the computation.
• Each grid square has width λw, known as the wire width and thus has area λ2w. It is in this parameter
that this circuit model subsumes different VLSI implementation techniques. In real circuits, this
4Fig. 2. Diagram of a possible VLSI circuit. Grid squares that are fully filled in represent computation nodes and the lines between them
represent wires.
parameter may be a value like 14 nanometers. Our concern in this paper is not what this value is,
but rather in providing scaling rules in terms of the VLSI implementation technology used.
• The computational nodes are the “computing” parts of the circuit. A node has at most 4 bidirectional
wires connected to it, which are used to feed in bits into the node and feed out the bits computed
by the node. Each node is capable of computing a fixed function of the bits fed into it by the wires
connected to them during each clock cycle. In particular, a node with f ≤ 4 wires leading into it
can compute any function g : {0, 1}f → {0, 1}f . However, a computational node is restricted to only
be able to compute the same function at each clock cycle. We note, of course, that the output of a
particular node could change with each clock cycle because, in general, the inputs into the function
could change with each clock cycle.
• The wires are the “communication” part of a circuit. Wires in a circuit are connections between
computational nodes, and are assumed in our model to be bidirectional. At each clock cycle a wire
can carry one bit in each direction. The bits communicated are an output of the function computed
by the computational node to which the wire is connected. A wire can be placed in a grid square in
a way that connects one edge of the grid square to some other edge. Thus, grid squares containing
wires can be connected to form a wire leading from one node to another node.
• An input node is a special node in the circuit. In addition to being able to compute any fixed function
mapping its f ≤ 4 inputs to its f ≤ 4 outputs, this node is also given an input bit into the circuit. In
general, at each clock cycle an input node can have as its input a new input into the function. Thus, we
say that inputs, in general, can be serialized; that is, they can be injected into the circuit at different
clock cycles of the computation. Usually it is assumed that the inputs into an input node are chosen
from the set {0, 1}; however, sometimes (especially for the purpose of lower bound) we can assume
that the inputs into an input node are chosen from a larger set of values. In [5] it was assumed that
an input node that is attached to f wires can compute any function g : {0, 1, ?}× {0, 1}4 → {0, 1}4,
{i.e., the node can perform any function of its 4-bit input from the wires connecting to it, as well
as its input, taken from the symbols {0, 1, ?}, where in the case of this assumption ? is considered
an erasure symbol. In our analysis we can assume that an erasure is a valid input as well, however
this is not a central assumption of this paper and the results apply to inputs being taken from the set
{0, 1}.
• An output node is another special node in a circuit. It is permitted to, like any other node, compute
any function of its inputs, but it is given an additional output. Thus, in the case of an output node
with f ≤ 4 wires leading from it, the output node can perform any function g : {0, 1}f → {0, 1}f+1
where one of the bits in the output is distinguished as an output bit. The output node is required
to hold in its output bit some circuit output during set clock cycles. In a fully parallel computation
5the output node is required to hold one output bit of the computation at the end of the computation,
but in general the outputs may be serialized, and one output node can be responsible for outputting
a number of the outputs of the computation, where each output has a specified clock cycle during
which it is to appear.
• A wire crossing in a circuit is a grid square that contains two wires that “cross” each other. An
example of a circuit with computational nodes, wires, and a wire crossing is given in Fig. 2.
• The normalized area of a circuit is the number of grid squares occupied, and it is denoted with
the symbol A. The number of grid squares occupied with nodes/wires is the normalized area of the
nodes/wires of the circuit, and is denoted A¯n/A¯w. Thus, the actual area of the circuit is A = λ2wA¯
and the area of the nodes/wires are defined similarly by multiplying the normalized value by λ2w, the
area of a unit grid square.
• The energy of a computation is proportional to the product of the area of the circuit, times the number
of clock cycles. Real VLSI circuits are made of conducting material laid out essentially flat; thus,
in our model, we say that the capacitance of a circuit is proportional to its area. A circuit works
by, at every clock cycle, charging or discharging its wires. It is thus assumed that the energy of a
computation is proportional to 1
2
CV 2ddτ where C = Cunit-areaA. Thus, we can denote the energy of a
computation as Ecomp = ξtechAcτ where ξtech = 12Cunit-areaV
2
dd is a constant that varies depending on
the technology used to implement the circuit. For decoder circuits we often denote the energy of
computation as Edec where the subscript indicates the type of computation performed by the circuit
under consideration.
Note that the restriction that each node has at most four inputs and four outputs is somewhat arbitrary;
it is also arbitrary that each node is permitted to compute any function of its inputs all at the same
cost. In real VLSI implementations it may be that an arrangement of transistors can compute some
functions more efficiently than others. However, our model does not consider what gains could be made if
certain functions are cheaper in an energy sense to compute. On the other hand, the model subsumes the
interconnection complexity of the inputs of the function to their outputs. In the field of error control codes,
this interconnection complexity has been shown to be a significant factor in the energy of a computation
in, for example, [11], [12].
C. Relationship Between Circuit Model and Graphs
This paper analytically characterizes a relationship between the energy of LDPC decoders as a function
of block length and gap to capacity. To understand this we must first define what is meant by an LDPC
decoder implemented according to the Thompson VLSI model. To understand this we must first understand
the connection between a circuit and the graph corresponding to a circuit.
Note that a circuit is a collection of nodes connected by wires. Each of the computational nodes of a
circuit can be thought of as the vertices of a graph, G = (V,E). The wires of a circuit correspond to the
edges of a graph. In particular, two vertices v1 and v2 are connected in the graph G by an edge if and only
if there is a wire connecting the two computational nodes that correspond to v1 and v2. Thus, any circuit
can be considered a graph. As well, any graph can be implemented as a circuit (although of course there
may be many ways to implement a particular graph on a circuit). Note that although a circuit, according
to our model, must be planar, since we also allow wire crossings, any graph can be implemented, though
it may be that more complex graphs require far more circuit area.
Note that saying that a circuit has a corresponding graph is a slight abuse of terminology: a graph,
according to common definitions, does not allow for two edges between the same nodes, but obviously
two computational nodes are permitted to have two or more wires connecting them. More precisely, we
mean that a circuit has a corresponding multi-graph. However, for the sake of simplicity we simply call a
circuit’s corresponding multigraph a graph, and we hope that this does not cause confusion for the reader.
Sometimes in our discussion we may want to refer not to a particular node of the circuit (corresponding
to the node of a graph), but rather to the nodes associated with a subcircuit, which leads to the following
definition.
6• A subcircuit is a circuit corresponding to a subset of nodes of the graph and the wires connecting
them. In particular, it is the circuit induced by deleting all wires not connecting the nodes of interest
and by deleting all the other nodes in the graph. Any subcircuit has associated with it both internal
wires (the wires connecting the nodes of this circuit) and also external wires, the wires leading from
nodes within the subcircuit to nodes from outside the subcircuit. Note that the notion of a subcircuit
corresponds to a particular subgraph of the graph of the circuit. In the language of graph theory [13],
we can say that a subcircuit with computational nodes corresponding to some subset of V ′ ⊆ V
corresponds to the subgraph induced by the vertices in V ′. Note that any subset of the computational
nodes of a graph induces a subcircuit and also a subgraph of the circuit’s graph.
D. LDPC Decoders
An LDPC code is a linear code first invented by Gallager in [1]. All linear codes can be specified by a
parity check matrix. Central to the construction LDPC codes is the Tanner graph of the code corresponding
to a parity check matrix of the code. A Tanner graph is a bipartite graph. Thus, such a graph has two
partite sets, or sets of unconnected vertices which are referred to as the check nodes and the variable
nodes. An (n, k) LDPC code has associated with it a Tanner graph with n variable nodes and at least
n − k check nodes (we say at least because it may be that some of the linear constraints induced by
the check nodes are not linearly independent). The n variable nodes correspond to the n symbols of a
block length n codeword in the LDPC code. A codeword c ∈ {0, 1}n is in the LDPC code generated by
a Tanner graph if, for each check node in the Tanner graph of the code, the mod 2 sum of the values
of the variable nodes to which they are connected is 0. The association of a set of linear constraints with
a Tanner graph leads to natural and very efficient methods of decoding that exploit the sparse nature of
the Tanner graph.
An LDPC decoding algorithm associated with a Tanner graph is a message-passing procedure. Each
variable node is thought conceptually to be connected to their check nodes, and each check node corre-
spondingly to their variable nodes. In general, a variable node has as its inputs a message passed to it
from each of the check nodes to which it is connected, as well as the output of a noisy channel. A variable
node, in general, is able to compute any function of these inputs and pass the outputs of this computation
to its adjacent check nodes. The check nodes are similarly allowed to compute any function of their inputs
(which will be in general the outputs of the variable nodes to which they are connected). An iteration
of an LDPC decoder is one instance of this procedure: the variable nodes computing a function that is
then passed to the check nodes, and then the check nodes computing a function of these messages and
passing the output of these functions back to the variable nodes to which they are connected. A good
LDPC decoding algorithm should choose these functions well, so that, at the end of a certain number of
clock cycles τ , the variable nodes hold within them an estimate of the original input into a noisy channel.
In the most general case, we allow the check and variable nodes to compute different functions of their
inputs during different iterations (i.e., the function they compute in general may vary in time). Gallager
discussed a variety of these message passing procedures in [1].
To instantiate an LDPC decoding algorithm in a circuit, we consider two possible paradigms, a directly-
implemented technique in which the Tanner graph of an LDPC code is directly instantiated in some sense
by the circuit, and a complete-check node serialized technique, in which the Tanner graph is not necessarily
directly implemented, but there are subcircuits in the graph corresponding to each check node and an LDPC
message passing procedure is performed.
A directly-instantiated LDPC decoder can be thought of as a circuit that has a graph that is an
implementation of a Tanner graph of the underlying LDPC code. To be precise, we will use terminology
borrowed from graph theory regarding the subdivision of a graph.
Definition 2. Suppose a graph has an edge, e, connecting vertices v1 and v2. Then a subdivision of edge
e in a graph is a process that takes the graph G and forms a new graph G′ with an additional vertex v′
7and two additional edges connecting v1 and v2 to v′ by replacing e with two edges. A subdivision of a
graph G is a graph obtained by the successive subdivisions of edges in the graph.
If a graph G has a subgraph that is a subdivision of a graph G′, then we say that the graph G contains
graph G′. This leads to an important lemma that will allow us to connect bounds on graph properties of
a Tanner graph to the area of directly-implemented LDPC decoders.
Definition 3. A directly-implemented LDPC decoder is a circuit associated with an LDPC code with a
Tanner graph T . Consider the graph associated with the circuit. Then a circuit is a directly-implemented
LDPC decoder if its graph contains T .
This means that a circuit is a directly-implemented LDPC decoder if there are subcircuits corresponding
to each variable node and edges leading from these “black boxes” that connect to subcircuits that
correspond to the check nodes of the Tanner graph.
Associated with any graph G is a quantity that we will call the minimum area of a circuit implementation
of G, or, to be more concise, the area of G. The area of a graph G is the circuit with corresponding graph
G with the minimum number of grid squares occupied. We denote this quantity as Amin (G).
Lemma 1. If a graph G contains a graph G′, then Amin (G) ≥ Amin (G′).
Remark 1. This is a very intuitive idea. If a graph contains another graph, then naturally one would regard
the original graph as “larger” in some sense then the graph that it contains. This notion will be used to
connect a bound on the area of a circuit implementing the Tanner graph of an LDPC code to a bound on
directly-implemented LDPC decoders.
Proof: Suppose that Amin (G) < Amin (G′). Consider the circuit with minimal area that implements
G. We can use this circuit to construct a circuit for G′ with area less than Amin (G′), resulting in a
contradiction. Since G contains G′, there is a subgraph of G that is a subdivision of G′. Consider the
subcircuit associated with that subgraph. Clearly, this subcircuit has area less than or equal to Amin (G).
Delete those nodes of this subgraph that correspond to subdivisions of edges of G′. On a circuit, this
corresponds to replacing a computational node with merely a wire. This process does not change the area
of this subgraph, and it will result in a circuit for G′ less than Amin (G′), a contradiction.
There is a key result attributed to Thompson [3] that relates a graph’s minimum bisection width to the
area of a circuit implementing that graph, presented in the following lemma.
Lemma 2. If a graph has minimum bisection width ω, then the area of a circuit implementing this graph
is lower bounded by
Ac ≥ λ
2
wω
2
4
.
Proof: See Thompson [3] for a detailed proof.
Currently, our definition of a directly-implemented LDPC decoder subsumes many practical implemen-
tations of LDPC decoding algorithms, but in practice circuits can be implemented that perform an LDPC
decoding algorithm and do not directly instantiate the Tanner graph of the code. This thus motivates the
following definition of a more general type of LDPC decoder.
Definition 4. An (n, k) complete-check-node LDPC decoder associated with Tanner graph T is a circuit
with n separate subcircuits each corresponding to a variable node in T and one subcircuit corresponding to
each check node in T . During one iteration a message must be passed from each variable-node subcircuit
to each adjacent check-node subcircuit, and also from each check-node subcircuit to each adjacent variable-
node subcircuit. To be precise, the check-node subcircuits that are adjacent to a variable-node subcircuit
are those check-node subcircuits that correspond to check nodes in T that are adjacent to the variable
node that corresponds to the variable-node subcircuit of interest. The variable-node subcircuits that are
adjacent to a check-node subcircuit are defined similarly.
8Note that for such a circuit we do not require that a wire exists in the circuit for each edge in the Tanner
graph. Thus, it is possible that a complete-check-node LDPC decoder can use the same wire multiple
times, but in different clock cycles to communicate information during an iteration.
Our results rely on the evaluation of some limits, which we present as lemmas below.
Lemma 3. Suppose P (n) = O
(
nk
)
for some k > 0 and is positive for sufficiently large n, and there is
a sequence n1, n2, . . . that increases without bound. Then:
lim
i→∞
P (ni) exp (−nif (n)) = 0 if
lim
n→∞
f (n) > 0.
Proof: Since limn→∞ f (n) > 0 and the sequence ni increases without bound, then for sufficiently
large i, f (ni) > c for some c > 0 (in particular for any c strictly less than the value of the limit). Then,
for sufficiently large i,
P (ni) exp (−nif (ni)) ≤ P (ni) exp (−cni) .
Clearly, limi→∞ P (ni) exp (−cni) = 0 and because P (n) is positive for large enough n, P (ni) exp (−nif (ni)) >
0 for large enough i. The limit thus follows from the squeeze theorem.
Lemma 4. For any two positive integers m and n in which
m+ n ≤ Y (1)
for an integer Y > 0 where Y ≤ Z and both m ≤ Z and n ≤ Z,
m!n! ≤ Z! (Y − Z)!. (2)
Proof: Since m + n ≤ Y , then n = Y −m surely maximizes the product m!n! (regardless of any
additional restriction on n). Suppose a possible choice of m = Z− c and m = Y −Z+ c, for some c > 0
in which Y − Z + c ≤ Z. We divide Z! (Y − Z)! by the quantity (Z − c)! (Y − Z + c)! and show that
this quantity is greater than or equal to 1, meaning that Z! (Y − Z)! maximizes the product:
Z! (Y − Z)!
(Z − c)! (Y − Z + c)! =
Z (Z − 1) . . . (Z − c+ 1)
(Y − Z + c) (Y − Z + c− 1) . . . (Y − Z + 1)
Note that the numerator and denominator have precisely c terms. Since Z ≥ Y − Z + c the terms in
the numerator are strictly greater than a corresponding term in the denominator, unless Y − Z + c = Z,
but of course in this case the product is merely equal to the upper bound in (2).
IV. MAIN THEOREM
Our main theorem is fundamentally graph-theoretic in nature and applies to graphs generated according
to a standard uniform random configuration model. We present this theorem in a general form and then
specialize it to create an “almost sure” scaling rule for capacity-approaching LDPC codes.
Consider the set of bipartite graphs G = (VL q VR, E) in which |VL| = n, |VR| = m, and with left node
degree sequence Λ = (λ1, λ2, . . . , λn) ∈ (N)n and right node degree sequence P = (ρ1, ρ2, . . . , ρm) ∈
(N)m. In other words, for a particular graph in this set, λi is the degree of vi ∈ VL, the ith left node in
the graph, and ρi is the degree of ri ∈ VR, the ith right node in the graph. Without loss of generality,
assume that the degree sequences are ordered, i.e. that λ1 ≤ λ2 ≤ . . . ≤ λn and ρ1 ≤ ρ2 ≤ . . . ρm, and
also, without loss of generality, assume n ≥ m. Denote this set G (Λ, P ). Note that the number of edges
in each particular graph in G (Λ, P ) is |E| = ∑ni=1 λi = ∑mi=1 ρi.
9For convenience of counting, we will consider not the set of graphs with a particular degree sequence,
but rather the set of configurations with this degree sequence. We can associate each node in a graph
with a number of sockets equal to its degree. Then, we can label each socket, so that, for example, the
first node in the left side of the bipartite graph would have sockets labelled L11, L12, . . . L1λ1 , where the
symbol Lij is used to denote the jth socket on the ith left node. Thus, the ith left node would have
λi sockets labelled Li1, Li2, . . . Liλi . Also, the right nodes would have sockets labelled Rij , where Rij
denotes the jth socket on the ith right node. This node and socket configuration model is a standard way
to consider the set of bipartite graphs that form the Tanner graphs of LDPC ensembles, and in particular
is discussed in length in [14]. A multigraph together with a labelling of the sockets of each node is called
a configuration. Any particular left and right degree sequences Λ and P have associated with them the
set of all configurations with these node degree sequences, and this set is called the configuration space
associated with the degree sequences. Clearly, a configuration is determined by a permutation mapping the
|E| left node sockets to the |E| right node sockets. Note that there are |E|! configurations within the space
of configurations with degree sequences Λ and P . Let the set of configurations with degree sequences Λ
and P be denoted B (Λ, P ). Since a configuration is merely a graph with a labelling of sockets for each
node, graph properties can be extended to describe configurations in the natural way, including minimum
bisection width.
Define
Ba = {G ∈ B (λ, ρ) : ∃ a bisection K ∈ E such that |K| = a}
or in other words let Ba be the set of configurations in B (λ, ρ) that have a bisection of size a. Note that
Ba does not represent the set of configurations in B (Λ, P ) with minimum bisection width a, but rather
the set of graphs with any bisection of size a. Define B∗a to be the set of all configurations in B (Λ, P )
that have a bisection of size a or less, or in particular
B∗a =
a⋃
i=0
Bi.
Define
δL (Λ) =
1
n
n∑
i=bn2 c
λi (3)
(a function of a particular left degree sequence) and let
σL (Λ) =
|E|
n
− δL.
We define these quantities so that any subset of half the left nodes can have at most δn “sockets” leading
from these nodes. Similarly, define
δR (P ) =
1
m
m∑
i=bm2 c
ρi
and
σR (P ) =
|E|
m
− δR
The quantities δL (Λ) and σL (Λ) are functions of the left degree distribution. As well, δR (P ) and σR (P )
are functions of the right degree distribution. For convenience, we may sometimes denote these quantities
as δL, σL,δR and σR, and their dependence on the degree distributions is to be implicit. Thus, it is clear
that the total number of edges in such a configuration is δLn+ σLn = δRm+ σRm. Define
δ (Λ, P ) =
max (nδ (Λ) ,mδ′ (P ))
n
(4)
10
and define
σ (Λ, P ) =
|E|
n
− δ.
For notational convenience we will abbreviate these two quantities as δ and σ and their dependence on
the node degree distribution under discussion is to be implicit. Note that |E| = δn+σn. These quantities
are defined so that in any subset of half the nodes
(
n+m
2
)
of a configuration in B (Λ, P ), the minimum of
the number of left sockets and right sockets cannot exceed δn. This observation will be useful in deriving
the bounds in this and will be made more formal in Lemma 5.
Consider a given set of nodes N ⊆ V for a bipartite multigraph as defined above, with left degree
sequences Λ and right degree sequences P . For a given subset of vertices N we can thus divide this set
into two disjoint sets, NL and NR, where NL is the set of all those vertices in N that are left nodes, and
NR all those vertices in N that are right nodes. Let R (N) =
∑
v∈NR deg (v) and L (N) =
∑
v∈NL deg (v)
be the number of “sockets” attached to the left nodes in N and right nodes in N respectively.
Lemma 5. For any bipartite multigraph G = (VL q VR, E) with left degree sequences Λ and right degree
sequences P , for any collection N of n+m
2
vertices, min (L (N) , R (N)) ≤ nδ.
Remark 2. We will use this lemma in a counting upper-bounding argument. Specifically, we will count
the number of graph configurations that have a bisection of size a by dividing the vertices that form a
graph into two equally-sized sets. The quantity min (L (N) , R (N)) will be important for our counting
bounds.
Proof: Suppose not. This implies that both L (N) > nδ and R (N) > nδ. Divide the vertices in
N into the left nodes NL and right nodes NR. It must be that |NL| + |NR| = n+m2 . Thus, it must be
that |NL| ≤ n2 or |NR| ≤ m2 (otherwise their sum would exceed m+n2 ). Let us consider the case in which|NL| ≤ n2 (the other case leads to an analogous argument). If |NL| ≤ n2 and L (N) > nδ, then, in particular
L (N) > nδL (Λ) by the definition of nδ. But δL (Λ) by definition 3 is the sum of the highest degree left
nodes. A collection of at most half these nodes cannot exceed this quantity, leading to a contradiction.
Lemma 6. If a configuration G = (VL q VR, E) with degree sequences P and Λ is generated according
to the uniform configuration model, then the probability that this configuration is in the set B∗a and hence
has a bisection of size a or less, when
0 ≤ a < σn (5)
is upper bounded by
P (Ba∗) ≤
(a+ 1)n2
(
n
n
2
)2(|E|
a
)4
a! (δn)! (σn− a)!
(δn+ σn)!
. (6)
Proof: Follows from a straightforward counting upper-bounding technique given in the appendix.
This lemma can be used to prove our main theorem which shows that if a sequence of node-and-
socket configurations is generated uniformly over all such configurations, and the quantities δ and σ
(quantities that could in general change with each element of the sequence) scale according to a particular
condition, then the probability that a configuration in this randomly generated sequence has a small
bisection (proportional to n or less) approaches 0.
Our main theorem concerns sequences of random configurations. Specifically, we concern ourselves
with a sequence of random configurations G1, G2, . . . where each Gi in the sequence is a configuration
generated according to the uniform configuration model, in which the ith configuration is drawn according
to node degree distributions Λi and Pi. Note that the randomness for each element of such a sequence
does not come from the degree distributions: we are assuming that these distributions are fixed. It is the
interconnections between nodes that is random. We specifically concern ourselves with a sequence in
which the number of left nodes n increases without bound. For such a sequence, denote the number of
11
left nodes of the ith configuration as ni. We will abbreviate the quantities δ (Λi, Pi) and σ (Λi, Pi) with
the symbols δi and σi respectively, where we recall their definitions in (4) and (IV). When the dependence
on i is clear, the subscript for these symbols may be omitted for convenience.
Theorem 1. Suppose that there is a sequence of randomly generated bipartite configurations with a series
of degree sequences in which in which the number of left nodes approaches infinity, and if
lim
i→∞
2H
(
1
2
)
+ δi
(
ln
(
δi
δi + σi
))
+
σi
(
ln
(
σi
δi + σi
))
< 0 (7)
then there exists some β > 0 in which
lim
i→∞
P
(
B∗βni
)→ 0
and in particular, this occurs for any value of 0 < β < σ that satisfies:
lim
i→∞
2H
(
1
2
)
+ 4H
(
β
δi + σi
)
+ β
(
ln
(
β
σi − β
))
+δi
(
ln
(
δi
δi + σi
))
+ σi
(
ln
(
σi − β
δi + σi
))
< 0. (8)
Remark 3. This theorem says that subject to some condition on the average edge degrees of the configu-
rations, as these configurations get larger the probability that the configuration generated has a bisection
proportional to n or less gets vanishingly small. We will use this result to show that for capacity-
approaching LDPC degree distributions, the minimum bisection width must be large in some sense,
implying that circuit implementations of these LDPC Tanner graphs must grow quickly as well, with high
probability. The condition in (7) recognizes that for a sequence of such graphs, the quantities δ and σ
could change with increasing n. If the condition is satisfied (which we will see for capacity-approaching
LDPC degree sequences it must) then with high probability the graphs do not have a “small” bisection.
Proof: (of Theorem 1) Consider first a specific random configuration in the sequence with block
length n and node degree distributions that result in values for δ and σ. We will use the bounds of
Lemma 6 and then apply well known approximations. Firstly, we use the well known bounds that
e
(
exp
(
n ln
(n
e
)))
≤ n! ≤ e
(
exp
(
n ln
(
n+ 1
e
)))
and that (
n
k
)
≤ exp
(
nH
(
k
n
))
where H (x) = −x log x−(1− x) log (1− x). We use base e as opposed to base 2 in order to conveniently
simplify the expressions that follow. Applying these bounds appropriately to the bound in Lemma 6, and
grouping terms that grow slower than n into an arbitrary polynomial term P (n) we get the following:
P (B∗a) ≤ P (n) (a+ 1) exp
(
2nH
(
1
2
)
+ 4nH
(
a
|E|
))
exp
(
a ln
(
a+ 1
e
))
+ δn ln
(
δn+ 1
e
)
exp
(
(σn− a) ln
(
σn− a+ 1
e
))
exp
(
− (δn+ σn) ln
(
δn+ σn
e
))
.
12
Expanding the last two terms in the exponent gives us:
P (B∗a) ≤ P (n) (a+ 1) exp
(
2nH
(
1
2
)
+ 4nH
(
a
|E|
))
exp
(
a ln
(
a+ 1
e
))
+ δn ln
(
δn+ 1
e
)
exp
(
(σn) ln
(
σn− a+ 1
e
)
− a ln
(
σn− a+ 1
e
))
exp
(
− (δn) ln
(
δn+ σn
e
)
− (σn) ln
(
δn+ σn
e
))
.
Factoring the terms in the exponent with an a term, a δn term, and a σn term gives us:
P (B∗a) ≤ P (n) (a+ 1) exp
(
2nH
(
1
2
)
+ 4nH
(
a
|E|
))
exp
(
a
(
ln
(
a+ 1
e
)
− ln
(
σn− a+ 1
e
)))
exp
(
(σn)
(
ln
(
σn− a+ 1
e
)
− ln
(
δn+ σn
e
)))
exp
(
(δn)
(
ln
(
δn+ 1
e
)
− ln
(
δn+ σn
e
)))
.
Simplifying the logarithmic expressions in each line gives us:
P (B∗a) ≤ P (n) (a+ 1) exp
(
2nH
(
1
2
)
+ 4nH
(
a
|E|
))
exp
(
a
(
ln
(
a+ 1
σn− a+ 1
)))
exp
(
(σn)
(
ln
(
σn− a+ 1
δn+ σn
)))
exp
(
(δn)
(
ln
(
δn+ 1
δn+ σn
)))
.
We now let a = βn, which will satisfy the condition specified in (5) for β < σ. Making this substitution
and also using that |E| = δn+ σn to expand the |E| term in the first line of the expression gives us:
P
(
B∗βn
) ≤ P (n) (βn+ 1)
exp
(
2nH
(
1
2
)
+ 4nH
(
βn
δn+ σn
))
exp
(
βn
(
ln
(
βn+ 1
σn− βn+ 1
)))
exp
(
(σn)
(
ln
(
σn− βn+ 1
δn+ σn
)))
exp
(
(δn)
(
ln
(
δn+ 1
δn+ σn
)))
.
13
Simplifying each quotient within the logarithms, and grouping the (βn+ 1) term into our arbitrary
polynomial term:
P
(
B∗βn
) ≤ P (n) exp(2nH (1
2
)
+ 4nH
(
β
δ + σ
))
exp
(
βn
(
ln
(
β + 1
n
σ − β + 1
n
)))
exp
(
(σn)
(
ln
(
σ − β + 1
n
δ + σ
)))
exp
(
(δn)
(
ln
(
δ + 1
n
δ + σ
)))
.
By factoring the n term and by applying Lemma 3, we see that the above expression will approach 0 if
lim
i→∞
2H
(
1
2
)
+ 4H
(
β
δ + σ
)
+ β
(
ln
(
β + 1
n
σ − β + 1
n
))
+ (σ)
(
ln
(
σ − β + 1
n
δ + σ
))
+ (δ)
(
ln
(
δ + 1
n
δ + σ
))
≤ 0
where we recall again that the dependence on i in this expression comes from the n terms and the δ and
σ terms (whose dependence on i we have suppressed for notational compactness). This is true if
lim
i→∞
2H
(
1
2
)
+ 4H
(
β
δ + σ
)
+ β
(
ln
(
β
σ − β
))
+σ
(
ln
(
σ − β
δ + σ
))
+ δ
(
ln
(
δ
δ + σ
))
≤ 0.
Also note that this is the condition on β given in (8). To derive the condition in (7), we find the limit as
β approaches 0 of this expression, and treating the other terms as constants, giving us:
2H
(
1
2
)
+ σ
(
ln
(
σ
δ + σ
))
+δ
(
ln
(
δ
δ + σ
))
≤ 0
where we have applied the easily verifiable facts that limx→0H
(
x
c
)
= 0 and limx→∞ x
(
ln
(
x
σ−x
))
= 0 to
get rid of the second and third terms in the expression. Thus, if this condition is satisfied, by the definition
of a limit, there exists a sufficiently small β in which limi→∞ P
(
B∗βn
)
= 0.
As we are considering a sequence of configurations, we let ωi be the minimum bisection width of the
ith configuration. This Theorem has an obvious corollary.
Corollary 1. If there is a sequence of configurations as described in Theorem 1, in which the condition
in (7) is satisfied then limi→∞ P (ωi ≥ βni) = 1.
Proof: Note that the event B∗a is the event that a random configuration has a bisection of size a or
less. The complement of this event is the event that a random configuration has no bisection of size a or
less, and thus equal to the event that a random configuration has minimum bisection width greater than
or equal to a. The corollary flows directly from this observation.
14
A. Application to a Specific Sequence of Random Configurations
Our result in Theorem 1 can be directly applied to the Tanner graphs of specific sequences of LDPC
codes. For example, consider a regular LDPC ensemble with variable node degree 6 and check node degree
3. A randomly generated Tanner graph with this degree distribution would have δn =
∑n
n
2
= 6n
2
= 3n
and σn = |E| − 3n = 3n. In this case we can compute that the condition in (7) evaluates to:
2H
(
1
2
)
+ δ
(
ln
(
δ
δ + σ
))
+ σ
(
ln
(
σ
δ + σ
))
=
2H
(
1
2
)
+ 3
(
ln
(
3
3 + 3
))
+ 3
(
ln
(
3
3 + 3
))
≈
− 2.77
which we see is less than 1. Thus, applying our theorem means that since the condition (7) is satisfied,
if random Tanner graphs are generated with this degree distribution, with probability approaching 1 the
minimum bisection width of these graphs will be proportional to n.
V. ALMOST SURE BOUNDS ON CAPACITY APPROACHING LDPC CIRCUITS
We will use the result above to find an “almost sure” scaling rule for the energy of a capacity-
approaching directly-implemented decoding scheme in which the Tanner graph of each decoder is gener-
ated according to a uniform configuration model with a set node degree distribution.
Consider a decoding scheme C1, C2, . . . in which each of the decoders in the scheme are directly-
implemented LDPC decoders, as in Definition 3. We associate a scheme with a channel that the decoders
are to decode. Let the capacity of that channel be C. Let the ith decoder have associated block length
ni. Let the rate associated with the ith decoder be Ri. Let the gap to capacity associated with the ith
decoder be ηi = RiC . Let the area of the ith decoder be Ai, and the energy of the ith decoder be Ei. Let the
minimum bisection width of the Tanner graph of the ith decoder be ωi. We consider a family of LDPC
decoding schemes in which the Tanner graph of each decoder in the scheme is generated according to
a uniform configuration model. Thus, we say that the Tanner graph of decoder i is generated uniformly
from a family Bi (Λ, P ) of configurations. We can thus discuss the probability of the ith decoder having
certain properties. In particular, in the corollary below, we will analyze P (ωi ≥ βni), the probability that
the ith decoder has a Tanner graph with minimum bisection width greater than βni, and show that this
approaches 1, resulting in an almost sure energy scaling rule for capacity-approaching LDPC decoders.
We let the event that the ith decoder has a bisection of size a or less to be B∗i,a
Corollary 2. For a family of capacity-approaching directly-implemented LDPC decoding schemes where
the Tanner graph of each decoder is generated according to a uniform configuration model, limi→∞ P (Ai ≥ cn2i ) =
1 for some constant c > 0. Similarly, limi→∞ P
(
Ai ≥ c′(1−ηi)4
)
= 1 for a constant c′ > 0.
Proof: Note that a Tanner graph is in fact a bipartite graph as described in the Theorem 1 in which
the block length corresponds to n and the number of checks corresponds to m. For a sequence of LDPC
codes to approach capacity, the result in [15] implies that
|E|
n (1−R) ≥ Ω
(
ln
(
1
1− η
))
Thus, as capacity is approached, the number of edges per node must approach infinity, and thus the
quantity δ must approach infinity. We can thus show that the expression:
2H
(
1
2
)
+ δ
(
ln
(
δ
δ + σ
))
+ σ
(
ln
(
σ
δ + σ
))
< 0 (9)
15
must be satisfied for sufficient closeness to capacity.
To see this, note that δ approaches ∞ for a capacity-approaching code. What happens to σ is either
(a) limn→∞ δδ+σ < 1 or (b) limn→∞
δ
δ+σ
= 1, or (c) this limit does not exist. Note that this value cannot
exceed 1 because necessarily σ ≤ δ.
In the case of (c), it must be that the value of σ alternates and no limit can be defined. In this case,
however, we should consider the specific subsequence of decoders in which either (a) or (b) applies. It
will be clear that since for each subsequence the appropriate scaling rule holds, thus it must be true for
the entire sequence.
In case (a): In the limit, ln
(
δ
δ+σ
)
< 0 and so δ
(
ln
(
δ
δ+σ
)) → −∞, as δ approaches ∞. Since
σ
(
ln
(
σ
δ+σ
))
< 0 in any case (a consequence of σ ≤ δ), thus in the limit the inequality (9) will be
satisfied.
For case (b), in which ln
(
σ
δ+σ
) → −∞, note that σ is positive, so σ (ln ( σ
δ+σ
)) → −∞ , and thus in
the limit (9) will also be satisfied.
Note that each Tanner graph in the sequence under consideration is generated according to the uniform
configuration model. Since the sequence is capacity approaching, by the argument above the node degree
distributions satisfy the sufficient condition of Theorem 1. Thus, by applying Corollary 1,
lim
i→∞
P (wi ≥ βni) = 1. (10)
We combine this result with Thompson’s [3] result presented in Lemma 2 that the area of a VLSI
instantiation of a graph with minimum bisection width ω is lower bounded by Ac ≥ λ2wω24 . Thus, the event
that ωi ≥ βni implies that Ai ≥ λ2w(βni)
2
4
and thus,
lim
i→∞
P
(
Ai ≥ λ
2
w (βni)
2
4
)
= 1
as expressed in the theorem statement.
This result can be used to understand how the area of almost all circuits that instantiate random Tanner
graphs of LDPC codes must scale as capacity is approached. It is well known from [16], [17] that, as a
function of fraction of capacity η = R
C
, the minimum block length required for any code scales as:
n ≈ b
(1− η)2
for a constant b that depends on the channel statistics and also the target probabilities of error. We are
not concerned with the value of this constant but rather the dependence of this expression on η.
We use this to note that, if ωi ≥ βni, then, recognizing from Definition 3 that a directly-instantiated
LDPC decoder must contain its Tanner graph, and also applying Lemma 1 which says that a circuit must
be bigger than the minimum area of a circuit instantiation of a graph that the circuit contains, then
Ac ≥ λ
2
wβ
2n2
4
≥ λ
2
wβ
2
4
b2
(1− η)4 ≥ Ω
(
1
(1− η)4
)
.
Combining this observation with the result in (10) results in
lim
i→∞
P
(
Ai ≥ c
′
(1− ηi)4
)
= 1
for a constant c′ > 0, finishing the proof.
16
Applicability of this Result
There is a minor detail that needs to be dealt with for this theorem to be truly useful. Our results
assume that a Tanner graph is directly implemented in wires. This is indeed a practical way to create a
decoding circuit. However, according to our configuration model, it is possible that two or more edges
can be drawn between the same two nodes. This type of conflict is usually dealt with by deleting even
multi-edges and replacing odd multi-edges with a single edge (see definition 3.15, the Standard LDPC
Ensemble in [14]). This leads to a potential problem with the applicability of our theorem: what happens
if the edges that we delete form a minimum bisection of the induced graph? In that case it is possible
that the graph we instantiate on the circuit has a lower minimum bisection width than that which we
calculated, and thus could possibly have less area. However, this is resolved by the fact that in the limit
as n approaches infinity for a standard LDPC ensemble, the graph is locally tree-like (Theorem 3.49 in
[14]) with probability approaching 1. This implies that the probability that the number of multi-edges
in a randomly generated configuration is some fraction of n must approach 0 (or else the graph would
not be locally tree-like, contradicting the theorem). Hence, even if we did delete these multi-edges from
the randomly generated configuration, this could at most decrease the minimum bisection width by the
number of deletions, but this number of deletions, with probability 1, cannot grow linearly with n. Hence,
the minimum bisection width must still, with probability 1, grow linearly with n, and our scaling rules
are still applicable.
A. Energy Complexity of Capacity Approaching Complete-Check-Node LDPC Decoders
Below we will consider a sequence of capacity-approaching, complete-check-node serialized decoders.
Recall that these decoders do not directly instantiate their Tanner graph in wires, but they do have
subcircuits corresponding to each check and variable node. In each iteration, possibly over several clock
cycles, messages are to be passed from each variable node subcircuit to their corresponding check node
subcircuit and similarly for the check node subcircuits passing messages to their corresponding variable
node subcircuits. It may be that the same wire is used to transmit different messages during different
clock cycles of the same iteration of the computation. It is thus possible that such a method can decrease
wiring area (by not requiring a wire for each edge of the Tanner graph) at the cost of more clock cycles.
We prove below that such a method still results in a super-linear almost sure lower bound on energy
complexity. So there is no ambiguity, a sequence of decoders for a channel with capacity C with rates
R1, R2, . . . is capacity-approaching if limi→∞Ri = C.
Corollary 3. For a sequence of capacity-approaching, complete-check-node serialized LDPC decoders
whose Tanner graphs are generated according to the uniform configuration model, limi→∞ P (Ei ≥ cn1.5i ) =
1 for some c > 0. Also, limi→∞ P
(
Ei ≥ c(1−η)3
)
= 1 and limi→∞ P
(
Ei
k
≥ c
1−η
)
= 1.
Proof: In considering a complete-check-node serialized LDPC decoder, we note that such a decoder
contains a graph with n variable nodes and at least n − k check nodes. We will use arguments similar
to those used by Thompson [3] and Grover [4]. Let the minimum bisection width of the Tanner graph of
the associated with the ith decoder be ωi. Suppose that the graph of the circuit implementing this decoder
has minimum bisection width Wi (we use the symbol Wi to distinguish this from the minimum bisection
width of the Tanner graph ωi of the ith decoder, recalling that we do not require in this case that the
circuit contains the underlying Tanner graph). Thus, in one iteration, the number of bits communicated
between any bisection of the nodes must at least be ωi. One iteration must be performed, but since the
minimum bisection width of the graph associated with this circuit is Wi, this requires that more clock
cycles are used to pass the information between the check and variable nodes, and in particular
τiWi ≥ ωi. (11)
We also know from Lemma 2 that Ai ≥ λ
2
wW
2
i
4
and so combining with the inequality in (11) gives us:
17
Aiτ
2
i ≥
λ2wW
2
i τ
2
i
4
≥ λ
2
wω
2
i
4
. (12)
Trivially, because there are ni variable node subcircuits in the circuit, Ai ≥ ni and thus combining with
(12) we get
A2i τ
2
i ≥
λ2wω
2
i ni
4
and thus, taking the square root of both sides of this inequality,
Aiτi ≥ λwωin
0.5
i
2
.
Since energy is proportional to the product of circuit area and number of clock cycles, this implies that
for each decoder in the sequence
Ei ≥ ξtechωin0.5i .
for the constant ξtech that depends on the specific technology used to implement the circuits.
Using the same arguments as Corollary 2 we can show that for a capacity-approaching LDPC scheme
limi→∞ P (ωi ≥ βni) = 1 for some β > 0. Following the logic above, the event that ωi ≥ βni implies
Ei ≥ ξtechβn1.5i which thus implies limi→∞ P (Ei ≥ cn1.5i ) = 1 for some constant c > 0, Also, following
the same logic as in Corollary 2, limi→∞ P
(
Ei ≥ c(1−η)3
)
= 1 and limi→∞ P
(
Ei
k
≥ c
1−η
)
= 1.
B. Limitations of Result
A goal of this research is to find fundamental bounds on the “energy complexity” of capacity-approaching
decoders as a function of η = R
C
. The result presented here does not quite do this, but it does advise
engineering by suggesting that if n is very large, one can be reasonably sure that the area of a circuit that
instantiates a randomly generated Tanner graph will have area that scales as Ω (n2). Of course, we have
assumed that this Tanner graph has been generated by going to each socket of the left nodes and randomly
finding a connection to a remaining right socket. This is of course a very natural way to generate Tanner
graph, and is in fact used in the analysis of LDPC codes [14].
This is not to say, of course, that there don’t exist good LDPC coding schemes with slower scaling laws.
Creating a sequence of LDPC codes that avoids this scaling law with probability greater than 0 would be
possible if the random generation rule for the LDPC graph was somehow altered. For example, perhaps
the variable nodes and check nodes could be placed uniformly scattered through a grid and then the
randomly placed edges, instead of being chosen uniformly over all possible edges, are chosen uniformly
over a choice of edges connecting variable and check nodes that are “close” to each other.
In practice, a Tanner graph is often modified to prevent interconnections that are ”too far” between
check and variable nodes that result in long wire length and thus higher energy [18]. Simulation in
a particular case can analyze whether this technique is worth the possible code performance trade-off.
Currently, however, the common technique of generating an LDPC ensemble and analyzing average code
performance does not consider energy complexity as a fundamental parameter to be traded-off with other
code parameters. It seems likely that if “neighbors” of a variable node are restricted to those check nodes
that are spatially close by, an LDPC code could still have good asymptotic performance if block lengths
grow large. An analysis challenge of such a scheme may be to show that asymptotically a Tanner graph
generated from such a distribution is locally tree-like. Furthermore, analysis of the required block length
using such a technique to get good performance would be needed: even if asymptotically such schemes
perform well, it may be that much longer block lengths are required for the same performance. The cost
of possibly larger block length for such a scheme would have to be considered to determine whether it is
worth it to have a slower scaling rule as a function of block length if it comes at a cost of much longer
block length.
18
Whether or not such a sequence of LDPC codes would give good performance is unclear. However,
in the following section we can use known bounds on the average node degree of an LDPC decoder as
well as bounds on the area of graphs instantiated on a circuit to get scaling rules that are true for all
directly-implemented capacity-approaching LDPC decoders, not just almost all.
VI. BOUNDS FOR ALL LDPC DECODER CIRCUITS
We can find bounds for the energy complexity for all capacity-approaching directly-implemented LDPC
codes (and not just almost all) by using the following Theorem:
Theorem 2. If a circuit contains a graph G = (V,E) that has no loops, according to the standard VLSI
model, the total area of a circuit that contains that graph is bounded as:
A ≥ λ
2
w
(√
2− 1)2
4
|E|2
|V |
where we recall that λw is the wire width in the circuit, and |E| and |V | are the number of edges and
vertices in the graph, respectively.
The proof of this theorem uses a similar approach as used by Grover et al. in [4], in which the Acτ
complexity of circuits is related to the bits communicated within the circuit. The result of this paper,
however, is a bound on the area of a circuit instantiation of a graph as a function of the number of edges
and vertices in the graph. We use a similar nested bisection technique as the Grover et al. paper. The
proof is given in the appendix.
This result, combined with the results in [15] on the average edge degree as a function of gap to
capacity, results in the following corollary:
Corollary 4. The energy of any directly-instantiated LDPC decoder must have asymptotic energy that is
lower bounded by:
Edec ≥ Ω
(
N
(1− η)2 ln
2
(
1
1− η
))
and average energy per bit decoded that scales as
Edec
k
≥ Ω
(
N ln2
(
1
1− η
))
where N is the number of iterations required to decode.
Remark 4. Note that the number of iterations N in the above Corollary in general may be a function of
the particular decoding algorithm instantiated and possibly the particular received vector. Our discussion
does not analyze the number of iterations required, so we simply write our scaling rules in terms of this
quantity.
Proof: We can combine Sason’s [15] result that the average parity node degree of the Tanner graph
of a capacity-approaching LDPC code must scale as Ω
(
ln
(
1
1−η
))
and that the minimum block length of
any code must scale as Ω
(
1
(1−η)2
)
[16], [17], meaning that |E| ≥ Ω
[
(n− k) ln
(
1
1−η
)]
. Note also that
the number of nodes in this graph must be at least |V | = 2n− k = O (n). Combining these results along
with Theorem 2 results in the scaling laws in the corollary.
We note that this lower bound on directly-implemented Tanner graphs contrasts with the lower bounds
in [5], which show an Ω
((
ln
(
1
1−η
)) 1
2
)
lower bound for the per bit energy complexity of fully-parallel
decoding algorithms as a function of gap to capacity. This result means that directly-instantiated LDPC
decoders are necessarily asymptotically worse than this lower bound (albeit a lot closer than the Ω
(
1
(1−η)2
)
19
Lower Bound Scaling Rule Per Bit (Edec
k
)
Almost all directly instantiated LDPC decoders Ω
(
N ln2
(
1
1−η
))
Almost all LDPC decoders Ω
(
N
(1−η)2
)
All LDPC with Tanner Graph Directly Implemented Ω
(
N ln2
(
1
1−η
))
All Fully-Parallel Decoders [5] Ω
(√
ln
(
1
1−η
))
TABLE I
SUMMARY OF THE SCALING RULE LOWER BOUNDS DERIVED IN THIS PAPER. WE PRESENT THESE BOUNDS AS A FUNCTION OF η = R
C
.
IN THE FIRST THREE SCALING RULES PRESENTED, N IS THE NUMBER OF ITERATIONS REQUIRED (WHICH IN GENERAL MAY BE A
FUNCTION OF THE ACTUAL LDPC CODE INSTANTIATED, AS WELL AS THE PARTICULAR RECEIVED VECTOR). FOR COMPARISON, WE
ALSO INCLUDE A RESULT ON LOWER BOUNDS FOR ALL FULLY-PARALLEL DECODERS GIVEN IN [5].
almost sure lower bound of Corollary 2). Of course, it is not known whether the lower bounds of the
paper in [5] are tight, but Corollary 4 proves that directly instantiated LDPC decoders cannot reach these
lower bounds in an asymptotic sense.
VII. CONCLUSION
The main contribution of this paper is graph theoretic in nature. We have shown that subject to a mild
condition on node degree distributions, almost all Tanner graph instantiations have a minimum bisection
width that scales as Ω (n) where n is the number of left nodes. The minimum bisection width of a graph is
related to the area of circuit implementations of these graphs. We have used this result to show that almost
all LDPC decoders that directly instantiate their Tanner graph must have circuit area, and thus energy, that
scales as Ω (n2). We can use this result to provide a scaling rule for the energy complexity of almost all
capacity-approaching LDPC decoders. We have further presented a general theorem on the area of circuits
that instantiate any graph to further bound the area of any LDPC decoder that approaches capacity. These
results are summarized in Table I. Note that our results show that directly-instantiated LDPC codes cannot
reach the lower bounds presented in [5], thus indicated that either the lower bound cited is not tight, or
directly-instantiated LDPC codes asymptotically not optimal from this energy perspective. It may also be
that both are true, namely that known lower bounds are not tight and LDPC codes are not asymptotically
optimal. This remains an open question.
APPENDIX A
PROOF OF LEMMA 6
Proof: (of Lemma 6) Let the set of graphs in B (Λ, P ) having a bisection of size a be denoted by
Ba. Then we can say that, according to the uniform configuration model, the probability of the event of
generating a configuration with a bisection of size a is given by:
P (Ba) =
|Ba|
|E|!
i.e., it is the cardinality of the set of such configurations divided by the total number of configurations in
with node degrees Λ and P .
We will now bound the number of configurations in B (Λ, P ) with a bisection of size a, and we will
assume that a < σn. To do so, we will define a quadrant configuration, show that the number of quadrant
configurations with a bisection of size a is greater than or equal to Ba, and then upper bound the number
of quadrant configurations with a bisection of size a or less.
A quadrant configuration of a bipartite configuration G = (VL q VR, E) is an ordered-tuple Q =
(G, TL, TR, BL, BR) where the vertices are divided into 4 disjoint sets, the top left vertices (TL), the
top right vertices (TR), the bottom left vertices (BL), and the bottom right vertices (BR), in which
20
Fig. 3. An example of a quadrant configuration associated with a degree distribution where all the left nodes have degree 2 and all the
right nodes have degree 4, in which the number of left nodes n = 8 and number of right nodes m = 4. The fully drawn configuration on
the right is a quadrant configuration in Q4,24 . Recall that the superscript denotes that there are i = 4 top left nodes and j = 2 edges leading
from top left nodes to bottom right nodes. The subscript indicates that there are a = 4 edges between top and bottom nodes, and in this
case we see that they cross a dotted line, indicating where the bisection occurs. The diagram on the left shows the drawing of a = 4 edges
crossing between top and bottom nodes. The graph on the right shows a permutation of the remaining sockets in both the top and bottom
nodes.
TL, BL ⊆ VL, TR, BR ⊆ VR and ||TR ∪ TL| − |BL ∪BR|| ≤ 1. Naturally, vertices in TL are considered
top left vertices, or, interchangeably, top left nodes, and similarly for the other sets of vertices in a quadrant
configuration. Furthermore, vertices in TL and TR are considered to be top vertices or top nodes, and
similarly for the bottom vertices.
Note that every bipartite graph has at least one quadrant configuration induced by arbitrarily dividing
the vertices in half, and denoting one half of these vertices top vertices and the other half bottom
vertices. Thus, the set of quadrant configurations with a particular degree distribution is at least as
big as the set of configurations with a particular degree distribution. Because a quadrant configuration
Q = (G, TL, TR, BL, BR) contains a graph G, graph properties can be extended to describe a quadrant
configuration. So, for example, if we say that a quadrant configuration has minimum bisection width a,
we mean precisely that the graph G within the quadrant configuration has minimum bisection width a.
Denote the set of quadrant configurations with set node degree distributions Λ and P in which a is the
number of edges leading from top vertices to bottom vertices as Qa. Note that the dependence of Qa on
a particular node degree distribution is implicit. Observe that every configuration with a bisection of size
a has a corresponding quadrant configuration in Qa created in the natural way by denoting one bisected
set of vertices as the top vertices, and the other the bottom vertices. Thus |Ba| ≤ |Qa|.
For ease of discussion, we will assume that the total number of nodes m+n in the set of configurations
under discussion is even, so that m+n
2
is an integer.
Denote the set of quadrant configurations with a bisection of size a in which there are i top left nodes
and j edges connecting top left vertices to the bottom right by Qi,ja . This of course implies that there are
m+n
2
− i top right nodes and a− j edges leading from the bottom left to the top right nodes. We can see
in Figure 3 an example of such an element that we are counting for the case of n = 8 and a = 4, i = 4
and j = 2. Note then that
Qa =
n⋃
i=0
a⋃
j=0
Qi,ja
We bound the size of Qi,ja by counting all quadrant configurations with a bisection of size a that are the
21
edges connecting top nodes to bottom nodes. We have∣∣Qi,ja ∣∣ ≤ (ni
)
︸︷︷︸
a
(
m
m+n
2
− i
)
︸ ︷︷ ︸
b
(|E|
j
)(|E|
j
)( |E|
a− j
)( |E|
a− j
)
︸ ︷︷ ︸
c
(j)! (a− j)!︸ ︷︷ ︸
d
(δn)! (σn− a)!︸ ︷︷ ︸
e
, (13)
where
a. Represents a choice of i top left nodes.
b. Represents a choice of m+n
2
− i top right nodes from the m total right nodes.
c. The quantity
(|E|
j
)
is an upper bound on the number of choices of j sockets that will have edges that
cross the bisection line chosen from the top variable nodes, and
(|E|
j
)
is an upper bound on the number of
choices for the bottom right sockets to which these edges will be connected. For a configuration in Bi,ja
there must also be a−j edges leading from the bottom left to the top right. The quantity ( |E|
a−j
)
is an upper
bound on the number of choices of sockets in the bottom left that can have edges crossing the middle
bisection, and similarly
( |E|
a−j
)
is an upper bound on the number of choices for the sockets connected in
the top right.
d. Counts the number of permutations of edges that join the top half to the bottom half (first counting
the j connections from the top left nodes to the bottom right nodes, then the a− j connections from the
bottom variable nodes to the top variable nodes.
e. This step in the quadrant configuration construction process involves permuting the connections of
the remaining sockets in the top half and the bottom half. However, at this point it is not clear how
many sockets are in the top half or the bottom half. However, we can upper bound the number of
permutations possible. The number of nodes available in the top left vertices must equal the number of
nodes available in the top right vertices (because in order to construct a valid configuration this must be
true). By construction, the total number of nodes in the top left and top right is m+n
2
, and thus the number
of sockets available cannot exceed δn, by Lemma 5. Suppose the number of sockets available for all the
top left nodes is M and the sockets available in the bottom left nodes is N . Then there are at most M !N !
ways to permute these. We also know that M +N = |E| − a (since the total number of sockets available
on one side of the constructed quadrant configuration is |E| and a have been used to cross between top
nodes and bottom nodes), and that M ≤ δn and N ≤ δn. Subject to these restrictions, a direct application
of Lemma 4 implies M !N ! ≤ (δn)! (|E| − δn− a)! = (δn)! (σn− a)!
Now, for the sake of simplicity, we will further loosen these bounds by upper bounding each of the
factors a, b, c, and d. Each of these bounds is easily verified:
a. We note that
(
n
i
) ≤ (nn
2
)
.
b. Since m ≤ n, thus ( mm+2
2
−i
) ≤ (nn
2
)
.
c.
(|E|
j
)( |E|
a−j
)(|E|
j
)( |E|
a−j
) ≤ (|E|
a
)4
which is implied by a ≤ σn ≤ |E|
2
.
d. (j)! (a− j)! ≤ a! which flows directly from the observation that (a
j
) ≥ 1.
Combining these gives us the following bound:∣∣Qi,ja ∣∣ ≤ (nn
2
)2(|E|
a
)4
a! (δn)! (σn− a)!
We can bound |Qa| by summing over our upper bound on |Qi,ja |:
|Qa| ≤
n∑
i=1
n∑
j=1
∣∣Qi,ja ∣∣
≤ n2
(
n
n
2
)2(|E|
a
)4
a! (δn)! (σn− a)! (14)
22
We of course are not concerned with the probability of a bisection of size a, but rather with the probability
of a bisection of size a or less. We denote the set of configurations with a bisection of size a or less by
Q∗a and since Q
∗
a =
⋃a
i=0Qa:
|Q∗a| ≤
a∑
i=0
|Qi| .
We will now show that the expression in (14) is an non-decreasing function of a for 0 < a ≤ |E|−1
2
. Let
the right side of the expression be denoted da, then it is easy to show that
da+1
da
is greater than or equal
to 1. It is easy to show that
da+1
da
=
( |E|
a+1
)4
(a+ 1)(|E|
a
)4
(σn− a)
.
Expanding the binomial coefficients in the numerator and denominator and simplifying gives us
da+1
da
=
(|E| − a)4
(a+ 1)3 (σn− a)
This quantity will be greater than or equal 1 if |E| − a ≥ a+ 1 and |E| − a ≥ σn− a . Note that a < σn
(an assumption of our lemma) implies 2a < 2σn ≤ |E|. Since a and |E| are both integers, this implies
2a ≤ |E| − 1, from which we can see that the first inequality is satisfied. The second is satisfied by the
fact that σn ≤ |E|. We thus observe that,
|B∗a| ≤ |Q∗a|
≤
a∑
i=0
|Qi|
≤
a∑
i=0
n2
(
n
n
2
)2(|E|
a
)4
a! (δn)! (σn− a)!
≤ (a+ 1)n2
(
n
n
2
)2(|E|
a
)4
a! (δn)! (σn− a)! (15)
We note that the number of possible multi-graphs with our given node degree distribution is at least
(δn+ σn)!. We can now bound the probability of the event B∗a with:
P (B∗a) ≤
|B∗a|
(δn+ σn)!
(16)
≤
(a+ 1)n2
(
n
n
2
)2(|E|
a
)4
a! (δn)! (σn− a)!
(δn+ σn)!
(17)
where we have simply applied the upper bound for the size of B∗a of (15) .
APPENDIX B
PROOF OF THEOREM 2
In this section we will prove Theorem 2, which states that if a circuit implements a graph G = (V,E)
that has no loops, according to the standard VLSI model, the total area of that circuit is bounded by:
Ac ≥
λ2w
(√
2− 1)2
4
|E|2
|V |
where λw is the wire width in the circuit, and |E| and |V | are the number of edges and vertices in the
graph, respectively.
23
Proof: (Of Theorem 2) For simplicity we will say the graph has |V | = 2k vertices. Recall that a
bisection of a graph is the set of edges of that graph that divides the vertices in half. A minimum bisection
of a graph is a bisection that uses the smallest number of edges to bisect the graph. We will perform what
we call nested minimum bisections on the graph. To do this, first, the edges in a minimum bisection of
the graph are removed, and there are b1,1 such edges. This divides the graph into two distinct components.
Then, these two components (which are subgraphs of the original graph) are bisected by removing edges
in their respective minimum bisection cut, and so b2,1 and b2,2 edges are removed. This process continues
for k bisections, and after the kth bisection, there are 2k disjoint subgraphs, each with one vertex, and
no edges (because we assume that in these graphs there are no loops). It must be that the total of all the
edges we removed equals the total number of edges in our graph; in other words, it must be that:
k∑
i=1
2i−1∑
j=1
bi,j = |E| . (18)
Recall Thompson’s bound from Lemma 2 that says for a circuit implementation of a graph with minimum
bisection width ω, the area of that circuit is lower bounded by:
4Ac
λw
≥ ω2.
We can use this result to bound the total area of all of the subgraphs for each level i = 1, 2, . . . , k,
4Ac
λw
≥
2i−1∑
j=1
b2i,j.
Thus,
4Ac
λw
≥ max
b21,1, b22,1 + b22,2, . . . , 2k−1∑
j=1
b2k,j
 .
Standard convex optimization techniques imply that this expression is minimized when:
c1 ≡ b1,1
c2 ≡ b2,1 = b2,2
...
ck ≡ bk,1 = bk,2 = . . . = bk,2k−1 (19)
where for the sake of convenience we have introduced the constants c1, c2, . . . , ck. Furthermore, it can be
shown that
b21,1 = b
2
2,1 + b
2
2,2 = . . . =
2k−1∑
i=1
b2k,i = a (20)
for some a. Using the definitions of the constants c1, c2, . . . , ck given in (19), and applying this to the
above equation (20)
c21 = 2c
2
2 = 4c
2
3 = . . . = 2
k−1ck
from which we can infer
c2 =
1√
2
c1, c3 =
1√
2
c2 =
(
1√
2
)2
c1,
and, in general,
ci =
(
1√
2
)i−1
c1.
24
We then apply this to the constraint in (18) to give us
|E| =
k∑
i=1
2i−1
(
1√
2
)i−1
c1
= c1
k∑
i=1
(
2√
2
)i−1
= c1
(√
2
k − 1√
2− 1
)
,
from which c1 is easily obtained. Now, using that k = log2 V , we have
√
2
k
=
(
2
1
2
)log2|V |
=
(
2log2|V |
) 1
2 =
|V | 12 . Hence, we conclude that
4Ac
λ2w
≥
(( √
2− 1√|V | − 1
)
|E|
)2
≥
(√
2− 1
)2 |E|2
|V | .
REFERENCES
[1] R. Gallager, “Low-density parity-check codes,” IRE Trans. Info. Theory, vol. 8, no. 1, pp. 21–28, 1962.
[2] P. Oswald and A. Shokrollahi, “Capacity-achieving sequences for the erasure channel,” IEEE Trans. Info. Theory, vol. 48, no. 12, pp.
3017–3028, Dec. 2002.
[3] C. D. Thompson, “Area-time complexity for VLSI,” in Proceedings of the Eleventh Annual ACM Symposium on Theory of Computing,
ser. STOC ’79. New York, NY, USA: ACM, 1979, pp. 81–88. [Online]. Available: http://doi.acm.org/10.1145/800135.804401
[4] P. Grover, A. Goldsmith, and A. Sahai, “Fundamental limits on the power consumption of encoding and decoding,” in Proc. 2012
IEEE Int. Symp. Info. Theory, 2012, pp. 2716–2720.
[5] C. G. Blake and F. R. Kschischang, “Energy consumption of VLSI decoders,” CoRR, vol. abs/1412.4130, 2014. [Online]. Available:
http://arxiv.org/abs/1412.4130
[6] K. Ganesan, P. Grover, and A. Goldsmith, “How far are LDPC codes from fundamental limits on total power consumption?” in 50th
Ann. Allerton Conf. Commun., Control, and Comput., Monticello, IL, 2012, pp. 671–678.
[7] M. Fiedler, “A property of the eigenvectors of non-negative symmetric matrices and its application to graph theory,” Czechoslovak
Mathematical, vol. 25, pp. 619–633, 1975.
[8] S. Bezrukov, R. Elsasser, B. Monien, R. Preis, and J.-P. Tillich, “New spectral lower bounds on the bisection width of graphs,”
Theoretical Computer Science, vol. 320, pp. 155–174, Mar. 2004.
[9] J. Diaz, M. J. Serna, and N. C. Wormald, “Bounds on the bisection width for random d-regular graphs,” Theoretical Computer Science,
vol. 382, pp. 120–130, 2007.
[10] M. Garey, D. Johnson, and L. Stockmeyer, “Some simplified NP-complete graph problems,” Theoretical Computer Science, vol. 1,
no. 3, pp. 237 – 267, 1976. [Online]. Available: http://www.sciencedirect.com/science/article/pii/0304397576900591
[11] J. Thorpe, “Design of LDPC graphs for hardware implementation,” in Proceedings of 2002 IEEE International Symposium on
Information Theory, 2002, p. 483.
[12] K. Ganesan, P. Grover, and J. Rabaey, “The power cost of over-designing codes,” in Proc. 2011 IEEE Workshop Signal Proc. Sys.,
2011, pp. 128–133.
[13] D. B. West, Introduction to Graph Theory, 2nd ed. Prentice Hall, 2001.
[14] T. Richardson and R. Urbanke, Modern Coding Theory. New York, NY, USA: Cambridge University Press, 2008.
[15] I. Sason, “On universal properties of capacity-approaching LDPC code ensembles,” IEEE Trans. Info. Theory, vol. 55, no. 7, pp. 1–2,
Jul. 2009.
[16] R. G. Gallager, Information Theory and Reliable Communication. New York, NY, USA: John Wiley & Sons, Inc., 1968.
[17] V. Strassen, “Asymptotische abschatzungen in Shannons Informationstheorie,” in Trans. 3rd Prague Conf. Info. Theory, Statist. Decision
Functions, Random Proc. Prague: Pub. House Czechoslovak Acad. Sciences, 1962, pp. 689–723.
[18] C. Roth, A. Cevrero, C. Studer, Y. Leblebici, and A. Burg, “Area, throughput, and energy-efficiency trade-offs in the VLSI
implementation of LDPC decoders,” in 2011 IEEE International Symposium on Circuits and Systems (ISCAS), May 2011, pp. 1772–
1775.
