Abstract-This paper contains a survey on iterative decoders of low-density parity-check (LDPC) codes built form unreliable logic gates. We assume that hardware unreliability comes from supply voltage reduction, which causes probabilistic gate failures, called timing errors.
I. INTRODUCTION
Replication of electronic devices and majority voting, proposed by von Neumann [1] and Winograd [2] is a classical way of increasing reliability of Boolean function computation and/or storage of information. Although simple to implement, it requires large quantity of redundant hardware in order to produce high level of reliability. Unfortunately, later attempts to reliably compute Boolean functions, built from unreliable logic gates, by using combinatorial schemes with lower redundancy were not successful. Dobrushin and Ortyukov [3] found Boolean functions for which construction of such schemes is even impossible.
On the other hand, in 1968. Taylor [4] and later Kuznetsov [5] provided much more promising results on reliable storage of information. Instead of multiplying memory cells, Taylor proposed that information is packed as a codeword of a low-density parity-check (LDPC) code and stored in registers, which are periodically refreshed by an error-correction circuit. Taylor proved that memory built only from unreliable components, with complexity increasing linearly with the quantity of stored information, is capable of preserving all information for arbitrary long time period.
P. Ivaniš and S. Brkic are with the University of Belgrade, School of Electrical Engineering, 11000 Belgrade, Serbia(e-mails: predrag.ivanis@etf.rs, srdjan.brkic@etf.rs).
B. Vasić is with the Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ, 85721 USA (e-mail: vasic@ece.arizona.edu).
The significance of Taylor's work, for long time not fully recognized in the coding theory community. However, four decades after Taylor's original work ensuring fault-tolerance was recognized as on the most significant challenges in the semiconductor industry [6] , [7] . Shortly after, Taylor's work was rediscovered [8] - [14] , which lead to a surge of interest in decoders of LDPC codes built from unreliable hardware, i.e., noise decoders. A number of relevant papers were dedicated to density evolution under uncorrelated gate failures [15] - [19] , with rather pessimistic conclusions that arbitrary low error probability cannot be achieved, even when code length tends to infinity.
In contrary to a popular beliefs that hardware failures have negative impact on decoding, in our previous work we showed that deliberate failure insertion can be beneficial to the decoding process [20] - [22] . We attributed such behaviour to stochastic resonance, a phenomenon well exploited in neural systems, semiconductor and quantum devices.
In another research direction we investigated iterative decoders under realistic gate failure models such as correlated gate failures. The paper provides an overview of the current state of our research regarding behaviour of hard-decision decoders under special type of correlated gate failures, inevitable in synchronous CMOS circuits, called timing errors. Following this line of research we reviled an errorcorrection potential that noisy decoders have, which is not visible by the density evolution analysis under uncorrelated failures. In a series of articles we showed robustness of different error-correction techniques to timing errors -onestep majority logic decoders [23] , Gallager B decoders [24] , bit-flipping decoders [25] , [26] , as well as coded memory architectures [27] . In this paper we especially emphasize two major contributions: 1) discovered positive impact of timing errors to decoder's correction capability and 2) ability of noisy decoders to achieve arbitrary low error probability. The first phenomenon is explained on the example of Gallager B decoder, while the second contribution incorporates expander codes an bit-flipping decoders. Furthermore, we provided a lower bound on the number of channel errors that can be corrected by noisy bit-flipping decoders. In addition, we show that expander graph arguments can also be used to provide guarantees of reliable information storage, when noisy bit-flipping decoder is used as a correcting circuit.
The rest of the paper is organized as follows. Section II contains the preliminaries on codes on graphs, followed by the gate failure modeling approaches and state of the art density evolution analysis. In Section III we discuss the Gallager B decoding under timing errors, while guaranteed error correction of bit-flipping decoder is given in Section IV. Section V discusses the application of LDPC codes in memories. Concluding discussion is given in Section VI.
II. PRELIMINARIES

A. Codes and Decoders on Graphs
Consider a graph-based representation of LDPC codes through a bipartite (Tanner) graph G = (V ∪ C, E), where V represents the set of n variable nodes, C is the set of nγ/ρ check nodes, and E is the set of nγ edges. An edge e = (v, c) connects variable node v ∈ V and check node c ∈ C iff H c,v = 1, where H denotes the parity check matrix of the code. All nodes connected to a particular node u are called neighbors and form a set N u . The number of neighbors of a variable node v (check node c) is denoted by
If ∀c ∈ C, ρ c = ρ, we say that the graph is ρ-right-regular and similarly if ∀v ∈ V , γ v = γ the graph is γ-left-regular. We say that the graph (and corresponding code) is (γ, ρ)-regular if it is in the same time ρ-right-regular and γ-leftregular. The length of the shortest cycle in Tanner graph is called girth, and denoted by g.
Let x = (x 1 , x 2 , . . . , x n ) denote a codeword of an LDPC code, where x v ∈ {±1} represents the value associated with the variable node v. The vector received from the channel is represented by y = (y 1 , y 2 , . . . , y n ), where
An iterative decoder follows the factor-graph approach -nodes of Tanner graph iteratively exchange messages, where a message passed from a check c to a variable node v during the -th iteration we denote by μ ( ) c→v , while a message passed in the opposite direction is denoted by ν ( ) v→c . The messages that leave a particular node are calculated using functions implemented locally within the node (node update functions), based on the messages sent by its neighbours during the previous iterations.
During the past decades variety of different decoders have been proposed, with different performancecomplexity trade-offs. In the most complex decoders, like Sum-Product algorithm (SPA), variable node update functions produce floating-point probability estimates, while the practically most significant ones are decoders which exchange quantized messages, called FAIDs (FiniteAlphabets Iterative Decoders), which in some cases even outperform SPA [28] . On the opposite side of the spectrum are hard-decision decoders in which, during the decoding iteration, a node sends/receives single binary value. Usually, check to variable message revels is parity check satisfied, while variable to check message contain sign of the transmitted bit likelihood denoted by Ω v,c . Thus,
where sgn(·) denotes signum function. On the other hand, check to variable messages are μ 2) bit-flipping decoders, for which S v,c takes following elements
Note that in the case of bit-flipping decoder variable to noise messages sent from a particular node are all the same. The decoding process terminates when all parity check equations are satisfied or maximal number of iterations is reached. It should be observed that operations in the check node c require implementation of ρ c XOR gates with ρ c − 1 inputs. Similarly, variable node processor v is composed of γ v majority logic (MAJ) gates with γ v − 1 inputs, when Gallager B decoder is considered, or a single γ v MAJ gate for the case of bit-flipping decoder.
B. Modeling Logic Gates Failures
Let f : {±1} m → {±1}, m > 1, be an m-argument node update function, which at decoding iteration produces the message z
m are input arguments at time . Due to unreliability of the logic gates used to calculate f , the result is not z
, where e ( ) ∈ {±1} is the error at time .
In the vast majority of reference literature related to coding-enhanced electronic devices, it is assumed that e ( ) is a Bernoulli random variable, which statistics does not change with decoding iterations nor does it depend on function input arguments. In other words, gate failures are spatially and temporally uncorrelated, which correspond to the von Neumann failure model [1] . However, this modeling approach is too general and dos not always produce practically relevant conclusions. For example, if hardware unreliability comes from supply voltage reduction failures are highly influenced by gate input arguments, i.e., we say that failures are data-dependent.
Reducing the power voltage supply prolongs the digital signal delay and makes it unfit for timing constrains established for the case of nominal (higher) voltage supply. If we do not slow down the system clock the logic gate output will not always switch correctly, leading to so called timing errors [7] , [29] , [30] . In other words we can write Pr{e
On the other hand, when the gate output does not change during a decoding iteration, we assume that signal propagation is zero, i.e., Pr{e
It should be noted that, in practice, different gate input arguments (which force gate output to switch) case failures with different probabilities [30] . However, guided by Einstein's observation that everything should be made as simple as possible, but no simpler, we sacrifice precision in order to produce more fundamental results. In that spirit, we can assume that ε denotes the highest possible failure probability obtained experimentally, or by the circuit-level simulations, for chosen technological parameters (clock frequency, supply voltage, temperature). Nevertheless, it typically differs from a gate type to a gate type and we denote by ε ⊕ and ε MAJ the failure rates of XOR and MAJ gates, respectively. Note that failure injection can be described by mapping Υ : {±1} 3 → {±1} where the actual gate outputẑ ( ) can be obtained bŷ
In the subsequent sections we show how timing errors affects the performance of hard-decision decoders.
C. Density Evolution and Transient Gate Failures
Density evolution (DE) technique, is recognized as one of the most important mathematical tools used to predict performance of LDPC codes and message passing decoders, on various communication channels. Let P ( ) n (p) denote the bit error probability of an LDPC code of length n after decoding iterations, when transmission channel crossover probability is equal to p. The goal of DE analysis is to find channel error probability threshold p such that for all p < p , lim →∞ P ( )
. In other words, DE proves the existence of codes, for which in the asymptotic block length the Gallager B decoder made of fully reliable components can correct all received errors.
Recently, DE approach was extended to the case of noisy decoders and transient von Neumann gate failures. [15] , [16] , [19] , [32] . For example, the bit error achievable for the noisy Gallager B decoder applied on (γ, ρ)-regular code ensemble can be expressed by the following recursive form [16] 
where
On the contrary to its reliable counterparts noisy Gallager B decoders, as it was shown in [16] , do not achieve arbitrary low error probability, even in the asymptotic case. Instead, we usually speak of η-reliability. Even when ε MAJ = 0, from Eq. (1) follows that, for γ > 3,
. This is formalized in the following proposition. Proposition 1. The bit error probability after decoding iterations, P ( )
for any LDPC ensemble decoded using noisy Gallager B decoder, for every channel noise level p and every decoder noise level ε ⊕ > 0 and/or ε MAJ > 0.
The above claim is numerically expressed in Fig. 1 , for several different levels of logic gate unreliability. In Section IV we show that the above proposition does not necessarily hold for the case of timing errors. Indeed, we prove that decoders partially built from unreliable gates can achieve zero error probability. 
III. NOISY GALLAGER B DECODER
We start our discussion with the simple, but important observation that timing error occurrence during the first decoding iteration depend on the update functions values prior to the decoding. In general, we do not have any control over these inherited values, which can dramatically degrade decoder performance. Let ν The above theorem reveals that decoder performance depend on the transmitted codeword order, since it is not possible to adjust coefficients A v and B v based on the transmitted codeword (which is unknown prior to the decoding). Incorrect switches from the first iteration accumulate on channel induce errors and in a sense increase the initial error rate. In [33] we have shown that the most harmful codeword ordering can degrade frame error rate (FER) for several orders of magnitude. Logic failures during subsequent iterations are rare and can be compensated during iterative decoding process. In addition, ensuring error-free computation during the first operation brings another benefit, depicted in the following corollary. Corollary 1. The frame error rate of the Gallager B decoder with failure-free first decoding iteration is independent of the transmitted codeword.
In [33] we have shown that the decoder satisfying the above corollary can perform as good as its counterpart built from reliable components. A failure-free decoding iteration can be accomplished by increasing the system clock period and letting signal level stabilize. Furthermore, Dupraz et al. [34] recently provided rigorous proof that if the number iterations and code length go to infinity the error rate of the noisy decoder converges to the error rate of fully reliable decoder. However, asymptotic analysis masks an important property which dominantly influence the performance of finite-length codes -trapping sets [35] . We next formally define trapping sets. , b) , where |T| = a and b is the number of parity nodes connected to odd number of nodes in T.
As it was shown by Vasic et al. [36] , reveling the the most dominant trapping sets (trapping sets with lowest cardinalities) for a chosen decoder can lead to construction of codes with lower error-floors. Dzung et al. [37] designed codes free of small trapping sets, especially adjusted to Gallager B decoding, called Latin-Square (LS) codes. On the other hand, other classes like well known Tanner's quasi-cyclic (QC) codes [38] exhibit poor error-floor performance, but are easier to construct since do not require elaborate information related to decoder's behaviour.
Here we show that, instead of designing a specific code, error-floor performance can be improved by allowing decoder to work on unreliable hardware. This surprising effect we first observed in the case of von Neumann failures [20] , [39] and showed that message perturbations caused by gate failures can help decoder to escape from trapping sets. Furthermore, in the infinite number of iterations all trapping sets break, and the decoding process always converge to a valid codeword. In the limiting case simple noisy hard-decision decoder becomes maximum likelihood (ML) decoder.
However, practicality of the above approach mainly depends on the ability of circuit designer to construct logic gates which produce uncorrelated failures with specific narrow range error probabilities. On the other hand, timing errors inherently exist in synchronous VLSI circuits that operate in a subthreshold regime. Instead of introducing perturbations in every logic gate in the decoder, the timing error injection assures that only messages that frequently switch their values become candidates for hardware induced perturbation. Frequent switches of Tanner graph messages are present in large number of trapping sets. We start with the example of trapping set presented in Fig. 2 .
Consider a subgraph given in Fig. 2.(a) induced by a set of variable nodes V = {v 1 , v 2 , . . . , v 5 }. It is well known (5,3) trapping set connected to check nodes from a set C = {c 1 , c 2 , . . . , c 8 }. Note that white circles correspond to correct variables, while black circles correspond to erroneous variables. Similarly, white squares denote satisfied parity checks, while unsatisfied parity checks are illustrated by black squares. Suppose that variables v 2 , v 4 and v 5 are erroneously received from the channel. After the first iteration decoder built from reliable hardware will correct all erroneous variables, meanwhile corrupting variables v 1 and v 3 , as illustrated in Fig. 2.(b) . It can be shown that the second iteration annuls the effect of the previous one, and the decoding process returns to the beginning. In the subsequent iterations the number of erroneous variables will alternately change from three to two and via versa, never reducing to zero.
Suppose now that decoding is performed by noisy decoder and that during the third decoding iteration message sent by variable node v 4 to c 5 is perturb due to its inability to meet timing constrains (Fig. 2.(c) ). The perturbation prevents corruption on the variable node v 3 , as illustrated in Fig. 2.(d) , and in only one iteration all errors will be corrected.
The full benefits produced by timing errors can be noticed by examining average FER level as a function of logic gate unreliability, which is shown in Fig. 3 . As illustrative example we choose QC code with |V | = 155, |C| = 64, code rate R = 0.4 denoted by QC(155,64) and measure FER by Monte Carlo simulation, for the channel rate p = 0.1. Performance of QC code are compared with a LS code with the same construction parameters (number of variable/check nodes, code rate, minimum distance), but which is free of small trapping sets -denoted by LS(155,64). Although LS code outperforms QC when perfectly reliable hardware is used, timing errors compensate all structural deficiencies of QC code, improving its performance for an order of magnitude. Thus, QC code achieves slightly lower error rate then its LS counterpart. On the other hand, as it was shown in [33] , LS code is highly resistant to timing errors, but they do not improve its performance. Note also that QC code retains low error rate even for high failure rates. This can be attributed to the low number of signal switches that can cause timing errors. Furthermore, switches appear with high probability in messages that are involved in trapping sets, while majority of (correct) messages outside of a trapping set are stable and according to our model will not perturb their values. Recall that timing errors are consequence of reduction of power supply voltage, i.e., voltage scaling, which is not easy to force to produce certain fixed error probability. However, due to nature of decoding process, timing error is a rare event, which simplifies the voltage scaling mechanism.
Another important decoding parameter is convergence time, i.e., the average number of iterations after which decoder produces a valid codeword or get stuck in a trapping set. In both cases prolonging the decoding additionally does not change the decoder's performance. The Gallager B decoder, made of reliable gates, is know to be a fast decoder and it is able to make the most of its error-correction potential after only 10 to 20 decoding iterations. As shown in Fig. 4 , timing errors do not dramatically increase the convergence time, and noisy Gallager B decoder remains desirable solution for high-speed applications.
IV. NOISY BIT-FLIPPING DECODER
We here examine important characteristic of noisy decoders -their ability to guarantee correction of a certain number of "worst case" channel errors. As discussed in Section II-C, under the von Neumann failure model decoder cannot provide any guarantees. Even when computation is performed by fully reliable hardware, provable errorcorrection is restricted to only small number of decoders. Sipser and Spielman, in a now classical paper [40] , linked the error correction capability of the bit-flipping decoder with expansion property of Tanner graph. In the subsequent work Burshtein and Miller [41] extended their work to the Gallager B decoder, while Feldman et al. [42] showed that linear programming can also correct a fixed fraction of worst channel errors, if an underlying graph is a good expander. Here we, by the example of noisy bit-flipping decoder, show that decoders built partially from unreliable gates can achieve arbitrary small error probability. We start with formal definition of expansion property. Expander property allows code rate and minimum code distances to stay constant as code length increases. In the prominent paper [43] , Barg and Zemor proved that expander codes achieves Shannon's capacity on binary symmetric channels. Here we show that the number of errors that can be corrected by noisy bit-flipping decoder increases linearly with code length, if MAJ gate operate reliably and if during the first iteration number of XOR gate failures does not exceeds C XOR .
Theorem 2.
[26] Consider a (γ, ρ, α, (7/8 + )γ) expander, 1/8 ≥ > 0. The bit-flipping decoder built from unreliable check nodes can correct any pattern of
errors.
Note that in addition to the above constraints, we also require that the expansion of Tanner graph is greater then 7/8γ. In order to overcome worst case error patterns combined with hardware failures, code structure needs to be expanded more -in [40] it was shown that perfect bitflipping decoder requires expansion of only 3/4γ. Now it is not hard to extrapolate the above result into probabilistic channels and prove that noisy decoders can achieve arbitrary small error probability. The following theorem depicts fundamentally different decoder's behaviour compared to the result given by Proposition 1.
Theorem 3. Consider binary symmetric channel with crossover probability p < 3(3 + 8 )α/32. Then, if exits expander code (γ, ρ, α, (7/8+ )γ) of length n, 1/8 ≥ > 0, the probability of error after decoding, P
As shown above expander codes behave good asymptotically, however, constructing practical codes which are good expanders is not an easy task. It is known that random bipartite graphs are good expanders with high probability, but are infeasible for practical applications. Although Capalbo et al. [44] provided explicit construction of codes with arbitrary expansion (called lossless expanders), high expansion comes with expense in decoder's complexity. Nevertheless, their method proves the existence of codes which are able to achieve zero error probability when decoded by noisy decoders.
Examining the the number of variable nodes with certain expansion property in structural LDPC codes is know to be an NP-complete problem. However, it is possible to link some structural parameters of a code with minimal graph expansion, as shown by Chilappagari et al. [45] . Namely, they proved that error correction of a code constructed under γ ≥ 4-left-regular graph increases exponentially with graph's girth. We were able to extend their results for the case of noisy decoders, as shown in the following theorem. 
Note that we were able to provide lower error correction bounds for only codes with γ ≥ 8. The corresponding numerical values are given in the table below. As can be observed, the error-correction capability of the noisy bit-flipping decoder is modest, but its low complexity makes it a desirable solution in coded memories, which is described in Section V. Again we stress the importance of ensuring that the first decoding iteration is error-free (C XOR = 0), since failures from the first iteration linearly reduce the correction capability of the decoder.
V. APPLICATION TO MEMORIES
Memories usually occupy large portion of VLSI devices and are especially vulnerable to manufacturing process variations, voltage drops, or thermal radiation. Thus, different modern volatile/non-volatile memories store information in coded forms, and are equipped with error-correction circuits. Theoretical fundamentals of coded memories are given by Taylor [4] and Kuznevsov [5] in late sixties and early seventies, respectively. Taylor was the first who considered memories enhanced by LDPC codes, and provided framework for design and analysis of memories built entirely from unreliable components. The memory stability concept plays central role in the Taylor's framework. Let us assume that at time t = 0 a codeword on an LDPC code is written in unreliable memory. The memory cell contents is updated at equidistant time intervals τ, 2τ, 3τ, . . . by one iteration of chosen hard-decision decoder, made also from unreliable hardware. Note that if the information is extracted even in the time instant just after an update cycle, it is still erroneous with the probability lower bounded by the unreliability of the last logic gate used in the decoder. For that reason another hard-decision decoder, built from reliable logic gates, is added into memory architecture and used only when information is read from the memory. Thus, we declare a memory failure only if the read information cannot be correctly decoded by the addition decoder in finite number of iterations. We next defined the memory stability. Definition 4. Memory complexity is defined as the sum of all memory cells and 2-input Boolean functions used in the memory architecture. Definition 5. The memory which stores K information bits is stable it the following is satisfied: i) Memory complexity is limited by θK, where θ is fixed. ii) For every time instance t > 0, i δ > 0, memory failure probability at instant t satisfies P f (t) < δ.
It should be noted that only memories coded by LDPC codes satisfy stability condition. Here we examine the memory architecture that employs noisy bit-flipping decoder, as shown in Fig. 5 . Let us assume that the fraction of memory cell failures between two update cycles is bounded by α m . Similarly, let the fraction of MAJ gates that fails during a update cycle be bounded by α γ . Then, we can formulate the following theorem.
Theorem 5.
[27] The proposed memory architecture based on a (γ, ρ, α, (7/8+ )γ) expander code can preserve all stored bits for an arbitrary long time period if α m + α γ < 2 (3 + 8 )α.
It is not difficult to show that in the asymptotic case the proposed memory satisfies stability conditions. We now illustrate the behaviour of finite-length memories assuming that failures of memory cells are uncorrelated (von Neumann failures) and happen with probability p m , while logic gate failures follow timing error mechanism (Fig. 6 ). For the illustrative example, we assume that memory information is coded by the projective-geometry code, denoted by PG(2, 2 3 ). Since code-length is finite, after sufficiently long time memory content will be permanently corrupted. The strong impact that LDPC coding has on the memory reliability is self-explanatory. We only draw the reader's attention to usefulness of timing errors to preserving memory content even longer.
VI. CONCLUSION The paper contains a survey on decoding of LDPC codes when logic gates used in the decoder are prone to timing errors. We mainly focused on hard-decision decoders, like Gallager B and bit-flipping decoders, which simplicity enables deeper insight into timing errors influence on the decoding process. We established the conditions necessary for decoding performance to be independent on the transmitted codeword, and showed that adding timing errors deliberately can help decoder to correct some otherwise uncorrectable errors patterns. Timing errors appear as a consequence of supply voltage reduction, and they do not require any additional hardware. Furthermore, as power supply reduces the logic gates become more energyefficient. The same effect is noticed in the memories which employ noisy bit-flipping decoder.
On the other hand, our analysis of the noisy bit-flipping decoder was mostly aimed to prove existence of noisy decoders that can achieve arbitrary low error probability. We show that simple bit-flipping decoder can correct fixed fraction of errors even if it is partially built form unreliable hardware.
