Abstract-In this paper, we propose a statistical model to analyze the performance of verification-based algorithm (VA) for packet-based low-density parity-check (LDPC) codes over binary symmetric channel (BSC). In contrast to the analysis of VA in the literature, we propose to take the false verification into consideration. For a given ensemble of LDPC codes and channel parameters, the proposed analysis model gives an efficient way to find the average performance of packet-based LDPC codes with verification-based decoding. Through numerical results, we find that the proposed method can provide a close estimation of frame error rate (FER) for packet-based LDPC codes with verificationbased decoding over BSC for all crossover probabilities of practical interests.
I. INTRODUCTION
Research on low-density parity-check (LDPC) codes [1] - [4] has traditionally focused on small alphabets. With the development of communication networks and high speed wireless systems, the operation units are normally blocks of bits organized as packets. As a result, some efforts have been made both in code design and decoding algorithms of packetbased channel codes [5] - [9] . For packet erasure channel, the Fountain codes such as LT-codes [6] and Raptor codes [7] have been proposed to be used for realizing multi-cast and content delivering networks. As for packet-based LDPC codes, verification-based decoding approach (VA) has been proposed by Luby and Mitzenmacher [8] . Their suggested decoding algorithm consists of two iterative stages: verification and correction. At the verification stage, if the sum of all the neighboring variable nodes of a check node equals zero, all these variable nodes are verified. At the correction stage, one unverified variable node can be corrected if all its other neighboring variable nodes incident to a check node have been verified. Its updated value is computed as the sum of other neighbor variable nodes.
Luby et al. also analyzed the performance of VA over qary symmetric channels, where q is a large number. In this This research was supported in part under Australian Research Council's Discovery Projects funding scheme (project number DP0877616).
scenario, the probability of a variable node that is falsely verified was assumed to be negligible. In this paper, we propose a different statistical model to investigate the performance of VA over binary symmetric channel (BSC). The main contribution of the proposed model is to take the falsely verified variable nodes into consideration, which is one of the dominant factors to affect the performance of VA. Numerical results verify that the proposed model provides a good estimation of frame error rate (FER) performance in various channel conditions for an ensemble of LDPC codes at any number of iterations.
The rest of the paper is organized as follows. In Section II, the preliminary of LDPC codes and the verification-based decoding are introduced. The proposed statistical model is described in Sections III. In Section IV, numerical results are presented. Section V concludes the paper.
II. VERIFICATION-BASED DECODING FOR PACKET-BASED LDPC CODES

A. Packet-based LDPC Codes
For packet-based LDPC codes, we refer a packet of l bits as a symbol and the sum operation is a bitwise EXCLUSIVE-OR between packets. An LDPC code can be represented by a bipartite graph consisting of variable nodes and check nodes. The variable nodes represent the symbols of a codeword and the check nodes determine the constraint that the sum of neighboring variable nodes equals zero, based on the parity check matrix H. A variable node and a check node are denoted as v and c, respectively. A pair of nodes {v, c} represents an edge, which is the path for message passing.
We use C(n, λ, ρ) to denote an ensemble of LDPC codes, where n denotes the length of a codeword and (λ, ρ) denote the degree distributions. For a regular LDPC code, the degree of variable nodes and the degree of check nodes are constant, which are denoted as d v and d c , respectively. For an irregular LDPC code, the degrees of both variable nodes and check nodes are governed by degree distributions, which can be represented by degree distribution polynomials [10] . 
B. Verification-Based Decoding
In the verification-based decoding algorithm [8] 
where V c is the set of neighboring variable nodes linked with check node c and c v is the current value of variable node v.
As for correction, an unverified variable node can be corrected if all its other neighboring variable nodes incident to a check node have been verified. Specifically, let Ω denote the set of verified neighboring variable nodes of a check node c and v u is the only unverified variable node incident to check node c. The correction of variable node v u is achieved as follows:
In VA, one variable node can be either of two states: verified or unverified. Once a variable node is verified, its value and state are finalized. However, a verified variable node can also be either correct or false. As a result, if a variable node is verified and it contains the correct value, we call it a correct verification. On the other hand, if a variable node is verified and it contains an incorrect value, we call it a false verification. If a variable node is unverified, we call it an un-verification. Since the verified variable node fixes its value and state during the decoding process, a false verification results in an error frame directly, regardless of the number of iterations. An unverification at the last iteration also results in an error frame since unverified codewords are normally discarded and not passed to higher layers of a communication protocol [11] .
In the following, we reformulate the VA as a message passing algorithm 1 . In message passing algorithm, messages are passed along the edges between nodes and nodes process the incoming messages to determine the outgoing messages. 1 We note here the message passing reformulation of VA algorithm is from performance analysis perspective, not from implementation perspective.
The decoding process proceeds in iterations. In each iteration, it includes the two stages verification and correction. As shown in Fig. 1 , each stage involves cycles of message passing followed by processing at the variable nodes and the check nodes.
At the verification stage, the states of variable nodes are determined. The cycle of message passing and processing contains four steps as shown in Fig. 1 .
• Step 1. Message passing from variable nodes to check nodes. As shown in Fig. 1 (a), each variable node sends a message which contains its current value through the edges to its neighboring check nodes.
• Step 2. Processing at the check node. As shown in Fig. 1(b), the check node processes all incoming messages from its neighboring variable nodes by using (1), generating messages that convey the suggested state: verified or unverified depending on whether (1) is held or not.
• Step 3. Message passing from check nodes to variable nodes. As shown in Fig. 1(c) , the check node delivers the verified or unverified messages to its neighboring variable nodes.
• Step 4. Processing at the variable node. As shown in Fig.  1(d) , the variable node determines its state based on all incoming messages from adjacent check nodes. If at least one of the received messages includes verified state, the variable node is deemed to be verified, thereby finalizing its current value.
At the correction stage, an unverified variable node can change its value and state. There are also four steps at this stage as shown in Fig. 1 .
• Step 1. Message passing from variable nodes to check nodes. As shown in Fig. 1(a) , each variable node sends the message which contains its current value and state through the edges to its adjacent check nodes.
• Step 2. Processing at the check node. As shown in Fig.  1(b) , the check node calculates the proposed value of one unverified variable node by using (2) if all other incoming messages from its neighboring variable nodes contain verified states. • Step 3. Message passing from check nodes to variable nodes. As shown in Fig. 1(c) , the check node passes the variable node a message which contains the updated value fv denotes the probability of a codeword containing at least one variable node with false verification at the kth iteration, given that no variable nodes of the codeword are falsely verified in the first k − 1 iterations. Q (k) uv denotes the probability of a codeword containing at least one unverified variable node but no falsely verified variable node at the kth iteration, given that no variable nodes of the codeword are falsely verified in the first k − 1 iterations. and verified state, if a correction is required. Otherwise, the message confirms the value and state in Step 1.
• Step 4. Processing at the variable node. As shown in Fig. 1(d) , an unverified variable node changes its value and state as long as it receives one correction message from its neighbouring check nodes.
III. FER PERFORMANCE ANALYSIS OF VA
In this section, a statistical model as shown in Fig. 2 is proposed to analyze the message passing algorithm in Section II. The statistical model operates iteratively and includes two stages for each iteration. For the kth iteration, the system input is the bit error probability p k and the system output is the updated bit error probability p k+1 , which is also used as the input for the next iteration. When k = 0, p 0 is the initial crossover probability of BSC. In our analysis, we assume that the values and states of different variable nodes and check nodes are statistically independent. Hence, the bit error probability is the only input-output parameter of the proposed model, which can be applied to determine the probabilities of variable nodes in different values and states, thereby estimating FER.
In Fig. 2 , Q (k) fv and Q (k) uv are defined as follows:
fv denotes the probability of a codeword containing at least one variable node with false verification at the kth iteration, given that no variable nodes of the codeword are falsely verified in the first k − 1 iterations.
• Q (k) uv denotes the probability of a codeword containing at least one unverified variable node but no falsely verified variable node at the kth iteration, given that no variable nodes of the codeword are falsely verified in the first k−1 iterations.
uv and the law of total probability [12] , the FER after running VA by m iterations is derived as follows:
In deriving (3), we use the fact that the FER can be calculated by accumulating the probability of a frame with false verification (i.e., at least one variable node of the frame is with false verification) in each iteration and the probability of a frame with un-verification (i.e., at least one variable node of the frame is with un-verification) after a maximum number of iterations.
To calculate Q
fv denote the probability of a variable node that is falsely verified at step 4 of the verification stage at the kth iteration. If a variable node is falsely verified, it fixes the incorrect value and state, which causes an error frame. Since the states and values of variable nodes of a codeword are assumed statistically independent, we can find that Q (k)
cv denote the probability of a codeword that all its variable nodes are correctly verified in the verification stage or corrected in the correction stage at the kth iteration, given that the variable nodes of the codeword are not all correctly verified and no variable nodes are falsely verified in the first k − 1 iterations. We can then, using Q
At the correction stage, letq (k) cor denote the probability of a variable node that is corrected at the kth iteration. Leť q (k) cv denote the proportion of correctly verified variable nodes among the variable nodes that are not falsely verified at Step 1 of the correction stage of the kth iteration. We can then have, Q
To run the iterative performance analysis algorithm of the statistical model, p k+1 needs to be found. Let p k denote the bit error probability among the codewords with no variable nodes falsely verified before the correction stage at the kth iteration. We can then obtain
From above discussion, to obtain FER, we need to calculatě q
cor and p k , which are presented in the Appendix.
IV. NUMERICAL RESULTS
We take rate-1/2 regular-(3,6) LDPC codes as in [4] for an example. 50 codes are randomly generated. The codeword length is 1008 and the packet size of a variable node is 32 bits. Further, for comparison purpose, the numerical result of the VA without considering false verification is employed as a reference. Theoretical FER results considering and not considering false verification, compared with simulation results by averaging 50 random 1008-(3, 6) LDPC codes for 5 iterations. The size of the packet is 32 bits. Fig. 3 shows the FER performance of the VA algorithm with 5 iterations. It can be found that the proposed statistical model provides a good approximation to the simulation results of verification-based decoding for packet-based LDPC codes over BSC channels. In contrast, without considering false verification, the theoretical results significantly deviate from the simulation results. For example, when the channel parameter is 3×10 −4 , the numerical result without considering false verification is smaller than the simulation result by more than three orders, while the numerical result considering false verification is in the same order with the simulation result. Fig. 4 shows the relationship between FER performance and the number of iterations of the VA algorithm. It can be found that the proposed model considering false verification gives good estimation of the simulation results for any number of iterations of the VA algorithm. In contrast, the theoretical results not considering false verification significantly deviate from the simulation results, especially when the number of iterations is increased. For example, when p 0 = 10 −4 and the number of iteration is 3, the FER of the proposed method is about 5 × 10 −3 which is close to the simulation result, while the FER of the theoretical results without considering false verification is less than 10 −6 which is far smaller than the simulation result.
V. CONCLUSIONS
In this paper, we have proposed a statistical model to analyze the verification-based algorithm for packet-based LDPC codes over BSC channels. The proposed model takes the false verification into consideration and provides an good estimate for FER at any number of iterations over all channel parameters of interest in practice. The numerical results and simulations demonstrate the validness of the proposed statistical model.
cor AND p k In the following, we only consider the kth iteration for simplicity, thus omitting the superscript k. To find the probabilitiesq fv ,q cv ,q cor and p k , we need to define the following probabilities and events:
•q cv denotes the probability of a variable node that is correctly verified at step 4 of the verification stage.
• At step 1 of the correction stage, q cv and q fv denote the probabilities that messages 2 from variable nodes to check nodes contain correctly verified and falsely verified values, respectively.
• q cv denotes the proportion of messages from variable nodes to check nodes that include correctly verified values among the messages that include no falsely verified values at
Step 1 of the correction stage.
• E vt is the event that the message from v to an adjacent check node contains a correct value at Step 1 of the verification stage.
is the event that the message from v to an adjacent check node contains an incorrect value with β error bits at Step 1 of the verification stage.
• E i ve is the event that (1) is held for a degree-i check node, which indicates that a verification is achieved at step 2 of the verification stage. Using the above definitions, we can find that P (E i ve |E vt ) is the conditional probability that (1) is held when the message from a variable node to the degree-i check node includes a correct value. At step 2 of the verification stage, if (1) is held, a verified message is generated. As a result, the probability that the message from the degree-i check node to a variable node with a correct current value contains a verified state equals P (E i ve |E vt ). Let ρ i denote the percentage of edges connected to degree-i check node. Let p cv denote the probability that the message carries a verified state from a check node to a variable node with a correct current value. It can then be found that:
Similarly, p β fv denotes the probability that the message carries a verified state from a check node to a variable node with β bit errors, which reads as follows
The calculations of
) can be solved by analyzing the error structures of the adjacent variable nodes for a degree-i check node. In the numerical results, the error structures can be found for the small number of error bits by using integer partition. The large number of error bits is not considered here, because it results in very small probabilities which can be ignored in the calculations.
At step 4 of the verification stage, q cv can be found using (4) as follows:
where λ j denotes the fraction of edges connected to degree-j variable nodes. Similarly, q fv can be obtained using (5) as:
As a result,q cv andq fv can be obtained as follows:
andq
where L j is the fractions of degree-j variable nodes among all variable nodes. q cv andq cv can be obtained from q cv , q fv ,q cv andq fv as follows:
q uv denotes the proportion of variable nodes that are unverified among the variable nodes that are correctly verified or unverified, which can be found asq uv = 1 −q cv .
Letq uv&v f denote the probability of an unverified variable node with incorrect current value, which can be found as follows:q
At the correction stage, the variable nodes are either unverified or correctly verified. Since the probabilityq uv&v f equals 1 − (1 − p k ) l , p k can be derived from (10) as:
Let p cor denote the probability that the message from a check node to an unverified variable node with incorrect current value includes the correct value and verified state. It can be obtained as follows:
For a degree-j variable node, the unverified variable node with incorrect current value can be corrected if it receives one message that contains its correct value and verified state from neighboring check nodes. Therefore, by using (12),q cor can be found as follows:
