Abstract-We consider turbo-structured low-density paritycheck (TS-LDPC) codes-structured regular codes whose Tanner graph is composed of two trees connected by an interleaver. TS-LDPC codes with good girth properties are easy to construct: careful design of the interleaver component prevents short cycles of any desired length in its Tanner graph. We present algorithms to construct TS-LDPC codes with arbitrary column weight j 2 and row weight k and arbitrary girth g. We develop a linear complexity encoding algorithm for a type of TS-LDPC codes-encoding friendly TS-LDPC (EFTS-LDPC) codes. Simulation results demonstrate that the bit-error rate (BER) performance at low signal-to-noise ratio (SNR) is competitive with the error performance of random LDPC codes of the same size, with better error floor properties at high SNR.
I. INTRODUCTION
L OW-density parity-check (LDPC) codes, [1] , are being considered in numerous applications including digital communication systems and magnetic recording channels. Their bit-error rate (BER) performance using iterative decoding is close to the Shannon limit, [2] .
There are several ways to specify LDPC codes. In this paper, we use indistinctly the code parity-check matrix and its associated bipartite Tanner graph, [3] . We adopt the common notation of referring to a code with block length and column and row weights and , respectively, as an -LDPC code. The rate of the code is (approximately) given by . An important parameter affecting the performance of the code is the length of the shortest cycle in its Tanner graph. This parameter is the girth of the code. The relation of girth to performance is not completely understood. Codes with four-cycles and projective geometry codes, which may have limited girth, can perform well, [4] . However, in general, in codes with small , i.e., with short cycles in the Tanner graph, the decoding algorithm takes longer to converge or may fail to converge to the optimal decoding result. On the other hand, for codes with very large girth , the first decoding iterations correspond to optimal decoding iterations. For moderate to high signal-to-noise ratio (SNR), it is with very high probability that the decoding success is reached within the "optimal" iterations when is large. Therefore, we can actually achieve optimal decoding results in the sense that the probability of symbol error is minimized. Furthermore, [3] shows that a lower bound on the minimum distance of LDPC codes increases exponentially with the girth . Therefore, it is desirable to have LDPC codes with good girth properties.
An important consideration for the practical application of LDPC codes is the regularity and structure of their encoding and decoding implementation. Recently, structured regular LDPC codes have drawn much attention in the sense that they facilitate low-complexity encoder and decoder designs. We briefly review several typical designs of structured regular LDPC codes in this paper.
We discuss here very briefly recent literature on designs of LDPC codes. A more extensive review of LDPC codes is in [5] , [6] . Kou, Lin, and Fossorier [7] , [8] developed LDPC codes based on finite geometries and incidence structures. Finite-geometry LDPC codes can be designed over a wide variety of block lengths and code rates and achieve good minimum distances. Finite-geometry LDPC codes have girth . Another method to construct structured -cycle-free regular LDPC codes is based on balanced incomplete block designs (BIBD) [9] - [13] . A BIBD is defined as a collection of equal size blocks, comprising elements drawn from a set , such that each pair of distinct elements of occurs in exactly blocks of . BIBD-based codes are well structured, free of -cycles, i.e., with girth , but exhibit a large number of -cycles in their Tanner graphs. Finite-geometry codes and BIBD codes are examples of cyclic and quasi-cyclic [14] , [15] codes. For these codes with column weight the girth is less than or equal to , see Fossorier, [16] , and our own paper [17] . This prevents the girth of -cyclic and quasi-cyclic LDPC codes of growing according to the prediction derived by [1] for random LDPC codes; hence, such codes with very long code block lengths perform poorly.
Hu, Eleftheriou, and Arnold propose in [18] a nonalgebraic method named progressive edge-growth (PEG). They present examples of codes of girth by progressively establishing edges between bit and check nodes in an edge-by-edge manner. PEG optimizes the placement of a new edge on the Tanner graph with the goal of maximizing the local girth. Reference [16] constructed LDPC codes from circulant permutation matrices. It 0018-9448/$25.00 © 2007 IEEE Fig. 1 . Left: The encoder structure for the concatenated LDPC codes in [22] . Right: The parity-check matrix H H H for the concatenated LDPC codes in [21] , [22] .
presents conditions for such codes to achieve girth up to . References [19] , [20] also provide good constructions for structured LDPC codes with large girth.
This paper considers a class of structured regular LDPC codes, the turbo-structured LDPC (TS-LDPC) codes, that can be designed with arbitrary large girth by appropriately choosing the code block length . We describe an algorithm to construct TS-LDPC codes with any desired column and row weights and , hence, with any desired practical rate . TS-LDPC codes are hardware friendly: they are regular and structured, and their parity-check matrix , which can be very large, is determined from a much smaller object, the shift matrix , so that their memory requirements can be negligible. We show by simulation that the BER performance of TS-LDPC codes is competitive with that of random LDPC codes, with better error floor properties. Finally, we exploit the specific structure of the Tanner graph of a particular type of TS-LDPC codes-encoding-friendly TS-LDPC (EFTS-LDPC) codes to derive an encoding algorithm with linear complexity.
The Tanner graph of TS-LDPC codes is composed of two trees, an upper tree and a lower tree , that are connected in a turbo-like manner by an interleaver . This turbo structure is exploited to facilitate the systematic design of LDPC codes with large girth and flexible code rates . Turbo structures have been used in [21] , [22] to construct LDPC codes. The codes proposed by these authors combine two LDPC codes as component codes on the encoder side. Reference [21] directly borrows the structure of the turbo encoder, replacing the recursive convolutional codes commonly used in turbo codes with a tree code-a specific LDPC code whose associated graph for the code-generating matrix , not for the parity-check matrix , is a tree. In [22] , the encoder is a parallel concatenation of two LDPC codes without an interleaver, as shown on the left of Fig. 1 . Our TS-LDPC codes stand in sharp contrast with the codes in [21] , [22] : they are turbo-like from the decoder point of view, see Fig. 2 , i.e., from the parity-check matrix and its associated Tanner graph; their Tanner graphs are NOT the concatenation of two trees through an interleaver as our decoders are. Fig. 1 on the right shows the parity-check matrices for the codes in [21] , [22] and Fig. 3 for a TS-LDPC code. The structure of these matrices bear no resemblance. The top and lower tree structures of the TS-LDPC codes give rise to the top and bottom diagonal lines in , while the cloud of points in arises from the interleaver. In contradistinction to this structure, the matrix for the code in [22] has rectangular blocks, the two right ones corresponding to the parity bits of each encoder component. We comment briefly on the relation between TS-LDPC codes and generalized quasi-cyclic LDPC (GQC-LDPC) codes, a class of LDPC code designs that we described recently in [17] , [23] - [25] . In the terminology of TS-LDPC codes, GQC-LDPC codes are simply reduced to an interleaver, no upper or lower trees are present like in TS-LDPC codes. At first sight, it seems that the TS-LDPC codes generalize GQC-LDPC codes. On the one hand, they are distinct from GQC-LDPC codes since they add to the interleaver the upper and lower trees in their code Tanner graph; on the other hand, they seem to borrow in some generic sense ideas from the design of GQC-LDPC codes including"grouping and shifting" to avoid cycles within the interleaver. In this restricted sense, Algorithm 1 for TS-LDPC codes extends Algorithm 1 for GQC-LDPC codes, see [24] . However, the addition of the upper and lower trees in the TS-LDPC codes adds additional constraints and degrees of freedom to the design process not present for the GQC-LDPC codes. The design constraints on the interleaver being different, the details of the construction and the theorems underlying the construction of TS-LDPC codes are quite different from the corresponding results for GQC-LDPC codes.
Both codes can be designed with flexible girth and code rates but their designs are very different and they represent distinct tradeoffs: usually, for codes of similar girth, GQC-LDPC codes have smaller block length , or, in alternative, for codes with similar block length , GQC-LDPC codes can be designed with larger girth; for similar girth, TS-LDPC codes have shorter diameter, which may lead to faster convergence; and, for TS-LDPC codes, we can design faster decoding, [26] , and faster (linear) encoding algorithms, see Section V.
II. TS-LDPC CODES
As shown in Fig. 2 , the Tanner graph of TS-LDPC codes is composed of three components: two height-balanced trees, denoted as an upper-tree and a lower-tree , and an interleaver that connects and . The leaf nodes of are bit nodes (circles), whereas the leaf nodes of are check nodes (squares). The number of layers or tiers in the trees and is kept the same and given by . We require to be an even number because we start the upper tree from a check node and require its leaf nodes to be bit nodes, and, similarly, we start the lower tree from a bit node and require its leaf nodes to be check nodes. The two trees are "coupled" in a turbo-like manner such that many edges join the leaf nodes of and together, see Fig. 2 . The structure formed by the edges connecting the leaf nodes of and the leaf nodes of is named the interleaver . The first tier of contains only one check node . To match , we let the root of be a bit node and connect to . For the TS-LDPC code in Fig. 2 , the height of the tree is , and the column and row weights are, respectively, and . The rate of the code is where denotes the parity-check matrix and is the code block length. This rate can be arbitrarily adjustable as long as we can design TS-LDPC codes with appropriate values for and . We will present an algorithm to achieve just that.
III. INTERLEAVER DESIGN
This section considers the design of TS-LDPC codes, in particular the design of the interleaver . The interleaver is designed by specifying the rules of how to connect the leaf bit nodes in the upper tree to the leaf check nodes in the lower tree . Interleaver designs for turbo codes have been extensively studied [27] - [30] . For example, an -random interleaver [27] guarantees that any two positions within distance are mapped to two positions with distance greater than after interleaving. The functions of the interleavers for turbo codes are to avoid low-weight codewords and to decrease the correlation between the extrinsic information and the input data sequence. They are not designed to construct Tanner graphs with large girth. Hence, we can not directly borrow the existing interleaver designs for turbo codes, say, the -random interleaver, for designing TS-LDPC codes. We need to develop new interleavers that suit the structure of TS-LDPC codes and lead to TS-LDPC codes with large girth.
Section III-A introduces -alternate-decimal indexing that is used in labeling the nodes in the Tanner graph and in determining how to connect the leaf nodes in and . We design these rules to prevent cycles of length smaller than the desired girth . We achieve this by categorizing the cycles in two types: type I and type II cycles. Section III-B specifies the connecting rules to avoid type I cycles. Type II cycles are considered in Section III-C-they are prevented by "grouping and shifting:" we group the leaf nodes and connect leaf nodes in distinct trees whose labels are appropriately shifted. Section III-D discusses the number of groups needed. Based on the discussions in Sections III-A-D, Section III-E shows the detailed algorithm to construct TS-LDPC codes. Finally, Section III-F discusses the advantages provided by the structure of TS-LDPC codes in terms of the memory required to store them.
A. Auxiliary Nodes and -Alternate-Decimal Indexing
Because of the regularity of the code, by construction, each leaf node in is to be connected to leaf nodes in . This is a one-to-mapping, while the standard interleaver is a one-to-one mapping between elements of two sets with the same size. To get a standard interleaver, we introduce "auxiliary nodes" (solid triangles) as shown in Fig. 4 to facilitate the code design. For each leaf node in , we add auxiliary nodes as its children. Similarly, each leaf node in has auxiliary nodes as its descendants.
Due to the tree structure of the upper and lower components of the codes, cycles present in TS-LDPC codes must contain at least four "auxiliary nodes"-two auxiliary nodes of and two auxiliary nodes of . We use this observation to classify cycles into two disjoint categories: Type I cycles contain four and only four auxiliary nodes; type II cycles are all the other cycles in the code and contain six or more auxiliary nodes. We will dispose of each of these two types of cycles separately. Fig. 5 shows on the left a type I cycle and on the right a type II cycle.
For with tiers, there are auxiliary nodes of . We address the interleaver design problem algebraically by indexing all the auxiliary nodes of and . We start by numbering the auxiliary nodes in from to in the following format-the -alternate-decimal format, where and . We need digits in the -alternate-decimal indexing to label all the auxiliary nodes in . These digits are numbered from to , starting from the rightmost one. The odd-numbered digits take values to and the even-numbered digits take values to . We refer to the position of each digit as its coordinate. Similarly, we index all the auxiliary nodes of from to and represent also all these indices in -alternate-decimal format. To be concrete, we provide an example. The index of the auxiliary node in in Fig. 4 , in -alternate-decimal form, is (1) In (1), , , represents the th digit. The coordinate of is . The corresponding value of in decimals is
From here on, when we refer to "indices" of nodes we mean their -alternate-decimal or -alternate-decimal representations.
Before establishing the connection rules, we first prove the following auxiliary lemma. . Since the leftmost coordinate where the digits of and differ from each other is the fourth, then the first common ancestor of the auxiliary nodes and is the root of , located in the first tier of . To find the shortest path in that connects and , we have to go up tiers from to reach its first common ancestor and then go downwards tiers from to . Therefore, the distance through the tree between and is . Similarly, we can prove part (b) of the lemma. This completes the proof.
We now consider the connection rule to avoid type I cycles.
B. Avoiding Short Type I Cycles-Digit-Wise Reversal
We start from a simple interleaver design-digit-wise reversal, [31] . For an index in -alternate-decimal form with digits, its digit-wise reversal interchanges the th digit and the th digit for . We represent A path of length 3 that contains two auxiliary nodes is actually a path of length1.
the digit-wise reversal operator by . For the index in (1), its digit-wise reversal is (2) We state the advantage of the digit-wise reversal interleaver in the following theorem. Fig. 6 , the length of a type I cycle is:
However, as auxiliary nodes , , , and are imaginary nodes, the type I cycle does not contain them as vertices. For example, the path of length 3 with two auxiliary nodes and shown on the right in Fig. 6 is actually a path of length in the Tanner graph. Therefore, the actual cycle length is , i.e., (4) We relate the distance to the value of their indices. By Lemma 1,  , where is the leftmost coordinate where the digits of and differ from each other. Let represent the height of . After digit-wise reversal, the digits of and at the coordinate become the digits of and at the coordinate , respectively. So, the digits of and at the coordinate are different. According to the connecting rule stated in the theorem, and , so the digits of and at the coordinate are different. Therefore, by Lemma 1 By (4), the length of such a type I cycle is then
From the above analysis, all type I cycles that result from following the connection rule in Theorem 1 are at least of length . This completes the proof.
Theorem 1 states that to increase the length of type I cycles we need simply to increase the number of tiers in the upper and lower trees.
C. Avoiding Type II Cycles-Grouping and Shifting
The connection rule in Theorem 1 prevents short type I cycles. We now consider the connection rule to avoid short type II cycles. To exclude short type II cycles, we propose grouping and shifting.
Shifting We define the shift to be a constant in -alternate-decimal format that is added to the original index to form a new index. We illustrate it with an example. Let , the shift , and represent by the digit-wise addition (with no carry). Then (5) In (5), , where if is even and if is odd. In a similar fashion, we represent the digit-wise subtraction by .
Grouping We divide the auxiliary nodes of into groups of the same size according to their indices. Those auxiliary nodes whose indices have the same leftmost digits are placed in the same group. The auxiliary nodes of can, likewise, be classified into groups based also on whether their indices have the same leftmost digits. We will derive in Section III-D, in particular, Lemma 3, the number of groups that we need for each tree to achieve a given girth , i.e., what is the relationship between and .
After clustering the auxiliary nodes into groups, we further let the shift to be the same when we connect the auxiliary nodes of in the same group to the auxiliary nodes of in the same group. Denote by the shift introduced when we connect the auxiliary nodes of in the th group to the auxiliary nodes of in the th group. For different and , the shifts may be the same or different from each other. The mapping rule for the interleaver is now the following.
Connection rule to avoid type II cycles Connect the auxiliary node indexed by in the th group in to the auxiliary node indexed by in the th group in .
We will show that in fact this rule prevents short type II cycles. But, first, we need to make sure that using this rule does not introduce short type I cycles. This is settled in the next theorem that shows that type I cycles do not depend on the shift .
Theorem 2:
Connecting the auxiliary node indexed by in the th group in to the auxiliary node indexed by in the th group in guarantees that any type I cycle formed is at least of length , where denotes the number of tiers in .
The proof of Theorem 2 is similar to that of Theorem 1 and is omitted here.
By Theorem 2, we are free to choose any shift in the rule because the length of any type I cycle is guaranteed to be greater than or equal to . The freedom to choose the values of is exploited to avoid short type II cycles. This is the subject of Theorem 3 and Section III-C2.
We observe that each type II cycle with edges in the interleaver is associated with shifts
We say that the type II cycle is characterized by the shift sequence . The index labels of the shift sequence characterizing a type II cycle satisfy the following two conditions:
, and , ; (ii) , and and , and .
For example, the two type II cycles of length six on the bottom left and bottom right of Fig. 7 contain four edges in the interleaver. They are characterized by the same shift sequence
The type II cycle shown on the top of Fig. 7 contains six edges in the interleaver; it is characterized by the shift sequence Given a shift sequence that satisfies conditions (i) and (ii) above, we define further its accumulated alternate sum to be
The following theorem helps to eliminate type II cycles with edges in the interleaver.
Theorem 3:
Let be a shift sequence that contains shifts.
is the accumulated alternate sum of and has digits in its -alternate-decimal expansion. If contains at most consecutive digits " " in its -alternate-decimal expansion, then any type II cycle characterized by has length NO less than . Proof: We prove Theorem 3 by proving an equivalent proposition: If there exists a type II cycle with edges in the interleaver and its length is less than , then the associated MUST contain more than consecutive digits " ." Fig. 8 shows a type II cycle with edges in the interleaver. Let this type II cycle contain edges in and edges in . By assumption, the length of this type II cycle is less than . Since the length of the type II cycle is , then (7) Since there are edges in the interleaver, the cycle contains auxiliary nodes, auxiliary nodes in , and auxiliary nodes in . With reference to the plot in Fig. 8, and assuming , then contains more than consecutive zeros. Thus, if a cycle has edges in the interleaver and its length is less than , then its associated contains more than consecutive zeros in its -alternate-decimal representation. This completes the proof.
D. Minimum Number of Groups in Each Tree
There is a tradeoff when deciding the number of groups to choose in each tree. On the one hand, to get compact TS-LDPC codes, we prefer a small number of groups in and . To reduce the number of groups, the number of the common leftmost digits of the indices of the auxiliary nodes in the same group should be as small as possible. On the other hand, a smaller number of groups decreases the number of free parameters in the code design. We will see that a specific class of type II cycles constrains how small the minimum number of groups in each tree can be.
Lemma 2:
The cycle shown on the left in Fig. 9 always exists for any possible values of the related shifts and when . ( , , , and are indices of the bit nodes and , respectively.) Proof: As shown on the right in Fig. 9 , we add auxiliary nodes , , , , , , , and to the cycle shown on the left in Fig. 9 . Let , , ,
the indices of the auxiliary nodes , , , , , , , and , respectively. Suppose that the cycle shown on the left in Fig. 9 does not exist for certain values of the shifts and . With reference to the plot on the right in Fig. 9 , the two check nodes and should be different. Similar to (8), we derive that (9) where , , , and . Equation (9) can be simplified as (10) Since and , , , are auxiliary nodes of , , , , respectively, we derive that and are different only in the rightmost digit. Moreover, since , , , are connected to the same check-node group as shown on the right in Fig. 9 , the rightmost digits of , , , are also the same. Hence, by the above reasoning (11) By (11), we derive that (12) Combining (10) and (12), we derive that (13) Since and are connected to the same check node , and are different only in the rightmost digit by definition. Therefore, all the digits of are zeros except for the rightmost one. Since by (13) , should also have all zero digits except for the rightmost digit. Hence, and are different only in the rightmost digit, which implies that the two auxiliary nodes and are children of the same check node. However, as shown on the right in Fig. 9 , is connected to and is connected to . Hence, and are the same check node. This contradicts the assumption that and are different. Therefore, the cycle shown on the left in Fig. 9 always exists for any choices of the shifts and when . This completes the proof. and ; the right plot shows a length cycle that cannot be excluded by choosing appropriate shifts and . To eliminate the cycle shown on the left in Fig. 9 , we introduce more subgroups to divide the two check nodes and into two different groups, as shown on the left in Fig. 11 . Similar to the proof of Lemma 2 and with reference to the right plot in Fig. 11 , we derive that (14) Since by (12) , (14) can be simplified as (15) As analyzed before, has only one nonzero digit-the rightmost one. However, this time, we have the freedom to choose appropriate values of the shifts , , , and to make their summation having nonzero digits other than the rightmost one, e.g., the second to the rightmost digit. Hence, by (15) , can be made to have nonzero digits other than the rightmost one, which means that the two check nodes and can be different. It follows that the cycle shown on the left in Fig. 11 can be avoided by carefully choosing shifts , , , and . This shows that, to achieve a desired girth , we should require that the number of groups in which we divide the auxiliary nodes of and be larger than a certain minimum, which is a function of the girth .
We now consider the question of determining the minimum number of groups to achieve the girth and how this relates to the parameter in Section III-C. Let and represent the minimum number of groups needed for and , respectively. Then we have the following lemma. Proof: Let us first look at an example. As analyzed before, the length cycle shown on the left in Fig. 10 cannot be avoided when the two check nodes and are in the same group. To avoid this length cycle, and must be in two different groups, as shown in Fig. 12 . Since connects to the auxiliary node and connects to the auxiliary node in Fig. 12 , and must connect to check nodes in two different groups. Further, since and can be any two auxiliary nodes whose indices are different only in the two rightmost digits, we conclude that any two auxiliary nodes whose two rightmost digits are different should be connected to different groups. Since there are categories of such auxiliary nodes, we need at least groups in . Similarly, to avoid the length cycle shown on the right in Fig. 10 , we need at least groups in . More generally, to avoid the cycle with length as shown on the left in Fig. 9 , we derive the following. When is odd, at least groups in are needed; when is even, at least groups in are necessary. The above relationship can be compactly written as (18) Now we study the number of groups needed for . To avoid the cycle shown on the left in Fig. 13 , we divide the two bit nodes and into different subgroups, as shown on the right in Fig. 13 . Symmetrically, we derive that (19) where is the girth to be achieved and is the minimum number of groups needed for . This completes the proof.
On the other hand, if the indices of the auxiliary nodes in the same group must have leftmost digits in common, and are given by (20) (21) So, to achieve girth , the parameter that determines the number of groups has to satisfy (22) Equation (22) determines the minimum value of the parameter to achieve a girth .
E. Construction of TS-LDPC Codes
We use Theorem 3 to reduce the construction of TS-LDPC codes with desired girth to designing a matrix that collects appropriate shifts . By choosing suitably these shifts according to Theorem 3, we can avoid all short type I and type II cycles up to the desired length . We present next an algorithm that finds for TS-LDPC codes with girth . The matrix is , which is much smaller than the TS-LDPC code parity-check matrix . The algorithm is greedy; we determine shifts , one at a time, its value being strongly dependent on the previously determined shifts. Different initial settings of will lead to different matrices . If the algorithm fails to generate a matrix , it is restarted with different initial settings. To construct a TS-LDPC code with column weight , row weight , and tier (number of tiers contained in the upper tree ), the number of candidate matrices is , which is exponential in the number of groups and . Large girth may require increasing the number of tiers in the upper and lower trees.
As an illustration, we applied Algorithm 1 to construct a regular LDPC code, with rate and girth . Its structure is given by the matrix shown in Fig. 3 .
We can clearly identify , , and the interleaver component from the constructed matrix, as labeled in Fig. 3 . In this matrix, along the solid lines, there is a single in each row, while along the dashed thicker diagonals there are five 's in each row, so that per row there are six 's.
We have the following relation between the column weight , the row weight , the girth , and the code block length of TS-LDPC codes that we construct when the girth is small:
When the girth is large, we need to apply Algorithm 1 to find the code block length required.
Algorithm 1 TS-LDPC codes with girth
Initialization Set matrix , the empty matrix. Determine the elements of the matrix row by row.
. Set and step a:
and set its flag to . for to do for all closed paths of length in the current entries of the shift matrix that pass the entry do Check if contains more than consecutive zeros are the consecutive corners of the closed path considered if contains more than consecutive zeros then set the flag to be and stop the for loop; else keep the flag to . end if end for end for if the value of the flag is then discard the current candidate for , go back to step a to select another possible value for . else fill the entry of with the current value . Set the values of and to the next element of that is to be determined. if all the elements of have already been properly chosen then go to step b else go back to step a end if end if step b: End, output the shift matrix 
F. Efficient Memory Utilization
In general, an -LDPC code is represented by an parity-check matrix . Its efficient storage records only the nonzero column indices in each row, hence, at least indices needs to be stored. In contrast, for a TS-LDPC code we need only to store its small shift generating matrix . For example, for the matrix with girth shown in Fig. 3 , according to (16) and (17), , and so the shift matrix is . Hence, instead of storing the column indices required for generic LDPC codes, TS-LDPC require only storing shifts, reducing the memory by a factor of .
IV. PERFORMANCE EVALUATION
We compare by simulation the BER of the TS-LDPC codes with the BER of randomly constructed LDPC codes that are free of -cycles [32] in additive white Gauss noise (AWGN) channels. The codes are decoded with the sum-product algorithm [33] . We adopt the rate normalized SNR defined in [32] : SNR , where denotes the code rate. We first compare the performance of TS-LDPC codes and random codes free of -cycles with column weight . We consider TS-LDPC codes with girth . Fig. 14 compares the BER performance for two LDPC codes with rate : a random ( -cycle free) and a TS-LDPC code of girth . In the high-SNR region, the TS-LDPC code outperforms the random code: at BER this gain is 1.2 dB. In the low-SNR region, the TS-LDPC code has performance comparable to that of the random code. The slope of the BER curve for the random LDPC code decreases with the SNR in the high-SNR region, implying that for this code the error floor occurs when the BER reaches , which is not the case for the girth TS-LDPC code. For LDPC codes with column weight , we can derive that the minimum distance , where is the girth. For the TS-LDPC code with girth , ; for the random LDPC code that is free of -cycles, since the girth is only , . In the high-SNR region, is a dominant factor in determining the code BER performance. Therefore, TS-LDPC codes with girth outperform random codes when the SNR is high. This is in agreement with our simulation studies.
We now study column weight codes. Fig. 15 shows the BER performance for a column weight TS-LDPC code with girth . For comparison, we also show the BER performance of a randomly constructed LDPC code (no -cycles) with column weight (dashed line). Both codes have the same block length and the same code rate . Since there exist many explicitly constructed LDPC codes with high girth, we also incorporate a LDPC code constructed from rectangular integer lattices [20] in our simulations. In particular, the constructed LDPC code is based on the lattice construction of --configurations [20] and has girth . Fig. 15 shows that the BER performance of the TS-LDPC code outperforms that of the random LDPC code at BER while at low SNR, both codes have identical error-correcting performance. The BER performance of the TS-LDPC code is also slightly better than that of the LDPC code constructed from rectangular integer lattices at high SNR.
The construction of TS-LDPC codes is based on interleaving edges between the leaf nodes of upper and lower trees. Cycles are avoided by controlling the interleaver. The variables in the upper and lower trees are strongly connected only through leaf nodes. During the decoding process, the computation of the likelihood for non-leaf nodes of the upper and lower trees depends on the information from the leaf nodes of the trees. It is interesting to check the error events of the iterative decoder to see if specific error concentrations in the non-leaf nodes occur. This is not supported by the experimental evidence shown in Fig. 16 that plots these BER for the TS-LDPC code constructed above. This figure shows that the BER for non-leaf nodes is slightly better than that for leaf nodes. The figure also shows that the BER for undetected codeword errors is much lower than the BER for non-leaf and leaf nodes, as expected.
We study now TS-LDPC codes with girth . The plot in Fig. 17 shows the BER performance for a column weight TS-LDPC code with girth . For comparison, we show the BER performance of a randomly constructed LDPC code (no -cycles) with column weight (dashed line) as well. Both codes have the same block length and the same code rate . We also study by simulation three other structured regular LDPC codes: finite-geometry LDPC code [7] ; LU LDPC code [19] ; and GQC-LDPC code [17] , [23] - [25] . The finitegeometry LDPC code we use is an extended code constructed from the TYPE-I 2-D EG-LDPC code. The code constructed has code block length and code rate 0.5. LU LDPC code constructed has code block length , girth , and code rate . The GQC-LDPC code constructed has code block length , girth , and code rate . Fig. 17 shows that the BER performance of the TS-LDPC code is 0.12 dB better than that of the random LDPC code at BER , while at low SNR, both codes have similar error-correcting performance. Using results from [3] , we can compute that the TS-LDPC code has minimum distance . Since this lower bound on derived in [3] is not tight, the actual of the TS-LDPC code may be much larger than . Again, in the high-SNR region, is a dominant factor in determining the code BER performance. This explains why the TS-LDPC code with girth has good BER performance in the high-SNR region. We also notice that at moderate-to-high SNR the TS-LDPC code outperforms the GQC-LDPC code, which suggests that TS-LDPC codes have better BER performances than GQC-LDPC codes with the same code parameters. Further, the TS-LDPC code, finite-geometry LDPC code, and LU LDPC code have similar BER performances. The finite-geometry LDPC code and the LU LDPC code have good BER performances since they have large minimum distance [7] , [19] . Again, we compare the BER performance for non-leaf and leaf nodes for the same TS-LDPC code just discussed. This is shown in Fig. 18 that also plots the undetected error rate. These simulation results confirm the previous observation, shown in Fig. 16 for the TS-LDPC code, that the BER performance of non-leaf nodes is either comparable or slightly better than the BER performance of leaf nodes.
V. EFFICIENT ENCODING FOR TS-LDPC CODES
TS-LDPC codes, or a variant of TS-LDPC codes, can be encoded with linear complexity. We first describe this variant-encoding friendly TS-LDPC (EFTS-LDPC) codes, then we show how EFTS-LDPC codes can be encoded in linear complexity.
A. Encoding Friendly TS-LDPC (EFTS-LDPC) Codes
The Tanner graph of an EFTS-LDPC code still contains an upper tree and a lower tree that are interconnected by an interleaver . There are no restrictions on the upper tree of the EFTS-LDPC code, which can be exactly the same as the part of the standard TS-LDPC codes. The of the EFTS-LDPC code is restricted so that the degree of its bit nodes is two. In addition, the root of the is changed from a bit node to a check node, as shown in Fig. 19 . The Tanner graph for an EFTS-LDPC code is shown in Fig. 19 . EFTS-LDPC codes are slightly irregular. These modifications enable EFTS-LDPC codes to be encoded with linear complexity.
B. Linear-Complexity Encoding of EFTS-LDPC Codes
Since an LDPC code is equivalently represented by its Tanner graph , we explain how to encode EFTS-LDPC codes using their Tanner graph.
To achieve linear-complexity encoding, we need to first remove the root (a check node) of the lower tree from the Tanner graph of the EFTS-LDPC codes. We will show in the following lemma that removing the root of will not alter the underlying code structure.
Lemma 4:
The parity-check equation denoted by the root of is redundant and can be removed from the parity-check matrix without changing the underlying code structure.
Proof: We discuss two different cases: the bit node degree of is even; the bit node degree of is odd.
1) The Bit Node Degree of
Is Even: Since all the bit nodes in have uniform degree two by definition, then the degree of all the bit nodes is even, which means that each column of contains an even number of 's. Hence, the sum of all the rows of in the binary field is a vector of 's. Therefore, one row of is linearly dependent on the remaining rows and can be removed without affecting the code. We choose to remove the row that corresponds to the root of . For example, Fig. 20 shows an EFTS-LDPC code. The bit node degree of its is two, an even number. The root of its can be removed to generate an equivalent Tanner graph, as shown in Fig. 21 .
2) The Bit Node Degree of is Odd: By construction, check nodes in connect to either leaf nodes of or bit nodes of . Since each leaf node of is connected to check nodes in and is an odd number, then each leaf node of is connected to an even number of check nodes in . Further, every bit node in is connected to exactly two check nodes in by construction. Hence, every bit node is connected to an even number of check nodes in . If we sum up those parity-check equations denoted by the check nodes in , the summation in the binary field is a vector of 's. Therefore, we can remove one of those parity-check equations in without changing the underlying code structure. We again choose to remove the parity-check equation denoted by the root of . For example, Fig. 22 shows an EFTS-LDPC code. The bit node degree of is , an odd number. The root of can be removed from its Tanner graph without changing the code, as shown in Fig. 23 . This completes the proof.
After removing the root of , we use Algorithm 2 to encode an EFTS-LDPC code. Next, we show that the encoding complexity of Algorithm 2 is linear in the block length . We first study the encoding process for the upper tree . The upper tree can be easily encoded in linear time, as shown in the following lemma.
Algorithm 2 Encoding algorithms for EFTS-LDPC codes ( contains tiers)

Initialization
Lemma 5:
The upper tree can be encoded in linear complexity.
The proof is straightforward. We omit it here. We look at an example. Fig. 24 shows an upper tree . We encode as follows:
a. Acquire the values of the information bits , , , , c. Compute the parity bits , , , , , and from the parity-check equations to : ; ; ; ;
; .
The complexity of the above encoding process is only XOR operations. After encoding , we notice that all the bit nodes in represent parity bits. Let represent the number of tiers in . Since the degree of every bit node in is two, the value of depends only on the values of the bit nodes in the lower tier. Particularly, the values of the bit nodes in tier of are based only on the values of the leaf nodes of . Hence, we can compute the values of the bits in tier by tier, starting from the bottom tier. Each time we succeed in obtaining the values of all the bits in a given tier, say, the th tier, we can then compute the values of all the bits in the th tier. This encoding process keeps going on until the values of all the bits in the second tier are known (the first tier has been removed by Lemma 4) . In this way, we encode all the bits in .
We, again, look at an example. Fig. 23 .
We evaluate the computational complexity of Algorithm 2. Let , , denote the number of bits contained in the th parity-check equation. Each of the parity-check equations is used to obtain the value of a parity bit. When employing the th parity-check equation to determine the value of a parity bit, XOR operations are needed. So, XOR operations are required to obtain all the parity bits. Let denote the average number of bits in the parity-check equations, then the encoding complexity can be expressed as . For LDPC codes with uniform row weight , the encoding complexity is . From the above analysis, the encoding process proposed can be accomplished in linear time.
VI. CONCLUSION
This paper proposes a class of well-structured regular LDPC codes-the turbo-structured (TS-LDPC) codes. We showed through a series of theorems that we can design TS-LDPC codes with arbitrary desired column and row weights and , hence, with any practical rate and arbitrary girth . TS-LDPC codes can be designed by specifying a shift matrix , a much smaller object than the parity-check matrix , hence, requiring much less memory to store them. TS-LDPC codes with girth have good BER performance, with lower error floor at high SNR than equivalent size and rate -cycle-free random codes. We further showed that a variant of TS-LDPC codes-EFTS-LDPC codes-can be encoded efficiently in linear complexity. These characteristics of flexible code rates, arbitrary large girth, good error floor performance, efficient storage, and efficient encoding make TS-LDPC codes attractive for applications such as digital communication systems and data storage systems.
