Parity-check (PC) polar codes can yield better error-correcting performance compared with the cyclic redundancy check (CRC) aided polar codes under successive cancellation list decoder. However, PC bits are incapable of detecting error as effective as CRC. To overcome this shortage, this paper proposes a scheme of CRC-aided PC polar codes. The proposed scheme can detect error before decoding is completed and outperform the standard CRC-assisted polar codes with better capability of error detecting.
I. INTRODUCTION
POLAR codes, introduced by [1] , have been adopted in the 5 th generation wireless communication standard as the control channel coding scheme for the enhanced Mobile Broadband (eMBB) service. Currently, the most dominant decoding scheme for polar codes is the cyclic redundancy check (CRC) assisted successive cancellation list (CA-SCL) algorithm [2] , [3] which enables polar codes to be competitive with state-of-the-art codes. To further improve the performance of CA-SCL at high signal noise ratio (SNR), some research was conducted based on the CRC code design. Zhang et al. searched the optimal CRC polynomials for the standard CA-SCL decoder to eliminate the erroneous polar codewords with minimum Hamming weight (MHW) in decoding [4] . Using the same idea, [5] and [6] respectively designed the protected bits of CRC and the locations of CRC bits.
However, all the above schemes have two drawbacks. First, they cannot detect error in the intermediate decoding process. 1 Second, they cannot correct the decoding error.
To solve the first problem, [7] first proposed a multi-CRC polar codes which can also reduce the decoding delay and memory space. Meanwhile, another partitioning method which can reduce the memory requirements associated with SCL decoding was proposed by [8] . Considering these structures suffer from performance degradation, [9] proposed an The associate editor coordinating the review of this manuscript and approving it for publication was Soon Xin Ng . 1 Since SC or SCL decoder is serial, the decoding delay is large especially when code length is very high. We should find a method which can detect the decoding error in time, but instead of detecting error at the end of decoding. optimized scheme. However, its designing complexity is high and it fails to give a universal concatenation scheme.
The second defect can be addressed by parity-check (PC) polar coding which is first proposed by [10] . After that, [11] designed the PC coding based on the polar kernel to improve the performance. Especially, [12] used a single cycle shift register (CSR) to design a PC polar coding scheme, which has been verified to provide significant performance gain.
Indeed, PC bits can be regarded as scattered CRC bits to detect error. 2 However, they cannot improve the performance while at the same time maintaining the similar error-detecting rate of the standard CA-SCL decoder. That's to say, if PC bits were used to detect error, the performance would not be improved. Otherwise, if they were used to correct error, receiver would not be able to confirm whether the decoding is correct or not. Considering the error-detecting capacity of p independent PC bits is inferior to a p length CRC [13] , PC bits are mainly used to correct error. This makes some techniques that can be used in CA-SCL decoder (such as adaptive SCL decoder [14] , bit-flip algorithm [15] , [20] , [21] and hybridautomatic-repeat-request (HARQ) mechanism [22] ) cannot be applied to the PC polar coding to repair the failed decoding attempt.
Overall, there does not exist a structure that can provide the similar decoding performance of PC polar codes while detecting error as accurately as the standard CRC-concatenated 2 PC bits are scattered not just in their locations but in their work mode. PC bits are actually odd even check bits and work independently of each other. On the other hand, CRC, which has a longer code length and each check bits is related, performs better on checking error. polar codes. To deal with this issue, this paper will propose a CRC-aided PC polar coding scheme that has both the capacity of error correcting and detecting. The main contributions of this paper can be summarized as follows 1) Based on [12] , we will propose an enhanced PC coding scheme which can significantly reduce the rate loss. 2) For the enhanced scheme, the suitable number of PC bits is discussed.
3) The locations of CRC and PC bits are designed. In the proposed scheme, CRC bits are located in the middle of the information sequence and only protect their previous bits. Thus, decoding errors can be detected before the entire decoding is completed. Simulation results will verify that the proposed scheme can provide better performance and error-detecting capacity compared with the standard CRC-concatenated scheme, irrespective of the code length or the code rate.
II. PRELIMINARIES
where u K 1 and c N 1 respectively denote sequence of information bits and codeword, and G A is the generator matrix which takes rows with indices i∈A from matrix G = 1 0 1 1 ⊗n , with A the information set and ⊗ the Kronecker product. Let |A| be the cardinality of A and A c = {1, 2, . . . , N }−A be the frozen set. The code rate is R = K N . The encoding process can also be c N 1 =ū N 1 G, wherē u N 1 is the input vector of polar codes. Forū N 1 , the bits with indices in A c are all 0 and the bits with indices in A are information bits.
B. PARITY-CHECK POLAR CODING
Of all the existing PC polar coding schemes, we only consider the one in [12] for its two advantages. First,its construction is so simple and universal that it does not need to redesign and store different PC functions for different polar coding schemes. Second, it can achieve satisfactory performance gain compared with all the other existing schemes. The process of this PC coding can be summarized as follows:
1) Construct the polar code to achieve A, where |A| = p+K , with p = 1, 2, . . . , N −K , and α a parameter determined by list size of SCL decoder.
2) Determine the locations of PC bits. Let A m and A s be the set that satisfy
where d m and d s are the MHW and sub-minimum Hamming weight of polar code, respectively, and w(g
n ) is the weight of the i-th row in G. The positions of PC bits are selected in these two sets according to the descending order of their own reliability which is obtained by coding construction. Concretely, if p≤|A m |, p indices in A m are selected.
If p>|A m |, all the indices in A m should be selected and the extra p−|A m | indices are chosen from set A s . We use A in to denote the different set between A and P, where P is the set of the selected locations for PC bits. 3) Implement the PC pre-coding by a 5-length CSR as shown in Fig.1 . Let r[k] i be the value of the k-th register after i times of cyclic shift. We useū N 1 to denote the PC encoded vector, or equivalently the input vector of polar code. The value for each register is initialized by r [1] 
. . = r[5] 0 = 0, and a count parameter k is set 1. Then, one considers all the indices from 1 to N one by one. When considering any i∈{1, 2, . . . , N }, the register is first left cyclic shifted, i.e., r [1] 
If i∈A in , then successively set
If i∈P, then setū i = r [1] i . If i∈A c , then setū i = 0. Example: In the modified PC coding, when N = 32, K = 16, A in = {8, 12, 14, 15, 16, 20, 22, 23, 24, 26, 27, 28, 29, 30 , 31, 32}, P = {13, 18, 19, 21, 25}. Then there are |P| = 5 PC functions. Concretely,
III. CRC-AIDED PC POLAR CODING A. MODIFIED PC CHECK
Actually, the scheme in [12] can be sightly modified by using all the frozen bits as PC bits. As shown Algorithm 1, in the modified pre-coding process, when considering the index i∈A c , it still hasū i = r [1] i . Considering SCL decoder can approach the ML performance with practical list size, e.g., L = 8, the number of MHW codewords has a great impact on the decoding results [16] . Therefore, even if all the frozen bits are used as PC bits, the index set P that are selected from A m is still necessary for parity checking. This is because only using frozen bits to check cannot essentially reduce the number of the MHW codewords.
In this modified PC coding scheme, there are two kinds of PC bits. One is the bits at frozen indices, which are called as frozen parity-check bits (FPCB), and another one is the bits whose positions are selected from A m , i.e., the
bits with indices in P, which are referred to as information parity-check bits (IPCB). Obviously, under the same condition, the modified PC coding should outperform the original one [12] .
B. SUITABLE LENGTH OF PC CHECK
One of the main drawbacks of the PC coding is that it cannot detect error. Intuitively, we can use a CRC to solve this problem. However, too many check bits will cause rate loss to offset the benefits. Thus, the total number of both the two kinds of check bits should be limited. Under this condition, we will first tradeoff between the suitable length of these two kinds of check bit.
In the original PC coding scheme in [12] , if the rate matching technique is not used, the number of PC bits is set to
Especially, when code rate equals to 0.5 and α is fixed, the number of IPCB will achieve the maximum value α log 2 N . Since α is a variable parameter, [12] did not provide the closed-form solution of the most suitable number of IPCB. Note that if α is set 1, p will fluctuate around log 2 N according to the code rates. Obviously, FPCB will not cause rate loss. Thus, we only discuss the number of IPCB in the following experiment. As shown in Fig.2-3 , we give the frame error rate (FER) of the PC coding scheme by arranging p from 1 to log 2 N (with step 1) under SCL decoder. We consider the polar codes with length N ∈{1024, 512, 256} and rate R∈{1/2, 2/3}. Thus, |A| is set to K +p. When R = 1/2, the transmitting SNR is 2.4 dB and the list size of SCL decoder is 8. When R = 2/3, the transmitting SNR is 2.6 dB and the list size of SCL decoder is 16. Fig.2 and Fig.3 respectively consider the original PC polar coding scheme in [12] and the modified PC coding scheme given in Algorithm 1. In Fig.2 , we can find that as p gets to smaller, the performance is getting worse. However, the opposite situation occurs in Fig.3 . When p gets to smaller, the performance is getting better. This is mainly because in the modified scheme, the FPCB can also benefit the decoding performance. On the other hand, in the original PC coding scheme, the error correcting only relies on IPCB, and thus the number of IPCB is more important for performance compared with the modified scheme. That's to say, using the original PC coding, the number of IPCB cannot be further reduced.
To make a more fair comparison, in Fig.4 -5, the number of information bits is changed with p and we only adopt the modified PC coding. The code lengths of the considered polar codes will be N ∈{256, 512, 1024}. In Fig.4 , the number of information bits is set as
That's to say, |A| is fixed to N 2 + log 2 N . The transmitting SNR is 2.4 dB. We can observe that the performance of the PC polar code with p = 1 is very close to the one with p = log 2 N . In Fig.5 , the code rate is higher and set as
The transmitting SNR is 2.6 dB. We can observe that the performance gap between the scheme with p = 35 log 2 N 36 and the one with p = 1 is still narrow. In Fig.6 , we consider the modified PC scheme with
The transmitting SNR is 3.0 dB. Although the curves are still fairly flat, they are a little steeper than those in Fig.4 and 5 .
From Fig.4-6 , we can observe that when the code rate is getting higher, the number of IPCB has a greater impact on the error correcting performance.
Thus, for the modified PC coding scheme, we can draw a conclusion that just one IPCB is enough to obtain satisfactory performance gain if the code rate is not very high (i.e., ≤ 3 4 ) . Under this condition, if we use a CRC-aided PC code, the number of CRC bits can be selected as
C. THE LOCATION OF CHECK BITS
In this part, we will design the locations of the two kinds of check bit. Based on the discussion in the previous part, the number of IPCB can be fixed to 1. We want to maximize the path-metric penalty for the error path. Thus, the location of IPCB should be the maximum index in A m . As for CRC bits, they are expected to detect the decoding error as soon as possible. Thus, we need to focus on the most likely location for the first error bit. In other words, the positions where the first error bit rarely appears does not need to be protected by CRC. Recalling that at high SNR the performance of SCL decoder approaches the ML bound, [6] verified that the main error patterns of the SCL decoder are the input vectorū N 1 whose corresponding codeword has the MHW. Meanwhile, [4] also proved that such the input vector u N 1 should satisfȳ
with i∈A m . This implies that at high SNR if a decoding attempt fails, the first error of the entire frame rarely occurs after the maximum index in A m . On the other hand, at low SNR, most of the first error bits also tend to appear in the first few positions. This is because at low SNR the performance mainly depends on the reliability of the bit-channel and the first several ones are less reliable. That's to say, for any SNR, CRC can obtain a satisfactory detecting effect by only protecting the bits with indices before the maximum index in A m .
According to the above analysis, the locations of CRC bits can be determined. If A is arranged in ascending order, we have
We denote the maximum index of A m as A(M ), with M = 1, 2, . . . , |A|. Then the set of the positions of CRC is
Note that A(M ) is also the location of the PC bit. We have
D. CRC-AIDED PC CODING SCHEME Then, the proposed CRC-aided PC coding scheme can be summarized in Algorithm 2. We can find that the source information vector u K 1 is first encoded by CRC code. Then the obtained vector u K +p crc 1 is PC encoded intoū N 1 .
Algorithm 2 CRC-Aided PC Pre-Coding Scheme
Input: A in , M , u K 
In Fig.7 , we compare the different CRC-concatenated polar coding schemes and the modified PC coding scheme. Fig.7(a) shows the standard scheme where the CRC bits are appended at the tail of the source information vector and protect all the message bits. Fig.7(b) depicts the scheme in [5] where the CRC bits are appended at the tail of the source vector while protect the bits with indices in A m . Then, Fig.7 (c) presents the scheme in [6] where the CRC bits are located at the front of the source vector and protect the bits with indices in A m . Besides, Fig.7(d) depicts the structure in [7] where CRC bits are partitioned into serval parts and each part only protects its previous information bits. Fig.7 (e) depicts the modified PC coding scheme given in Algorithm 1 where only the PC bits are adopted. Finally, we give the pstructure of the proposed CRC-aided PC polar coding in Fig.7 (f) and we can observe that the CRC bits are located in the middle of the source vector and only protect their previous bits. From Fig.7 , compared with all the existing CRC-concatenated polar codes, the proposed scheme has the following advantages:
1) The CRC bits are located in the middle of the information vector so that the failed decoding attempt can be detected before the completion of entire decoding process.
2) The detecting efficiency is better since the protected bits are only part of the entire information vector and would contain the first erroneous decoding bit with extremely high probability.
3) The PC bit can improve the performance. 4) For the case that |A m | is very small compared with |A|, the schemes that CRC only protects the bits with indices in A m would be incapable and suffer from extremely high undetecting rate. However, the proposed scheme is suitable for the polar codes with all the possible |A m |. In Fig.8 , we also give different PC coding schemes. Fig.8(a) depicts the original PC coding scheme given in [12] where the IPCBs are located at the indices in P and the information bits are located at the indices in A in . Meanwhile, Fig.8(b) shows the modified PC coding scheme which is similar with the original PC coding scheme. The only difference is that the modified PC coding scheme takes the frozen bits as the FPCB. In Fig.8(c) we give the proposed CRC-aided PC coding scheme where only one IPCB is adopted and the CRC bits are placed in front of the IPCB. Note that in Fig.7 we did not consider the frozen bits in each schemes.
Compared with the PC coding, the superiority of the proposed scheme is also notable. The proposed scheme can detect the failed decoding attempt so that the re-decoding mechanisms can be applied.
E. DECODING OF CRC-AIDED PC POLAR CODES
Let L N 1 = [L 1 , L 2 , . . . , L N ] be the log-likelihood ratio (LLR) vector received from channel. Besides, we useû i [l] to denote the i-th decoding bit at the l-th path when just decoding u i , with i = 1, 2, . . . , N and l = 1, 2, . . . , L. Finally, PM i [l] denotes the path metric (PM) of the l-th path when just decoding u i . From [18] , PM i [l] can be expressed as
Obviously, the most reliable path is the one with minimum PM. The decoding process of the proposed scheme is summarized in Algorithm 3.
Algorithm 3 Decoding of CRC-Aided PC Polar Codes
if All the paths cannot pass CRC then Terminate decoding and declare failure.
The path with the minimum PM is output asû N 1 Returnû N 1 ;
In Algorithm 3, we can observe that for any i∈A in , SCL decoder successively decodes u i . Especially, if i = A(M −1), then all the current L decoding trajectoriesû A(M−1) 1 should be testified by CRC. For the trajectories that can not pass CRC, their PM is set positive infinity. This operation equivalently means removing these trajectories from the candidate list. For any i / ∈A in ,û i 1 [l] is obtained by the PC function established by Algorithm 2. If i = N , then one should select the path with minimum PM as the output of decoder.
One of the main advantages of such decoder is that CRC can be used not only to detect error but also to correct error. All the trajectories that can be seen as the inference of the correct path, i.e., those trajectories that cannot pass the CRC, are eliminated during intermediate decoding. . Performance comparison of the proposed structure with the scheme in [5] , the scheme in [6] , the standard CRC-concatenated polar codes and the modified PC polar coding in algorithm 1.
IV. SIMULATION RESULTS
A. PERFORMANCE COMPARISON Fig.9 compares the FER of the proposed structure with the scheme in [5] , the scheme in [6] , the standard CRCconcatenated polar code and the modified PC polar coding in Algorithm 1. For all the schemes, the sum of the number of check bits is determined by Eq.(4). The code lengths of polar code are set to 256, 512 and 1024. The code rate is fixed to 0.5 and the CRC polynomials are selected from [17] . Note that under polarization weight (PW) [19] coding construction, when code lengths are 256, 512 and 1024, the corresponding |A m | are 4, 1, 25, respectively. The list size of SCL decoder is 8 and α = 1. We can find that when N = 1024, the scheme of [5] and [6] provide similar FER performance compared with the standard CRC-concatenated polar codes. When N = 256 or N = 512, the performance of these two schemes is inferior to the standard one. This means when |A m | is too small compared with |A|, these two schemes are incapable. Meanwhile, the performance of the proposed scheme is very close to the that of the modified PC polar coding scheme, irrespective of the code length.
In Fig.10 , we give the performance curves of the proposed scheme, the original PC coding [12] , the modified PC coding and the standard CRC-concatenated scheme. The code lengths of the considered polar codes are N ∈{256, 512, 1024} and the code rate is fixed to 2/3. The total number of the check bits is fixed to log 2 N . The list size is 16. We can find that the modified PC coding can provide the best performance among all the schemes, irrespective of code length. The performance of the proposed scheme is very close to that of the modified PC coding. Compared with the standard CRC-concatenated scheme, all the PC coding schemes can give better performance. FIGURE 11. Performance comparison of the proposed structure with the scheme in [5] , the scheme in [6] , the standard CRC-concatenated polar codes and the modified PC polar coding in algorithm 1.
In Fig.11 , we give the bit error rate (BER) of the same schemes considered in Fig.9 . The parameters of simulations are also identical with those used in Fig.9 . We can find that the BER of the modified PC coding scheme is also the most outstanding. The BER of the proposed CRC-aided PC polar coding is very close to that of the modified PC coding. [12] , the modified PC coding in algorithm 1 and the standard CRC-concatenated polar codes. The length of polar codes is 4096.
In Fig.12 , the length of the adopted polar codes is changed to 4096. The code rate is 0.5 and the list size of SCL decoder is 8. For each schemes, the total number of the check bits is based on Eq.(4). We can find that the proposed scheme also outperforms the standard CA-SCL decoder, irrespective of the FER or BER. This further shows the universality of the proposed scheme. To determine the border-line of the practicality of the proposed method, we give Fig.13 where the code rate of polar codes is 3 4 . The code length is N = {256, 1024}. The list size of SCL decoder is 8. We can find that the proposed scheme can not provide that impressive performance gains as it does at low code rates. This is because at high code rate the number of FPCB, which can correct the decoding error, is reduced.
Finally, we also consider the effect of the decoding scheme in the presence of the coded-modulation. In Fig.14-15 , we give the performance curves associated with high order modulation, where gray mapping is adopted. In Fig.14 we adopt 16-QAM and in Fig.15 we use 64-QAM. In Fig.14, the list size of SCL decoder is fixed to 8. The code rate is 0.5 and the code length is 1024. In Fig.15 , the list size of SCL decoder is 32. The code rate is 1 3 and the code length is 1536. The puncturing method proposed in [24] is adopted. For each scheme, the number of check bits is fixed to 8. This means that the information set of all the three schemes are identical. Thus, their puncturing patterns are also the same. This ensures the fairness of the comparison. We can find that when high order modulated, the proposed scheme can also provide impressive performance gain compared with the standard CA-SCL decoder. Similar to the case of BPSK, the modified PC coding provides the best performance among all the considered schemes. 
B. DETECTING EFFICIENCY COMPARISON
In Table. 1, we give the undetecting rate (P ud ) of all the CRCconcatenated polar codes depicted in Fig.7 . For the PC polar codes, we can regard its undetecting rate as 100% since all the check bits are used to correct error. The length of the considered polar code is 1024. The number of the CRC bits is determined by Eq.(4). The polynomial of CRC is selected based on [17] . The multi-CRC scheme [7] partitions the information vector into 2 parts for protecting. The list size of SCL decoder is 8 and α is set 1. The results in Table. 1 are derived based on a statistics for 3000 error decoding frames.
Using these parameters, we construct polar codes by PW algorithm [19] . Note that when the code rate is 1/2, |A m | is 25. Meanwhile, when the code rate is changed to 2/3, |A m | will be 7. From Table. 1 one can find that at low SNR (1 dB), the undetecting rate of the schemes in [5] and [6] is higher than 40%. That of the multi-CRC scheme is higher than 20%. However, for both standard and the proposed scheme, their undetecting rates are lower than 15%. At high SNR (2 dB), there is a remarkable improvement for the schemes in [5] and [6] . However, the proposed scheme is still the most outstanding one. This is because compared with the standard CRC-concatenated scheme, the protected bits of the proposed scheme is less and the detecting efficiency is higher. The schemes in [5] and [6] only consider the polar codewords with MHW, and thus they will be incapable when SNR is low or the |A m | is small. For the multi-CRC scheme, when the sum number of the CRC bits is fixed, the detecting effect of the partitioned CRC should be inferior to a single entire one.
We can also observe that when |A m | is small compared with |A|, the undetecting rate of schemes in [6] and [5] will be high. However, the proposed scheme can provide satisfactory undetecting rate regardless of |A m |. Overall, the proposed scheme can provide the best detecting effect, irrespective of the code length, code rate or |A m |. This reflects the universality of the proposed scheme.
C. THROUGHPUT OF THE PROPOSED SCHEME
When using HARQ or adaptive SCLD technique, throughput is an improtant metric to design coding scheme, where throughput of a decoder can be defined as [23] η = R(1 − P fer ) (15) where P fer is the FER of the decoder. In Fig.16 , we give the throughput of the proposed coding scheme and CA-SCL decoder. The SNR (i.e., E b N 0 ) is 2 dB. The codelength is 1024 and the number of check bits is 8. The list size is 8. We can find that if we take throughput as the metric, the optimal code rate for CA-SCL decoder and the proposed coding scheme are both 0.65. At higher code rate, the proposed scheme can provide better throughput compared with the CA-SCLD.
V. CONCLUSION
In this paper, we proposed a structure of CRC-aided PC polar coding. We designed the number of the two types of the check bits. Moreover, the locations of the check bits are jointly designed. The proposed scheme can terminate a failed decoding attempt early and offer an impressive performance gain compared with the standard CRC-concatenated polar codes with a better error-detecting effect.
