The recently proposed unary error correction (UEC) and Elias gamma error correction (EGEC) codes facilitate the near-capacity joint source and channel coding (JSCC) of symbol values selected from large alphabets at a low complexity. Despite their large alphabet, these codes were only designed for a limited range of symbol value probability distributions. In this paper, we generalize the family of UEC and EGEC codes to the class of rice and exponential Golomb error correction codes, which have a much wider applicability, including the symbols produced by the H.265 video codec, the letters of the English alphabet, and in fact any arbitrary monotonic unbounded source distributions. Furthermore, the practicality of the proposed codes is enhanced to allow a continuous stream of symbol values to be encoded and decoded using only fixed-length system components. We explore the parameter space to offer beneficial tradeoffs between error correction capability, decoding complexity, as well as transmission-energy, -duration, and -bandwidth over a wide range of operating conditions. In each case, we show that our codes offer significant performance improvements over the best of several state-of-the-art benchmarkers. In particular, our codes achieve the same error correction capability, as well as transmission energy, transmission duration, and transmission bandwidth as a variable length error-correction code benchmarker, while reducing the decoding complexity by an order of magnitude. In comparison with the best of the other JSCC and separate source and channel coding benchmarkers, our codes consistently offer E b /N 0 gains of between 0.5 and 1.0 dB which only appear to be the modest, because the system operates close to capacity. These improvements are achieved for free, since they are not achieved at the cost of increasing transmission energy, transmission duration, transmission bandwidth, or decoding complexity.
a, b, c
Bit or LLR vector lengths. C DCMC capacity. d
Stream of symbols at the transmitter.
x, t
Stream of sub-symbols at the transmitter.
y, z, u, v, w
Bit vectors at the transmitter. d
Stream of decoded symbols.
x,t
Stream of decoded sub-symbols. y
Vector of a posteriori LLRs at the receiver. z a ,ũ a ,ṽ a ,w a Vector of a priori LLRs at the receiver. z e ,ũ e ,ṽ e ,w e Vector of extrinsic LLRs at the receiver. H Symbol entropy. k ExpG Parameter of the ExpG code. l 1 , l 2 Average codeword length of the unary or FLC. L Symbol alphabet cardinality. M Rice Parameter of the Rice code. n 1 , n 2 Codeword length of the UEC trellis and CC trellis. p 1 Probability of the value 1 in a zeta distribution.
Puncturing or doping rate or the UEC or FLC-CC.
Average coding rate of the UEC or FLC-CC. r 1 , r 2 Number of states in the UEC trellis and CC trellis. s
Parameter of the zeta function.
Maximum value of sub-symbol considered by the FLC decoder. η Effective throughput.
I. INTRODUCTION
The encoding of multimedia information such as video and audio typically results in symbol values that are selected from large or infinite alphabets. For example, the H.265 video encoder represents source video information using transform coefficients and motion vectors [1] , which correspond to a large alphabet of symbol values in the range spanning from 1 to around L = 1000, as shown in Figure 1a . It may be observed that these symbols obey Zipf's law [2] , with lowvalued symbols occurring frequently and high valued symbols occurring infrequently, as shown in Figure 1a . Owing to this, the occurrence of these symbol values may be modeled by a zeta probability distribution, as was previously observed for the H.264 video encoder in [3] . In order to facilitate the reliable and bandwidth-efficient transmission of multimedia information, both source coding and channel coding is required. where the state-of-theart has evolved with the contributions listed in Table 1 . Shannon [4] postulated that near-capacity operation may be achieved using Separate Source and Channel Coding (SSCC). Here, a near-capacity channel code such as a turbo code [5] or Low-Density Parity-Check (LDPC) code [6] may be combined with a separate near-entropy source code, such as an arithmetic code [7] or Lempel-Ziv code [8] . However, these near-entropy source codes are typically impractical, since they assume that infinite complexity and/or latency can be afforded. For example, both the arithmetic code and Lempl-Ziv code require accurate knowledge of the FIGURE 1. The probability distribution of (a) letters of the English alphabet and the space character, ordered according to descending probability of occurrence, and (b) symbols output from a H.265 encoder. Also shown is the finite zeta-like distribution of (1) for p 1 = {0.1, . . . , 0.9} and with L = 27 and L = 1000, respectively. probability of occurrence of each symbol value at both the transmitter and receiver. This imposes both an excessive complexity and memory requirement when the symbols are selected from an alphabet having a large or infinite cardinality, as it is typical for the encoding of multimedia information. In adaptive schemes, the transmitter and receiver learn the symbol value probabilities from the transmitted symbols, but any transmission errors result in de-synchronization between the transmitter and receiver, introducing a large number of decoding errors from that point onwards [9] .
On the other hand, universal codes can encode source symbol values selected from large and infinite alphabets, without incurring the issues associated with near-entropy source codes. More specifically, a source code is said to be universal if it represents the source symbol vector using a bit vector that is guaranteed to have a finite average length for any monotonic source symbol probability distribution. These universal codes include the Elias Gamma code [13] , Elias omega code [13] , Stout code [15] , Fibonacci code [18] and the Exponential Golomb (ExpG) [23] code. These codes are capable of operating without any knowledge of the VOLUME 4, 2016 source symbol value probabilities, granting them immunity to the deleterious synchronization problems that detrimentally affect both the arithmetic and Lempl-Ziv codes. However, typically non-negligible redundancy remains in the encoded bitstreams produced by these source codes, which leads to capacity loss, when combined with a separate channel code. Motivated by this, JSCC [24] may be employed for exploiting the residual redundancy that remains after source encoding for enhancing the attainable error correction capability. A particular example of JSCC is constituted by the classic family of Variable Length Error-Correction (VLEC) codes [25] - [28] , although they exhibit a decoding complexity, which increases rapidly as the cardinality of the source symbol value alphabet increases, preventing its application for large or infinite alphabets. Against this background, the recently proposed UEC code [3] facilitates the JSCC of source symbol values selected from alphabets having large or infinite cardinalities, while maintaining near-capacity operation and imposing only a modest decoding complexity.
However the UEC code is only practical for a subset of zeta distributions, which does not include those which best model the H.264 and H.265 symbol distributions. More specifically, since a UEC is not a universal code [9] , for some zeta distributions the UEC code results in a bit vector having an infinite average length. Motivated by this deficiency of the UEC code, we previously proposed the class of EGEC codes [22] . Since the EGEC code is a universal code created from the class of Elias Gamma (EG) source codes, it produces bit vectors having a finite length for any monotonic probability distribution, including all zeta distributions. Despite its finitelength nature, it still produces long bit vectors for some zeta distributions, hence potentially resulting in low coding rates. To circumvent this problem, excessive puncturing may be required, if a high coding rate is desired, which potentially leads to increased complexity relative to codes having higher original coding rates, as well as to a degraded error correction performance [29] . As a result, the EGEC code family was only characterized for a limited set of zeta distributions in [22] . These limitations of the UEC and EGEC codes imply that neither of them exhibits general applicability, hence indicating that further generalization is required.
Against this background, this paper extends and generalizes the UEC code family of [3] and the EGEC code of [22] , so that they become attractive for encoding symbols produced by all zeta distributions, hence granting them general applicability. More specifically, we generalize the EGEC code family by extending its schematic to produce the class of Exponential Golomb Error Correction (ExpGEC) codes, as shown in Figure 2 . We also extend the EGEC code to produce the Rice Error Correction (RiceEC) code family, which represents a generalization of the UEC code. We show that the proposed ExpGEC and RiceEC codes offer a better error correction performance over a wider range of source symbol distributions than that of its benchmarkers. In particular, we consider the full range of zeta distributions for the first time, allowing us to best represent a wider range of source symbols, including the distribution of the English alphabet, which has a cardinality of L = 27 when including the space character, as characterized in Figure 1b . Furthermore, we consider the specific distribution of symbols produced by the H.265 video encoder, which we have not previously considered. Motivated by this, we critically appraise the infinite-cardinality zeta distribution of our previous work in the specific scenario of finite-cardinality zeta-like distributions, where symbols can assume values in the range {1, . . . , L}. We investigate how a specific choice of the parameter L affects the source symbol distribution, as well as how the parameters of the proposed codes may be best selected for optimizing their error correction performance. For the first time, we also compare the proposed codes and our previous codes to the classic VLEC code, although this benchmarker only has a moderate complexity for low values of L. Additionally, we further improve upon our previous work by improving the practicality of the proposed codes. More specifically, the EGEC scheme of [22] requires five different interleavers, each having different lengths and therefore designs, which change from frame to frame. Not only is this arrangement challenging to implement, but the overall error correction 7156 VOLUME 4, 2016 FIGURE 2. The proposed ExpGEC/RiceEC schemes. Here, the buffers facilitate operation on the basis of a stream of source symbols, while maintaining fixed-length designs for the interleavers π 1 -π 5 . performance is dominated by the worse-case performance, associated with the shortest inverleaver lengths. By contrast, our proposed ExpGEC and RiceEC schemes are designed for encoding continuous streams of symbols, while maintaining constant interleaver-lengths in every frame, hence significantly improving the associated practicality and performance. Furthermore, we prove that these modifications guarantee synchronization between the transmitter and receiver.
As shown in Figure 3 , we commence our discourse by detailing the operation and characteristics of the proposed ExpGEC and RiceEC codes, where Sections II and III consider the operation of the encoders and decoders, respectively. Section IV describes the arrangement proposed for processing streams of symbols, while maintaining fixed interleaver lengths. Section V characterizes the error correction performance of the proposed codes, demonstrating that they are superior to the best of several benchmakers for a wide variety of source distributions, including the full range of zeta distributions, as well as the H.265 and English alphabet distributions. Finally, we offer our conclusions in Section VI.
II. EXPGEC AND RICEEC ENCODER
In this section, we introduce the ExpGEC and RiceEC encoders, which are shown in Figure 2 . We commence in Section II-A by describing how the proposed ExpGEC and RiceEC encoders decompose the source symbols into subsymbols. Following this, Sections II-B and II-C describe how the sub-symbols are encoded by the UEC encoder and the Fixed Length Code-Convolutional Code (FLC-CC) encoder, respectively. Finally, Section II-D describes how the ExpGEC and RiceEC encoders may be integrated into a transmitter, as shown in Figure 2 .
A. SOURCE SYMBOLS AND THEIR DECOMPOSITION INTO SUB-SYMBOLS
As portrayed in Figure 2 
In contrast to the infinite symbol alphabet of our previous work [22] , we consider source symbols which are randomly selected from a finite symbol set, as described in Section I. This introduces an additional parameter, namely the cardinality L of the source symbol set. More specifically, each RV D i adopts a value in the set {1, 2, 3, . . . , L}, according to a particular source symbol distribution. Since Figure 1 shows that the H.265 symbols and the letters of the English alphabet obey Zipf's law, we consider a finite-cardinality zeta-like source symbol distribution, where the probability Pr(
−s being the finite Riemann zetalike function. Here, the variable s > 1 is related to the probability of an RV D i adopting the value 1 according to
, which parameterizes the finite zeta-like distribution. The entropy of the source symbols is given by 
Note that in the special case of the unary code where M Rice = 1, the length of each corresponding codeword becomes equal to the value of the encoded symbol l Unary (d i ) = d i . When the source symbols obey the finite zeta-like distribution, the average length l =
of the encoded ExpG, Rice, and unary codewords is given by
As shown using the dashed lines in Table 2 , each ExpG and Rice codeword can be viewed as a concatenation of a unary prefix y i = Unary(x i ) and Fixed Length Code (FLC) suffix u i = FLC(t i ), where x i and t i are the sub-symbols derived from a particular symbol d i . For each symbol d i , the value of the sub-symbol x i is given by
Rice:
Here, the sub-codeword y i = Unary(x i ) comprises (x i − 1) zeros, followed by a single logical one-valued bit. Likewise, the value of the sub-symbol t i is given by 
Motivated by the observation that each ExpG and Rice codeword comprises a unary prefix and an FLC suffix, the ExpGEC/RiceEC encoder of Figure 2 Figure 2 , which is based on the unary code, as described in Section II-B. Meanwhile, each sub-symbol t i in the stream t is encoded by the FLC-CC encoder, which is based on the FLC code, as described in Section II-C.
B. UEC SUB-SYMBOL ENCODER
As shown in Figure 2 , the input of the UEC encoder is provided by the stream of sub-symbols x = [x i ]. This stream may be modeled as a realization of a stream of IID RVs X = [X i ]. In the scenario where the RV D i obeys the finite zeta-like distribution of (1), the probability Pr(X i = x) = P(x) is given by
while the entropy of each RV X i is given by
Each sub-symbol x i in the stream x is encoded by the unary encoder, which outputs the corresponding x i -bit unary sub-codeword y i = Unary(x i ), according to Table 2 . The average length l 1 of these unary sub-codewords is given by
ExpG:
In [3] , the unary code was employed for encoding the source symbols d i directly, but this produces long average codeword lengths when p 1 is low, according to (5) . However, in the scheme of Figure 2 , the unary encoder is used for encoding the sub-symbols x i instead. Since the sub-symbol probability distributions P(x) of (12) and (13) are skewed towards the most likely symbol value x = 1, the UEC code may be used for their encoding without suffering from an excessive average codeword lengths, when p 1 is low. As shown in Figure 2 , the codewords in the stream produced by the unary encoder are concatenated and partitioned into a succession of bit-
, having a fixed length of b bits, as it will be described in Section IV-A. For example, given the sequence x = [2, 1, 3, 2, 3, 2, 1, 1] comprising 8 sub-symbols, the unary encoder produces the sequence y = 011001010010111 comprising b = 15 bits. The bit vector y is entered into the trellis encoder of Figure 2 , which operates on the basis of the UEC trellis of Figure 4 . Here, UEC trellises comprising only r 1 = 4 states are adopted, since our previous work [22] showed that this is sufficient for avoiding any significant capacity loss despite its low complexity, as it will be characterized for the codes proposed in Section V-B. However, the option to extend the trellis to more states remains open [3] , facilitating the elimination of even more capacity loss, at the cost of 
Here, the function odd(·) returns zero, if its operand is even, or one if it is odd. Note that each unary codeword comprises a sequence of zero-valued bits which is terminated by a single one-valued bit. The trellis structure of Figure 4 exploits this for maintaining synchronization between the unary-encoded symbols and the path through trellis. In particular, the final one-valued bit y j in each unary codeword is guaranteed to trigger a transition either to the state m j = 1 or to m j = 2.
In the case where the unary-encoded bit vector y comprises a complete sequence of unary codewords and the start state , where the probability of each state being selected Pr(M j = m|M j−1 = m ) = P(m|m ), is given by [3, eq. (9) ]. The knowledge of these conditional transition probabilities P(m|m ) may be exploited to aid the receiver, as described in Section III-B.
Depending on the path selected through the UEC trellis, each of the bits in the unary-encoded bit vector y is encoded using a n 1 -bit codeword z j , which are concatenated to form the bn 1 
For example, the trellis of Figure 4 uses the set of codewords C = [01; 11], as well as the complementary set C = [10; 00]. In the case of the example path through the trellis m provided above, the UEC-encoded bit vector z = 101110100011010010001101000110 comprising bn 1 = 30 bits is produced, when using the r 1 = 4-state, n 1 = 2-bit trellis of Figure 4 . Since the top and bottom halves of the UEC trellis use complementary codewords, the UEC-encoded bits of z are guaranteed to have equiprobable binary values. Due to this equiprobablity, the average coding rate of the UEC encoder is given by
Here the superscript 'o' is used to indicate this coding rate relates to the outer code of a serial concatenation, namely the UEC code of Figure 2 .
C. FLC-CC SUB-SYMBOL ENCODER
As shown in Figure 2 , the FLC-CC sub-symbol encoder is used for encoding the sub-symbol stream t = [t i ]. Here, the FLC encoder of Figure 2 represents each sub-symbol t i using a codeword u i , which is given by the fixed-point binary representation of t i , having a particular length that may depend on the particular value of the corresponding sub-symbol x i , as described in Section II-B. Motivated by this, we may model the sub-symbol stream t as a realization of a stream of RVs T = [T i ], where each RV T i is dependent on the corresponding RV X i . More specifically, in the case where the RV D i obeys the finite zeta-like distribution of (1), the joint probability Pr(
where 0 ≤ t < 2 x−1+k ExpG for the case of the ExpG and 0 ≤ t < M Rice for the case of the RiceEC code. These joint probabilities may be used for obtaining expressions for the corresponding conditional probabilities, which may be exploited in the receiver to aid FLC-CC decoding. In the case of the finite zeta-like distribution, the conditional probability Pr(T i = t|X i = x) = P(t|x) is given by
where 0 ≤ t < 2 x−1+k ExpG for the case of the ExpGEC code and 0 ≤ t < M Rice for the RiceEC code. Finally, the conditional entropy of the RV T i is given by
As described in Section II-B, each FLC codeword u i has the length x i + k ExpG − 1 in the case of the ExpGEC code, where x i is the corresponding sub-symbol in the stream x.
Owing to this, the FLC-CC encoder requires knowledge of the symbol stream x, as shown in Figure 2 . In the case of the RiceEC code, the length of the FLC codewords is fixed at log 2 (M Rice ) and so the knowledge of x is not required during FLC-CC encoding in this case. When the sub-symbols of d obey the finite zeta-like distribution of (1), the average length of the FLC codewords l 2 is given by
Rice: l 2 = log 2 (M Rice ).
Following FLC encoding, the resultant FLC codewords are then concatenated together to form a bit-stream, which is then partitioned into bit-vectors u = [u j ] c j=1 , comprising c number of bits, as it will be detailed in Section IV-B. In our example of using the k ExpG = 1 ExpGEC code to encode the sub-symbol vectors x = [2, 1, 3, 2, 3, 2, 1, 1] and Table 2 As shown in Figure 2 , the FLC-encoded bit vector u is interleaved in the block π 3 , in order to provide the bit
j=1 . This is then encoded by a r 2 -state, n 2 -bit non-systematic recursive CC encoder having a design selected from [3, Table II ], in order to obtain the bit vector
. The CC codes of [3, Table II ] were shown in our previous work [22] to mitigate capacity loss. More explicitly, while the bits of u are not guaranteed to have equiprobable values, the non-systematic recursive nature of these CCs guarantees equiprobable values for the bits of w, which mitigates capacity loss [3] . For example, the interleaver π 3 = [6, 14, 11, 10, 3, 1, 5, 8, 13, 16, 15, 2, 4, 7, 12, 9] may be employed for transforming the example c = 16-bit vector u provided above into the c = 16-bit vector v = 0011000001010111, where v j = u π 3 (j) . When the r 2 = 4-state, n 2 = 2-bit CC encoder of [3, Table II ] is employed, this bit vector v is encoded into the cn 2 = 32-bit FLC-CC-encoded bit vector w = 00001101010000000011100001110110. The octally represented generator polynomials invoked for this CC encoder are [4] , [7] , while the octal feedback polynomial is 6. Finally, the average coding rate of the FLC-CC encoder is given by
D. INTEGRATION INTO A TRANSMITTER
The RiceEC and ExpGEC encoders may be integrated into a transmitter by serially concatenating them with an inner code. In the scheme of Figure 2 , the bit vectors z and w provided by the UEC and FLC-CC encoders are interleaved in the blocks π 1 and π 4 , before being encoded by 2-state Unity Rate Coding (URC) encoders, which behave as accumulators.
As discussed in Section III-A, these URC codes will facilitate iterative decoding in the receiver, enabling near capacity operation [30] . Following URC encoding, the pair of resultant bit vectors are interleaved and optionally punctured or doped in the blocks π 2 and π 5 of Figure 2 , before they are multiplexed and QPSK modulated onto the channel using Gray mapping. Here, puncturing or doping may be employed for achieving a particular effective target throughput η for the scheme. As we will discuss further in Section V-C, puncturing and doping may also be employed to provide unequal error protection for the UEC and FLC-CC parts of the schemes, in order to facilitate operation at lower channel SNRs. Puncturing is achieved in the blocks π 2 and π 5 of Figure 2 and by removing the appropriate number of bits from the end of the interleaved bit vector. By contrast, doping is achieved in the blocks π 2 or π 5 by duplicating the appropriate number of bits from the end of the interleaved bit vector and concatenating them. The puncturing and doping rates of π 2 and π 5 are represented by R i 1 and R i 2 respectively, which quantifies their ratio of input to output bits. Here, a value of R i > 1 represents puncturing, where R i < 1 represents doping. For example, R i = 0.75 implies that one third of the original bits have been duplicated, while R i = 1.25 implies that one fifth of the original bits have been discarded. Here, the superscript i indicates relevance to the inner code. The transmitter's overall effective throughput in bits per symbol is given by
where M mod is the number of constellation points used by the modulator, which is M mod = 4 in the case of QPSK.
III. ExpGEC AND RiceEC DECODER
In this section, we detail the operation of the ExpGEC and RiceEC decoders of Figure 2 . Section III-A discusses the integration of the UEC and FLC-CC subsymbol decoders of the ExpGEC and RiceEC decoders into the receiver of Figure 2 . Following this, the operation of the UEC sub-symbol decoder is described in Section III-B, while the operation of the FLC-CC sub-symbol decoder is described in Section III-C.
A. INTEGRATION INTO A RECEIVER
In the receiver of Figure 2 , the QPSK demodulator converts each received QPSK symbol into a pair of Logarithmic Likelihood Ratios (LLRs), which pertain to the bits at the transmitter. These LLRs convey the likelihood of the corresponding bits being 1-valued or 0-valued. More specifically, each LLR is defined by LLR = ln
Pr(bit=0) , where a large positive valued LLR represents a high confidence that the bit has the value of logical one, while a large negative-valued LLR represents a high confidence that the bit is logical zero-valued. The LLRs are then demultiplexed and entered into the blocks π removed during puncturing with a zero-valued LLR, which reflects the absence of any knowledge about the value of the corresponding bit. By contrast, de-doping is achieved by removing the LLRs pertaining to the replicas of the duplicated bits and summing them into the LLRs pertaining to the corresponding duplicated bits. Following de-puncturing or de-doping the resultant LLRs are deinterleaved and forwarded to the URC decoders, which iteratively exchange their vectors of LLRs with the UEC decoder or the FLC-CC decoder, as appropriate. More specifically, the UEC trellis decoder and first URC decoder perform iterative decoding by exchanging the LLR vectorsz a andz e , which pertain to the bits in z, as shown in Figure 2 . Likewise, the CC decoder and second URC decoder perform iterative decoding by exchanging the LLR vectorsw a andw e , which pertain to the bits of w. The UEC trellis decoder, CC decoder and URC decoder of Figure 2 invoke the Logarithmic Bahl-Cocke-Jelinek-Raviv (Log-BCJR) algorithm [5] for converting the input a priori LLR vector into the extrinsic LLR output vector, as may be characterized by their EXtrinsic Information Transfer (EXIT) functions, exemplified in Figure 5 . Before initiating the first decoding iteration, the LLR vectorsw e andz e are populated with zero-valued LLRs. The URC decoders then invoke the Log-BCJR algorithm for converting the LLR vectors provided by the demodulator into extrinsic LLR vectors that can be used for iterative decoding. The deinterleavers π 
k=1 , respectively. The UEC decoder and FLC-CC decoder then operate as described in Sections III-B and III-C, in order to generate the extrinsic
k=1 . The interleavers π 1 and π 4 of Figure 2 convert these extrinsic LLR vectors into a priori LLR vectors that can be provided to the URC decoders. The iterations between the decoders continue until a fixed complexity limit has been reached or until the error-free decoding ofx andt have been achieved, which may be detected using a Cyclic Redundancy Check (CRC) in practice. Note that the iterative UEC decoding must be completed before the iterative FLC-CC decoding can begin, since the latter takes the outputx of the former as its input, as will be discussed in Section III-C. Finally, the symbolsd are recovered using the de-splitter S −1 , which operates according to Equations (10) and (11).
B. UEC SUB-SYMBOL DECODER
In this section, we describe the operation of the UEC subsymbol decoder of Figure 2 . Here, the trellis decoder invokes the Log-BCJR algorithm for the trellis of Figure 4 in order to convert the vector of a priori LLRsz a into the vector of a posteriori LLRsỹ, as well as the vector of extrinsic LLRsz e . This extrinsic LLR vectorz e is exchanged with the URC decoder in order to facilitate iterative decoding, as described in Section III-A. The performance of the UEC trellis decoder may be enhanced by exploiting the knowledge of the conditional transition probabilities P(m j |m j−1 ) of the trellis, as discussed in Section II-B. More specifically, the decoder requires the knowledge of the average unary codeword length l 1 and of the first r 1 /2 − 1 values of P(x) in order to compute the transition probabilities P(m j |m j−1 ) according to [3, eq. (9) ]. These conditional transition probabilities P(m j |m j−1 ) may be exploited during the computation of the a priori transition probabilities γ , which are employed by the logarithmic version of the BCJR algorithm [17] . Note that if the source distribution is unknown at the receiver, the trellis decoder may initially operate without the transition probabilities P(m j |m j−1 ), at the cost of a reduced error correction performance [22] . Once a sufficiently high number of frames have been decoded, the value of l 1 and the first r 1 /2−1 values of P(x) may be estimated heuristically, and then exploited for improving the error correction performance.
The transformation ofz a intoz e may be characterized by the UEC trellis decoder's inverted EXIT function [31] , as shown in Figure 5 . Note that for both the ExpGEC and RiceEC codes, the area A o 1 beneath the inverted UEC EXIT function may be closely approximated by [3] , [32] 
Once a sufficiently high number of iterations have been completed between the UEC trellis decoder and the URC decoder, the trellis decoder outputs the a posteriori
, which is forwarded to the unary decoder, as shown in Figure 7a . Note that since each unary codeword contains only a single logical one-valued bit, there are guaranteed to be a number of one-valued bits in the unary-encoded bit vector y. The unary decoder exploits this observation by making logical one-valued hard decisions for the a highest LLR values in the a posteriori LLR vectorỹ, since high LLR values indicate that the corresponding bit in y is likely to have a logical value of one. Following this, zero-valued hard decisions are selected for the remaining (b − a) LLRs inỹ. Note that in practice the value of a may be reliably conveyed to the receiver using a small amount of side information. Alternatively, a logical one-valued hard decision may be selected for all positive LLRs inỹ, while 0 may be selected for all the other LLRs, hence avoiding the requirement of side information, but degrading the error correction performance. Finally, the unary decoder then converts the hard decision bit vectorỹ into sub-symbolsx according to Table 2 .
C. FLC SUB-SYMBOL DECODER
This section describes the FLC-CC decoder, which iteratively exchanges LLRs with the corresponding URC decoder of Figure 2 . More specifically, the CC decoder invokes the Log-BCJR algorithm for processing the a priori LLR vectors Figure 2 , the FLC decoder requires the knowledge of the decoded sub-symbol vectorx
, which is provided by the UEC decoder, as described in Section III-B. This allows the SBSD algorithm employed by the FLC decoder to exploit the knowledge of the conditional subsymbol probabilities P(t|x) [33] , where the corresponding decoded sub-symbolx i is employed as a substitute for x i , when decoding the sub-symbol t i . Note that if the source distribution is unknown at the receiver, the SBSD algorithm may be initially operated without the conditional sub-symbol probabilities P(t|x), at the cost of a reduced error correction performance. Once a sufficient number of frames have been decoded, these probabilities may be heuristically estimated and exploited to restore the error correction performance. In the case of the ExpGEC code, the SBSD algorithm also requires the decoded sub-symbols ofx in order to determine the number of LLRs ofũ a that should be used to recover each of the sub-symbols oft. More specifically, the number of LLRs corresponding to the sub-symbol t i is given by l FLC(t i ) = x i − 1 + k ExpG , as previously described in Section II-C. For the RiceEC code, by contrast, this length is fixed at l FLC (t i ) = log 2 (M Rice ). Note that the complexity of the SBSD algorithm increases exponentially with the length l FLC (t i ) of the codeword it is tasked to decode, since it considers each of the codeword's 2 l FLC (t i ) possible values. Since l FLC (t i ) grows with x i in the case of the ExpGEC code, we may limit the FLC decoding complexity by only invoking the SBSD algorithm for calculating the extrinsic LLRs ofũ e for the specific symbols satisfyingx ≥ x max = log 2 (d max + 2 k ExpG ) + 1 − k ExpG , where we recommend a parameter value of d max = 18 for striking an attractive trade-off between the decoding complexity imposed and the error correction capability attained. This approach also has the benefit of limiting the number of conditional sub-symbol probabilities P(t i |x i ) that are exploited by the SBSD algorithm and that must therefore be known and stored by the receiver. For this reason, the SBSD algorithm is only applied in the case of the RiceEC code for the particular symbols satisfyinĝ The transformation ofw a intow e may be characterized by the FLC-CC decoder's inverted EXIT function, as shown in Figure 5 . Note that the area beneath the inverted FLC-CC EXIT function may be closely approximated by (33) and (34) , as shown at the top of the next page, 1 [22] .
IV. FIXED LENGTH INTERLEAVERS
As described in Section II-D, the EGEC scheme of [22] employs five interleavers having specific lengths and therefore particular designs that change from frame-to-frame, hence limiting the practicality of the EGEC scheme. This may be attributed to the EGEC scheme's partitioning of the sub-symbol streams x and t into fixed length vectors, which
are encoded using variable length codewords to produce bit vectors having lengths that change from frame-to-frame, Motivated by this, this section describes a novel approach that allows our proposed ExpGEC and RiceEC schemes of Figure 2 to use single fixed length designs for the interleavers π 1 to π 5 for each frame, significantly improving its practicality and error correction capability, as discussed in Section I. More specifically, the proposed approach encodes the sub-symbol streams x and t into bit-streams, which are partitioned into bit-vectors
having fixed lengths of b and c for the UEC and FLC-CC subcodes, respectively. Our proposed solution is designed for accommodating unary and FLC codewords that span across frames, as well as for maintaining synchronization between the UEC and FLC-CC sub-codes of the decoder. We commence in Section IV-A by detailing how the UEC sub-code partitions the corresponding bit stream into the fixed length bit vector y, in order to ensure that π 1 and π 2 of Figure 2 have fixed lengths. Likewise, Section IV-B describes how the FLC-CC sub-code partitions the corresponding bit stream into the fixed length bit vector u, in order to ensure that π 3 , π 4 and π 5 of Figure 2 have fixed lengths. Section IV-C proves how synchronization is maintained between the UEC and FLC-CC sub-codes, while Section IV-D discusses the specific design of the interleavers π 1 to π 5 .
A. FIXED LENGTH INTERLEAVERS FOR THE UEC SUB-CODE
As described in Section II-B, the unary encoder of Figure 2 converts each sub-symbol x i in the stream x into the corresponding unary codeword y i . These codewords are concatenated to form a bit stream, which is partitioned into fixed length vectors y = [y j ] b j=1 , which may cause some unary codewords to be split between consecutive frames. In order to address this, the scheme of Figure 2 employs a buffer to store the last bits in the codeword that do not fit into the current frame, so that they can be concatenated onto the start of the next frame. For example, when unary encoding the subsymbol vector x = [1, 2, 1, 1, 4, 1, 3, 1, 2] to form bit vectors comprising b = 8 bits, we obtain y = [1, 0, 1, 1, 1, 0, 0, 0] and y = [1, 1, 0, 0, 1, 1, 0, 1] for two consecutive frames. Note that the unary codeword corresponding to the fifth subsymbol of x is split between the two bit-vectors of y. Owing to this, the first and last bits of the bit vector y are not guaranteed to be the first and last bits of a unary codeword,which is in contrast to the usual UEC operation. The UEC trellis encoder is capable of accommodating this change in two ways.
In a first method for accommodating codewords split between two consecutive frames, the UEC trellis encoder may carry over its state between successive frames. The transmitter then uses a small amount of additional side information for reliably conveying this state to the receiver. The UEC trellis decoder may use this side information to initialize the end state of one frame, as well as the initial state of the next. , where the state m = 3 is sent as side information between the transmitter and reciever. This method maintains all of the UEC code's near-capacity capability, although this is achieved at the cost of requiring additional side information to be transmitted.
In a second method conceived for accommodating codewords split between two consecutive frames, the trellis encoder may be forced to restart the trellis path m from state 1, when encoding each bit vector y. This does not compromise the near-capacity capability of the UEC code since synchronization is still maintained between the trellis path and the unary codewords despite the trellis encoder potentially starting from the middle of a codeword. More specifically, owing to the particular design of the UEC trellis, the trellis path returns to one of the two central states, whenever the final bit in a unary codeword is encountered, as described in Section II-B. 1, 2, 1, 3, 3, 2, 1, 3, 2] , respectively. Observe that these paths are identical to those that result from the first method, with the only exception of the states corresponding to the end of the split codeword, demonstrating that synchronization has been maintained. Note that this approach causes the end state to be unknown to the receiver, since it may correspond to the middle of a unary codeword. This may be accommodated during the Log-BCJR algorithm, by not terminating the UEC trellis at its right-most end, like usual. Furthermore since the first state of each frame is forced to 1, the transition probabilities P(m j |m j−1 ) employed by the Log-BCJR algorithm for the first codeword may be slightly inaccurate. These two factors impose some error correction performance loss upon the UEC trellis decoder, although this effect is negligible in practice. Owing to this, the results of Section V will adopt this second method, since it does not require any side information to be sent between the transmitter and receiver.
In the receiver of Figure 2 , the UEC trellis decoder and URC decoder perform iterative decoding, in order to obtain the a posteriori LLR vectorỹ, which pertains to the bit vector y of the transmitter, as described in Section III-B. Mirroring the buffer employed in the transmitter, the receiver employs a buffer to temporarily store LLRs from the end of the vectorỹ which correspond to a unary codeword that was split between consecutive frames. More specifically, when the UEC trellis decoder generates the LLR-vectorỹ for the next frame, it is concatenated on to the end of any LLRs stored in the buffer, forming part of the first codeword inỹ. The concatenated LLR-vector is then provided to the unary decoder, which generates the vectorx comprising a subsymbols, as described in Section III-B. Following this, any trailing LLRs ofỹ that did not contribute to a complete unary codeword are placed into the buffer, ready to be concatenated to the beginning of the next LLR-vectorỹ. Note that since the number of sub-symbols a in the vectorx varies from frame to frame, it must be conveyed to the receiver using a small amount of side information, as previously discussed in Section III-B. More specifically, this side information conveys the number of logical one-valued bits in y, as exploited by the unary decoder. While the proposed scheme of Figure 2 is capable of functioning without this side information, its inclusion guarantees synchronization between the UEC and FLC-CC parts, as it will be described in Section IV-C. Using the example given above, if the unary decoder is successful in decoding the LLR vectorỹ of the first frame, it will output the sub-symbol vectorx = [1, 2, 1, 1], and will place the last three LLRs ofỹ into the buffer. Following this, the unary decoder will output the sub-symbol vectorx = [4, 1, 3, 1, 2] if it is successful in decoding the LLR-vectorỹ of the second frame, with no LLRs needed to be placed into the buffer. The decoded sub-symbols ofx are buffered until the corresponding sub-symbols oft have also been decoded by the FLC decoder, as will be discussed in Section IV-C.
B. FIXED LENGTH INTERLEAVERS FOR THE FLC-CC SUB-CODE
In a similar fashion to the UEC encoder, the FLC encoder of Figure 2 converts each sub-symbol t i in the stream t into the corresponding FLC codeword u i , as described in Section II-C. These codewords are concatenated to form a stream of bits, which is then partitioned into fixed length bit-vectors u = [u j ] c j=1 , which may result in some codewords becoming split between consecutive frames. In order to address this, the scheme of Figure 2 employs a buffer for storing the last bits in the codeword that do not fit into the current frame, so that they can be concatenated onto the start of the next frame. For example, in the case of a RiceEC code associated with M Rice = 2, when FLC encoding the sub-symbol vector t = [1, 3, 2, 0, 2] to form bit vectors comprising c = 5 bits, we obtain u = [0, 1, 1, 1, 1] and u = [0, 0, 0, 1, 0] for two consecutive frames. Note that the FLC codeword corresponding to the third sub-symbol of t is split across the two bit-vectors of u. In the receiver of Figure 2 , the CC decoder iteratively exchanges the LLR-vectorsw a andw e with the URC decoder, as well as the LLR-vectorsṽ a andṽ e with the FLC decoder, as described in Section III-C. More specifically,ṽ e is de-interleaved in the block π −1 3 to obtainũ a , whileũ e is interleaved to obtainṽ a . However, the FLC decoder can only generate a sub-symbol oft and a codeword of extrinsic LLRs forṽ e when the a priori LLR vectorṽ a contains all of the a priori LLRs for that codeword. Owing to this, a buffer is required for storing a priori LLRs for incomplete codewords that have been split between frames, as shown in Figure 2 . When the CC decoder forwards an a priori LLR vectorũ a to the FLC decoder, it is concatenated onto the end of any LLRs stored in the buffer from the previous frame. In the case of the ExpGEC scheme, the decoded sub-symbols ofx are used for determining the length of each of the codewords inũ a , hence allowing the FLC decoder to determine how many excess trailing LLRs ofũ a must be stored in the buffer for each frame. In the case of the RiceEC scheme, since the length of the FLC codewords is fixed at log 2 (M Rice ), the FLC decoder does not need any additional information for determining how many excess trailing LLRs to place in the buffer for each frame. The FLC decoder processes the a priori LLRs ofũ a comprising complete codewords, in order to generate corresponding extrinsic LLRs for the vectorũ e , which is provided to the CC decoder. The extrinsic LLRs corresponding to the a priori LLRs that were concatenated from the buffer must be removed beforeũ e is provided to the CC decoder, since they are not relevant to the current frame. Likewise, zero-valued LLRs must be inserted at the end ofũ e to pad the vector, in correspondence to the a priori LLRs of u a that were placed into the buffer, rather than being decoded by the FLC decoder. As a result of this, the FLC decoder and CC decoder may not iteratively exchange LLRs pertaining to every bit in the vector u. This may therefore result in a slight degradation of the FLC decoder's near-capacity capability, VOLUME 4, 2016 in a similar manner to the effect of limiting the FLC decoding complexity using the parameter x max . However, since only a small number of LLRs in the vectorũ e are impacted in this way, this effect is negligible in practice.
C. MATCHING SUB-SYMBOLS
As described in Section II, the ExpGEC and RiceEC codes decompose the symbol stream d into two sub-symbol streams x and t. It is important to ensure that the two sub-symbol streams remain synchronized at the receiver, even in the presence of transmission errors. More specifically, if errors cause deleted or inserted sub-symbols to appear inx ort, then the incorrect pairings of sub-symbols will be recombined for generating the reconstructed symbol streamd, hence resulting in a high SER for the remainder of the stream. This motivates the approach described in this section for avoiding the two sub-symbol streams becoming de-synchronized. We commence by describing how the transmitter multiplexes the UEC part and FLC-CC part onto the channel, then discuss how the receiver maintains synchronization.
As explained in Section III-C, the FLC decoder requires knowledge of the sub-symbolx i in order to generate the corresponding sub-symbolt i . Motivated by this, the transmitter ensures that each sub-symbol x i is encoded and transmitted in advance of its corresponding sub-symbol t i , so that the reconstructed sub-symbolx i is available for use by the FLC decoder at the appropriate time. Since the number of subsymbols of x and t that are represented by each bit vector of z and w may vary from frame to frame, the transmitter has to ensure that a sufficiently high number of sub-symbols of x have been transmitted before transmitting a bit vector representing a selection of sub-symbols of t. Since the transmitter is capable of maintaining a count of how many sub-symbols of x it has transmitted at any given time, it can infer how many sub-symbols ofx are buffered in the receiver, ready to be used by the FLC decoder. Likewise, the transmitter employs a buffer for storing the sub-symbols of t until they are ready for transmission, as shown in Figure 2 . When the transmitter identifies that the number of sub-symbols ofx buffered in the receiver exceeds the number of sub-symbols of t required to generate the next bit vector w, it is transmitted to the receiver.
The proposed ExpGEC and RiceEC schemes are designed to ensure that synchronization is maintained between the UEC and FLC-CC decoders. As described above, de-synchronization can occur if the unary decoder or FLC decoder output too many or too few sub-symbols forx andt, when decoding the LLR vectorsỹ andũ a , respectively. However, owing to the side information used to indicate to the UEC decoder how many one-valued bits there are in each bit vector y, the unary decoder is guaranteed to output the correct number of sub-symbols forx, since each unary codeword contains only a single logical one-valued bit, as described in Section IV-A. For the RiceEC code, the correct number of sub-symbols fort is also guaranteed, since each FLC codeword comprises the same number of log 2 (M Rice ) bits. In the case of the ExpGEC code, the properties of the UEC code may be exploited for ensuring synchronization. More specifically, the number of FLC encoded bits of u associated with each decoded vectorx
, where a is the length of a specific decoded vector ofx, which can be correctly inferred from the side information, which conveys the number of logical one-valued bits in each vector y. Since the length of each unary encoded codeword is given by l Unary(x i ) = x i , the previous sum may be expressed as
is the number of LLRs inỹ constituting the sub-symbol vectorx. Since the values of a and the number of LLRs inỹ are known to the receiver, the number of LLRs ofũ a associated with a decoded subsymbol vector ofx does not depend on the received value of those sub-symbols. Owing to this, any errors inx cannot cause the sub-symbolsx andt to become permanently de-synchronized.
D. INTERLEAVERS
In contrast to the JSCC schemes of our previous work [22] , the proposed ExpGEC and RiceEC schemes employ interleavers π 1 to π 5 of Figure 2 having fixed lengths, which do not change from frame to frame. Owing to this, these lengths and the corresponding interleaver designs may be hard-coded into the transmitter and receiver. This has the added benefit of avoiding the large memory requirement for storing a large number of interleaver designs, as well as the additional design effort required for ensuring that all the interleavers have desirable distance properties. It also avoids the requirement for using side information to signal which interleaver lengths are used for each frame.
The interleavers π 2 and π 5 of Figure 2 are required for evenly distributing the punctured or doped bits throughout the corresponding bit vector, as opposed to the interleavers π 1 , π 3 and π 4 of Figure 2 , which are required for maintaining desirable distance properties. As a result, we recommend the LTE sub-block interleaver [34] for the interleavers π 2 and π 5 , in order to evenly distribute the doped or interleaved bits. Meanwhile, we recommend S-Random interleavers [35] for the interleavers π 1 , π 3 and π 4 .
V. PERFORMANCE COMPARISON WITH BENCHMARKERS
In this section, we compare the performance of our proposed RiceEC and ExpGEC schemes with that of several appropriate benchmarkers, which are introduced in Section V-A. In Section V-B, we analyze the near-capacity potential of both the proposed codes and of the benchmarkers. Following this, Section V-C discusses our EXIT chart analysis invoked for selecting the doping or puncturing rates R i 1 and R i 2 , which control the unequal error correction of the proposed schemes. Finally, Section V compares the performance of the proposed schemes and the benchmarkers. Table 3 lists the considered parameterizations of the proposed schemes and of the benchmarkers, namely of the Rice-CC, ExpG-CC and VLEC schemes [26] . In analogy with the EG-CC scheme of [22] , the Rice-CC and ExpG-CC benchmarkers are both SSCCs, which replace the EG source code of [22] by the Rice and ExpG codes, respectively. Note that the EG-CC scheme of [22] may be viewed as the special case of the ExpG-CC benchmarker, where k ExpG = 0. In these benchmarkers, separate channel coding is provided by a serial concatenation of an r = 4-state non-systematic recursive CC of [3, Table II ] with an r = 2-state URC and Greycoded QPSK modulation. Note that this combination was shown to minimize (but not eliminate) the capacity loss of VOLUME 4, 2016 SSCC schemes like the Rice-CC and ExpG-CC benchmarkers [3] , [22] . A further benchmarker is provided by the JSCC VLEC scheme of [25] and [26] in which a VLEC code is serially concatenated with a r = 2 state URC and Gray-coded QPSK modulator. This scheme offers a near capacity operation, but has a rapidly increasing complexity as L increases, as described in Section I.
A. SCENARIOS AND BENCHMARKERS
As shown in Table 3 , our comparisons consider several scenarios, including the case where the source symbols represent the 26 letters of the English alphabet and the space character, which have a particular probability distribution, as shown in Figure 1b . Here, we map the letters and the space character to the symbol values of 1 to 27 according to descending probability of occurrence. Figure 1b compares this probability distribution to the finite zeta-like distribution of (1) having a parameter value of L = 27 and various values of p 1 . The entropy of the letters probability distribution is H D = 4.1 bits per symbol, which is equal to that of the finite zeta-like distribution having p 1 = 0.45. We also consider the scenario where the source symbols obey the same probability distribution as the transform coefficients and motion vectors produced by a H.265 encoder, as shown in Figure 1b . Note that while the H.265 encoder produces some symbols having values higher than 1000, these have a combined probability of less than 2×10 −5 . Motivated by this, Figure 1a also shows the finite zeta-like distribution having L = 1000 and various values of p 1 . The entropy of the H.265 probability distribution is H D = 2.4 bits per symbol, which is equal to that of the finite zeta-like distribution having p 1 = 0.6.
As shown in Table 3 , we also consider scenarios, where the source symbols obey the finite zeta-like probability distribution of (1). Here, we consider the alphabet cardinality of L = 27 in order to match that of the English letter source set, as well as the cardinality L = 1000, since this is approximately equal to the highest symbol value produced by H.264 [3, Fig. 7 ] and H.265. This approach allows the performance of the various schemes considered to be compared for both small and large symbol alphabet cardinalities L. Furthermore, we parametrize the finite zeta-like probability distributions using p 1 ∈ {0.2, 0.4, 0.6}, which spans the range of p 1 values that best approximate the English alphabet and the H.265 probability distributions, as discussed above. Note that our previous work of [22] focused only on the complementary parameter value range of p 1 ≥ 0.6.
As shown in Table 3 , the puncturing and doping of the proposed schemes and benchmarkers are specifically parameterized in each scenario for achieving the same effective throughput η, for the sake of facilitating fair comparisons. More specifically, the effective throughput of η = 0.95 was selected for the English letters distribution, while the effective throughput of η = 0.85 was selected for the H.265 distribution. Meanwhile, the target throughputs of η ∈ {0.90, 0.83, 0.85} were selected for the scenarios using the finite zeta-like distribution having p 1 ∈ {0.2, 0.4, 0.6}, respectively. These target effective throughputs were selected since they are close to the native effective throughputs of all schemes considered, ensuring that none of them were excessively impacted by puncturing or doping.
As described in Section II, the proposed RiceEC and ExpGEC schemes are parameterized by M Rice and k ExpG , respectively. Table 3 shows the specific values of M Rice and k ExpG that were selected in each scenario considered. In each case, we selected the particular parameter values that give the best SER performance, as will be characterized in Section V-D. We also selected M Rice = 1 and k ExpG = 0, which reduce the RiceEC and ExpGEC schemes to the special cases of the UEC and EGEC schemes of our previous work [3] , [22] . The latter arrangements served as additional benchmarkers. The only exception to this however is the omission of the UEC benchmarker in scenarios where excessive puncturing would be required for achieving the effective target throughput η. For example, the UEC benchmarker has very small outer coding rates of R o 1 = 0.0004 for the case of a finite zeta-like distribution having p 1 = 0.2 and L = 1000, as well as R o 1 = 0.007 for p 1 = 0.4 and L = 1000. This may be expected, since the UEC benchmarker has an average codeword length l Unary which approaches infinity, as the source alphabet cardinality L tends to infinity, for the case of p 1 < 0.608. Despite this, the UEC benchmarker has only a moderately small outer coding rate of R o 1 = 0.246 for the case of the finite zeta-like distribution having p 1 = 0.6 and L = 1000, as well as of R o 1 = 0.348 for the case of the H.265 distribution, as shown in Table 3 . Since the Rice-CC and ExpG-CC benchmarkers employ the Rice and ExpG source codes respectively, they are also parameterized by M Rice and k ExpG . Table 3 shows the values of M Rice and k ExpG that were selected for the Rice-CC and ExpG-CC benchmarkers in each of the scenarios considered. In each case, this selection was made by finding the parameterization of M Rice or k ExpG that gives the best SER performance, as will be characterized in Section V-D.
Note that the VLEC benchmarker is only considered for scenarios having source symbol alphabets associated with the cardinality of L = 27, since it suffers from an excessive trellis complexity for larger values of L. The VLEC codewords were designed using the approach of [36] , which was parameterized using a block distance
Here, the block distance d b specifies the minimum Hamming distance required between all pairs of equal length VLEC codewords. Meanwhile, the divergence distance d d and convergence distance d c specify the minimum Hamming distances required between all pairs of un-equal length codewords, when they are left-and right-aligned, respectively [25] , [32] . After a VLEC codebook was designed in this way, each codeword was then doubled in length by concatenating it with its inverse. For example, the codebook {101, 1001, 10001} becomes {101010, 10010110, 1000101110} using this approach. This enables a fair comparison by reducing the VLEC coding rate R o to become comparable to those of the other schemes, while also ensuring that the VLEC-encoded bits have equiprobable values, which is a necessary condition for avoiding capacity loss [3] .
The final column of Table 3 quantifies the complexity of each scheme, which was quantified in terms of the number of Add-Compare-Select (ACS) operations performed per bit entered into the QPSK modulator, per decoding iteration. This method of comparing complexity is motivated since logarithmic implementations of the algorithms used within each of the iterative decoding blocks can be decomposed into only the basic ACS operations. In particular, each max* operation employed by the Log-BCJR algorithm is assumed to be computed using several lookup table operations, requiring a total of 5 ACS operations [37] .
B. NEAR CAPACITY ANALYSIS
There are two requirements for an iterative decoding scheme to facilitate near-capacity operation, where reliable communication is achieved using an effective throughput η that VOLUME 4, 2016 approaches the Discrete-Input Continuous-Output Memoryless Channel (DCMC) capacity C [38] . Firstly, the inner coding rate should obey A i = C/[R i log 2 (M mod )], where C is the DCMC capacity. As discussed in [31] , this condition is satisfied by the URC, regardless of the puncturing or doping rate R i used for meeting the required effective throughput η. Secondly, the areas beneath the inverted EXIT outer functions A o should approach the corresponding outer coding rates R o [39] . Motivated by this, this section compares the areas A o 1 and A o 2 beneath the inverted EXIT functions of the UEC and FLC-CC sub-codes of the proposed RiceEC and ExpGEC schemes with the corresponding coding rates R o 1 and R o 2 , in order to characterize their near-capacity operation. Figure 6 shows the product of the coding rates R o and inverted EXIT chart areas A o with the corresponding codeword lengths n for the proposed schemes as functions of the finite zeta-like distribution parameter p 1 . Here we have plotted the products A o n and R o n to eliminate the effect of different codeword lengths n on the analysis. Furthermore, the values of R o and A o for each of the considered schemes are listed in Table 3 for each scenario considered. For both the RiceEC and ExpGEC schemes, the coding rate R o 1 was obtained using (21) , while the FLC-CC coding rate R o 2 was obtained using (30) . Likewise, (32) is used to obtain the area beneath the inverted UEC EXIT function A o 1 , for the case of employing r 1 = 4 states. Meanwhile, (33) and (34) The discrepancy between A o n and R o n represents capacity loss, as discussed in [3] . Figure 6 shows that the capacity loss A o n − R o n depends on the particular scheme, scenario and parameterization considered. The capacity loss A o 1 n 1 − R o 1 n 1 of the UEC sub-code of the proposed RiceEG and ExpGEC schemes depends on the number of states r 1 employed by the UEC trellis decoder, as shown in Figure 7a . If more than r 1 = 4 states are employed, then the capacity loss shown in Figure 6 will be reduced accordingly, as it may also be seen in (21) Figure 7a also shows that this capacity loss reduces as p 1 is increased, since this results in the less frequent occurrence of higher sub-symbol values in the stream x, which would benefit from using more than r 1 = 4 states in the UEC trellis decoder to exploit knowledge of the corresponding occurrence probabilities. As characterized in Figure 7b , the capacity loss A o 2 n 2 − R o 2 n 2 of the FLC-CC subcode is caused by symbols in the stream d having values that exceed the limit d max , which occur more frequently as p 1 is reduced. This explains why employing a value higher than d max = 18 reduces the capacity loss shown in Figure 7b . This can also be seen in (33) bound, area bound and tunnel bound. More specifically, the capacity bound is the E b /N 0 value where the DCMC capacity C becomes equal to the effective throughput η of the scheme, representing theoretical minimum E b /N 0 at which reliable communication is possible [31] . The area bound is the E b /N 0 value at which A i = A o , implying that it is theoretically possible to create an open EXIT tunnel, providing that there is a good match between the shapes of the EXIT curves of the schemes' inner and outer decoders [22] . The discrepancy between the capacity bound and the area bound represents the capacity loss of the particular scheme. Table 3 shows that our best performing ExpGEC and RiceEC schemes have an area bound that is within 0.1 dB of the capacity bound, demonstrating their capability of near-capacity operation. By contrast, the discrepancies between the area and capacity bounds are significantly higher for the SSCC ExpG-CC and Rice-CC benchmakers, owing to the corresponding capacity loss. Finally the tunnel bound is the E b /N 0 value at which an open tunnel is actually created between the EXIT curves of the scheme's inner and outer decoders [40] . Note that for all proposed schemes and for all benchmarkers, the 2-state URC was found to produce lower tunnel bounds than any 4-state or 8-state URCs. Furthermore, the 2-state URC exhibits the additional benefit of having the lowest complexity of these design options.
C. EXIT CHART MATCHING
This section discusses the employment of EXIT charts for designing the parameterization of the proposed ExpGEC and RiceEC schemes, as well as for characterizing their iterative decoding convergence and that of the benchmarkers. This facilitates the rapid characterization of these schemes, considering a wide range of values for the scheme parameters such as k ExpG , M Rice , R i 1 , R i 2 , n 1 and n 2 , as well as for various combinations of the scenario parameters p 1 and L, without requiring time-consuming SER simulations. Separate EXIT charts may be used for characterizing the pair of iterative decoding processes corresponding to the UEC and FLC-CC sub-codes of our proposed RiceEC and ExpGEC schemes. As discussed in [31] , codeword lengths of n 1 ≥ 2 and n 2 ≥ 2 are required in the proposed schemes in order to facilitate iterative decoding convergence to the (1,1) point in the EXIT chart, associated with a vanishingly low SER [31] . If both sub-codes correspond to equal inner coding rates of R i 1 and R i 2 and therefore equal amounts of puncturing or doping, then they may be said to adopt equal error protection. However, in this case, the EXIT charts corresponding to the UEC and FLC sub-codes may not form marginally open tunnels at the same channel E b /N 0 value. This results in a range of E b /N 0 values where the EXIT chart of one sub-code has an open tunnel allowing iterative decoding convergence to the (1,1) point and a low SER for the corresponding sub-symbols, while the other sub-code has a closed tunnel leading to a high SER for the corresponding sub-symbols and resulting in a high SER overall. To combat this, unequal error protection [22] may be employed to increase the E b /N 0 value where the first EXIT chart tunnel becomes open, allowing the E b /N 0 value where the second EXIT chart tunnel opens to be reduced to the same value. This enables a low overall SER to be achieved at this lower E b /N 0 value. For example, Figure 5 provides the EXIT charts for the unequal error protection of a particular parameterization of the RiceEC scheme. In this case where E b /N 0 = 2.8 dB, both the UEC and FLC-CC sub-codes are associated with marginally open EXIT chart tunnels, facilitating iterative decoding convergence to the (1,1) point and a low overall SER. To be specific, unequal error protection is achieved by carefully selecting the inner puncturing or doping rates R i 1 and R i 2 that are associated with the UEC or FLC-CC sub-codes. Note that when R i 1 or R i 2 is reduced in order to increase the error protection for the corresponding sub-code, the other one of R i 1 or R i 2 must be increased, in order to maintain the same overall throughput η, according to (31) . As an alternative to excessive doping, the error protection of the FLC-CC subcode may be increased by increasing the codeword length of the r 2 = 4-state CC from n 2 = 2 to n 2 = 3 bits, according to the design of [3, Table II ].
Most of the schemes characterized in Table 3 have puncturing or doping rates R i that are close to 1, avoiding the performance degradation that is associated with excessive puncturing or doping. However, schemes such as the UEC benchmarker that results for the M Rice = 1 special case of the RiceEC scheme, have puncturing rates that are as high as R i = 1.7, which partly contributes to a high E b /N 0 tunnel bound. More specifically, excessive puncturing results in EXIT charts with narrower open tunnels, resulting in a gradual SER improvement with E b /N 0 , rather than a steep turbo cliff. Secondly, excessive puncturing negatively impacts the threshold E b /N 0 value where a marginally open EXIT chart tunnel is created, leading to more capacity loss. Furthermore, a punctured code will also have a decoding complexity disadvantage, since the punctured bits must be decoded alongside the transmitted bits.
D. SER PERFORMANCE
In this section, we compare the SER performance of the proposed RiceEC and ExpGEC schemes to that of the benchmarkers for each of the scenarios listed in Table 3 . The SER performance of these schemes is characterized in Figures 8 and 9 , where a complexity limit of 5000 ACS operations per QPSK input bit is imposed, as it was characterized per decoding iteration in Table 3 . This complexity limit was chosen in all scenarios considered, since we found that it only marginally impacts the SER performance of whichever scheme converges to its ultimate unlimited performance with the lowest complexity in each scenario. The employment of this criterion for selecting the complexity limit ensures that excessively high-complexity schemes are not favored over those which have a marginally worse SER performance but lower complexity. Due to the considerable complexity of the trellis employed in the VLEC benchmarker, it performs poorly when this complexity limit is imposed, even for the case of L = 27. Owing to this, Figures 8 and 9 characterize the SER of the VLEC benchmarker when the complexity limit is removed, in order to characterize its ultimate performance, although this is only achieved at the cost of potentially excessive complexity. Figures 8 and 9 show that our family of schemes offer the best SER performance in all of the considered scenarios. In the finite zeta-like distribution scenarios having p 1 = 0.6, the best of the proposed schemes offers around 1 dB of gain compared to the best SSCC benchmarker for both considered values of L. Likewise, our schemes offer around 0.5 dB of gain for p 1 = 0.4, as well as around 0.75 dB of gain for the H.265 distribution. For p 1 = 0.2 and for the English letters distribution however, our schemes offer only a marginal SER performance gain over the Rice-CC benchmarker. Note however that this benchmarker suffers from poor performance in other scenarios, which prevents its general applicability. Note that the gains offered by the proposed schemes are achieved 'for free,' since they are achieved without increasing the required decoding complexity, or transmission-energy, -bandwidth, or -duration. Table 3 . Each scheme encodes an average of a = 20000 symbols per frame, which are generated using finite zeta-like probability distributions having different combinations of the parameters L ∈ {27, 1000} and p 1 ∈ {0.2, 0.4, 0.6}. Each scheme uses QPSK modulation for communication over an uncorrelated narrowband Rayleigh fading channel. For each scheme, the adopted value of the parameter k ExpG or M Rice is listed in the legend within brackets. A complexity limit of 5000 ACS operations per bit input into the QPSK modulator is imposed on each scheme, except in the case of the VLEC benchmarker where the ultimate unlimited complexity performance is shown.
The unlimited complexity VLEC benchmarker offers an SER performance very close to our proposed schemes for p 1 ∈ {0.2, 0.4} and L = 27, however it should be noted that its complexity is more than an order of magnitude greater than that of our proposed schemes. Furthermore, the complexity of the VLEC benchmarker becomes impractical for values of L significantly greater than 27. Table 3 when the source symbols obey the probability distribution of letters in the English alphabet, as well as when the symbols obey the probability distribution of a H.265 encoder. Each scheme encodes an average a = 20000 symbols per frame and uses QPSK modulation for communication over an uncorrelated narrowband Rayleigh fading channel. For each scheme, the parameter k ExpG or M Rice is listed in the legend within brackets. A complexity limit of 5000 ACS operations per bit input into the QPSK modulator is imposed on each scheme, except in the case of the VLEC benchmarker where the ultimate unlimited complexity performance is shown.
At higher values of p 1 , lower values of the parameters of k ExpG and M Rice give higher coding rates R o 1 and R o 2 , enabling higher effective throughputs η without requiring excessive puncturing. This facilitates better SER performance than may be achieved using higher values of k ExpG and M Rice at these p 1 values. For example, in the scenario where we have L = 27 and p 1 = 0.2, the k ExpG = 2 parameterization of the ExpGEC scheme outperforms the k ExpG = 0 parameterization by 0.5 dB. By contrast, when p 1 = 0.6, the k ExpG = 0 parameterization offers a marginal improvement over the k ExpG = 1 parameterization. This is also the case in the RiceEC, where the M Rice = 8 parameterization offers the best performance at p 1 = 0.2 and L = 27, while the M Rice = 1 parameterization offers the best performance at p 1 = 0.6. In the case where L = 1000, the coding rates R o 1 and R o 2 of the RiceEC scheme are lower than those of the ExpGEC, requiring more puncturing to meet the effective target throughput η. Owing to this, the ExpGEC schemes are superior to the RiceEC scheme in these cases. By contrast, in the case where we have L = 27, the coding rates R o 1 and R o 2 of the RiceEC scheme are similar to those of the ExpGEC, with neither schemes requiring any significant puncturing or doping. This allows the RiceEC to provide similar performance to the ExpGEC when L = 1000.
VI. CONCLUSIONS
In this paper we have extended and generalized the EGEC code of [22] to give the proposed ExpGEC code. Similarly, we extended the UEC code of [3] to design the proposed RiceEC code. Both these novel codes facilitate operation over a significantly wider range of source symbol distributions. This paper has focused on the scenario where the cardinality L of the source symbol set is finite, allowing comparison to a VLEC benchmarker. We have shown that the proposed schemes achieve the same SER performance as the VLEC benchmarker, but at an order of magnitude lower complexity when we have L = 27. Furthermore, our proposed schemes maintain a moderate complexity for significantly higher L values, while that of the VLEC benchmarker may become excessive. Furthermore, we have proposed a technique for increasing the practicality of the proposed schemes, which allow fixed-length designs to be employed for all interleavers. We have shown across a wide range of application scenarios that our family of proposed schemes outperforms several SSCC benchmarkers by as much as 1 dB with no cost in terms of decoding complexity, transmission-energy, -bandwidth or -duration. 
