Abstract -Conventional two-stage turbo-detected schemes typically suffer from a Bit Error Rate (BER) floor, preventing them from achieving infinitesimally low BER values, especially, when the inner coding stage is of non-recursive nature. We circumvent this deficiency by proposing a three-stage turbo-detected Sphere Packing (SP) aided Space-Time Block Coding (STBC) STBC-SP scheme, where a rate-1 recursive inner precoder is employed to avoid having a BER floor. The convergence behaviour of this serially concatenated scheme is investigated with the aid of 3D Extrinsic Information Transfer (EXIT) Charts. Furthermore, the capacity of the STBC-SP scheme is shown and an algorithm is proposed for calculating a tighter upper bound on the maximum achievable bandwidth efficiency. The proposed three-stage turbo-detected scheme operates within about 1.0 dB of the capacity and within 0.5 dB of the maximum achievable bandwidth efficiency limit.
INTRODUCTION
The concept of combining orthogonal transmit diversity designs with the principle of sphere packing was introduced by Su et al. in 2003 [1] in order to maximise the achievable coding advantage 1 , where it was demonstrated that the proposed Sphere Packing (SP) aided Space-Time Block Coded (STBC) system, referred to here as STBC-SP, was capable of outperforming the conventional orthogonal design based STBC schemes of [2, 3] .
The ultimate rationale of this paper is to use a novel three-dimensional Extrinsic Information Transfer (EXIT)-chart-based technique to jointly design the two time-slots' STBC signal by near-optimally combining them into an iteratively detected SP symbol.
The turbo principle of [4] was extended to multiple serially concatenated codes in 1998 [5] . The appeal of concatenated coding is that low-complexity iterative detection replaces the potentially more complex optimum decoder, such as that of [6] . In [7] , the employment of the turbo principle was considered for iterative soft demapping in the context of bit-interleaved coded modulation (BICM), where a soft demapper was used between the multilevel demodulator and the channel decoder. In [8] , a turbo coding scheme was proposed for the multiple-input multiple-output (MIMO) Rayleigh fading channel, where a block code was employed as an outer channel code, while an orthogonal STBC scheme was considered as the inner code. The iterative soft demapping principle of [7] was extended to STBC-SP schemes in [9] , where it was demonstrated that turbo-detected STBC-SP schemes provide * The financial support of the EPSRC, UK and that of the European Union under the Phoenix and Newcom projects as well as that of the Ministry of Higher Education of Saudi Arabia is gratefully acknowledged. 1 The diversity product or coding advantage was defined as the estimated gain over an uncoded system having the same diversity order as the coded system [1] .
useful performance improvements over conventionally-modulated orthogonal design based STBC schemes. It was shown in [10] that a recursive inner code is needed in order to maximise the interleaver gain and to avoid the formation of a bit-error rate (BER) floor, when employing iterative decoding. This principle has been adopted by several authors designing serially concatenated schemes, where rate-1 inner codes were employed for designing low complexity turbo codes suitable for bandwidth and power limited systems having stringent BER requirements [11] [12] [13] .
Recently, studying the convergence behaviour of iterative decoding has attracted considerable attention [14] [15] [16] [17] . In [14] , ten Brink proposed the employment of the so-called EXIT characteristics between a concatenated decoder's output and input for describing the flow of extrinsic information through the soft-in/softout constituent decoders. The computation of EXIT charts was further simplified in [15] to a time average, for scenarii when the PDFs of the communicated information at the input and output of the constituent decoders are both symmetric and ergodic. The concept of EXIT chart analysis has been extended to three-stage concatenated systems in [16, 17] . At a spectral efficiency of η = 1 bits/s/Hz, the upper bound of the maximum achievable rate is within 0.5 dB of the capacity, and our proposed threestage scheme operates within 1.0 dB of the capacity. The rationale of the proposed architecture is explicit: (1) SP modulation maximises the coding advantage of the transmission scheme by jointly designing and detecting the SP symbols hosting the two time-slots' STBC symbols; (2) the inner rate-1 recursive decoder maximises the interleaver gain and hence avoids having a BER floor; and (3) the outer irregular convolutional codes (IRCCs) [15, 18] minimise the area of the EXIT chart's convergence tunnel and hence facilitate near-capacity operation [19] . This paper is organised as follows. In Section 2, a brief description of our three-stage system is presented. Section 3 provides our 3D EXIT chart analysis along with its simplified 2D projections. The capacity of STBC-SP schemes is shown in Section 4, where an upper bound on the maximum achievable rate is also calculated based on the EXIT chart analysis. Our simulation results and discussions are provided in Section 5. Finally, we conclude in Section 6.
In this paper

1-4244-0353-7/07/$25.00 ©2007 IEEE
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the ICC 2007 proceedings. 
SYSTEM OVERVIEW
The schematic of the entire system is shown in Fig. 1 , where the transmitted source bits u1 are encoded by the outer channel Encoder I having a rate of RI . The outer channel encoded bits c1 are then interleaved by the first random bit interleaver, where the randomly permuted bits u2 are fed through the rate-1 Encoder II. The concatenated coded bits c2 at the output of the rate-1 encoder are interleaved by the second random bit interleaver, producing the permuted bits u3. After channel interleaving, the sphere packing mapper first maps blocks of B channel-coded bits b = b0,...,B−1 ∈ {0, 1} to the L = 2 B number of legitimate fourdimensional sphere packing modulated symbols s l ∈ S, where
constitutes a set of L legitimate constellation points from the lattice D4 [20] having
. The STBC encoder then maps each sphere packing modulated symbol s l to a space-time signal C l as [1, 9] :
where x l,1 and x l,2 are complex-valued symbols constructed from the 4-dimensional real-valued coordinates of the SP symbol s l in order to maximise the coding advantage of the space-time signal C l [1] , since the lattice D4 has the best minimum Euclidean distance in the four-dimensional real-valued Euclidean space R 4 [20] . Specifically, x l,1 and x l,2 may be written as
is the space-time transmission matrix given by [2] 
where the rows and columns of Eq. (2) represent the temporal and spatial dimensions, corresponding to two consecutive time slots and two transmit antennas, respectively. In this treatise, we considered a correlated narrowband Rayleigh fading channel, associated with a normalised Doppler frequency of fD = f d Ts = 0.1, where f d is the Doppler frequency and Ts is the symbol period. The complex-valued fading envelope is assumed to be constant across the transmission period of a spacetime coded symbol spanning T = 2 time slots. The complex Additive White Gaussian Noise (AWGN) of n = nI + jnQ is also added to the received signal, where nI and nQ are two independent zero-mean Gaussian random variables having a variance of σ As shown in Fig. 1 , the received complex-valued symbols are first decoded by the STBC decoder in order to produce the received SP soft-symbols r, where each SP symbol represents a block of B coded bits [9] . Then, iterative demapping/decoding is carried out between the SP demapper, APP-based soft-in/soft-out (SISO) module II and APP-based SISO module I, where extrinsic information is exchanged between the three constituent demapper/decoder modules. More specifically, L·,a(·) in Fig. 1 represents the a priori information, expressed in terms of the log-likelihood ratios (LLRs) of the corresponding bits, whereas L·,e(·) represents the extrinsic LLRs of the corresponding bits. The iterative process is performed for a number of consecutive iterations. During the last iteration, only the LLR values LI,e(u1) of the original uncoded systematic information bits u1 are required, which are passed to a hard decision decoder in order to determine the estimated transmitted source bitsû1 as shown in Fig. 1 .
EXIT CHART ANALYSIS
Preliminaries
The main objective of employing EXIT charts [14] , is to predict the convergence behaviour of the iterative decoder by examining the evolution of the input/output mutual information exchange between the inner and outer decoders in consecutive iterations. The application of EXIT charts is based on the two assumptions that upon assuming large interleaver lengths, (1) the a priori LLR values are fairly uncorrelated; (2) the a priori LLR values exhibit a Gaussian PDF. In this section, the approach presented in [17] is adopted in order to provide the EXIT chart analysis of the proposed three-stage system of Fig. 1 .
Let I·,a(x), 0 ≤ I·,a(x) ≤ 1, denote the mutual information (MI) between the a priori LLRs L·,a(x) as well as the corresponding bits x and let I·,e(x), 0 ≤ I·,e(x) ≤ 1, denote the MI between the extrinsic LLRs L·,e(x) and the corresponding bits x, where the subscript (·) is used to distinguish the different constituent decoders, i.e. Decoder I, Decoder II and the SP demapper.
3D EXIT Charts
As seen from Fig. 1 , the input of Decoder II is constituted by the a priori input LII,a(c2) and the a priori input LII,a(u2) provided after bit-deinterleaving by the SP demapper and Decoder I, respectively. Therefore, the EXIT characteristic of Decoder II can be described by the following two EXIT functions [14, 17] :
which are illustrated by the 3D surfaces drawn in dotted lines in Figs. 2 and 3, respectively. On the other hand, the EXIT characteristic of the SP demapper as well as that of Decoder I are each dependent on a single a priori input, namely on LM,a(u3) and LI,a(c1), respectively, both of which are provided by the rate-1 Decoder II after appropriately ordering the bits, as seen in Fig. 1 . The EXIT characteristic of the SP demapper is also dependent on the E b /N0 value. Consequently, the corresponding EXIT functions for the SP demapper and Decoder I, respectively, may be written as
which are illustrated by the 3D surfaces drawn in solid lines in Figs. 2 and 3, respectively. Eqs. (4) to (6) may be represented with the aid of two 3D EXIT charts. More specifically, the 3D EXIT chart of Fig. 2 can be used to plot Eq. (3) and Eq. (5), which describe the EXIT relation between the SP demapper and Decoder II. Similarly, the 3D EXIT chart of Fig. 3 can be used to describe the EXIT relation between Decoder II and Decoder I by plotting Eq. (4) and Eq. (6).
Figs. 2 and 3 show an example of these 3D EXIT charts, when Encoder I is a half-rate memory-1 recursive systematic convolutional (RSC) code having octally represented generator polynomials of This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the ICC 2007 proceedings.
(Gr, G) = (3, 2)8, where Gr is the feedback polynomial, while Encoder II is a simple rate-1 accumulator, described by the pair of octal generator polynomials (G/Gr) = (2/3)8. 
2D EXIT Chart Projections
The 3D EXIT charts of Figs. 2 and 3 are somewhat cumbersome to interpret as well as to plot. Hence in this section we derive their unique and unambiguous 2D representations, which can be interpreted in the usual way.
The intersection of the surfaces in Fig. 2 , shown as a thick solid line, portrays the best achievable performance, when exchanging mutual information between the SP demapper and the rate-1 Decoder II for different fixed values of III,a(u2) spanning the range of [0, 1]. Each (III,a(u2), III,a(c2), III,e(c2)) point belonging to the intersection line in Fig. 2 uniquely specifies a 3D point (III,a(u2), III,a(c2), III,e(u2)) in Fig. 3 , according to the EXIT function of Eq. (4). Therefore, the line corresponding to the (III,a(u2), III,a(c2), III,e(c2)) points along the thick line of Fig. 2 is projected to the solid line shown in Fig. 3 , while the 2D projection of the solid line in Fig. 3 at III,a(c2) = 0 onto the plane spanned by the lines (III,a(u2), III,e(u2) ) and (II,e(c1), II,a(c1)) is shown in Fig. 4-a, represented by the dotted line at E b /N0 = 2.0 dB. This projected EXIT curve may be written as
Projected 2D EXIT charts of similar nature will be used throughout the rest of the paper for the sake of describing the convergence behaviour of the three-stage turbo-detected STBC-SP scheme. More details on the related 3D-to-2D EXIT chart projection are provided in [17] . Fig. 4 -a shows the 2D-projected EXIT curve of the SP demapper, when operating at E b /N0 = 2.0 dB and employing Anti-Gray Mapping 2 (AGM-1) scheme, which is described in Appendix A and in Table 1 . The figure also shows the 2D-projected EXIT curve of the outer RSC Decoder I and the 2D-projected EXIT curves of the combined SP demapper and the rate-1 Decoder II at different E b /N0 values, when employing AGM-1 of Table 1 . Observe in Fig. 4 -a that an open convergence tunnel is taking shape for the three-stage scheme upon increasing the Signal-to-Noise Ratio (SNR) beyond E b /N0 = 2.0 dB. This implies that according to the predictions of the 2D EXIT chart seen in Fig. 4 -a, the iterative decoding process is expected to converge to the (1.0, 1.0) point and hence an infinitesimally low BER may be attained beyond E b /N0 = 2.0 dB. By contrast, for the traditional two-stage turbo-detected STBC-SP scheme, there would be a BER floor preventing it from achieving an infinitesimally low BER due to the non-recursive nature of the SP demapper, which also prevents the intersection of the EXIT curves of the SP demapper and the outer RSC Decoder I from reaching the (1.0, 1.0) point of convergence, despite increasing the SNR or the number of iterations. In contrast to this, the three-stage scheme of Fig. 1 becomes capable of achieving an infinitesimally low BER, as suggested by the EXITchart predictions of Fig. 4 -a. 
EXIT Tunnel-Area Minimisation for Near-Capacity Operation
In this section we will exploit the well-understood properties of conventional 2D EXIT charts that a narrow and open EXIT-tunnel represents a near-capacity performance. Therefore, we invoke Irregular Convolutional Codes (IRCCs) for the sake of appropriately shaping the EXIT curves by minimising the area within the EXITtunnel using the procedure of [15, 18] . Let AI andĀI be the areas under the EXIT-curve TI,c 1 (i) of Eq. (6) and its inverse T
, respectively, which is expressed as:
Similarly, the area A p II is defined under the EXIT-curve T p II,u 2 (i) of Eq. (7). It was observed in [15, 21] that for the APP-based outer Decoder I, the areaĀI maybe approximated byĀI ≈ RI , where the equalityĀI = RI was later shown in [19] for the family of Binary Erasure Channels (BECs). The area property ofĀI ≈ RI implies that the lowest SNR convergence threshold occurs, when we have A p II = RI + , where is an infinitesimally small number, provided that the following convergence constraints hold [18] :
(9) Observe, in Fig. 4 -a, however that there is a 'larger-than-necessary' tunnel area between the projected EXIT curve T p II,u 2 
(i) and the
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the ICC 2007 proceedings.
EXIT curve T −1 I,c 1 (i) of the outer 1/2-rate RSC code at E b /N0 = 2.0 dB. This implies that the BER curve is farther from the achievable capacity than necessary, despite the fact that the specific bitto-SP-symbol mapping scheme of AGM-1 and the 1/2-rate RSC code employed in Fig. 4 -a were specifically optimised for convergence at a low E b /N0 value. More quantitatively, the area under the projected EXIT curve T
0 dB, which is larger than the outer code rate of RI = 0.50. Therefore, according to Fig. 4 -a and to the area property ofĀI ≈ RI , a lower E b /N0 convergence threshold may be attained, provided that the constraints outlined in Eq. (9) Fig. 4 -a more closely. Hence we will invoke IRCCs [15, 18] as outer codes that exhibit flexible EXIT characteristics, which can be optimised to more closely match the 2D-projected EXIT curve T p II,u 2 (i) of Fig. 4 a, rendering the near-capacity code optimisation a simple curvefitting process.
An IRCC scheme constituted by a set of P = 17 subcodes was constructed in [18] from a systematic 1/2-rate memory-4 mother code defined by the octally represented generator polynomials (Gr, G) = (23, 35)8. Each of the P = 17 subcodes encodes a specific fraction of the uncoded bits determined by the weighting coefficient, αi, i = 0, . . . , P . Hence the coefficients αi are optimised with the aid of the iterative algorithm of [15] , so that the EXIT curve of the resultant IRCC closely matches the 2D-projected EXIT curve T 
CAPACITY AND MAXIMUM ACHIEVABLE RATE
The channel capacity and bandwidth efficiency valid for STBC schemes using ND-dimensional so-called L-orthogonal signalling [22] over the Discrete-input Continuous-output Memoryless Channel (DCMC) [23] was derived in [24] . Fig. 5 shows the DCMC bandwidth efficiency η
ST BC−SP DCM C
of the 4-dimensional SP modulation assisted STBC scheme for L = 16, where the Continuous-Input Continuous-Output Memoryless Channel (CCMC) [23] capacity of the MIMO scheme is given by [25] . More specifically, Fig. 5 demonstrates that at a bandwidth efficiency of η = 1 bit/s/Hz, the capacity limit for the DCMC STBC-SP scheme employing Nt = 2 transmit and Nr = 1 receive antennas is E b /N0 = 0.78 dB. The EXIT chart analysis of Fig. 4 -b predicts that our three-stage system will converge at E b /N0 = 1.5 dB, i.e. within 0.72 dB from the capacity limit. The dotted curve referring to the maximum achievable rate of the three-stage turbo-detected STBC-SP scheme is discussed next.
A tighter upper limit on the maximum achievable rate of the system can be calculated based on the area property ofĀI ≈ RI of EXIT charts as discussed in Section 3.4. More explicitly, it was shown in Section 3.4 that the outer Decoder I may have a maximum rate of R efficiency may be formulated as a function of the E b /N0 value as follows
where B = log 2 (L) is the number of bits per SP symbol and
, since T = 2 time slots are needed to transmit one SP symbol according to Eqs. (1) and (2) . Additionally, E b /N0 and E b /N0 are related as follows (12) where Ro is the original outer code rate used when generating the 2D-projected EXIT curves of the SP demapper and the rate-1 Decoder II of Eq. (7) More specifically, the maximum achievable bandwith efficiency of Eq. (11) can be calculated using the following procedure for E b /N0 ∈ [ρmin, ρmax], assuming that Ro is an arbitrary rate and is a small constant.
Algorithm 1 (Maximum Achievable Bandwidth Efficiency using EXIT Charts):
Step 1: Let RI = Ro.
Step 2: Let E b /N0 = ρmin dB.
Step 3: Calculate N0.
Step 4: Let IM,a(u3) = 0.
Step 5: Activate the SP demapper.
Step 6:
Step 7:
Step 5.
Step 8:
Step 9: Calculate E b /N0 using Eq. (12).
Step 10: Save ηmax(E b /N0) of Eq. (11).
Step 11:
Step 3.
Step 12: Output ηmax(E b /N0) from Step 10.
Observe that ρmin and ρmax are adjusted accordingly in order to produce the desired range of the resultant E b /N0 values. Furthermore, the output of Algorithm 1 is independent of the specific choice of Ro, since Eq. (12) would always adjust the E b /N0 values, regardless of Ro. For example, Ro may be set to the desired final RI to be employed in the three-stage system.
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the ICC 2007 proceedings.
The resultant maximum achievable bandwidth efficiency is demonstrated in Fig. 5 , which is slightly lower than the bandwidth efficiency calculated according to [24] , i.e. we have ηmax < η ST BC−SP DCM C . Observe that the bandwidth efficiency calculated according to [24] and using the EXIT charts as well as Eq. (11) were only proven to be equal for the family of BECs [19] . Nonetheless, similar trends have been observed for both AWGN and InterSymbol-Interference (ISI) channels [16, 18] , when APP-based decoders are used for all decoder blocks [19] . However, the discrepancy between the two bandwidth efficiency curves shown in Fig. 5 is due to the fact that the SP demapper is not an APP-based decoder. Nevertheless, the bandwidth efficiency calculated based on the EXIT charts using Eq. (11) and Algorithm 1 constitutes a tighter bound on the maximum achievable bandwidth efficiency of the system. Fig. 5 shows that at a bandwidth efficiency of η = 1 bit/s/Hz, the capacity limit for the STBC-SP scheme is about E b /N0 = 1.3 dB, which is within 0.2 dB from the prediction of our EXIT chart analysis seen in Fig. 4-b , where convergence is predicted at E b /N0 = 1.5 dB.
RESULTS AND DISCUSSIONS
System Parameters
Without loss of generality, we considered a sphere packing modulation scheme associated with L = 16 using two transmit and a single receiver antenna in order to demonstrate the performance improvements achieved by the proposed system. The communication channel is described in Section 2, while the outer Encoder I is a half-rate memory-4 IRCC constructed using P = 17 subcodes according to the weighting coefficients of Eq. (10) . Encoder II is a simple rate-1 accumulator, described by the pair of octal generator polynomials (G/Gr) = (2/3)8. A three-stage iteration involves the following decoder activation sequence: (SP Demapper -Decoder II -SP Demapper -Decoder II -Decoder I -Decoder II). The overall system throughput is 1 bit/symbol.
Decoding Trajectory and BER Performance
EXIT chart based convergence predictions are usually verified by the actual iterative decoding trajectory. Fig. 4-b shows that the three-stage turbo-detected STBC-SP scheme is expected to converge at E b /N0 = 1.5 dB, where convergence to the (1.0, 1.0) point requires an excessive number of three-stage iterations. However, convergence to the (1.0, 1.0) point becomes more feasible for E b /N0 > 1.5dB. Fig. 6 illustrates the actual decoding trajectory of the three-stage turbo-detected STBC-SP scheme of Fig. 1 at E b /N0 = 1.8 dB, when using an interleaver depth of D = 10 6 bits and 33 three-stage iterations. The zigzag-path seen in Fig. 6 represents the actual extrinsic information transfer between the SP demapper and the rate-1 Decoder II on one hand and the outer IRCC Decoder I on the other. Fig. 7 compares the performance of the proposed three-stage IRCC-coded STBC-SP scheme employing anti-Gray mapping (AGM-2) against that of an identical-throughput 1 Bit Per Symbol (1BPS) uncoded STBC-SP scheme [1] using L = 4 and against Alamouti's conventional G2-BPSK scheme [2] . The system is also benchmarked against a two-stage RSC-coded STBC-SP scheme [9] , when employing the system parameters outlined in Section 5.1 and using an interleaver depth of D = 10 6 bits. Table 1 in combination with the system parameters outlined in Section 5.1 and operating at Eb/N0 = 1.8 dB with an interleaver depth of D = 10 6 bits after 33 three-stage iterations.
only 10 iterations since the advantage of employing any further iterations diminishes owing to the presence of a BER floor. Explicitly, Fig. 7 demonstrates that a coding advantage of about 22.2 dB was achieved at a BER of 10 −5 after 28 iterations by the threestage turbo-detected STBC-SP system over both the uncoded STBC-SP [1] and the conventional orthogonal STBC design based [2, 3] schemes for transmission over the correlated Rayleigh fading channel considered. Additionally, a coding advantage of approximately 2.0 dB was attained over the 1BPS-throughput RSC-coded AGM-3 STBC-SP scheme [9] at the expense of an increased decoding complexity due to the employment of the rate-1 decoder and the additional three-stage iterations. According to Fig. 7 , the three-stage turbo-detected STBC-SP scheme operates within approximately 1.0 dB from the capacity limit calculated from [24] and 0.5 dB from the maximum achievable bandwidth efficiency limit of Eq. (11). 
(2) (3) Figure 7 : Performance comparison of the anti-Gray mapping AGM-2
(1) based IRCC-coded three-stage STBC-SP scheme in conjunction with L = 16 against an identical-throughput 1 bit/symbol (BPS) uncoded STBC-SP scheme (2) using L = 4 and against Alamouti's conventional G2-BPSK scheme (3) as well as against a two-stage RSC-coded STBC-SP scheme (4) , when employing the system parameters outlined in Section 5.1 and using an interleaver depth of D = 10 6 bits.
CONCLUSION
We proposed a three-stage serial concatenated turbo-detected STBC-SP scheme that is capable of achieving infinitesimally low
BER values, where the performace is not limited by a BER floor, which is routinely encountered in conventional two-stage systems. The convergence behaviour of the three-stage system was analysed with the aid of novel 3D EXIT charts and their 2D projections [16, 17] . With the advent of 2D projections, an IRCC [15, 18] was constructed for the sake of matching the projected EXIT curve of the SP demapper and the rate-1 inner decoder leading to a nearcapacity performance. The capacity of the STBC-SP scheme was calculated and a procedure was proposed for calculating a tighter upper bound on the maximum achievable bandwidth efficiency of the three-stage system using EXIT chart analysis. Our proposed three-stage scheme operated within about 1.0 dB from the capacity limit and within 0.5 dB from the maximum achievable bandwidth efficiency limit.
Appendix A Anti-Gray Mapping Schemes for Sphere Packing
Modulation of Size L = 16
In this appendix, the different anti-Gray mapping (AGM) schemes introduced in this paper for STBC-SP signal sets of size L = 16 are described. There are more than L = 16 legitimate SP symbols in the lattice D4 and hence the required L = 16 SP symbols were chosen according to the minimum energy and highest minimum Euclidean distance (MED) criterion proposed in [9] . All mapping schemes described here use the same 16 optimum constellation points. More specifically, for all mapping schemes, constellation points of the lattice D4 are given for each integer index l = 0, 1, . . . , 15. The normalisation factor of these constellation points is 2L E = 1. The constellation points corresponding to each mapping scheme are given in Table 1 
