Abstract-An approach to hybrid digital-analog (HDA) source-channel coding for the communication of analog sources over memoryless Gaussian channels is introduced. The HDA system, which exploits the advantages of both digital and analog systems, generalizes a scheme previously presented by the authors, and can operate for any bandwidth ratio (bandwidth compression and expansion). It is based on vector quantization and features turbo coding in its digital component and linear/nonlinear processing in its analog part. Simulations illustrate that, under both bandwidth compression and expansion modes of operation, the HDA system provides a robust and graceful performance with good reproduction fidelity for a wide range of channel conditions.
threshold. 1 This leveling-off effect is due to the nonrecoverable error introduced by the quantizer.
Note that analog systems do not suffer from these problems to the same extent, in particular concerning the leveling-off effect. On the other hand, in practice, analog systems generally are inferior to digital systems in terms of rate-distortion-capacity performance-particularly at the designed CSNR.
Recently, Mittal and Phamdo [7] proposed a class of hybrid digital-analog (HDA) joint source-channel coding systems. These systems can theoretically achieve the Shannon rate-distortion-capacity limit at the designed CSNR. Furthermore, they do not suffer from the leveling-off effect-the threshold effect is still inherent, though less severe, in the HDA systems [7] . Thus, systems that mix digital and analog techniques can have some of the advantages of digital systems and some of the advantages of analog systems (e.g., [3] , [6] , [8] [9] [10] [11] ).
In [6] , [12] , we presented a vector quantization (VQ) based HDA system. This system is valid only for bandwidth ratios larger than one (bandwidth expansion)-i.e., when the channel bandwidth is greater than the source bandwidth. In this correspondence, we introduce a generalized version of the scheme in [6] , [12] . The new system, originally proposed in [1] , can be used for either bandwidth expansion or bandwidth compression. We begin with describing a highly general version of the system, and then we investigate in some detail the performance, under both bandwidth compression and expansion, of one important typical case for the communication of a Gauss-Markov source over a memoryless Gaussian channel. The new scheme has several important features that were not present in the original work [6] , [12] . In particular, the system studied in detail incorporates a turbo error-correcting code [13] to improve the performance at low CSNRs, and uses a Karhunen-Loéve transform (KLT) to decorrelate the source vector (for the bandwidth compression mode). The new scheme also allows for both linear and nonlinear transformations in the analog part of the HDA system (in [6] , [12] , only linear transformations are used). Other recent methods which employ a direct source-channel analog mapping or combine digital and analog coding include those in [3] , [8] , [11] , [14] [15] [16] [17] [18] [19] .
In the next section, we provide a general description of the new HDA system. In Section III, we study in detail the system under bandwidth compression for a typical scenario and present simulation results. We also compare the performance of the new scheme with i) a purely analog system, ii) a purely digital system, and iii) the Shannon rate-distortion-capacity limit. We examine the new system under bandwidth expansion in Section IV and evaluate its performance vis-a-vis systems i)-iii) and the scheme studied in [6] . Finally, conclusions are given in Section V.
Some notation used in this correspondence is as follows. Bold-faced characters are used for vectors and matrices. Upper case is used for random entities and lower case for their realizations. The notation (x x x) m denotes the mth component of vector x x x.
II. SYSTEM DESCRIPTION
In Fig. 1 , we depict a general version of the proposed HDA system.
The purpose of the system is to convey the p-dimensional random source vector X X X 2 p over a memoryless Gaussian channel, and reproduce it asX X X at the receiver. The upper part of the transmitter is the digital part, and the lower part the analog part. In the following we describe in detail how the system works. 
A. Encoding
The source vector is first fed to a linear invertible preprocessing mapping, defined by the matrix G G G. The resulting output,X X X, is then used as input to the first encoder mapping " 1 . In the most general version of the HDA system, "1 is a low-delay source or source-channel encoder. Examples include a VQ encoder trained for a noiseless channel, i.e., a source-optimized VQ (SOVQ), a VQ encoder trained for a noisy channel, i.e., a channel-optimized VQ (COVQ), or an SOVQ encoder in tandem with a (short) channel block code. The discrete output I = " 1 (X X X) 2 I N , where I N f0; . . . ; N 0 1g and where we assume N = 2 L , is then fed (in its L-bit binary form) to the high-delay (n; k) channel encoder " 2 , of rate r c = k=n < 1. An example of a specific such mapping, which we use in this correspondence, is the encoder of a rate rc = 1=2 turbo code [13] .
A number M = k=L > 1 of consecutive outputs from " 1 are blocked and encoded by "2. Note that for ease of presentation, Fig. 1 does not explicitly illustrate the delay operation associated with the grouping of M consecutive blocks. In the digital part, this operation occurs before the high-delay encoder " 2 ; in the analog part, it occurs after 2 the scaling of Z Z Z by a. Thus, the transmission of one q-dimensional channel symbol s s s k in the digital part corresponds to M different aZ Z Z's in the analog part. This point also applies to Fig. 2 .
B. Decoding
At the receiver, the decoder mapping takes the channel output R R R and outputs a source vector estimateX X X = (R R R). In general, however, the complexity of implementing this decoder prohibits its use in practice. Therefore, we propose a suboptimal, but more practical, decoder structure, as illustrated in Fig. 2 . As shown in the figure, the received vector R R R is fed to a decoder, 2 , for the high-delay code. The resulting discrete output J 2 I N is encoded by " 2 and assigned a channel symbol. The result is then subtracted from the received vector R R R and scaled by the constant b, forming an estimate,Ẑ Z Z, of the transmitted analog vector Z Z Z. This estimate is then fed to the mapping , with outputÊ E E. The purpose of is to act as decoder for the analog encoder , andÊ E E is hence an estimate of the error vector E E E. The output index J of the high-delay channel decoder 2 is also fed to a decoder 1 for the low-delay source (or source-channel) code. The output 1(J) is then added toÊ E E and the result is fed to the inverse of the preprocessing map, resulting in the source vector estimateX X X.
While Figs. 1 and 2 describe the most general version of the proposed HDA system, we will in the remaining parts investigate some specific instances of the system. In particular, we will first employ the system for bandwidth compression, and then study its performance when used for bandwidth expansion.
III. BANDWIDTH COMPRESSION
In this part, we focus on using the system in Fig. 1 , with the decoder in Fig. 2 , for bandwidth compression, that is, under the assumption that the "total" source vector dimension M 1 p is larger than the channel signal space dimension q, and consequently the bandwidth ratio = q=(Mp) = r=p < 1 (channel dimensions/uses per source dimension). For the sake of clarity and concreteness, we describe our system explicitly in terms of a typical example with = 1=2 (the generalization for systems with arbitrary < 1 is straightforward). More precisely, we have implemented a system with the following parameters.
The source vector X X X is p = 32 dimensional, drawn from a zero- x corresponding to eigenvalue i , resulting inX X X = G G GX X X having independent components with variances 1 through 32. The low-delay encoder "1 is an SOVQ encoder, of dimension p = 32 and size L = 8 bits, trained using the Linde, Buzo, and Gray (LBG) algorithm [20] , and the encoder codebook fz z z ig is chosen to be identical to the codebook defining " 1 . The high-delay encoder " 2 is an (n = 2048; k = 1024), rate r c = 1=2, turbo encoder, with generators (37; 21) (punctured to rate 1=2) and with a random interleaver [13] . The 8-bit output blocks from " 1 are blocked into one k = 1024-bit "superblock" which is fed to "2, resulting in a codeword of length n = 2048 bits. The output bits from " 2 , corresponding to the index K, are mapped directly into binary phase-shift keying (BPSK) symbols, 3 with alphabet f61g. Consequently, the channel signal space dimension is q = 2048 and s s s k 2 f61g
2048 . Since M = 1024=8 = 128, one " 2 -codeword represents 128 source vectors, and hence, = q=(Mp) = 2048=(128 1 32) = 1=2 (channel uses per source dimension).
The scaling constant a, in the analog part, is chosen so that a fraction 0 < 1 < 1 of the total input power to the channel is assigned to the analog part. That is, since the power in the digital (BPSK) part is 1, the constant a is solved to satisfy The high-delay decoder 2 is a turbo decoder for the encoder "2, implemented using 10 iterations and given access to the noise variance 2 . The low-delay decoder 1 is defined by a What remains to be specified is the analog encoder-decoder pair (; ). We have investigated two systems, which are described in the following two subsections. Simulations results for bandwidth compression are then provided in Section III-C.
A. Linear Analog Part
The first system employs linear mappings to define and . More precisely, is the linear mapping that projects E E E onto the subspace spanned by the eigenvectors corresponding to the 16 strongest eigenvalues of R R R xx xx xx . Hence, since a KLT is performed on X X X, the mapping is simply the operation of dropping the 16 low-energy components of E E E, resulting in the r = 16 dimensional output Z Z Z. The decoder is defined by the linear mapping of extendingẐ Z Z from 16 to 32 dimensions, by filling in zeros in the 16 low-energy dimensions.
B. Nonlinear Analog Part
The second system employs a "discrete approximation" of the optimal, analog, generally nonlinear mappings (; ) that minimize the MSE, EkE E E 0Ê E Ek 2 (so in this case the analog part is not really analog, but "close-to-analog"). The mappings are described as follows. 3 Although we only treat the case of BPSK signaling in the digital component of the system, we can clearly accommodate multilevel signaling schemes in general. For each m 2 f1; ...; 16g, the encoder m and decoder m are trained to minimize
for a fixed channel noise power 2 , a given 1, a fixed a and under a constraint on the total transmit power. Note that a power constraint is needed in the design, since even if the PAM constellation for (Z Z Z) m and the value of the constant a are fixed, the encoder can still assign different probabilities to different transmitted symbols (note that the PAM symbols have different energy).
In Fig. 3 , we illustrate, schematically, the typical structure of the nonlinear compression. The circles mark code vectors in the two-dimensional input space, and m is defined by a nearest neighbor search among these code vectors to produce the corresponding 256-PAM symbol. How code vectors are mapped to the PAM alphabet can clearly be seen in the figure. The two endpoints of the PAM constellation are mapped to the two endpoints of the "spiral" in Fig. 3 , and any two neighboring code vectors correspond to two neighboring PAM points. At the receiver, a nearest neighbor search over the PAM constellation produces the corresponding code vector (circle in the figure) to give a value for ((Ê E E 2m01); (Ê E E 2m)).
The approach we use for the nonlinear analog part, as described, is essentially equivalent to the "BDCE system" studied by Vaishampayan in his Ph.D. dissertation [17] (see, in particular, [17, Secs. 5.4-5.5]). A similar system (; ) has also been investigated in [14] . We refer the reader to [17] for results on optimal encoder and decoder mappings, how to handle the power constraint, and a design algorithm. 
C. Simulation Results: Bandwidth Compression
Here we evaluate the performance of the described HDA system when used for bandwidth compression. We investigate the system both with linear and nonlinear analog parts. The systems were trained for a fixed relative power level 1 in the analog part, and a fixed CSNR, 4 where CSNR = 10log 10 (P in = 2 ) (in decibels), with P in denoting the total channel input power per component. In our simulations, motivated by a broadcast scenario, we allow the receiver to have knowledge of the true CSNR and thus to adapt to it as it varies, while the transmitter is kept fixed. We employed 500 000 vectors in the training of the SOVQ ("1; 1) and 100 000 vectors in the training of the nonlinear (; ) maps. The simulations were run with M = 128 and using 1000 "superblocks" (128 000 source vectors). All considered systems have an overall bandwidth ratio of = 1=2 channel uses per source symbol. for the following systems.
• Five linear analog HDA schemes (Fig. 4) , evaluated at an analog power level 1 of 1%, 10%, 20%, 30%, and 40%, respectively.
• Four nonlinear analog HDA schemes ( i , that is, the power in the strong half and the weak half of the dimensions, respectively. • A purely digital tandem system (Fig. 4) employing solely the digital part of the HDA systems (with the analog part turned off). 4 Note that, unlike the encoder of the nonlinear analog system, the encoder of the linear analog system does not need any knowledge about the CSNR value.
• The optimal performance theoretically attainable (OPTA) shown Also, as a reference when judging the purely digital system, the curve labeled OPTA 2 (Fig. 4 only) We observe from the figures that the HDA systems offer a robust and graceful performance over the entire range of the CSNRs. We also remark that the performance of the HDA systems at low to medium CSNRs is strongly affected by the power allocation provided to the analog part, with the value of 1 playing a role similar to that of "rate allocator" between the digital and analog parts. The linear analog HDA systems outperform the purely analog systems for a wide range of CSNRs, depending on 1. The systems with 1 = 30% or 20% can be said to provide the best overall performance. The HDA systems also provide substantial improvements over the purely digital system at medium to high CSNRs. A drawback of using a linear analog part, however, is that the performance saturates at SDR 14 dB. This can be counteracted by using the nonlinear maps (; ) in the analog part, as can be seen in Fig. 5 . They perform very well (with a strictly positive SDR curve slope) in the vicinity of the CSNR at which their encoder was designed; they also provide a smooth degradation/improvement as the true CSNR varies away from the designed CSNR. Indeed, their SDR is within 5 dB of OPTA for a wide range of CSNRs (e.g., for 6 dB CSNR 45 dB in Fig. 5 ). Note also that the HDA system with nonlinear analog part can be made to saturate at an arbitrarily high SDR, by increasing the resolution of the maps (; ). 
IV. BANDWIDTH EXPANSION
Here we study the system in Figs. 1 and 2 when used for bandwidth expansion, that is, = q=(Mp) = r=p > 1. The precise system we have implemented is specified as follows.
The source vector X X X is p = 8 dimensional. As in Section III, the vector X X X is drawn from a zero-mean Gauss-Markov source with normalized correlation 0:8. In this section, we do not use linear preprocessing, so G G G is the identity matrix. The low-delay encoder " 1 is again an SOVQ encoder, this time of dimension p = 8 and size L = 8 bits, and the codebook fz z z ig is identical to the codebook defining "1.
The high-delay encoder " 2 is the same (k = 1024;n = 2048), rate rc = 1=2, turbo encoder as used in Section III, and M = 128 blocking is again used. The output bits from " 2 are mapped to 61 BPSK symbols, and 2048 bits are transmitted to represent M = 128 source vectors. This gives a bandwidth ratio = 2 channel uses per source sample.
A. Linear Analog Part
In the case of bandwidth expansion with a linear analog part, is the linear mapping corresponding to transmitting each component of
The constant scaling in the analog part of the receiver is chosen as b = 1 for simplicity (since b anyhow can be absorbed into ), and the decoder is defined as the linear mapping that computes the component-wise linear minimum MSE estimate of the vector E E E based on R R R, again assuming the digital decoder 2 works without errors. That is 
B. Nonlinear Analog Part
We again employ the discrete approximation described in Section III-B, the only essential difference being that is split into eight parts m that each maps one input dimension into two channel dimensions. That is, the 8-dimensional vector E E E is transmitted via one (Ê E E) m for m = 1; ...; 8, as before based on ML decisions and table lookup decoding. As in Section III, the pair (; ) is trained subject to a power constraint on the channel input symbols.
C. Simulation Results: Bandwidth Expansion
Here we evaluate the performance of HDA bandwidth expansion, with linear and nonlinear analog parts. As in Section III-C, the systems were trained for a fixed relative power level 1 in the analog part. The receiver knows the true CSNR and can thus adapt to it, while the transmitter is kept fixed. As before, we employed 500 000 vectors in the training of the SOVQ (" 1 ; 1 ) and 100 000 vectors in the training of the nonlinear (; ) maps. The simulations were run with M = 128 and using 1000 "superblocks" (128 000 source vectors). All systems in the comparison have an overall bandwidth ratio of = 2 channel uses per source dimension.
Figs. 6 and 7 illustrate the performance for the following systems.
• Four linear analog HDA schemes (Fig. 6 ), evaluated at an analog power level 1 of 1%, 10%, 20%, and 30%, respectively. • Four nonlinear analog HDA schemes (Fig. 7) trained at 1 = 0:3 (in all cases) and for the following CSNRs: 10, 15, 20, 25 dB. The performance was evaluated over a range of different CSNRs, and with 1 = 30% in all cases.
• A purely analog system (Figs. 6 and 7) employing solely the analog part of the linear analog HDA system (with the digital part turned off).
• A purely digital tandem system (Fig. 6 ) employing solely the digital part of the HDA systems (with the analog part turned off).
• The "HDA-VQ" system presented in [6] (cf. [6, Fig. 4 
]).
The figures also show the OPTA curves for = 2.
We remark from Figs. 6 and 7 that our bandwidth expansion systems perform analogously to the bandwidth compression systems studied in the previous section (cf. Figs. 4 and 5) . Indeed, the gains vis-a-vis the purely analog and digital systems are maintained at medium to high analog input (OPTA ), OPTA for binary input (OPTA ), the "HDA-VQ" system in [6] , purely analog and purely digital. CSNRs. Furthermore, the HDA system is improved at high CSNRs when the linear maps in its analog component are replaced by the nonlinear maps. For example, the HDA system with a linear analog part with 1 = 30% has an SDR of 28 dB for CSNR = 20 dB (see Fig. 6 ), while the HDA system with a nonlinear analog part trained for CSNR = 20 dB provides an SDR 33 dB at the same CSNR (see Fig. 7 ), resulting in a substantial gain. This gain is however reduced if there is a mismatch between the true CSNR and the CSNR for which the nonlinear encoder of the analog part is designed; for example, when the true CSNR is 20 dB and the nonlinear encoder's design CSNR is 15 dB, the gain is 3 dB (it is 1 dB for a design CSNR of 25 dB). This indicates that one advantage of the linear analog part is that it does not need to know the CSNR at the encoder and thus it is not affected by a CSNR mismatch. The main difference between the bandwidth expansion and compression systems is that the SDR in our bandwidth expansion schemes, with a linear or infinite-resolution 5 nonlinear analog part, have no leveling-off effect-the slope of their SDR curve is positive for any CSNR. The slope, however, is noticeably less than that of the OPTA curve (slope = 2).
With respect to the "HDA-VQ" system of [6] , it is first worthy to point out that our system employs superposable coding (as the digital and analog signals are added to each other at the encoder output before transmission over the channel), while the system of [6] does not. In Fig. 6 , we observe that our system with the linear analog part provides a better performance at low to medium CSNRs. This can be explained in virtue of the turbo channel coding employed in the digital part of our system, which helps combat channel error in the "waterfall" error region of the turbo code at low to medium CSNRs. On the other hand, the system of [6] does not employ strong channel coding and is hence prone to the significant channel impairment in that CSNR range. However, in the high-CSNR regime, the system of [6] is less susceptible to channel noise and its analog component becomes "cleaner" than our system's since it does not use superposable coding; i.e., unlike our system, it does not need to "filter" out the digital and analog signals from each other at the decoder. Still, as illustrated in Fig. 7 , our system with the nonlinear analog part can match or outperform the system of [6] at high CSNRs that lie in the vicinity of the CSNR for which the nonlinear encoder map is designed; e.g., the nonlinear system designed for a CSNR of 25 dB outperforms the scheme of [6] for CSNRs in an interval starting at 23 dB (for finite resolution in the nonlinear analog part, the curve from [6] will cross the new curve at a CSNR 35 dB, however, by increasing the resolution, the range over which the new system outperforms the one in [6] can be improved). Finally, it is important to note that the new system is more general than that of [6] as it allows for both expansion and compression modes. In fact, it subsumes the scheme in [6] ; e.g., for = 2, the new system can be converted to the one in [6] if we replace the high-delay channel encoding map " 2 by a simple r c = 1=2 map resulting in 16 BPSK symbols where the 8 bits of index I appear in the first eight positions and zeros are stacked in the last eight positions (the decoder 2 performs the reverse operation), and if we choose the analog map to produce a vector Z Z Z 2 16 such that the first eight components of Z Z Z are zeros and E E E appears within the last eight components.
V. SUMMARY AND CONCLUSION
An HDA source-channel coding system for the reliable communication and reproduction of discrete-time analog-valued sources over AWGN channels is proposed and investigated. The HDA system, which is based on VQ source coding, employs turbo channel coding in its digital component and linear/nonlinear coding in its analog component, before superposing the analog and digital signals for transmission over the channel. As a result, the system accommodates all bandwidth ratios and, unlike the scheme studied in [6] , it can operate in both bandwidth compression and expansion modes. Numerical results show that the HDA system provides a robust and graceful performance for a wide range of channel conditions (medium to high CSNRs), substantially outmatching purely digital and analog coding systems. Under bandwidth compression, the system performs within 5 dB (in SDR) of the OPTA limit for a large CSNR range. The advantages of using linear and nonlinear coding in the analog part of the system are also illustrated: linear coding is simple and does not need the knowledge of the CSNR at the encoder, while nonlinear coding can significantly improve the system performance at high CSNRs.
Future work may include improving the system performance at low CSNRs. An interesting direction is to optimize the performance of the digital component of the system using joint source-channel coding techniques without affecting its performance at high CSNRs. This can be accomplished by leaving the VQ encoder unoptimized and designing a joint source-channel decoder for the VQ-turbo decoder pair according to the methods of [4] , [5] , [21] . A first step in this direction is undertaken in [22] , in the context of image communication without the use of turbo coding, and the digital encoder and decoder are optimized under bandwidth compression. Finally, since the HDA system is general, it can be applied for a variety of source and channel models, including fading channels used in conjunction with multilevel modulation.
