Parallelised max-log-MAP model by Loo, KK
---- +30v 
-0- + G O V  
j -+- t 9 0 V  
I I I, I 
3 4 5 6 7 8 
frequency. CHz 
Fig. 3 Mensurecl Szl qfbandstop Jilter. with diflkrent PET timing voltctges 
Acknowledgment: The authors would like to thank Chunlei Waiig of 
Texas A&M University for technical assistance. 
@> IEE 2002 
Elect/-oizics Letters Online No: 20020665 
Dol: lO.l049/el:20020665 
Lung-Hwa Nsieh and l b i  Chang (Depurtrrent of Electrical 
Engineering, Texas A&M University, College Stution, Texas. 77843- 
3128, USA) 
E-mai I : chang@ee. tamu.edu 
25 May 2002 
References 
YEO, K.S.K., >Uld LANC‘ASTER, M.I.: ‘The design of  niicrOStrip Six-pOk 
quasi-elliptic filter with lincer phase responsc using extractcd- 
pole technique’, IEEE Fans. Microw Theory 7ech., 200 I ,  49, (2), 
pp. 321-327 
BELL. HC: ‘Single-passband single-stophand narrow-band filters’, IEEE 
Trans. Micmw Theory 7bclz., 2000, 48, ( I  2), pp. 2472-2475 
CFIALOUPKA. H.J.: ‘A high-temperature supcrconducting duplexer for 
cellular base-station applications’. IEEK Puns. Microw Theory Ecli., 
2.000. 48, (8), pp. 1336-1343 
YUN,l:-Y., and CIIANC, K.: ‘A low loss time-delay phase shift controlled by 
piezoelectric transducer to perturb microstrip line’, /EEE Microw Guitl. 
Wave Lett., 2000, 10, (3), pp. 96-98 
RUCIIANAN, R.C. (Ed.): ‘Ceramic material for electronics’ (Marcel 
Dckker, New York), Chap. 3 
HONG J.-S., LANCASTER, M..I., GREED, R.B., JEDAMZIK, D., MAGF,, J.-C., alld 
Parallelised max-Log-Map model 
K.K. Loo, K. Salman, T. Alukaidey and S.A. Jiinaa 
A paralleliscd max-Log-MAP model (P-max-Log-MAP) that exploits 
the sub-word parallelism and very long instruction word architccture 
of a microprocessor or a digital signal processor (DSP) is presented. 
The proposed model rcduccs considerably thc computational compkx- 
ity of the max-Log-hdAP algorithm; aiid thereforc facilitalcs easy 
implementation. 
lnti-odzrction: The simplified logarithm domain of maximum 
LI posteriori (MAP) algorithm [I] ,  the max-Log-MAP algorithm [2] ,  
work on a large number of add-compare-select (ACS) operations 
which are the basic and most intensive operations. The ACS is 
cxecuted in scqueiice to compute trellis mctrics recursivcly for each 
trellis state in the forward and backward manner over a huge data 
volume undcrlying the trellis. It can he shown that tlie max-Log-MAP 
algorithm lends itself to parallelism. In this Letter, the proposed 
P-max-Log-MAP fully exploits the sub-word parallelism (SWP) and 
very long instruction word (VLI W) architecture of a microprocessor 
or a digital signal processor (DSP) to achieve high level data 
parallelism. The SWP allows single instructions for tlie processing 
of several different data in parallel within a defincd data width. With 
the combination of VLLW architecture, at least two SWP instructions 
can be executed in parallel, i.c. a greater data parallelism can be 
achicvcd. 
Parallelised /nax-Log-MAP: Consider a max-Log-MAP algorithiii for 
decoding 3GPP turbo codes with constraint length K = 4  and 
polynomial generator, g ( D )  = 15/13 [3]. We show that thc algorithm’s 
forward metrics, the backward metrics, and LLR computations can be 
highly parallelisetl. Here, assumptions arc made whcrc the forward 
recursion is first executed, followed by the backward recursion in the 
same loop with the logarithm likelihood ratio (LLK) computation. In 
addition, we assume that a typical SWP architecture of a processing unit 
is either of a microprocessor or a DSP that supports &bit ALU 
operations as minimum. With data widths of 64 bits, the SWP is 
capablc of computing eight ACS operations over ciglit 8-bit different 
data sets individually, in parallel. The coinputations of eight 
forward/backward metrics usually require 16 add/sub operations aiid 
eight inax operations. However, using the single VLIW’s SWP 
instruction, I6 add/sub operations can be performed in one cycle, 
half cycles for each add and sub opcration, and cight iiiax operations in 
otic cyclc. Our model also requires that the data positious within a 
defined field to he arranged for a proper match of computation. For 
example, an arbitrary vector 2, = {ZO, Z , ,  Zz, Z3, Z4, Zs, Z6, Z,} may be 
arranged to 2, = {Z4, Z7, Z,, Z6, Z2, ZO, Zs, Z ,}  which matches another 
arbitrary vector V’, = {Vd, V,, Vj,  Vn, Vz, Vo, Vs, Vi ] for possible 
computations. The arrangement can he done by a simple mapping of 
Z!,c,,, I+ Z, wlicre thc permutation indcx, ,I], is related to 2,. 
Fo,wnd recui-sion: The forward metrics M ,  = a”, , , ,, are initialiscd 
arc stored in a register as shown in Fig. I. The branch metrics, y I  ly lor  
are retricvcd from the memory and arranged according to the trcllis 
branch transitions. The branch nietrics of other polyiioniial generators 
within K = 4 can he arranged according to the ncw branch transitions 
with a corresponding permutation index, p .  The mctrics N , ~  =E,,, , , , ,, 
at time index k -  I will add/sub with the corresponding branch 
metrics individually to yield 2,: and cc, that represent the inetrics 
related to information ‘ 1 ’ and ‘O’, respectively. After the operation, 
the position of a,: is arranged to match u,:, then a comparison is 
applied to select the survival inetrics as thc new metrics, E,,, at index k. 
The &,s are arraiigcd and updated to register a,, and a copy is stored in 
thc memory where they will be retrieved when cotnputiug the LLR. 
old metrics 
Baclmard recimion; Fig. 2 shows thc computational flow of the 
backward recursion which is identical to the forward recursion. 
Therefore, the backward mctrics can hc computed using the forward 
recursion method with the data collected form the backward trellis 
tracc. The backward metrics arc computed in the sainc loop with thc 
LLR computation. As the ncw backward inctrics are being computed, 
the metrics /j,: aiid /I: that arc related to information ‘1 ’ and ‘0’ are 
kept in the register for LLR computation. The reference ‘A’ in Fig. 2 
marks this situation. The new mctrics are updated to p,, and are used 
for computing thc next set of nietrics while executing the LLR 
computation. To eliminate extra memory, the mctrics are not required 
ELECTRONICS LETTERS 15th August 2002 Vol. 38 No. 17 97 1 
Authorized licensed use limited to: Brunel University. Downloaded on September 19, 2009 at 15:36 from IEEE Xplore.  Restrictions apply. 
to be stored in the memory. Next, wc discuss the LLR computation 
which completes the backward recursion operations. 
P-max-Log-MAP 
Rranch iiietrics 
Vorward mctrics 
I I new melrics 
Fig. 2 Compit~itiorzu~ $ow of‘ backwJard reczirsion 
Latcncy (cycle) Memory (bit) 
7 + ( N / l 6 )  x 8 
8 + (iV x 51 
2N x X 
X N  x 8 
4 expand-4x16 bits 4 expand-4x16bits I ”0’ ; ; 20 13 , expand ~ 2x32 bits expand ~ 2x32 bits 
A 3 2  bits 
Fig. 3 Computational pow of LLR cornputation 
LLR computation: Fig. 3 shows the computational flow of the LLR 
which is a continuation from the reference ‘A’ in Fig. 2. Both the 
backward metrics, /I: and /I,: will be added to the foiward metrics E., 
in a matchcd position, yiclding thc a posteriori probability soft values 
for information ‘ I ’  and ‘0’ represented as &=o, 7 and &Lo, , . . ,7. 
rcspectively. The rcst of thc LLR computation is to find the maximum 
values of &’Lo, , , , ,7 and i..:=o, . . . ,7 using a unique search procedurc as 
shown in Fig. 3. Finally, the LLR soft valuc, &, is calculatcd by 
finding the ratio between the two maximum values as in ( I ) :  
1% = max[ilt,,, ... ,71 - max[ lL3 .... 71. (1) 
lnzplenientution results: The proposed P-max-Log-MAP was imple- 
mented on analogue devices TigerSHARC dual-cores DSP. Thc input 
data is quantised into a minimum of H-bit signed integer format. In 
addition to saving a great deal of memory resources, multiple %bit 
operations can also be uscd. For a singlc-core operation or  thc 
TigerSHARC, 16 &bit data can be proccssed in parallcl on thc 
single VLTW’s SWP instruction, as shown in Table I ,  which also 
depicts the implementation rcsults of thc P-max-Log-MAP. 
Table 1 :  Cycle count 
Backward nictrics 
10 + (N x 20) 
9h.V 
Conclusion: The result show that the proposed model iiiiplemcnting 
thc 3GPP turbo dccoder can decode, with two iterations, 3G data 
traffic channels at a rate beyond 2 Mbit/s on thc 250 MHZ Tigcr- 
SHARC DSP. 
(iJ IEE 2002 17 Muy 2002 
Electr.onics Letters Online No: 20020663 
Dol:  IO. 1 O4Y/el:2O020663 
K.K. Loo, T. Ahkdidcy and S.A. Jimaa (Department uf ECEE, 
University iif Hertfordshshire. ALI 0 9AB, United Kingdom) 
E-mail: k.k.loo@herts.ac.uk 
K. Salman (NC A&T Stute Universitj~, USA) 
E-mail: Itsalman@ncat.edu 
References 
BAHL, L.R., COCKE, J., JLLINEK,  F., and RAVIL: J : ‘Optimal decoding of 
lincnr codcs Ibr minimisiug symbol crror rate’, IEEE Trans. Inf,’ Theor,% 
ROBEICl3L‘SON, P., and I-IOEIIHR, p.: ‘Optimal and sub-optimal maximum a 
posteriori algorithm suitable for turbo dccoding’, Etrr: Fans. 
Telecommun., 1997, 8, (2), pp. 119-125 
‘3rd Generation Partncrship Project; Technical Spccification Group 
Radio Access Nctwork; Multiplexing and Channel Coding (FDD)’, 
3GPP TS25.212 v3.6.0, July 2001 
1974, pp. 284-287 
Pilot symbol initiated sub-optimal sequence 
estimation algorithm for QAM signals 
Jong-Ho Lee, Seong-Cheol Kim and Jae Choong Han 
A sub-optimal dccision dirccted decoding algorithin based on thc 
expcctation-maxiinisatinti algorithm is proposed. The algorithm 
performs iterative sequcnce estimation assuming quadrature ainplitudc 
modulation signals in a frequency non-selective fading channel. The 
iteration begins using periodically inscrted pilot symbols, and it was 
obscrvcd that thc algorithm cotivergcs mostly within two iterations. 
Thc bit error rate perforinances ol‘ the proposed algorithm are 
evaluated using computer simulation. The rcsults show that the 
proposed algorithm pcrrorins better than conventional schemes. 
Inti-odziction: Applying thc M-ary QAM scheme for mobile commu- 
nications is attractive due to its inhercnt spectral ef‘ficicncy. However, 
channel fading introduces amplitude as wcll as phase distortion. Thus, 
a fading compensation technique is required for cohcrent dccoding. 
Fading compcnsation techniques using periodically inserted pilot 
symbols in thc dala strcam havc been studied in the litcrature [I] .  It 
is shown that thc technique is simple to iinplcment whilc providing 
good pcrformancc. The tcchniqtic uses pilot symbols to produce 
fading estimates, which arc used for coherent dccoding of QAM 
symbols. Thc performance of the tcchnique can be further improved if 
dccoded symbols arc used for channcl cstimation via some itcrative 
algorithm such as the expectation maximisation (EM) algorithm [2]. It 
is wcll known that thc EM algorithm provides maximum likelihood 
(ML) estimate under some conditions. Thc application of‘ the EM 
algorithm for a fading channel is introduced in [3]. Howevcr, 
implementation of the EM algorithm for QAM schetnc is too 
complicatcd. In this Lettcr, the EM algorithm is investigated for 
QAM signals, and a sub-optimal pilot symbol initiatcd decision 
directed decoder is proposed. 
972 ELECTRONICS LETTERS 15th August 2002 Vol. 38 No. 77 
Authorized licensed use limited to: Brunel University. Downloaded on September 19, 2009 at 15:36 from IEEE Xplore.  Restrictions apply. 
