INTRODUCTION
In 1993, a parallel concatenated convolution code (PCCC) decoding scheme was proposed by Berrou et al., which consists of two SISO (soft input soft output) decoders concatenated through an interleaver -deinterleaver structure [1] . These component decoders are individually matched to corresponding encoders as shown in figure 1 . The interleaver allows the low-weight code words produced by a single encoder to be transformed into high-weight code words for the overall encoder. This iterative decoding achieves transmission performance of a few tenths of a dB from Shannon limit when applied to BPSK transmission over channel with memory less noise. The conventional VLSI implementation of a Map decoder (operating in Log Domain) involves complex multiplication, exponentials and logarithm computations. Suboptimal varieties of Map, Max-Log-Map, Linear Log Map, log Map [2] [3] are usually used for VLSI implementations. The aim of the paper is not to rigorously derive these algorithms but to identify critical issues related to a reconfigurable turbo decoder array with the aim to facilitate various viterbi decoding mappings. Our previous work [4] showed viterbi component details for the platform. This paper extends these concepts to reconfigurable turbo decoder domain. Branch Metrics configuration bits [4] . We have used WriteAfter-Read (WAR) RAMs to implement two memory architecture compared with three memory architecture proposed in [5] . We had shown in our previous work [4] that in viterbi mode the write and read operation on these RAMs is done without wasting any clock cycles resulting in dynamic context switch for multi standard viterbi mappings and continuous decoding operation for turbo mode. Both viterbi and turbo decoders use forward and reverse state metrics processing. To improve the latency typically windowed versions of the algorithm are employed for VLSI implementations, largely known as sliding window BCJR algorithm [6] . The basic effect is that the equations will be applied separately to portions (window lengths-WLs) of the global block of data. In its simplistic form the algorithm uses two reverse processors Reverse Processor Dummy B2 and Reverse Processor B1 in parallel with on forward processor (shown by ACSO-ACS7 in figure 3 In Viterbi mode the write and read controls by FSM are much simpler and explained in [4] . For Turbo mode these are explained with the help of figure 4 and figure 5 below:
INPUT RAMS:
Input RAMs store input metrics for two window lengths (WLs). In viterbi mode the same input RAMs store the
Input metrics corresponding to first window length 0-L are written in RAMI. The last metric is saved in first memory 590 6B-4 6B-4 location and first metric in last memory location as shown in figure 4a. figure 6 with blue arrows. Note that this adjustment keeps the critical path still exactly the same, however now the same Processor blocks can be used for decoders with states greater than 8.
TIME SLOT L-2L (FIGURE 4B
)
LLR CALCULATION:
As shown in figure above LLR block require the values of forward, backward state metrics and branch metrics. Figure 7 LLR Computation Unit.
RECONFIGURABLE INTERCONNECT
The reconfiguration topology for viterbi mappings were explained in our previous work [4] . We figure 6 for these connections). These flexible connections are provided through multiplexer network as shown in figure  1 . The multiplexer network is therefore a multiplexer bank providing 4xl and 8xl multiplexer connections for each BM and FSM branch of ACS operation of Forward and Reverse processors. Viterbi blocks in the array are shown in white in figure 1 and these are clocked down by using an active clocking gating strategy throughout the chip.
INTERLEAVER:
One challenge in the design of turbo decoders is the length of the interleaver. The near Shannon performance of turbo codes is directly linked with the length of the interleaver. 3GPP defines an interleaver of the order greater than 5 thousand bits. Interleavers are usually implemented storing the interleaved address patterns in LUTs or ROMs. This storage will amount to interleaver memories equivalent to frame length (for example 5114x6 bits for 3GPP). This is a major overhead on area and power and we have addressed this in our previous work [7] . We have shown performance improvements by an alternative memory less implementation of 3GPP S-Random Interleaver.
RESULTS
The design is synthesized using Synopsys Design 
