Abstract-This paper presents an efficient synchronizer architecture using a common autocorrelator for Digital Video Broadcasting via Satellite, Second generation (DVB-S2). To achieve the required performance under the worst channel condition and to implement the efficient H/W resource utilization of functional synchronization blocks, we propose a new efficient common autocorrelator structure. The proposed architecture can decrease about 92% of multipliers and 81% of adders compared with the direct implementation. Moreover, the proposed architecture has been thoroughly verified in XilinxTM Virtex IV and R&STM SFU (Signaling and Formatting Unit) broad-cast test equipment.
I. INTRODUCTION
Recently, satellite communications are growing for a variety of broadband applications such as broadcast services for standard definition TV and HDTV, and interactive services including internet access. Satellites are essential in order to link users at long distance or in cases where cable connection is impractical. Digital Video Broadcasting -Satellite second generation (DVB-S2) is standardized by European Telecommunications Standards Institute (ETSI) in 2005. Compared with Digital Video Broadcasting via Satellite (DVB-S) [1] , DVB-S2 achieves about 30% more channel capacity and more flexible air interface, getting better trade-off between acceptance quality level and broadcasting coverage in various satellite applications.
To ensure the maximum channel efficiency, the synchronization part should operate well around SNR -2.35 dB that the Low Density Parity Check (LDPC) code can reach the Quasi Error Free (QEF). Based on the simulation results, we found that the existing synchronization algorithms [2] cannot satisfy the required perfor-mance [3] . To overcome this problem, more sophisticated and efficient algorithms should be employed for the frame synchronization and carrier frequency synchronization. To reach the requirement, we intend to use the Differential -Generalized Post Detection Integration (D-GPDI) algorithm for the frame synchronization and the Mengali and Moreli (M&M) algorithm for the carrier frequency synchronization [4] [5] [6] . Since these algorithms are based on Maximum Likelihood (ML) method, they contain autocorrelation operations. However, the complexity of autocorrelators is so high since autocorrelation operations should be carried out through the complex multiplications of 26 Start Of Frame (SOF) symbols and the existing serial architecture [6] may be inefficient in terms of robust algorithm design and hardware complexity. With this rationale, to reduce high hardware complexity and to achieve suitable performance, we propose an efficient synchronizer architecture which can be jointly used with the common autocorrelator. For hardware complexity point of view, the proposed architecture can eliminate a considerable number of autocorrelators by sharing them. Moreover, the proposed architecture performs operations in parallel and can acquire the uncertainty channel rapidly under the cold start condition. Therefore, this paper must be useful to design the entire DVB-S2 demodulation with suitable performance and complexity.
The rest of this paper is organized as follows. Section II analyzes various synchronization algorithms. Section III describes the proposed hardware architecture. Section IV presents the implementation results. Finally, conclusions are drawn in Section V.
II. SYNCHRONIZATION ALGORITHMS

Frame Synchronization Algorithms
Typically, the frame synchronization for DVB-S2 system considers the maximum 5 MHz carrier frequency offset which corresponds to 20% of the symbol rate 25 Mbaud and at SNR -2.35 dB. To identify the location of the SOF symbols, the general correlator is dependent on the carrier frequency offset. The presence of the carrier frequency offset deteriorates correlation and integration coherently over the entire SOF. To overcome this difficulty, Different PDI (DPDI) techniques have been proposed and investigated in the literature to alleviate the effect of carrier frequency uncertainty, such as NCPDI, DPDI, GPDI, and D-GPDI [3] . The system model is described here. Let's assume the received signal is r(t)=s(t)+n(t), where s(t) is the transmitted signal, and n(t) is the Additive White Gaussian Noise (AWGN). The transmitted signal s(t) is shown in the Eq. (1). (2 ) ( ) ( ) [
L UW is the unique word symbol length, L F is the total frame length, C k is the k-th known signal, and d k is the kth random data signal. When the signal is received at the frame synchronizer, the signal has been already finished the process of the symbol matched filter and symbol sampling at (m+Δ)T s +δ, which is expressed in Eq. (3).
R p is the pulse autocorrelation function and n' m is the noise component at the output of the impulse matched filter. With the signal in the frame synchronizer, we can find the autocorrelation value with a k which is known to both receiver and transmitter. The correlation of the known received signal samples in the SOF is
where r m is the known received signal and cm is the SOF autocorrelation coefficient. The NCPDI, n-Span DPDI, GPDI and D-GPDI is defined as
where M is the coherent sum length, and L is the PDI length, and these parameters should satisfy the relation of M×L=L UW . The proposed frame synchronizer has been applied with M=1, L=26 [3] .
In our previous work, we adopted the D-GPDI algorithm through analyzing and simulating each algorithm [4] .
Frequency Synchronization Algorithms
Data-aided (DA) algorithms are normally employed to obtain good performance with short preambles. This paper focuses on the algorithms proposed by Mengali and Moreli (M&M) [5] . The M&M algorithm performs the common sample correlations as follows
where L p is the pilot symbol length and M is a design parameter not greater than L p /2 which corresponds to the number of autocorrelators. p n i is the i-th received pilot symbol of the n-th pilot block, and c i is the referenced ith pilot symbol.
The M&M [8] algorithm takes the form of
where l k is a smoothing function given by
The M&M algorithm estimates a frequency offset by performing a weighted average using, arg{R(k)R * (k-1)}, a phase difference between neighbor pilot symbols. Since the accuracy of the M&M algorithm is better than that of the L&W algorithm [5] and its estimation range is larger than the other algorithms [4] [5] [6] , we employ the M&M algorithm in our previous work [5] .
III. PROPOSED HARDWARE ARCHITECTURE
The D-GPDI algorithm for the frame synchronization and the M&M algorithm for the frequency synchronization are based on the ML method. However, the ML method inevitably requires the autocorrelation functions. Because each algorithm requires the correlation value with the known SOF 26 symbols to detect the SOF and to estimate the frequency offset, the frame and frequency synchronization can jointly use the common autocorrelator. Accordingly, the proposed architecture can bring out the synergy effect to exchange the correlation value and output results. Fig. 1 depicts the existing hardware architecture [4] [5] [6] . In the existing architecture, each synchronization block can be independently operated since they are designed in serial. The presence of the noise and a large frequency offset causes severe degradation of the performance at the frame synchronization. Thus, the remaining synchronization blocks have a critical drawback which can cause the entire system not working well when the frame synchronization operates improperly or fails in the conditions of low SNR or large frequency error. Fig. 2 show the proposed architecture. Each block shares the common autocorrelator and jointly operates in parallel, and the estimated frequency error can be calculated just through detection of the SOF and the frequency compensator gradually provides the beneficial condition to the frame synchronizer by continuously mitigating the frequency offset. As a result, the proposed architecture of the frame synchronizer using only DPDI and 2-span DPDI can detect well the SOF, without a significant impact of any performance degradation. In addition, the frequency synchronizer can track only the residual frequency offset after acquiring the initial carrier frequency offset estimation value and can considerably decrease the residual frequency error.
Existing Architecture
Proposed Architecture
The output value of the calculated G-DPDI energy rapidly increases as SNR decreases due to the noise power increase. The threshold value to detect the SOF can be adaptively changed as the estimated noise power value from the SNR estimator. As a result, the proposed frame synchronization architecture can reach the reliable performance even at low SNR and a large frequency offset. To implement directly the D-GPDI, 1,455 multipliers and 1,455 adders are required [3] which is too complex to entirely implement the D-GPDI algorithm.
However, the proposed frame synchronizer using only DPDI and 2-span DPDI can detect frame synchronization through the appropriate compromise of performance impact and implementation complexity. In addition, we can replace all multipliers which calculate the coherent value by multiplexers due to the characteristic of the SOF. Fig. 3 shows the performance of the proposed frame synchronizer. The y-axis of Fig. 3 represents a ratio of the D-GPDI value at detecting the SOF to the value at non-detecting the SOF. The performance of the frame synchronizer is obtained at the maximum carrier frequency offset, 20% of the symbol rate. As shown in Fig. 3 , the proposed frame synchronizer reliably satisfies the performance, and thus, it can considerably reduce the complexity of the entire system. Fig. 4 shows the performance of the proposed carrier frequency synchronization. The remained frequency offset of the proposed method is smaller than that in our previous work [6] because the frequency synchronizer compensates the residual frequency error after removing the initial frequency offset. In these simulation results, we found that the remained frequency offset converges to about 100 kHz after 4 frames. After 3 frames, the remained frequency offset of the proposed method decreases about 50 kHz compared with the convergence value of the remained frequency offset. However, this difference does not affect the convergence value of the remained frequency offset.
Performance Comparisons
IV. PROPOSED HARDWARE ARCHITECTURE
The proposed architecture has been modeled by CowareTM SPW and implemented in Verilog HDL. The proposed architecture can find the optimized bit width through the fixed point modeling. Logic synthesis has been performed using SynplicityTM Synplify Pro 8.0 and XilinxTM ISE 9.2i. In addition, the proposed architecture has been thoroughly verified onto the FPGA board having the XilinxTM Virtex IV LX200 and R&STM SFU broadcast test equipment. Table 1 shows the performance comparisons between the existing non-sharing common autocorrelator architecture and the proposed architecture. Since the direct 
V. CONCLUSIONS
This paper proposed the low complexity synchronizer architecture using the common autocorrelator for the DVB-S2 system. The proposed architecture can provide the reliable performance even at the worst condition of low SNR and large frequency offset. In addition, the proposed architecture including the frame and frequency synchronizer can reduce about 92% multipliers and 81% adders compared with the direct implementation. Therefore, the proposed synchronizer can achieve the hardware resource utilization and power consumption. 
Jang-Woong Park
