29 research outputs found

    Application-specific instruction set processor for SoC implementation of modern signal processing algorithms

    Full text link

    CP-Based SBHT-RLS Algorithms for Tracking Channel Estimates in Multicarrier Modulation Systems

    Get PDF

    Interference Mitigation for WCDMA using QR Decomposition and a CORDIC-based Reconfigurable Systolic Array

    Get PDF
    This paper presents implementation and performance of QR Decomposition based Recursive Least-Squares (QRD-RLS) for interference mitigation in Wideband CDMA (WCDMA). The implementation is carried on CORSAEngine which is a new Software-Defined Radio (SDR) processor developed by NEC Corporation and highly optimized for MIMO-OFDM systems. It is shown how QRD-RLS can be mapped on its rectangular CORDIC-based reconfigurable systolic array, hence demonstrating its capability to process WCDMA. In addition, the performance of CORSAEngine is compared to that of other architectures and it is found to achieve at least 91% of the performance of dedicated hardware in terms of computational density

    REAL-TIME ADAPTIVE PULSE COMPRESSION ON RECONFIGURABLE, SYSTEM-ON-CHIP (SOC) PLATFORMS

    Get PDF
    New radar applications need to perform complex algorithms and process a large quantity of data to generate useful information for the users. This situation has motivated the search for better processing solutions that include low-power high-performance processors, efficient algorithms, and high-speed interfaces. In this work, hardware implementation of adaptive pulse compression algorithms for real-time transceiver optimization is presented, and is based on a System-on-Chip architecture for reconfigurable hardware devices. This study also evaluates the performance of dedicated coprocessors as hardware accelerator units to speed up and improve the computation of computing-intensive tasks such matrix multiplication and matrix inversion, which are essential units to solve the covariance matrix. The tradeoffs between latency and hardware utilization are also presented. Moreover, the system architecture takes advantage of the embedded processor, which is interconnected with the logic resources through high-performance buses, to perform floating-point operations, control the processing blocks, and communicate with an external PC through a customized software interface. The overall system functionality is demonstrated and tested for real-time operations using a Ku-band testbed together with a low-cost channel emulator for different types of waveforms

    MIMO equalization.

    Get PDF
    Thesis (M.Sc.Eng.)-University of KwaZulu-Natal, Durban, 2005.In recent years, space-time block co'des (STBC) for multi-antenna wireless systems have emerged as attractive encoding schemes for wireless communications. These codes provide full diversity gain and achieve good performance with simple receiver structures without the additional increase in bandwidth or power requirements. When implemented over broadband channels, STBCs can be combined with orthogonal frequency division multiplexing (OFDM) or single carrier frequency domain (SC-FD) transmission schemes to achieve multi-path diversity and to decouple the broadband frequency selective channel into independent flat fading channels. This dissertation focuses on the SC-FD transmission schemes that exploit the STBC structure to provide computationally cost efficient receivers in terms of equalization and channel estimation. The main contributions in this dissertation are as follows: • The original SC-FD STBC receiver that bench marks STBC in a frequency selective channel is limited to coherent detection where the knowledge of the channel state information (CSI) is assumed at the receiver. We extend this receiver to a multiple access system. Through analysis and simulations we prove that the extended system does not incur any performance penalty. This key result implies that the SC-FD STBC scheme is suitable for multiple-user systems where higher data rates are possible. • The problem of channel estimation is considered in a time and frequency selective environment. The existing receiver is based on a recursive least squares (RLS) adaptive algorithm and provides joint equalization and interference suppression. We utilize a system with perfect channel state information (CSI) to show from simulations how various design parameters for the RLS algorithm can be selected in order to get near perfect CSI performance. • The RLS receiver has two modes of operation viz. training mode and direct decision mode. In training mode, a block of known symbols is used to make the initial estimate. To ensure convergence of the algorithm a re-training interval must be predefined. This results in an increase in the system overhead. A linear predictor that utilizes the knowled~e of the autocorrelation function for a Rayleigh fading channel is developed. The predictor is combined with. the adaptive receiver to provide a bandwidth efficient receiver by decreasing the training block size.· The simulation results show that the performance penalty for the new system is negligible. • Finally, a new Q-R based receiver is developed to provide a more robust solution to the RLS adaptive receiver. The simulation results clearly show that the new receiver outperforms the RLS based receiver at higher Doppler frequencies, where rapid channel variations result in numerical instability of the RLS algorithm. The linear predictor is also added to the new receiver which results in a more robust and bandwidth efficient receiver

    A DSP based system for experiments with systolic arrays

    Get PDF

    Synthèse d'architectures parallèles dédiées du filtre de Kalman dans l'environnement MMAlpha

    Get PDF

    Formal process for systolic array design using recurrences

    Get PDF

    Comparison of logarithmic and floating-point number systems implemented on Xilinx Virtex-II field-programmable gate arrays

    Get PDF
    The aim of this thesis is to compare the implementation of parameterisable LNS (logarithmic number system) and floating-point high dynamic range number systems on FPGA. The Virtex/Virtex-II range of FPGAs from Xilinx, which are the most popular FPGA technology, are used to implement the designs. The study focuses on using the low level primitives of the technology in an efficient way and so initially the design issues in implementing fixed-point operators are considered. The four basic operations of addition, multiplication, division and square root are considered. Carry- free adders, ripple-carry adders, parallel multipliers and digit recurrence division and square root are discussed. The floating-point operators use the word format and exceptions as described by the IEEE std-754. A dual-path adder implementation is described in detail, as are floating-point multiplier, divider and square root components. Results and comparisons with other works are given. The efficient implementation of function evaluation methods is considered next. An overview of current FPGA methods is given and a new piecewise polynomial implementation using the Taylor series is presented and compared with other designs in the literature. In the next section the LNS word format, accuracy and exceptions are described and two new LNS addition/subtraction function approximations are described. The algorithms for performing multiplication, division and powering in the LNS domain are also described and are compared with other designs in the open literature. Parameterisable conversion algorithms to convert to/from the fixed-point domain from/to the LNS and floating-point domain are described and implementation results given. In the next chapter MATLAB bit-true software models are given that have the exact functionality as the hardware models. The interfaces of the models are given and a serial communication system to perform low speed system tests is described. A comparison of the LNS and floating-point number systems in terms of area and delay is given. Different functions implemented in LNS and floating-point arithmetic are also compared and conclusions are drawn. The results show that when the LNS is implemented with a 6-bit or less characteristic it is superior to floating-point. However, for larger characteristic lengths the floating-point system is more efficient due to the delay and exponential area increase of the LNS addition operator. The LNS is beneficial for larger characteristics than 6-bits only for specialist applications that require a high portion of division, multiplication, square root, powering operations and few additions
    corecore