570 research outputs found

    Overview of Parallel Platforms for Common High Performance Computing

    Get PDF
    The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well

    Non-power-of-Two FFTs: Exploring the Flexibility of the Montium TP

    Get PDF
    Coarse-grain reconfigurable architectures, like the Montium TP, have proven to be a very successful approach for low-power and high-performance computation of regular digital signal processing algorithms. This paper presents the implementation of a class of non-power-of-two FFTs to discover the limitations and Flexibility of the Montium TP for less regular algorithms. A non-power-of-two FFT is less regular compared to a traditional power-of-two FFT. The results of the implementation show the processing time, accuracy, energy consumption and Flexibility of the implementation

    Implementation of a Combined OFDM-Demodulation and WCDMA-Equalization Module

    Get PDF
    For a dual-mode baseband receiver for the OFDMWireless LAN andWCDMA standards, integration of the demodulation and equalization tasks on a dedicated hardware module has been investigated. For OFDM demodulation, an FFT algorithm based on cascaded twiddle factor decomposition has been selected. This type of algorithm combines high spatial and temporal regularity in the FFT data-flow graphs with a minimal number of computations. A frequency-domain algorithm based on a circulant channel approximation has been selected for WCDMA equalization. It has good performance, low hardware complexity and a low number of computations. Its main advantage is the reuse of the FFT kernel, which contributes to the integration of both tasks. The demodulation and equalization module has been described at the register transfer level with the in-house developed Arx language. The core of the module is a pipelined radix-23 butterfly combined with a complex multiplier and complex divider. The module has an area of 0.447 mm2 in 0.18 ¿m technology and a power consumption of 10.6 mW. The proposed module compares favorably with solutions reported in literature

    Parallel Fast Legendre Transform

    Get PDF
    We discuss a parallel implementation of a fast algorithm for the discrete polynomial Legendre transform We give an introduction to the DriscollHealy algorithm using polynomial arithmetic and present experimental results on the eciency and accuracy of our implementation The algorithms were implemented in ANSI C using the BSPlib communications library Furthermore we present a new algorithm for computing the Chebyshev transform of two vectors at the same tim

    On-board demux/demod

    Get PDF
    To make satellite channels cost competitive with optical cables, the use of small, inexpensive earth stations with reduced antenna size and high powered amplifier (HPA) power will be needed. This will necessitate the use of high e.i.r.p. and gain-to-noise temperature ratio (G/T) multibeam satellites. For a multibeam satellite, onboard switching is required in order to maintain the needed connectivity between beams. This switching function can be realized by either an receive frequency (RF) or a baseband unit. The baseband switching approach has the additional advantage of decoupling the up-link and down-link, thus enabling rate and format conversion as well as improving the link performance. A baseband switching satellite requires the demultiplexing and demodulation of the up-link carriers before they can be switched to their assigned down-link beams. Principles of operation, design and implementation issues of such an onboard demultiplexer/demodulator (bulk demodulator) that was recently built at COMSAT Labs. are discussed

    On the efficient parallel computation of Legendre transforms

    Get PDF
    In this article, we discuss a parallel implementation of efficient algorithms for computation of Legendre polynomial transforms and other orthogonal polynomial transforms. We develop an approach to the Driscoll-Healy algorithm using polynomial arithmetic and present experimental results on the accuracy, efficiency, and scalability of our implementation. The algorithms were implemented in ANSI C using the BSPlib communications library. We also present a new algorithm for computing the cosine transform of two vectors at the same time
    corecore