28 research outputs found

    Number theoretic techniques applied to algorithms and architectures for digital signal processing

    Get PDF
    Many of the techniques for the computation of a two-dimensional convolution of a small fixed window with a picture are reviewed. It is demonstrated that Winograd's cyclic convolution and Fourier Transform Algorithms, together with Nussbaumer's two-dimensional cyclic convolution algorithms, have a common general form. Many of these algorithms use the theoretical minimum number of general multiplications. A novel implementation of these algorithms is proposed which is based upon one-bit systolic arrays. These systolic arrays are networks of identical cells with each cell sharing a common control and timing function. Each cell is only connected to its nearest neighbours. These are all attractive features for implementation using Very Large Scale Integration (VLSI). The throughput rate is only limited by the time to perform a one-bit full addition. In order to assess the usefulness to these systolic arrays a 'cost function' is developed to compare them with more conventional techniques, such as the Cooley-Tukey radix-2 Fast Fourier Transform (FFT). The cost function shows that these systolic arrays offer a good way of implementing the Discrete Fourier Transform for transforms up to about 30 points in length. The cost function is a general tool and allows comparisons to be made between different implementations of the same algorithm and between dissimilar algorithms. Finally a technique is developed for the derivation of Discrete Cosine Transform (DCT) algorithms from the Winograd Fourier Transform Algorithm. These DCT algorithms may be implemented by modified versions of the systolic arrays proposed earlier, but requiring half the number of cells

    Design of microprocessor-based hardware for number theoretic transform implementation

    Get PDF
    Number Theoretic Transforms (NTTs) are defined in a finite ring of integers Z (_M), where M is the modulus. All the arithmetic operations are carried out modulo M. NTTs are similar in structure to DFTs, hence fast FFT type algorithms may be used to compute NTTs efficiently. A major advantage of the NTT is that it can be used to compute error free convolutions, unlike the FFT it is not subject to round off and truncation errors. In 1976 Winograd proposed a set of short length DFT algorithms using a fewer number of multiplications and approximately the same number of additions as the Cooley-Tukey FFT algorithm. This saving is accomplished at the expense of increased algorithm complexity. These short length DFT algorithms may be combined to perform longer transforms. The Winograd Fourier Transform Algorithm (WFTA) was implemented on a TMS9900 microprocessor to compute NTTs. Since multiplication conducted modulo M is very time consuming a special purpose external hardware modular multiplier was designed, constructed and interfaced with the TMS9900 microprocessor. This external hardware modular multiplier allowed an improvement in the transform execution time. Computation time may further be reduced by employing several microprocessors. Taking advantage of the inherent parallelism of the WFTA, a dedicated parallel microprocessor system was designed and constructed to implement a 15-point WFTA in parallel. Benchmark programs were written to choose a suitable microprocessor for the parallel microprocessor system. A master or a host microprocessor is used to control the parallel microprocessor system and provides an interface to the outside world. An analogue to digital (A/D) and a digital to analogue (D/A) converter allows real time digital signal processing

    The design of multiconfiguration axisymmetric optical systems

    Get PDF
    Imperial Users onl

    Far-field radiation patterns of aperture antennas by the Winograd Fourier transform algorithm

    Get PDF
    A more time-efficient algorithm for computing the discrete Fourier transform, the Winograd Fourier transform (WFT), is described. The WFT algorithm is compared with other transform algorithms. Results indicate that the WFT algorithm in antenna analysis appears to be a very successful application. Significant savings in cpu time will improve the computer turn around time and circumvent the need to resort to weekend runs

    DFT algorithms for bit-serial GaAs array processor architectures

    Get PDF
    Systems and Processes Engineering Corporation (SPEC) has developed an innovative array processor architecture for computing Fourier transforms and other commonly used signal processing algorithms. This architecture is designed to extract the highest possible array performance from state-of-the-art GaAs technology. SPEC's architectural design includes a high performance RISC processor implemented in GaAs, along with a Floating Point Coprocessor and a unique Array Communications Coprocessor, also implemented in GaAs technology. Together, these data processors represent the latest in technology, both from an architectural and implementation viewpoint. SPEC has examined numerous algorithms and parallel processing architectures to determine the optimum array processor architecture. SPEC has developed an array processor architecture with integral communications ability to provide maximum node connectivity. The Array Communications Coprocessor embeds communications operations directly in the core of the processor architecture. A Floating Point Coprocessor architecture has been defined that utilizes Bit-Serial arithmetic units, operating at very high frequency, to perform floating point operations. These Bit-Serial devices reduce the device integration level and complexity to a level compatible with state-of-the-art GaAs device technology

    The inherent overlapping in the parallel calculation of the Laplacian

    Get PDF
    Producción CientíficaA new approach for the parallel computation of the Laplacian in the Fourier domain is presented. This numerical problem inherits the intrinsic sequencing involved in the calculation of any multidimensional Fast Fourier Transform (FFT) where blocking communications assure that its computation is strictly carried out dimension by dimension. Such data dependency vanishes when one considers the Laplacian as the sum of n independent one-dimensional kernels, so that computation and communication can be naturally overlapped with nonblocking communications. Overlapping is demonstrated to be responsible for the speedup figures we obtain when our approach is compared to state-of-the-art parallel multidimensional FFTs.Junta de Castilla León (grant number VA296P18

    A proposal for the Co6 chapter of the NAG Algol 68 library

    Get PDF

    Number theoretic transform implementation using microprocessors

    Get PDF
    Since 1974 considerable interest has been shown in the literature in the topic of number theoretic transforms. These transforms provide an efficient integer processing technique for convolution. Microprocessors are suited to integer processing particularly for applications where the required processing load is small. It was therefore a natural step to investigate and tailor the properties of number theoretic transforms to the capabilities of microprocessors to provide cheap and compact processors using efficient signal processing algorithms. It was found that efficient number theoretic transforms could be defined using the Modulus M = 65521 and this is especially convenient for a microprocessor implementation. Relevant aspects of modular arithmetic are investigated. The techniques developed are extended to allow for complex signal processing. In conclusion it is shown that number theoretic transforms can be used to encode and decode Reed-Soloman error correcting codes

    Application of bit-slice microprocessors to digital correlation in spread spectrum communication systems

    Get PDF
    This thesis describes the application of commercially available microprocessors and other VLSI devices to high-speed real-time digital correlation in spread spectrum and related communication applications. Spread spectrum communications are a wide-band secure communication system that generate a very broad spectral bandwidth signal that is therefore hard to detect in noise. They are capable of rejecting intentional or unintentional jamming, and are insensitive to the multipath and fading that affects conventional high frequency systems. The bandwidth of spread spectrum systems must be large to obtain a significant performance improvement. This means that the sequence rate must be fast and therefore very fast microprocessors will be required when they are used to perform spread spectrum correlation. Since multiplication cannot be performed efficiently by microprocessors considerable work, since 1974, has been published in the literature which is devoted to minimising the requirement of multiplications in digital correlation and other signal processing algorithms. These fast techniques are investigated and implemented using general-purpose microprocessors. The restricted-bandwidth problem in microprocessor-based digital correlator has been discussed. A new implementation is suggested which uses bit-slice devices to maintain the flexibility of microprocessor-based digital correlation without sacrificing speed. This microprocessor-based system has been found to be efficient in implementing the correlation process at the baseband in the digital domain as well as the post-correlation signal processing- demodulation, detection and tracking, especiaJIy for low rate signals. A charge coupled-device is used to obtain spectral density function. An all-digital technique which is programmable for any binary waveform and can be used for achieving initial acquisition and maintaining synchronisation in spread spectrum communications is described. Many of the practical implementation problems are discussed. The receiver performance, which is measured in terms of the acquisition time and the bit-error rate, is also presented and results are obtained which are close to those predicted in the system simulations
    corecore