815 research outputs found

    Radix-2 x 2 x 2 algorithm for the 3-D discrete hartley transform

    Get PDF
    The discrete Hartley transform (DHT) has proved to be a valuable tool in digital signal/image processing and communications and has also attracted research interests in many multidimensional applications. Although many fast algorithms have been developed for the calculation of one- and two-dimensional (1-D and 2-D) DHT, the development of multidimensional algorithms in three and more dimensions is still unexplored and has not been given similar attention; hence, the multidimensional Hartley transform is usually calculated through the row-column approach. However, proper multidimensional algorithms can be more efficient than the row-column method and need to be developed. Therefore, it is the aim of this paper to introduce the concept and derivation of the three-dimensional (3-D) radix-2 2X 2X algorithm for fast calculation of the 3-D discrete Hartley transform. The proposed algorithm is based on the principles of the divide-and-conquer approach applied directly in 3-D. It has a simple butterfly structure and has been found to offer significant savings in arithmetic operations compared with the row-column approach based on similar algorithms

    Multiplierless DCT Algorithm for Image Compression Applications

    Get PDF
    This paper presents a novel error-free (infinite-precision) architecture for the fast implementation of 8x8 2-D Discrete Cosine Transform. The architecture uses a new algebraic integer encoding of a 1-D radix-8 DCT that allows the separable computation of a 2-D 8x8 DCT without any intermediate number representation conversions. This is a considerable improvement on previously introduced algebraic integer encoding techniques to compute both DCT and IDCT which eliminates the requirements to approximate the transformation matrix ele- ments by obtaining their exact representations and hence mapping the transcendental functions without any errors. Apart from the multiplication-free nature, this new mapping scheme fits to this algorithm, eliminating any computational or quantization errors and resulting short-word-length and high-speed-design

    VLSI Architectures of Lifting-Based Discrete Wavelet Transform

    Get PDF

    Design and Evaluation of a Scalable Engine for 3D-FFT Computation in an FPGA Cluster

    Get PDF
    The Three Dimensional Fast Fourier Transform (3D-FFT) is commonly used to solve the partial differential equations describing the system evolution in several physical phenomena, such as the motion of viscous fluids described by the Navier–Stokes equations. Simulation of such problems requires the use of a parallel High-Performance Computing architecture since the size of the problem grows with the cube of the FFT size, and the representation of the single point comprises several double precision floating- point complex numbers. Modern High-Performance Computing (HPC) systems are considering the inclusion of FPGAs as components of this computing architecture because they can combine effective hardware acceleration capabilities and dedicated communication facilities. Furthermore, the network topology can be optimized for the specific calculation that the cluster must perform, especially in the case of algorithms limited by the data exchange delay between the processors. In this paper, we explore an HPC design that uses FPGA accelerators to compute the 3DFFT. We devise a scalable FFT engine based on a custom radix-2 double-precision core that is used to implement the Decimation in Frequency version of the Cooley–Tukey FFT algorithm. The FFT engine can be adapted to different technology constraints and networking topologies by adjusting the number of cores and configuration parameters in order to minimize the overall calculation time. We compare the various possible configurations with the technological limits of available hardware. Finally, we evaluate the bandwidth required for continuous FFT execution in the APEnet toroidal mesh network.

    Dimension Reduction Using Quantum Wavelet Transform on a High-Performance Reconfigurable Computer

    Get PDF
    This work is licensed under a Creative Commons Attribution 4.0 International License.The high resolution of multidimensional space-time measurements and enormity of data readout counts in applications such as particle tracking in high-energy physics (HEP) is becoming nowadays a major challenge. In this work, we propose combining dimension reduction techniques with quantum information processing for application in domains that generate large volumes of data such as HEP. More specifically, we propose using quantum wavelet transform (QWT) to reduce the dimensionality of high spatial resolution data. The quantum wavelet transform takes advantage of the principles of quantum mechanics to achieve reductions in computation time while processing exponentially larger amount of information. We develop simpler and optimized emulation architectures than what has been previously reported, to perform quantum wavelet transform on high-resolution data. We also implement the inverse quantum wavelet transform (IQWT) to accurately reconstruct the data without any losses. The algorithms are prototyped on an FPGA-based quantum emulator that supports double-precision floating-point computations. Experimental work has been performed using high-resolution image data on a state-of-the-art multinode high-performance reconfigurable computer. The experimental results show that the proposed concepts represent a feasible approach to reducing dimensionality of high spatial resolution data generated by applications such as particle tracking in high-energy physics

    Digital and Mixed Domain Hardware Reduction Algorithms and Implementations for Massive MIMO

    Get PDF
    Emerging 5G and 6G based wireless communications systems largely rely on multiple-input-multiple-output (MIMO) systems to reduce inherently extensive path losses, facilitate high data rates, and high spatial diversity. Massive MIMO systems used in mmWave and sub-THz applications consists of hundreds perhaps thousands of antenna elements at base stations. Digital beamforming techniques provide the highest flexibility and better degrees of freedom for phased antenna arrays as compared to its analog and hybrid alternatives but has the highest hardware complexity. Conventional digital beamformers at the receiver require a dedicated analog to digital converter (ADC) for every antenna element, leading to ADCs for elements. The number of ADCs is the key deterministic factor for the power consumption of an antenna array system. The digital hardware consists of fast Fourier transform (FFT) cores with a multiplier complexity of (N log2N) for an element system to generate multiple beams. It is required to reduce the mixed and digital hardware complexities in MIMO systems to reduce the cost and the power consumption, while maintaining high performance. The well-known concept has been in use for ADCs to achieve reduced complexities. An extension of the architecture to multi-dimensional domain is explored in this dissertation to implement a single port ADC to replace ADCs in an element system, using the correlation of received signals in the spatial domain. This concept has applications in conventional uniform linear arrays (ULAs) as well as in focal plane array (FPA) receivers. Our analysis has shown that sparsity in the spatio-temporal frequency domain can be exploited to reduce the number of ADCs from N to where . By using the limited field of view of practical antennas, multiple sub-arrays are combined without interferences to achieve a factor of K increment in the information carrying capacity of the ADC systems. Applications of this concept include ULAs and rectangular array systems. Experimental verifications were done for a element, 1.8 - 2.1 GHz wideband array system to sample using ADCs. This dissertation proposes that frequency division multiplexing (FDM) receiver outputs at an intermediate frequency (IF) can pack multiple (M) narrowband channels with a guard band to avoid interferences. The combined output is then sampled using a single wideband ADC and baseband channels are retrieved in the digital domain. Measurement results were obtained by employing a element, 28 GHz antenna array system to combine channels together to achieve a 75% reduction of ADC requirement. Implementation of FFT cores in the digital domain is not always exact because of the finite precision. Therefore, this dissertation explores the possibility of approximating the discrete Fourier transform (DFT) matrix to achieve reduced hardware complexities at an allowable cost of accuracy. A point approximate DFT (ADFT) core was implemented on digital hardware using radix-32 to achieve savings in cost, size, weight and power (C-SWaP) and synthesized for ASIC at 45-nm technology
    • …
    corecore