37,797 research outputs found

    Fast Digital Convolutions using Bit-Shifts

    Full text link
    An exact, one-to-one transform is presented that not only allows digital circular convolutions, but is free from multiplications and quantisation errors for transform lengths of arbitrary powers of two. The transform is analogous to the Discrete Fourier Transform, with the canonical harmonics replaced by a set of cyclic integers computed using only bit-shifts and additions modulo a prime number. The prime number may be selected to occupy contemporary word sizes or to be very large for cryptographic or data hiding applications. The transform is an extension of the Rader Transforms via Carmichael's Theorem. These properties allow for exact convolutions that are impervious to numerical overflow and to utilise Fast Fourier Transform algorithms.Comment: 4 pages, 2 figures, submitted to IEEE Signal Processing Letter

    An Orthogonal 16-point Approximate DCT for Image and Video Compression

    Full text link
    A low-complexity orthogonal multiplierless approximation for the 16-point discrete cosine transform (DCT) was introduced. The proposed method was designed to possess a very low computational cost. A fast algorithm based on matrix factorization was proposed requiring only 60~additions. The proposed architecture outperforms classical and state-of-the-art algorithms when assessed as a tool for image and video compression. Digital VLSI hardware implementations were also proposed being physically realized in FPGA technology and implemented in 45 nm up to synthesis and place-route levels. Additionally, the proposed method was embedded into a high efficiency video coding (HEVC) reference software for actual proof-of-concept. Obtained results show negligible video degradation when compared to Chen DCT algorithm in HEVC.Comment: 18 pages, 7 figures, 6 table

    Opendda: a Novel High-Performance Computational Framework for the Discrete Dipole Approximation

    Full text link
    This work presents a highly optimized computational framework for the Discrete Dipole Approximation, a numerical method for calculating the optical properties associated with a target of arbitrary geometry that is widely used in atmospheric, astrophysical and industrial simulations. Core optimizations include the bit-fielding of integer data and iterative methods that complement a new Discrete Fourier Transform (DFT) kernel, which efficiently calculates the matrix vector products required by these iterative solution schemes. The new kernel performs the requisite 3-D DFTs as ensembles of 1-D transforms, and by doing so, is able to reduce the number of constituent 1-D transforms by 60% and the memory by over 80%. The optimizations also facilitate the use of parallel techniques to further enhance the performance. Complete OpenMP-based shared-memory and MPI-based distributed-memory implementations have been created to take full advantage of the various architectures. Several benchmarks of the new framework indicate extremely favorable performance and scalability. OpenDDA is available following the usual open source regulations from http://www.opendda.orgComment: 29 pages, 5 figure

    Improved 8-point Approximate DCT for Image and Video Compression Requiring Only 14 Additions

    Full text link
    Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer superior compression performance at very low circuit complexity. Such approximations can be realized in digital VLSI hardware using additions and subtractions only, leading to significant reductions in chip area and power consumption compared to conventional DCTs and integer transforms. In this paper, we introduce a novel 8-point DCT approximation that requires only 14 addition operations and no multiplications. The proposed transform possesses low computational complexity and is compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio. The proposed DCT approximation is a candidate for reconfigurable video standards such as HEVC. The proposed transform and several other DCT approximations are mapped to systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45 nm CMOS technology.Comment: 30 pages, 7 figures, 5 table

    On the spectra of hypermatrix direct sum and Kronecker products constructions

    Full text link
    Our main result is an elementary derivation of the spectral decomposition of hypermatrices generated by arbitrary combinations of Kronecker products and direct sums of cubic side length

    A Class of DCT Approximations Based on the Feig-Winograd Algorithm

    Full text link
    A new class of matrices based on a parametrization of the Feig-Winograd factorization of 8-point DCT is proposed. Such parametrization induces a matrix subspace, which unifies a number of existing methods for DCT approximation. By solving a comprehensive multicriteria optimization problem, we identified several new DCT approximations. Obtained solutions were sought to possess the following properties: (i) low multiplierless computational complexity, (ii) orthogonality or near orthogonality, (iii) low complexity invertibility, and (iv) close proximity and performance to the exact DCT. Proposed approximations were submitted to assessment in terms of proximity to the DCT, coding performance, and suitability for image compression. Considering Pareto efficiency, particular new proposed approximations could outperform various existing methods archived in literature.Comment: 26 pages, 4 figures, 5 tables, fixed arithmetic complexity in Table I

    Lossless Image and Intra-frame Compression with Integer-to-Integer DST

    Full text link
    Video coding standards are primarily designed for efficient lossy compression, but it is also desirable to support efficient lossless compression within video coding standards using small modifications to the lossy coding architecture. A simple approach is to skip transform and quantization, and simply entropy code the prediction residual. However, this approach is inefficient at compression. A more efficient and popular approach is to skip transform and quantization but also process the residual block with DPCM, along the horizontal or vertical direction, prior to entropy coding. This paper explores an alternative approach based on processing the residual block with integer-to-integer (i2i) transforms. I2i transforms can map integer pixels to integer transform coefficients without increasing the dynamic range and can be used for lossless compression. We focus on lossless intra coding and develop novel i2i approximations of the odd type-3 DST (ODST-3). Experimental results with the HEVC reference software show that the developed i2i approximations of the ODST-3 improve lossless intra-frame compression efficiency with respect to HEVC version 2, which uses the popular DPCM method, by an average 2.7% without a significant effect on computational complexity.Comment: Draft consisting of 16 page

    Efficient Quantum Transforms

    Full text link
    Quantum mechanics requires the operation of quantum computers to be unitary, and thus makes it important to have general techniques for developing fast quantum algorithms for computing unitary transforms. A quantum routine for computing a generalized Kronecker product is given. Applications include re-development of the networks for computing the Walsh-Hadamard and the quantum Fourier transform. New networks for two wavelet transforms are given. Quantum computation of Fourier transforms for non-Abelian groups is defined. A slightly relaxed definition is shown to simplify the analysis and the networks that computes the transforms. Efficient networks for computing such transforms for a class of metacyclic groups are introduced. A novel network for computing a Fourier transform for a group used in quantum error-correction is also given.Comment: 30 pages, LaTeX2e, 7 figures include

    Computing Hasse-Witt matrices of hyperelliptic curves in average polynomial time

    Full text link
    We present an efficient algorithm to compute the Hasse-Witt matrix of a hyperelliptic curve C/Q modulo all primes of good reduction up to a given bound N, based on the average polynomial-time algorithm recently introduced by Harvey. An implementation for hyperelliptic curves of genus 2 and 3 is more than an order of magnitude faster than alternative methods for N = 2^26.Comment: 17 page

    Fractional integrals and Fourier transforms

    Full text link
    This paper gives a short survey of some basic results related to estimates of fractional integrals and Fourier transforms. It is closely adjoint to our previous survey papers \cite{K1998} and \cite{K2007}. The main methods used in the paper are based on nonincreasing rearrangements. We give alternative proofs of some results. We observe also that the paper represents the mini-course given by the author at Barcelona University in October, 2014.Comment: 42 page
    • …
    corecore