2,276 research outputs found

    Multiplierless 16-point DCT Approximation for Low-complexity Image and Video Coding

    Full text link
    An orthogonal 16-point approximate discrete cosine transform (DCT) is introduced. The proposed transform requires neither multiplications nor bit-shifting operations. A fast algorithm based on matrix factorization is introduced, requiring only 44 additions---the lowest arithmetic cost in literature. To assess the introduced transform, computational complexity, similarity with the exact DCT, and coding performance measures are computed. Classical and state-of-the-art 16-point low-complexity transforms were used in a comparative analysis. In the context of image compression, the proposed approximation was evaluated via PSNR and SSIM measurements, attaining the best cost-benefit ratio among the competitors. For video encoding, the proposed approximation was embedded into a HEVC reference software for direct comparison with the original HEVC standard. Physically realized and tested using FPGA hardware, the proposed transform showed 35% and 37% improvements of area-time and area-time-squared VLSI metrics when compared to the best competing transform in the literature.Comment: 12 pages, 5 figures, 3 table

    A Class of DCT Approximations Based on the Feig-Winograd Algorithm

    Full text link
    A new class of matrices based on a parametrization of the Feig-Winograd factorization of 8-point DCT is proposed. Such parametrization induces a matrix subspace, which unifies a number of existing methods for DCT approximation. By solving a comprehensive multicriteria optimization problem, we identified several new DCT approximations. Obtained solutions were sought to possess the following properties: (i) low multiplierless computational complexity, (ii) orthogonality or near orthogonality, (iii) low complexity invertibility, and (iv) close proximity and performance to the exact DCT. Proposed approximations were submitted to assessment in terms of proximity to the DCT, coding performance, and suitability for image compression. Considering Pareto efficiency, particular new proposed approximations could outperform various existing methods archived in literature.Comment: 26 pages, 4 figures, 5 tables, fixed arithmetic complexity in Table I

    An Orthogonal 16-point Approximate DCT for Image and Video Compression

    Full text link
    A low-complexity orthogonal multiplierless approximation for the 16-point discrete cosine transform (DCT) was introduced. The proposed method was designed to possess a very low computational cost. A fast algorithm based on matrix factorization was proposed requiring only 60~additions. The proposed architecture outperforms classical and state-of-the-art algorithms when assessed as a tool for image and video compression. Digital VLSI hardware implementations were also proposed being physically realized in FPGA technology and implemented in 45 nm up to synthesis and place-route levels. Additionally, the proposed method was embedded into a high efficiency video coding (HEVC) reference software for actual proof-of-concept. Obtained results show negligible video degradation when compared to Chen DCT algorithm in HEVC.Comment: 18 pages, 7 figures, 6 table

    Improved 8-point Approximate DCT for Image and Video Compression Requiring Only 14 Additions

    Full text link
    Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer superior compression performance at very low circuit complexity. Such approximations can be realized in digital VLSI hardware using additions and subtractions only, leading to significant reductions in chip area and power consumption compared to conventional DCTs and integer transforms. In this paper, we introduce a novel 8-point DCT approximation that requires only 14 addition operations and no multiplications. The proposed transform possesses low computational complexity and is compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio. The proposed DCT approximation is a candidate for reconfigurable video standards such as HEVC. The proposed transform and several other DCT approximations are mapped to systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45 nm CMOS technology.Comment: 30 pages, 7 figures, 5 table

    A DCT Approximation for Image Compression

    Full text link
    An orthogonal approximation for the 8-point discrete cosine transform (DCT) is introduced. The proposed transformation matrix contains only zeros and ones; multiplications and bit-shift operations are absent. Close spectral behavior relative to the DCT was adopted as design criterion. The proposed algorithm is superior to the signed discrete cosine transform. It could also outperform state-of-the-art algorithms in low and high image compression scenarios, exhibiting at the same time a comparable computational complexity.Comment: 10 pages, 6 figure

    Low-complexity 8-point DCT Approximation Based on Angle Similarity for Image and Video Coding

    Full text link
    The principal component analysis (PCA) is widely used for data decorrelation and dimensionality reduction. However, the use of PCA may be impractical in real-time applications, or in situations were energy and computing constraints are severe. In this context, the discrete cosine transform (DCT) becomes a low-cost alternative to data decorrelation. This paper presents a method to derive computationally efficient approximations to the DCT. The proposed method aims at the minimization of the angle between the rows of the exact DCT matrix and the rows of the approximated transformation matrix. The resulting transformations matrices are orthogonal and have extremely low arithmetic complexity. Considering popular performance measures, one of the proposed transformation matrices outperforms the best competitors in both matrix error and coding capabilities. Practical applications in image and video coding demonstrate the relevance of the proposed transformation. In fact, we show that the proposed approximate DCT can outperform the exact DCT for image encoding under certain compression ratios. The proposed transform and its direct competitors are also physically realized as digital prototype circuits using FPGA technology.Comment: 16 pages, 12 figures, 10 table

    Low-complexity 8-point DCT Approximations Based on Integer Functions

    Full text link
    In this paper, we propose a collection of approximations for the 8-point discrete cosine transform (DCT) based on integer functions. Approximations could be systematically obtained and several existing approximations were identified as particular cases. Obtained approximations were compared with the DCT and assessed in the context of JPEG-like image compression.Comment: 21 pages, 4 figures, corrected typo

    Fast Singular Value Shrinkage with Chebyshev Polynomial Approximation Based on Signal Sparsity

    Full text link
    We propose an approximation method for thresholding of singular values using Chebyshev polynomial approximation (CPA). Many signal processing problems require iterative application of singular value decomposition (SVD) for minimizing the rank of a given data matrix with other cost functions and/or constraints, which is called matrix rank minimization. In matrix rank minimization, singular values of a matrix are shrunk by hard-thresholding, soft-thresholding, or weighted soft-thresholding. However, the computational cost of SVD is generally too expensive to handle high dimensional signals such as images; hence, in this case, matrix rank minimization requires enormous computation time. In this paper, we leverage CPA to (approximately) manipulate singular values without computing singular values and vectors. The thresholding of singular values is expressed by a multiplication of certain matrices, which is derived from a characteristic of CPA. The multiplication is also efficiently computed using the sparsity of signals. As a result, the computational cost is significantly reduced. Experimental results suggest the effectiveness of our method through several image processing applications based on matrix rank minimization with nuclear norm relaxation in terms of computation time and approximation precision.Comment: This is a journal pape

    A Discrete Tchebichef Transform Approximation for Image and Video Coding

    Full text link
    In this paper, we introduce a low-complexity approximation for the discrete Tchebichef transform (DTT). The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations. Numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding. Furthermore, Xilinx Virtex-6 FPGA based hardware realization shows 44.9% reduction in dynamic power consumption and 64.7% lower area when compared to the literature.Comment: 13 pages, 5 figures, 2 table

    Multi-Beam RF Aperture Using Multiplierless FFT Approximation

    Full text link
    Multiple independent radio frequency (RF) beams find applications in communications, radio astronomy, radar, and microwave imaging. An NN-point FFT applied spatially across an array of receiver antennas provides NN-independent RF beams at N2log2N\frac{N}{2}\log_2N multiplier complexity. Here, a low-complexity multiplierless approximation for the 8-point FFT is presented for RF beamforming, using only 26 additions. The algorithm provides eight beams that closely resemble the antenna array patterns of the traditional FFT-based beamformer albeit without using multipliers. The proposed FFT-like algorithm is useful for low-power RF multi-beam receivers; being synthesized in 45 nm CMOS technology at 1.1 V supply, and verified on-chip using a Xilinx Virtex-6 Lx240T FPGA device. The CMOS simulation and FPGA implementation indicate bandwidths of 588 MHz and 369 MHz, respectively, for each of the independent receive-mode RF beams.Comment: 8 pages, 3 figures, 2 tables, sfg correcte
    corecore