2,478 research outputs found

    A Class of DCT Approximations Based on the Feig-Winograd Algorithm

    Full text link
    A new class of matrices based on a parametrization of the Feig-Winograd factorization of 8-point DCT is proposed. Such parametrization induces a matrix subspace, which unifies a number of existing methods for DCT approximation. By solving a comprehensive multicriteria optimization problem, we identified several new DCT approximations. Obtained solutions were sought to possess the following properties: (i) low multiplierless computational complexity, (ii) orthogonality or near orthogonality, (iii) low complexity invertibility, and (iv) close proximity and performance to the exact DCT. Proposed approximations were submitted to assessment in terms of proximity to the DCT, coding performance, and suitability for image compression. Considering Pareto efficiency, particular new proposed approximations could outperform various existing methods archived in literature.Comment: 26 pages, 4 figures, 5 tables, fixed arithmetic complexity in Table I

    Improved 8-point Approximate DCT for Image and Video Compression Requiring Only 14 Additions

    Full text link
    Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer superior compression performance at very low circuit complexity. Such approximations can be realized in digital VLSI hardware using additions and subtractions only, leading to significant reductions in chip area and power consumption compared to conventional DCTs and integer transforms. In this paper, we introduce a novel 8-point DCT approximation that requires only 14 addition operations and no multiplications. The proposed transform possesses low computational complexity and is compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio. The proposed DCT approximation is a candidate for reconfigurable video standards such as HEVC. The proposed transform and several other DCT approximations are mapped to systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45 nm CMOS technology.Comment: 30 pages, 7 figures, 5 table

    An Orthogonal 16-point Approximate DCT for Image and Video Compression

    Full text link
    A low-complexity orthogonal multiplierless approximation for the 16-point discrete cosine transform (DCT) was introduced. The proposed method was designed to possess a very low computational cost. A fast algorithm based on matrix factorization was proposed requiring only 60~additions. The proposed architecture outperforms classical and state-of-the-art algorithms when assessed as a tool for image and video compression. Digital VLSI hardware implementations were also proposed being physically realized in FPGA technology and implemented in 45 nm up to synthesis and place-route levels. Additionally, the proposed method was embedded into a high efficiency video coding (HEVC) reference software for actual proof-of-concept. Obtained results show negligible video degradation when compared to Chen DCT algorithm in HEVC.Comment: 18 pages, 7 figures, 6 table

    Low-complexity 8-point DCT Approximations Based on Integer Functions

    Full text link
    In this paper, we propose a collection of approximations for the 8-point discrete cosine transform (DCT) based on integer functions. Approximations could be systematically obtained and several existing approximations were identified as particular cases. Obtained approximations were compared with the DCT and assessed in the context of JPEG-like image compression.Comment: 21 pages, 4 figures, corrected typo

    The discrete fractional random cosine and sine transforms

    Full text link
    Based on the discrete fractional random transform (DFRNT), we present the discrete fractional random cosine and sine transforms (DFRNCT and DFRNST). We demonstrate that the DFRNCT and DFRNST can be regarded as special kinds of DFRNT and thus their mathematical properties are inherited from the DFRNT. Numeral results of DFRNCT and DFRNST for one and two dimensional functions have been given.Comment: 15 pages, 4 eps figures. LaTe

    A Discrete Tchebichef Transform Approximation for Image and Video Coding

    Full text link
    In this paper, we introduce a low-complexity approximation for the discrete Tchebichef transform (DTT). The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations. Numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding. Furthermore, Xilinx Virtex-6 FPGA based hardware realization shows 44.9% reduction in dynamic power consumption and 64.7% lower area when compared to the literature.Comment: 13 pages, 5 figures, 2 table

    A DCT Approximation for Image Compression

    Full text link
    An orthogonal approximation for the 8-point discrete cosine transform (DCT) is introduced. The proposed transformation matrix contains only zeros and ones; multiplications and bit-shift operations are absent. Close spectral behavior relative to the DCT was adopted as design criterion. The proposed algorithm is superior to the signed discrete cosine transform. It could also outperform state-of-the-art algorithms in low and high image compression scenarios, exhibiting at the same time a comparable computational complexity.Comment: 10 pages, 6 figure

    Efficient Computation of the 8-point DCT via Summation by Parts

    Full text link
    This paper introduces a new fast algorithm for the 8-point discrete cosine transform (DCT) based on the summation-by-parts formula. The proposed method converts the DCT matrix into an alternative transformation matrix that can be decomposed into sparse matrices of low multiplicative complexity. The method is capable of scaled and exact DCT computation and its associated fast algorithm achieves the theoretical minimal multiplicative complexity for the 8-point DCT. Depending on the nature of the input signal simplifications can be introduced and the overall complexity of the proposed algorithm can be further reduced. Several types of input signal are analyzed: arbitrary, null mean, accumulated, and null mean/accumulated signal. The proposed tool has potential application in harmonic detection, image enhancement, and feature extraction, where input signal DC level is discarded and/or the signal is required to be integrated.Comment: Fixed Fig. 1 with the block diagram of the proposed architecture. Manuscript contains 13 pages, 4 figures, 2 table

    Signal Flow Graph Approach to Efficient DST I-IV Algorithms

    Get PDF
    In this paper, fast and efficient discrete sine transformation (DST) algorithms are presented based on the factorization of sparse, scaled orthogonal, rotation, rotation-reflection, and butterfly matrices. These algorithms are completely recursive and solely based on DST I-IV. The presented algorithms have low arithmetic cost compared to the known fast DST algorithms. Furthermore, the language of signal flow graph representation of digital structures is used to describe these efficient and recursive DST algorithms having (n−1)(n-1) points signal flow graph for DST-I and nn points signal flow graphs for DST II-IV

    ACDC: A Structured Efficient Linear Layer

    Full text link
    The linear layer is one of the most pervasive modules in deep learning representations. However, it requires O(N2)O(N^2) parameters and O(N2)O(N^2) operations. These costs can be prohibitive in mobile applications or prevent scaling in many domains. Here, we introduce a deep, differentiable, fully-connected neural network module composed of diagonal matrices of parameters, A\mathbf{A} and D\mathbf{D}, and the discrete cosine transform C\mathbf{C}. The core module, structured as ACDC−1\mathbf{ACDC^{-1}}, has O(N)O(N) parameters and incurs O(NlogN)O(N log N ) operations. We present theoretical results showing how deep cascades of ACDC layers approximate linear layers. ACDC is, however, a stand-alone module and can be used in combination with any other types of module. In our experiments, we show that it can indeed be successfully interleaved with ReLU modules in convolutional neural networks for image recognition. Our experiments also study critical factors in the training of these structured modules, including initialization and depth. Finally, this paper also provides a connection between structured linear transforms used in deep learning and the field of Fourier optics, illustrating how ACDC could in principle be implemented with lenses and diffractive elements
    • …
    corecore