23 research outputs found
A Class of DCT Approximations Based on the Feig-Winograd Algorithm
A new class of matrices based on a parametrization of the Feig-Winograd
factorization of 8-point DCT is proposed. Such parametrization induces a matrix
subspace, which unifies a number of existing methods for DCT approximation. By
solving a comprehensive multicriteria optimization problem, we identified
several new DCT approximations. Obtained solutions were sought to possess the
following properties: (i) low multiplierless computational complexity, (ii)
orthogonality or near orthogonality, (iii) low complexity invertibility, and
(iv) close proximity and performance to the exact DCT. Proposed approximations
were submitted to assessment in terms of proximity to the DCT, coding
performance, and suitability for image compression. Considering Pareto
efficiency, particular new proposed approximations could outperform various
existing methods archived in literature.Comment: 26 pages, 4 figures, 5 tables, fixed arithmetic complexity in Table
I
Multiplierless 16-point DCT Approximation for Low-complexity Image and Video Coding
An orthogonal 16-point approximate discrete cosine transform (DCT) is
introduced. The proposed transform requires neither multiplications nor
bit-shifting operations. A fast algorithm based on matrix factorization is
introduced, requiring only 44 additions---the lowest arithmetic cost in
literature. To assess the introduced transform, computational complexity,
similarity with the exact DCT, and coding performance measures are computed.
Classical and state-of-the-art 16-point low-complexity transforms were used in
a comparative analysis. In the context of image compression, the proposed
approximation was evaluated via PSNR and SSIM measurements, attaining the best
cost-benefit ratio among the competitors. For video encoding, the proposed
approximation was embedded into a HEVC reference software for direct comparison
with the original HEVC standard. Physically realized and tested using FPGA
hardware, the proposed transform showed 35% and 37% improvements of area-time
and area-time-squared VLSI metrics when compared to the best competing
transform in the literature.Comment: 12 pages, 5 figures, 3 table
Low-complexity 8-point DCT Approximation Based on Angle Similarity for Image and Video Coding
The principal component analysis (PCA) is widely used for data decorrelation
and dimensionality reduction. However, the use of PCA may be impractical in
real-time applications, or in situations were energy and computing constraints
are severe. In this context, the discrete cosine transform (DCT) becomes a
low-cost alternative to data decorrelation. This paper presents a method to
derive computationally efficient approximations to the DCT. The proposed method
aims at the minimization of the angle between the rows of the exact DCT matrix
and the rows of the approximated transformation matrix. The resulting
transformations matrices are orthogonal and have extremely low arithmetic
complexity. Considering popular performance measures, one of the proposed
transformation matrices outperforms the best competitors in both matrix error
and coding capabilities. Practical applications in image and video coding
demonstrate the relevance of the proposed transformation. In fact, we show that
the proposed approximate DCT can outperform the exact DCT for image encoding
under certain compression ratios. The proposed transform and its direct
competitors are also physically realized as digital prototype circuits using
FPGA technology.Comment: 16 pages, 12 figures, 10 table
Efficient Computation of the 8-point DCT via Summation by Parts
This paper introduces a new fast algorithm for the 8-point discrete cosine
transform (DCT) based on the summation-by-parts formula. The proposed method
converts the DCT matrix into an alternative transformation matrix that can be
decomposed into sparse matrices of low multiplicative complexity. The method is
capable of scaled and exact DCT computation and its associated fast algorithm
achieves the theoretical minimal multiplicative complexity for the 8-point DCT.
Depending on the nature of the input signal simplifications can be introduced
and the overall complexity of the proposed algorithm can be further reduced.
Several types of input signal are analyzed: arbitrary, null mean, accumulated,
and null mean/accumulated signal. The proposed tool has potential application
in harmonic detection, image enhancement, and feature extraction, where input
signal DC level is discarded and/or the signal is required to be integrated.Comment: Fixed Fig. 1 with the block diagram of the proposed architecture.
Manuscript contains 13 pages, 4 figures, 2 table
The Arithmetic Cosine Transform: Exact and Approximate Algorithms
In this paper, we introduce a new class of transform method --- the
arithmetic cosine transform (ACT). We provide the central mathematical
properties of the ACT, necessary in designing efficient and accurate
implementations of the new transform method. The key mathematical tools used in
the paper come from analytic number theory, in particular the properties of the
Riemann zeta function. Additionally, we demonstrate that an exact signal
interpolation is achievable for any block-length. Approximate calculations were
also considered. The numerical examples provided show the potential of the ACT
for various digital signal processing applications.Comment: 17 pages, 3 figure
A Multiparametric Class of Low-complexity Transforms for Image and Video Coding
Discrete transforms play an important role in many signal processing
applications, and low-complexity alternatives for classical transforms became
popular in recent years. Particularly, the discrete cosine transform (DCT) has
proven to be convenient for data compression, being employed in well-known
image and video coding standards such as JPEG, H.264, and the recent high
efficiency video coding (HEVC). In this paper, we introduce a new class of
low-complexity 8-point DCT approximations based on a series of works published
by Bouguezel, Ahmed and Swamy. Also, a multiparametric fast algorithm that
encompasses both known and novel transforms is derived. We select the
best-performing DCT approximations after solving a multicriteria optimization
problem, and submit them to a scaling method for obtaining larger size
transforms. We assess these DCT approximations in both JPEG-like image
compression and video coding experiments. We show that the optimal DCT
approximations present compelling results in terms of coding efficiency and
image quality metrics, and require only few addition or bit-shifting
operations, being suitable for low-complexity and low-power systems.Comment: Fixed Figure 1 and typos in the reference lis
An Orthogonal 16-point Approximate DCT for Image and Video Compression
A low-complexity orthogonal multiplierless approximation for the 16-point
discrete cosine transform (DCT) was introduced. The proposed method was
designed to possess a very low computational cost. A fast algorithm based on
matrix factorization was proposed requiring only 60~additions. The proposed
architecture outperforms classical and state-of-the-art algorithms when
assessed as a tool for image and video compression. Digital VLSI hardware
implementations were also proposed being physically realized in FPGA technology
and implemented in 45 nm up to synthesis and place-route levels. Additionally,
the proposed method was embedded into a high efficiency video coding (HEVC)
reference software for actual proof-of-concept. Obtained results show
negligible video degradation when compared to Chen DCT algorithm in HEVC.Comment: 18 pages, 7 figures, 6 table
An Integer Approximation Method for Discrete Sinusoidal Transforms
Approximate methods have been considered as a means to the evaluation of
discrete transforms. In this work, we propose and analyze a class of integer
transforms for the discrete Fourier, Hartley, and cosine transforms (DFT, DHT,
and DCT), based on simple dyadic rational approximation methods. The introduced
method is general, applicable to several block-lengths, whereas existing
approaches are usually dedicated to specific transform sizes. The suggested
approximate transforms enjoy low multiplicative complexity and the
orthogonality property is achievable via matrix polar decomposition. We show
that the obtained transforms are competitive with archived methods in
literature. New 8-point square wave approximate transforms for the DFT, DHT,
and DCT are also introduced as particular cases of the introduced methodology.Comment: 13 pages, 5 figures, 8 table
Improved 8-point Approximate DCT for Image and Video Compression Requiring Only 14 Additions
Video processing systems such as HEVC requiring low energy consumption needed
for the multimedia market has lead to extensive development in fast algorithms
for the efficient approximation of 2-D DCT transforms. The DCT is employed in a
multitude of compression standards due to its remarkable energy compaction
properties. Multiplier-free approximate DCT transforms have been proposed that
offer superior compression performance at very low circuit complexity. Such
approximations can be realized in digital VLSI hardware using additions and
subtractions only, leading to significant reductions in chip area and power
consumption compared to conventional DCTs and integer transforms. In this
paper, we introduce a novel 8-point DCT approximation that requires only 14
addition operations and no multiplications. The proposed transform possesses
low computational complexity and is compared to state-of-the-art DCT
approximations in terms of both algorithm complexity and peak signal-to-noise
ratio. The proposed DCT approximation is a candidate for reconfigurable video
standards such as HEVC. The proposed transform and several other DCT
approximations are mapped to systolic-array digital architectures and
physically realized as digital prototype circuits using FPGA technology and
mapped to 45 nm CMOS technology.Comment: 30 pages, 7 figures, 5 table
Low-complexity Architecture for AR(1) Inference
In this Letter, we propose a low-complexity estimator for the correlation
coefficient based on the signed process. The introduced
approximation is suitable for implementation in low-power hardware
architectures. Monte Carlo simulations reveal that the proposed estimator
performs comparably to the competing methods in literature with maximum error
in order of . However, the hardware implementation of the introduced
method presents considerable advantages in several relevant metrics, offering
more than 95% reduction in dynamic power and doubling the maximum operating
frequency when compared to the reference method.Comment: 7 pages, 3 tables, 4 figure