43 research outputs found

    Low computational complexity variable block size (VBS) partitioning for motion estimation using the Walsh Hadamard transform (WHT)

    Get PDF
    Variable Block Size (VBS) based motion estimation has been adapted in state of the art video coding, such as H.264/AVC, VC-1. However, a low complexity H.264/AVC encoder cannot take advantage of VBS due to its power consumption requirements. In this paper, we present a VBS partition algorithm based on a binary motion edge map without either initial motion estimation or Rate-Distortion (R-D) optimization for selecting modes. The proposed algorithm uses the Walsh Hadamard Transform (WHT) to create a binary edge map, which provides a computational complexity cost effectiveness compared to other light segmentation methods typically used to detect the required region

    Tchebichef Moment Based Hilbert Scan for Image Compression

    Get PDF
    Image compression is now essential for applications such as transmission and storage in data base, so we need to compress a vast amount of information whereas, the compressed ratio and quality of compressed image must be enhanced, for this reason, this paper develop a new algorithm that used a discrete orthogonal Tchebichef moment based Hilbert curve for image compression. The analyzed image was divided into 8×8 image sub-blocks, the Tchebichef moment has been applied to each one, and then the transformed coefficients 8×8 sub-block shall be reordered in Hilbert scan into a linear array, at this step Huffman coding is implemented. Experimental results show that this algorithm improves the coding efficiency on the one hand; and on the other hand the quality of reconstructed image is also not significantly decreased. Keywords: Huffman Coding, Tchebichef Moment Transforms, Orthogonal Moment Functions, Hilbert, zigzag scan

    Approximate and timing-speculative hardware design for high-performance and energy-efficient video processing

    Get PDF
    Since the end of transistor scaling in 2-D appeared on the horizon, innovative circuit design paradigms have been on the rise to go beyond the well-established and ultraconservative exact computing. Many compute-intensive applications – such as video processing – exhibit an intrinsic error resilience and do not necessarily require perfect accuracy in their numerical operations. Approximate computing (AxC) is emerging as a design alternative to improve the performance and energy-efficiency requirements for many applications by trading its intrinsic error tolerance with algorithm and circuit efficiency. Exact computing also imposes a worst-case timing to the conventional design of hardware accelerators to ensure reliability, leading to an efficiency loss. Conversely, the timing-speculative (TS) hardware design paradigm allows increasing the frequency or decreasing the voltage beyond the limits determined by static timing analysis (STA), thereby narrowing pessimistic safety margins that conventional design methods implement to prevent hardware timing errors. Timing errors should be evaluated by an accurate gate-level simulation, but a significant gap remains: How these timing errors propagate from the underlying hardware all the way up to the entire algorithm behavior, where they just may degrade the performance and quality of service of the application at stake? This thesis tackles this issue by developing and demonstrating a cross-layer framework capable of performing investigations of both AxC (i.e., from approximate arithmetic operators, approximate synthesis, gate-level pruning) and TS hardware design (i.e., from voltage over-scaling, frequency over-clocking, temperature rising, and device aging). The cross-layer framework can simulate both timing errors and logic errors at the gate-level by crossing them dynamically, linking the hardware result with the algorithm-level, and vice versa during the evolution of the application’s runtime. Existing frameworks perform investigations of AxC and TS techniques at circuit-level (i.e., at the output of the accelerator) agnostic to the ultimate impact at the application level (i.e., where the impact is truly manifested), leading to less optimization. Unlike state of the art, the framework proposed offers a holistic approach to assessing the tradeoff of AxC and TS techniques at the application-level. This framework maximizes energy efficiency and performance by identifying the maximum approximation levels at the application level to fulfill the required good enough quality. This thesis evaluates the framework with an 8-way SAD (Sum of Absolute Differences) hardware accelerator operating into an HEVC encoder as a case study. Application-level results showed that the SAD based on the approximate adders achieve savings of up to 45% of energy/operation with an increase of only 1.9% in BD-BR. On the other hand, VOS (Voltage Over-Scaling) applied to the SAD generates savings of up to 16.5% in energy/operation with around 6% of increase in BD-BR. The framework also reveals that the boost of about 6.96% (at 50°) to 17.41% (at 75° with 10- Y aging) in the maximum clock frequency achieved with TS hardware design is totally lost by the processing overhead from 8.06% to 46.96% when choosing an unreliable algorithm to the blocking match algorithm (BMA). We also show that the overhead can be avoided by adopting a reliable BMA. This thesis also shows approximate DTT (Discrete Tchebichef Transform) hardware proposals by exploring a transform matrix approximation, truncation and pruning. The results show that the approximate DTT hardware proposal increases the maximum frequency up to 64%, minimizes the circuit area in up to 43.6%, and saves up to 65.4% in power dissipation. The DTT proposal mapped for FPGA shows an increase of up to 58.9% on the maximum frequency and savings of about 28.7% and 32.2% on slices and dynamic power, respectively compared with stat

    Discrete Tchebichef transform and its application to image / video compression

    Get PDF
    The discrete Tchebichef transform (DTT) is a novel polynomial-based orthogonal transform. It exhibits interesting properties, such as high energy compaction, optimal decorrelation and direct orthogonality, and hence is expected to produce good transform coding results. Advances in the areas of image and video coding have generated a growing interest in discrete transforms. The demand for high quality with a limited use of computational resources and improved cost benefits has lead to experimentation with novel transform coding methods. One such experiment is undertaken in this thesis with the DTT. We propose the integer Tchebichef transform (ITT) for 4x4 and 8x8 DTTs. Using the proposed ITT, we also design fast multiplier-free algorithms for 4-point and 8-point DTTs that are superior to the existing algorithms. We perform image compression using 4 {604} 4 and 8 {604} 8 DTT. In order to analyze the performance of DTT, we compare the image compression results of DTT, discrete cosine transform (DCT) and integer cosine transform (ICT). Image quality measures that span both the subjective and objective evaluation techniques are computed for the compressed images and the results analyzed taking into account the statistical properties of the images for a better understanding of the behavioral trends. Substantial improvement is observed in the quality of DTT-compressed images. The appealing characteristics of DTT motivate us to take a step further to evaluate the computational benefits of ITT over ICT, which is currently being used in the H.264/AVC standard. The merits of DTT as demonstrated in this thesis are its simplicity, good image compression potential and computational efficiency, further enhanced by its low precision requirement

    Image Compression using ITT and ICT and a Novel Variable Quantization Technique

    Get PDF
    The need for novel transform coding techniques promising improved reconstruction and reduced computational complexity in the field of image and data compression is undeniable. Currently, there is a prevalent use of the integer adaptation of the popular discrete cosine transform (DCT) with fixed quantization in the video compression domain due to its ease of computation and adequate performance. However, there cannot be a single method or technique that would not only provide maximum compression possible, but also offer the best quality for different types of images. The influence of specific features of the image, such as its structure and content, on the quality of reconstructed image after decompression cannot be ignored. This thesis intends to utilize this aspect and identify areas where an alternative to integer DCT (ICT) for image compression can be proposed. There exist polynomial-based orthogonal transforms like discrete Tchebichef transform (DTT), which possess valuable properties like energy compaction, but are potentially unexploited in comparison with DCT. This thesis examines, in detail, where DTT stands as a viable alternative to DCT for lossy image compression based on various image quality parameters. It introduces a multiplier-free fast implementation of integer DTT (ITT) of size 8×8 that significantly reduces the computational complexity. Normally, images have detail spread across them in a non-homogenous manner. Hence, when the image is divided into blocks, some blocks might have intricate detail, whereas the amount of detail in some might be very sparse. This feature is exploited in this thesis by proposing a technique to adapt the quantization performed during compression according to the characteristics of the image block. The novelty of this variable quantization is that it is simple to implement without much computational or transmission overhead. The image compression performance of ITT and ICT, using both variable and fixed quantization, are evaluated and compared for a variety of images. Eventually, the cases suitable for ITT-based image compression employing variable quantization are identified
    corecore