40 research outputs found

    Tchebichef Moment Based Hilbert Scan for Image Compression

    Get PDF
    Image compression is now essential for applications such as transmission and storage in data base, so we need to compress a vast amount of information whereas, the compressed ratio and quality of compressed image must be enhanced, for this reason, this paper develop a new algorithm that used a discrete orthogonal Tchebichef moment based Hilbert curve for image compression. The analyzed image was divided into 8×8 image sub-blocks, the Tchebichef moment has been applied to each one, and then the transformed coefficients 8×8 sub-block shall be reordered in Hilbert scan into a linear array, at this step Huffman coding is implemented. Experimental results show that this algorithm improves the coding efficiency on the one hand; and on the other hand the quality of reconstructed image is also not significantly decreased. Keywords: Huffman Coding, Tchebichef Moment Transforms, Orthogonal Moment Functions, Hilbert, zigzag scan

    Approximate and timing-speculative hardware design for high-performance and energy-efficient video processing

    Get PDF
    Since the end of transistor scaling in 2-D appeared on the horizon, innovative circuit design paradigms have been on the rise to go beyond the well-established and ultraconservative exact computing. Many compute-intensive applications – such as video processing – exhibit an intrinsic error resilience and do not necessarily require perfect accuracy in their numerical operations. Approximate computing (AxC) is emerging as a design alternative to improve the performance and energy-efficiency requirements for many applications by trading its intrinsic error tolerance with algorithm and circuit efficiency. Exact computing also imposes a worst-case timing to the conventional design of hardware accelerators to ensure reliability, leading to an efficiency loss. Conversely, the timing-speculative (TS) hardware design paradigm allows increasing the frequency or decreasing the voltage beyond the limits determined by static timing analysis (STA), thereby narrowing pessimistic safety margins that conventional design methods implement to prevent hardware timing errors. Timing errors should be evaluated by an accurate gate-level simulation, but a significant gap remains: How these timing errors propagate from the underlying hardware all the way up to the entire algorithm behavior, where they just may degrade the performance and quality of service of the application at stake? This thesis tackles this issue by developing and demonstrating a cross-layer framework capable of performing investigations of both AxC (i.e., from approximate arithmetic operators, approximate synthesis, gate-level pruning) and TS hardware design (i.e., from voltage over-scaling, frequency over-clocking, temperature rising, and device aging). The cross-layer framework can simulate both timing errors and logic errors at the gate-level by crossing them dynamically, linking the hardware result with the algorithm-level, and vice versa during the evolution of the application’s runtime. Existing frameworks perform investigations of AxC and TS techniques at circuit-level (i.e., at the output of the accelerator) agnostic to the ultimate impact at the application level (i.e., where the impact is truly manifested), leading to less optimization. Unlike state of the art, the framework proposed offers a holistic approach to assessing the tradeoff of AxC and TS techniques at the application-level. This framework maximizes energy efficiency and performance by identifying the maximum approximation levels at the application level to fulfill the required good enough quality. This thesis evaluates the framework with an 8-way SAD (Sum of Absolute Differences) hardware accelerator operating into an HEVC encoder as a case study. Application-level results showed that the SAD based on the approximate adders achieve savings of up to 45% of energy/operation with an increase of only 1.9% in BD-BR. On the other hand, VOS (Voltage Over-Scaling) applied to the SAD generates savings of up to 16.5% in energy/operation with around 6% of increase in BD-BR. The framework also reveals that the boost of about 6.96% (at 50°) to 17.41% (at 75° with 10- Y aging) in the maximum clock frequency achieved with TS hardware design is totally lost by the processing overhead from 8.06% to 46.96% when choosing an unreliable algorithm to the blocking match algorithm (BMA). We also show that the overhead can be avoided by adopting a reliable BMA. This thesis also shows approximate DTT (Discrete Tchebichef Transform) hardware proposals by exploring a transform matrix approximation, truncation and pruning. The results show that the approximate DTT hardware proposal increases the maximum frequency up to 64%, minimizes the circuit area in up to 43.6%, and saves up to 65.4% in power dissipation. The DTT proposal mapped for FPGA shows an increase of up to 58.9% on the maximum frequency and savings of about 28.7% and 32.2% on slices and dynamic power, respectively compared with stat

    Development of Novel Image Compression Algorithms for Portable Multimedia Applications

    Get PDF
    Portable multimedia devices such as digital camera, mobile d evices, personal digtal assistants (PDAs), etc. have limited memory, battery life and processing power. Real time processing and transmission using these devices requires image compression algorithms that can compress efficiently with reduced complexity. Due to limited resources, it is not always possible to implement the best algorithms inside these devices. In uncompressed form, both raw and image data occupy an unreasonably large space. However, both raw and image data have a significant amount of statistical and visual redundancy. Consequently, the used storage space can be efficiently reduced by compression. In this thesis, some novel low complexity and embedded image compression algorithms are developed especially suitable for low bit rate image compression using these devices. Despite the rapid progress in the Internet and multimedia technology, demand for data storage and data transmission bandwidth continues to outstrip the capabil- ities of available technology. The browsing of images over In ternet from the image data sets using these devices requires fast encoding and decodin g speed with better rate-distortion performance. With progressive picture build up of the wavelet based coded images, the recent multimedia applications demand goo d quality images at the earlier stages of transmission. This is particularly important if the image is browsed over wireless lines where limited channel capacity, storage and computation are the deciding parameters. Unfortunately, the performance of JPEG codec degrades at low bit rates because of underlying block based DCT transforms. Altho ugh wavelet based codecs provide substantial improvements in progressive picture quality at lower bit rates, these coders do not fully exploit the coding performance at lower bit rates. It is evident from the statistics of transformed images that the number of significant coefficients having magnitude higher than earlier thresholds are very few. These wavelet based codecs code zero to each insignificant subband as it moves from coarsest to finest subbands. It is also demonstrated that there could be six to sev en bit plane passes where wavelet coders encode many zeros as many subbands are likely to be insignificant with respect to early thresholds. Bits indicating insignificance of a coefficient or subband are required, but they don’t code information that reduces distortion of the reconstructed image. This leads to reduction of zero distortion for an increase in non zero bit-rate. Another problem associated with wavelet based coders such as Set partitioning in hierarchical trees (SPIHT), Set partitioning embedded block (SPECK), Wavelet block-tree coding (WBTC) is because of the use of auxiliary lists. The size of list data structures increase exponentially as more and more eleme nts are added, removed or moved in each bitplane pass. This increases the dynamic memory requirement of the codec, which is a less efficient feature for hardware implementations. Later, many listless variants of SPIHT and SPECK, e.g. No list SPIHT (NLS) and Listless SPECK (LSK) respectively are developed. However, these algorithms have similar rate distortion performances, like the list based coders. An improved LSK (ILSK)algorithm proposed in this dissertation that improves the low b it rate performance of LSK by encoding much lesser number of symbols (i.e. zeros) to several insignificant subbands. Further, the ILSK is combined with a block based transform known as discrete Tchebichef transform (DTT). The proposed new coder isnamed as Hierar-chical listless DTT (HLDTT). DTT is chosen over DCT because of it’s similar energy compaction property like discrete cosine transform (DCT). It is demonstrated that the decoded image quality using HLDTT has better visual performance (i.e., Mean Structural Similarity) than the images decoded using DCT based embedded coders in most of the bit rates. The ILSK algorithm is also combined with Lift based wavelet tra nsform to show the superiority over JPEG2000 at lower rates in terms of peak signal-to-noise ratio (PSNR). A full-scalable and random access decodable listless algorithm is also developed which is based on lift based ILSK. The proposed algorithm named as scalable listless embedded block partitioning (S-LEBP) generates bit stream that offer increasing signal-to-noise ratio and spatial resolution. These are very useful features for transmission of images in a heterogeneous network that optimally service each user according to available bandwidth and computing needs. Random access decoding is a very useful feature for extracting/manipulating certain ar ea of an image with minimal decoding work. The idea used in ILSK is also extended to encode and decode color images. The proposed algorithm for coding color images is named as Color listless embedded block partitioning (CLEBP) algorithm. The coding efficiency of CLEBP is compared with Color SPIHT (CSPIHT) and color variant of WBTC algorithm. From the simulation results, it is shown that CLEBP exhibits a significant PSNR performance improvement over the later two algorithms on various types of images. Although many modifications to NLS and LSK have been made, the listless modification to WBTC algorithm has not been reported in the literature. Therefore,a listless variant of WBTC (named as LBTC) algorithm is proposed. LBTC not only reduces the memory requirement by 88-89% but also increases the encoding and decoding speed, while preserving the rate-distortion perform ance at the same time. Further, the combination of DCT with LBTC (named as DCT LBT) and DTT with LBTC (named as Hierarchical listless DTT, HLBTDTT) are compared with some state-of-the-art DCT based embedded coders. It is also shown that the proposed DCT-LBT and HLBT-DTT show significant PSNR improvements over almost all the embedded coders in most of the bit rates. In some multimedia applications e.g., digital camera, camco rders etc., the images always need to have a fixed pre-determined high quality. The extra effort required for quality scalability is wasted. Therefore, non-embedded algo rithms are best suited for these applications. The proposed algorithms can be made non-embedded by encoding a fixed set of bit planes at a time. Instead, a sparse orthogonal transform matrix is proposed, which can be integrated in a JEPG baseline coder. The proposed matrix promises a substantial reduction in hardware complexity with amarginal loss of image quality on a considerable range of bit rates than block based DCT or Integer DCT

    Image Compression using ITT and ICT and a Novel Variable Quantization Technique

    Get PDF
    The need for novel transform coding techniques promising improved reconstruction and reduced computational complexity in the field of image and data compression is undeniable. Currently, there is a prevalent use of the integer adaptation of the popular discrete cosine transform (DCT) with fixed quantization in the video compression domain due to its ease of computation and adequate performance. However, there cannot be a single method or technique that would not only provide maximum compression possible, but also offer the best quality for different types of images. The influence of specific features of the image, such as its structure and content, on the quality of reconstructed image after decompression cannot be ignored. This thesis intends to utilize this aspect and identify areas where an alternative to integer DCT (ICT) for image compression can be proposed. There exist polynomial-based orthogonal transforms like discrete Tchebichef transform (DTT), which possess valuable properties like energy compaction, but are potentially unexploited in comparison with DCT. This thesis examines, in detail, where DTT stands as a viable alternative to DCT for lossy image compression based on various image quality parameters. It introduces a multiplier-free fast implementation of integer DTT (ITT) of size 8×8 that significantly reduces the computational complexity. Normally, images have detail spread across them in a non-homogenous manner. Hence, when the image is divided into blocks, some blocks might have intricate detail, whereas the amount of detail in some might be very sparse. This feature is exploited in this thesis by proposing a technique to adapt the quantization performed during compression according to the characteristics of the image block. The novelty of this variable quantization is that it is simple to implement without much computational or transmission overhead. The image compression performance of ITT and ICT, using both variable and fixed quantization, are evaluated and compared for a variety of images. Eventually, the cases suitable for ITT-based image compression employing variable quantization are identified
    corecore