23 research outputs found

    Strategy of microscopic parallelism for Bitplane Image Coding

    Get PDF
    Recent years have seen the upraising of a new type of processors strongly relying on the Single Instruction, Multiple Data (SIMD) architectural principle. The main idea behind SIMD computing is to apply a flow of instructions to multiple pieces of data in parallel and synchronously. This permits the execution of thousands of operations in parallel, achieving higher computational performance than with traditional Multiple Instruction, Multiple Data (MIMD) architectures. The level of parallelism required in SIMD computing can only be achieved in image coding systems via microscopic parallel strategies that code multiple coefficients in parallel. Until now, the only way to achieve microscopic parallelism in bitplane coding engines was by executing multiple coding passes in parallel. Such a strategy does not suit well SIMD computing because each thread executes different instructions. This paper introduces the first bitplane coding engine devised for the fine grain of parallelism required in SIMD computing. Its main insight is to allow parallel coefficient processing in a coding pass. Experimental tests show coding performance results similar to those of JPEG2000

    Strategy of Microscopic Parallelism for Bitplane Image Coding

    Full text link

    Bitplane image coding with parallel coefficient processing

    Get PDF
    Image coding systems have been traditionally tailored for multiple instruction, multiple data (MIMD) computing. In general, they partition the (transformed) image in codeblocks that can be coded in the cores of MIMD-based processors. Each core executes a sequential flow of instructions to process the coefficients in the codeblock, independently and asynchronously from the others cores. Bitplane coding is a common strategy to code such data. Most of its mechanisms require sequential processing of the coefficients. The last years have seen the upraising of processing accelerators with enhanced computational performance and power efficiency whose architecture is mainly based on the single instruction, multiple data (SIMD) principle. SIMD computing refers to the execution of the same instruction to multiple data in a lockstep synchronous way. Unfortunately, current bitplane coding strategies cannot fully profit from such processors due to inherently sequential coding task. This paper presents bitplane image coding with parallel coefficient (BPC-PaCo) processing, a coding method that can process many coefficients within a codeblock in parallel and synchronously. To this end, the scanning order, the context formation, the probability model, and the arithmetic coder of the coding engine have been re-formulated. The experimental results suggest that the penalization in coding performance of BPC-PaCo with respect to the traditional strategies is almost negligible

    GPU-oriented architecture for an end-to-end image/video codec based on JPEG2000

    Get PDF
    Modern image and video compression standards employ computationally intensive algorithms that provide advanced features to the coding system. Current standards often need to be implemented in hardware or using expensive solutions to meet the real-time requirements of some environments. Contrarily to this trend, this paper proposes an end-to-end codec architecture running on inexpensive Graphics Processing Units (GPUs) that is based on, though not compatible with, the JPEG2000 international standard for image and video compression. When executed in a commodity Nvidia GPU, it achieves real time processing of 12K video. The proposed S/W architecture utilizes four CUDA kernels that minimize memory transfers, use registers instead of shared memory, and employ a double-buffer strategy to optimize the streaming of data. The analysis of throughput indicates that the proposed codec yields results at least 10× superior on average to those achieved with JPEG2000 implementations devised for CPUs, and approximately 4× superior to those achieved with hardwired solutions of the HEVC/H.265 video compression standard

    Stationary probability model for microscopic parallelism in JPEG2000

    Get PDF
    Parallel processing is key to augmenting the throughput of image codecs. Despite numerous efforts to parallelize wavelet-based image coding systems, most attempts fail at the parallelization of the bitplane coding engine, which is the most computationally intensive stage of the coding pipeline. The main reason for this failure is the causality with which current coding strategies are devised, which assumes that one coefficient is coded after another. This work analyzes the mechanisms employed in bitplane coding and proposes alternatives to enhance opportunities for parallelism. We describe a stationary probability model that, without sacrificing the advantages of current approaches, removes the main obstacle to the parallelization of most coding strategies. Experimental tests evaluate the coding performance achieved by the proposed method in the framework of JPEG2000 when coding different types of images. Results indicate that the stationary probability model achieves similar coding performance, with slight increments or decrements depending on the image type and the desired level of parallelism

    Bitplane Image Coding With Parallel Coefficient Processing

    Full text link

    Stationary Probability Model for Microscopic Parallelism in JPEG2000

    Full text link

    GPU implementation of bitplane coding with parallel coefficient processing for high performance image compression

    Get PDF
    The fast compression of images is a requisite in many applications like TV production, teleconferencing, or digital cinema. Many of the algorithms employed in current image compression standards are inherently sequential. High performance implementations of such algorithms often require specialized hardware like field integrated gate arrays. Graphics Processing Units (GPUs) do not commonly achieve high performance on these algorithms because they do not exhibit fine-grain parallelism. Our previous work introduced a new core algorithm for wavelet-based image coding systems. It is tailored for massive parallel architectures. It is called bitplane coding with parallel coefficient processing (BPC-PaCo). This paper introduces the first high performance, GPU-based implementation of BPC-PaCo. A detailed analysis of the algorithm aids its implementation in the GPU. The main insights behind the proposed codec are an efficient thread-to-data mapping, a smart memory management, and the use of efficient cooperation mechanisms to enable inter-thread communication. Experimental results indicate that the proposed implementation matches the requirements for high resolution (4 K) digital cinema in real time, yielding speedups of 30x with respect to the fastest implementations of current compression standards. Also, a power consumption evaluation shows that our implementation consumes 40 x less energy for equivalent performance than state-of-the-art methods

    Entropy-based evaluation of context models for wavelet-transformed images

    Get PDF
    Entropy is a measure of a message uncertainty. Among others aspects, it serves to determine the minimum coding rate that practical systems may attain. This paper defines an entropy-based measure to evaluate context models employed in wavelet-based image coding. The proposed measure is defined considering the mechanisms utilized by modern coding systems. It establishes the maximum performance achievable with each context model. This helps to determine the adequateness of the model under different coding conditions and serves to predict with high precision the coding rate achieved by practical systems. Experimental results evaluate four well-known context models using different types of images, coding rates, and transform strategies. They reveal that, under specific coding conditions, some widely-spread context models may not be as adequate as it is generally thought. The hints provided by this analysis may help to design simpler and more efficient wavelet-based image codecs

    Cell-based 2-step scalar deadzone quantization for high bit-depth hyperspectral image coding

    Get PDF
    Remote sensing images often need to be coded and/or transmitted with constrained computational resources. Among other features, such images commonly have high spatial, spectral, and bit-depth resolution, which may render difficult their handling. This letter introduces an embedded quantization scheme based on two-step scalar deadzone quantization (2SDQ) that enhances the quality of transmitted images when coded with a constrained number of bits. The proposed scheme is devised for use in JPEG2000. It is named cell-based 2SDQ since it uses cells, i.e., small sets of wavelet coefficients within the codeblocks defined by JPEG2000. Cells permit a finer discrimination of coefficients in which to apply the proposed quantizer. Experimental results indicate that the proposed scheme is especially beneficial for high bit-depth hyperspectral images
    corecore