590 research outputs found

    A VLSI architecture of JPEG2000 encoder

    Get PDF
    Copyright @ 2004 IEEEThis paper proposes a VLSI architecture of JPEG2000 encoder, which functionally consists of two parts: discrete wavelet transform (DWT) and embedded block coding with optimized truncation (EBCOT). For DWT, a spatial combinative lifting algorithm (SCLA)-based scheme with both 5/3 reversible and 9/7 irreversible filters is adopted to reduce 50% and 42% multiplication computations, respectively, compared with the conventional lifting-based implementation (LBI). For EBCOT, a dynamic memory control (DMC) strategy of Tier-1 encoding is adopted to reduce 60% scale of the on-chip wavelet coefficient storage and a subband parallel-processing method is employed to speed up the EBCOT context formation (CF) process; an architecture of Tier-2 encoding is presented to reduce the scale of on-chip bitstream buffering from full-tile size down to three-code-block size and considerably eliminate the iterations of the rate-distortion (RD) truncation.This work was supported in part by the China National High Technologies Research Program (863) under Grant 2002AA1Z142

    Sample-Parallel Execution of EBCOT in Fast Mode

    Get PDF
    JPEG 2000’s most computationally expensive building block is the Embedded Block Coder with Optimized Truncation (EBCOT). This paper evaluates how encoders targeting a parallel architecture such as a GPU can increase their throughput in use cases where very high data rates are used. The compression efficiency in the less significant bit-planes is then often poor and it is beneficial to enable the Selective Arithmetic Coding Bypass style (fast mode) in order to trade a small loss in compression efficiency for a reduction of the computational complexity. More importantly, this style exposes a more finely grained parallelism that can be exploited to execute the raw coding passes, including bit-stuffing, in a sample-parallel fashion. For a latency- or memory critical application that encodes one frame at a time, EBCOT’s tier-1 is sped up between 1.1x and 2.4x compared to an optimized GPU-based implementation. When a low GPU occupancy has already been addressed by encoding multiple frames in parallel, the throughput can still be improved by 5% for high-entropy images and 27% for low-entropy images. Best results are obtained when enabling the fast mode after the fourth significant bit-plane. For most of the test images the compression rate is within 1% of the original

    Low power JPEG2000 5/3 discrete wavelet transform algorithm and architecture

    Get PDF

    Hybrid Neural Network Predictive-Wavelet Image Compression System

    Get PDF
    This paper considers a novel image compression technique called hybrid predictive wavelet coding. The new proposed technique combines the properties of predictive coding and discrete wavelet coding. In contrast to JPEG2000, the image data values are pre-processed using predictive coding to remove interpixel redundancy. The error values, which are the difference between the original and the predicted values, are discrete wavelet coding transformed. In this case, a nonlinear neural network predictor is utilised in the predictive coding system. The simulation results indicated that the proposed technique can achieve good compressed images at high decomposition levels in comparison to JPEG2000

    Evaluation of GPU/CPU Co-Processing Models for JPEG 2000 Packetization

    Get PDF
    With the bottom-line goal of increasing the throughput of a GPU-accelerated JPEG 2000 encoder, this paper evaluates whether the post-compression rate control and packetization routines should be carried out on the CPU or on the GPU. Three co-processing models that differ in how the workload is split among the CPU and GPU are introduced. Both routines are discussed and algorithms for executing them in parallel are presented. Experimental results for compressing a detail-rich UHD sequence to 4 bits/sample indicate speed-ups of 200x for the rate control and 100x for the packetization compared to the single-threaded implementation in the commercial Kakadu library. These two routines executed on the CPU take 4x as long as all remaining coding steps on the GPU and therefore present a bottleneck. Even if the CPU bottleneck could be avoided with multi-threading, it is still beneficial to execute all coding steps on the GPU as this minimizes the required device-to-host transfer and thereby speeds up the critical path from 17.2 fps to 19.5 fps for 4 bits/sample and to 22.4 fps for 0.16 bits/sample

    Multiprocessor DSP Implementation of the JPEG 2000 Codec

    Get PDF
    The transition to JPEG2000 from other image formats such as standard JPEG offers im proved compression and image quality, yet has not been widely adopted in practice. This is mainly due to the complexity of the JPEG2000 algorithm. Standard JPEG uses the Discrete Cosine Transform (DCT) and Huffmann encoding to achieve its compression, whereas JPEG2000 uses the wavelet transform and arithmetic encoding. Due to the wide acceptance of JPEG, there are processors such as Equator Technology\u27s BSP-15 digital signal processor (DSP) that have been designed with features specifically for JPEG appli cations. For some of the current digital printing applications where JPEG is used, images must be encoded and decoded at rates exceeding 100 pages per minute. A multiprocessor environment consisting of Equator Technology\u27s BSP-15 processors may offer acceptable performance for the JPEG2000 codec. The aim of this work is to design a JPEG2000 codec for the BSP-15 processor and to determine if this processor is capable of delivering the performance required by high end digital printers. The features of the BSP-15 that are well suited for the JPEG2000 algorithm will be discussed, as well as future improvements that could be incorporated into the architecture. By analyzing the advantages and disadvantages of this processor, the next generation of processors may be able to offer features that will allow it to excel in JPEG2000 processing. A multiprocessor DSP implementation of the JPEG2000 codec is the main result of this work. The resulting codec is able to provide more than double the processing throughput of existing JPEG2000 software

    Data Compression in the Petascale Astronomy Era: a GERLUMPH case study

    Full text link
    As the volume of data grows, astronomers are increasingly faced with choices on what data to keep -- and what to throw away. Recent work evaluating the JPEG2000 (ISO/IEC 15444) standards as a future data format standard in astronomy has shown promising results on observational data. However, there is still a need to evaluate its potential on other type of astronomical data, such as from numerical simulations. GERLUMPH (the GPU-Enabled High Resolution cosmological MicroLensing parameter survey) represents an example of a data intensive project in theoretical astrophysics. In the next phase of processing, the ~27 terabyte GERLUMPH dataset is set to grow by a factor of 100 -- well beyond the current storage capabilities of the supercomputing facility on which it resides. In order to minimise bandwidth usage, file transfer time, and storage space, this work evaluates several data compression techniques. Specifically, we investigate off-the-shelf and custom lossless compression algorithms as well as the lossy JPEG2000 compression format. Results of lossless compression algorithms on GERLUMPH data products show small compression ratios (1.35:1 to 4.69:1 of input file size) varying with the nature of the input data. Our results suggest that JPEG2000 could be suitable for other numerical datasets stored as gridded data or volumetric data. When approaching lossy data compression, one should keep in mind the intended purposes of the data to be compressed, and evaluate the effect of the loss on future analysis. In our case study, lossy compression and a high compression ratio do not significantly compromise the intended use of the data for constraining quasar source profiles from cosmological microlensing.Comment: 15 pages, 9 figures, 5 tables. Published in the Special Issue of Astronomy & Computing on The future of astronomical data format
    corecore