52 research outputs found

    GPU上での展開に適した可逆データ圧縮方式に関する研究

    Get PDF
    広島大学(Hiroshima University)博士(工学)Doctor of Engineeringdoctora

    FPGA向けの効率的なハードウエアアルゴリズム

    Get PDF
    広島大学(Hiroshima University)博士(工学)Doctor of Engineeringdoctora

    DCT Implementation on GPU

    Get PDF
    There has been a great progress in the field of graphics processors. Since, there is no rise in the speed of the normal CPU processors; Designers are coming up with multi-core, parallel processors. Because of their popularity in parallel processing, GPUs are becoming more and more attractive for many applications. With the increasing demand in utilizing GPUs, there is a great need to develop operating systems that handle the GPU to full capacity. GPUs offer a very efficient environment for many image processing applications. This thesis explores the processing power of GPUs for digital image compression using Discrete cosine transform

    ON THE COMPRESSION OF DIGITAL HOLOGRAMS

    Get PDF
    This thesis investigates the compression of computer-generated transmission holograms through lossless schemes such as the Burrows-Wheeler compression scheme (BWCS). Ever since Gabor’s discovery of holography, much research have been done to improve the record­ ing and viewing of holograms into more convenient uses such as video viewing. However, the compression of holograms where recording is performed from virtual scenes has not received much attention. Phase-shift digital holograms, on the other hand, have received more attention due to their practical application in object recognition, imaging, and video sequencing of phys­ ical objects. This study is performed for virtually recorded computer-generated holograms in order to understand compression factors in virtually recorded holograms. We also investigate application of lossless compression schemes to holograms with reduced precision for the in­ tensity and phase values. The overall objective is to explore the factors that affect effective compression of virtual holograms. As a result, this work can be used to assist in the design­ ing of better compression algorithms for applications such as virtual object simulations, video gaming application, and holographic video viewing

    GIF image hardware compressors

    Get PDF
    Increasing requirements for data transfer and storage is one of the crucial questions now. There are several ways of high-speed data transmission, but they meet limited requirements applied to their narrowly focused specific target. The data compression approach gives the solution to the problems of high-speed transfer and low-volume data storage. This paper is devoted to the compression of GIF images, using a modified LZW algorithm with a tree-based dictionary. It has led to a decrease in lookup time and an increase in the speed of data compression, and in turn, allows developing the method of constructing a hardware compression accelerator during the future research

    OPTIMIZING LEMPEL-ZIV FACTORIZATION FOR THE GPU ARCHITECTURE

    Get PDF
    Lossless data compression is used to reduce storage requirements, allowing for the relief of I/O channels and better utilization of bandwidth. The Lempel-Ziv lossless compression algorithms form the basis for many of the most commonly used compression schemes. General purpose computing on graphic processing units (GPGPUs) allows us to take advantage of the massively parallel nature of GPUs for computations other that their original purpose of rendering graphics. Our work targets the use of GPUs for general lossless data compression. Specifically, we developed and ported an algorithm that constructs the Lempel-Ziv factorization directly on the GPU. Our implementation bypasses the sequential nature of the LZ factorization and attempts to compute the factorization in parallel. By breaking down the LZ factorization into what we call the PLZ, we are able to outperform the fastest serial CPU implementations by up to 24x and perform comparatively to a parallel multicore CPU implementation. To achieve these speeds, our implementation outputted LZ factorizations that were on average only 0.01 percent greater than the optimal solution that what could be computed sequentially. We are also able to reevaluate the fastest GPU suffix array construction algorithm, which is needed to compute the LZ factorization. We are able to find speedups of up to 5x over the fastest CPU implementations

    Parallelized Quadtrees for Image Compression in CUDA and MPI

    Get PDF
    Quadtrees are a data structure that lend themselves well to image compression due to their ability to recursively decompose 2-dimensional space. Image compression algorithms that use quadtrees should be simple to parallelize; however, current image compression algorithms that use quadtrees rarely use parallel algorithms. An existing program to compress images using quadtrees was upgraded to use GPU acceleration with CUDA but experienced an average slowdown by a factor of 18 to 42. Another parallelization attempt utilized MPI to process contiguous chunks of an image in parallel and experienced an average speedup by a factor of 1.5 to 3.7 compared to the unmodified program

    ISP: An optimal out-of-core image-set processing streaming architecture for parallel heterogeneous systems

    Get PDF
    Journal ArticleImage population analysis is the class of statistical methods that plays a central role in understanding the development, evolution, and disease of a population. However, these techniques often require excessive computational power and memory that are compounded with a large number of volumetric inputs. Restricted access to supercomputing power limits its influence in general research and practical applications. In this paper we introduce ISP, an Image-Set Processing streaming framework that harnesses the processing power of commodity heterogeneous CPU/GPU systems and attempts to solve this computational problem. In ISP, we introduce specially designed streaming algorithms and data structures that provide an optimal solution for out-of-core multiimage processing problems both in terms of memory usage and computational efficiency. ISP makes use of the asynchronous execution mechanism supported by parallel heterogeneous systems to efficiently hide the inherent latency of the processing pipeline of out-of-core approaches. Consequently, with computationally intensive problems, the ISP out-of-core solution can achieve the same performance as the in-core solution. We demonstrate the efficiency of the ISP framework on synthetic and real datasets

    Fast online predictive compression of radio astronomy data

    Get PDF
    This report investigates the fast, lossless compression of 32-bit single precision floating-point values. High speed compression is critical in the context of the MeerKAT radio telescope currently under construction in Southern Africa and Australia, which will produce data at rates up to 1 Petabyte every 20 seconds. The compression technique being investigated is based on predictive compression, which has proven successful at achieving high-speed compression in previous research. Several different predictive techniques (which includes polynomial extrapolation), along with CPU- and GPU-based parallelization approaches are discussed. The implementation successfully achieves throughput rates in excess of 6 GiB/s for compression and much higher rates for decompression using a 64-core AMD Opteron machine, achieving file-size reductions of, on average 9%. Furthermore the results of concurrent investigations into block-based parallel Huffman encoding and Zero-length Encoding are compared to the predictive scheme and it was found that the predictive scheme obtains approximately 4%-5% better compression ratios than the Zero-Length Encoder and is 25 times faster than Huffman encoding on an Intel Xeon E5 processor. The scheme may be well-suited to address the large network bandwidth requirements of the MeerKAT project
    corecore