254 research outputs found

    Compression Methods for Structured Floating-Point Data and their Application in Climate Research

    Get PDF
    The use of new technologies, such as GPU boosters, have led to a dramatic increase in the computing power of High-Performance Computing (HPC) centres. This development, coupled with new climate models that can better utilise this computing power thanks to software development and internal design, led to the bottleneck moving from solving the differential equations describing Earth’s atmospheric interactions to actually storing the variables. The current approach to solving the storage problem is inadequate: either the number of variables to be stored is limited or the temporal resolution of the output is reduced. If it is subsequently determined that another vari- able is required which has not been saved, the simulation must run again. This thesis deals with the development of novel compression algorithms for structured floating-point data such as climate data so that they can be stored in full resolution. Compression is performed by decorrelation and subsequent coding of the data. The decorrelation step eliminates redundant information in the data. During coding, the actual compression takes place and the data is written to disk. A lossy compression algorithm additionally has an approx- imation step to unify the data for better coding. The approximation step reduces the complexity of the data for the subsequent coding, e.g. by using quantification. This work makes a new scientific contribution to each of the three steps described above. This thesis presents a novel lossy compression method for time-series data using an Auto Regressive Integrated Moving Average (ARIMA) model to decorrelate the data. In addition, the concept of information spaces and contexts is presented to use information across dimensions for decorrela- tion. Furthermore, a new coding scheme is described which reduces the weaknesses of the eXclusive-OR (XOR) difference calculation and achieves a better compression factor than current lossless compression methods for floating-point numbers. Finally, a modular framework is introduced that allows the creation of user-defined compression algorithms. The experiments presented in this thesis show that it is possible to in- crease the information content of lossily compressed time-series data by applying an adaptive compression technique which preserves selected data with higher precision. An analysis for lossless compression of these time- series has shown no success. However, the lossy ARIMA compression model proposed here is able to capture all relevant information. The reconstructed data can reproduce the time-series to such an extent that statistically rele- vant information for the description of climate dynamics is preserved. Experiments indicate that there is a significant dependence of the com- pression factor on the selected traversal sequence and the underlying data model. The influence of these structural dependencies on prediction-based compression methods is investigated in this thesis. For this purpose, the concept of Information Spaces (IS) is introduced. IS contributes to improv- ing the predictions of the individual predictors by nearly 10% on average. Perhaps more importantly, the standard deviation of compression results is on average 20% lower. Using IS provides better predictions and consistent compression results. Furthermore, it is shown that shifting the prediction and true value leads to a better compression factor with minimal additional computational costs. This allows the use of more resource-efficient prediction algorithms to achieve the same or better compression factor or higher throughput during compression or decompression. The coding scheme proposed here achieves a better compression factor than current state-of-the-art methods. Finally, this paper presents a modular framework for the development of compression algorithms. The framework supports the creation of user- defined predictors and offers functionalities such as the execution of bench- marks, the random subdivision of n-dimensional data, the quality evalua- tion of predictors, the creation of ensemble predictors and the execution of validity tests for sequential and parallel compression algorithms. This research was initiated because of the needs of climate science, but the application of its contributions is not limited to it. The results of this the- sis are of major benefit to develop and improve any compression algorithm for structured floating-point data

    QARV: Quantization-Aware ResNet VAE for Lossy Image Compression

    Full text link
    This paper addresses the problem of lossy image compression, a fundamental problem in image processing and information theory that is involved in many real-world applications. We start by reviewing the framework of variational autoencoders (VAEs), a powerful class of generative probabilistic models that has a deep connection to lossy compression. Based on VAEs, we develop a novel scheme for lossy image compression, which we name quantization-aware ResNet VAE (QARV). Our method incorporates a hierarchical VAE architecture integrated with test-time quantization and quantization-aware training, without which efficient entropy coding would not be possible. In addition, we design the neural network architecture of QARV specifically for fast decoding and propose an adaptive normalization operation for variable-rate compression. Extensive experiments are conducted, and results show that QARV achieves variable-rate compression, high-speed decoding, and a better rate-distortion performance than existing baseline methods. The code of our method is publicly accessible at https://github.com/duanzhiihao/lossy-vaeComment: Technical repor

    Lossless compression with latent variable models

    Get PDF
    We develop a simple and elegant method for lossless compression using latent variable models, which we call `bits back with asymmetric numeral systems' (BB-ANS). The method involves interleaving encode and decode steps, and achieves an optimal rate when compressing batches of data. We demonstrate it rstly on the MNIST test set, showing that state-of-the-art lossless compression is possible using a small variational autoencoder (VAE) model. We then make use of a novel empirical insight, that fully convolutional generative models, trained on small images, are able to generalize to images of arbitrary size, and extend BB-ANS to hierarchical latent variable models, enabling state-of-the-art lossless compression of full-size colour images from the ImageNet dataset. We describe `Craystack', a modular software framework which we have developed for rapid prototyping of compression using deep generative models

    Adapted generalized lifting schemes for scalable lossless image coding

    No full text
    International audienceStill image coding occasionally uses linear predictive coding together with multi-resolution decompositions, as may be found in several papers. Those related approaches do not take into account all the information available at the decoder in the prediction stage. In this paper, we introduce an adapted generalized lifting scheme in which the predictor is built upon two filters, leading to taking advantage of all this available information. With this structure included in a multi-resolution decomposition framework, we study two kinds of adaptation based on least squares estimation, according to different assumptions, which are either a global or a local second order stationarity of the image. The efficiency in lossless coding of these decompositions is shown on synthetic images and their performances are compared with those of well-known codecs (S+P, JPEG-LS, JPEG2000, CALIC) on actual images. Four images' families are distinguished: natural, MRI medical, satellite and textures associated with fingerprints. On natural and medical images, the performances of our codecs do not exceed those of classical codecs. Now for satellite images and textures, they present a slightly noticeable (about 0.05 to 0.08 bpp) coding gain compared to the others that permit a progressive coding in resolution, but with a greater coding time

    Perceptually-Driven Video Coding with the Daala Video Codec

    Full text link
    The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not. We evaluate which tools are easy to integrate into a more traditional codec design, and show results in the context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital Image Processing (ADIP), 201
    corecore