620,202 research outputs found

    Bicriteria data compression

    Get PDF
    The advent of massive datasets (and the consequent design of high-performing distributed storage systems) have reignited the interest of the scientific and engineering community towards the design of lossless data compressors which achieve effective compression ratio and very efficient decompression speed. Lempel-Ziv's LZ77 algorithm is the de facto choice in this scenario because of its decompression speed and its flexibility in trading decompression speed versus compressed-space efficiency. Each of the existing implementations offers a trade-off between space occupancy and decompression speed, so software engineers have to content themselves by picking the one which comes closer to the requirements of the application in their hands. Starting from these premises, and for the first time in the literature, we address in this paper the problem of trading optimally, and in a principled way, the consumption of these two resources by introducing the Bicriteria LZ77-Parsing problem, which formalizes in a principled way what data-compressors have traditionally approached by means of heuristics. The goal is to determine an LZ77 parsing which minimizes the space occupancy in bits of the compressed file, provided that the decompression time is bounded by a fixed amount (or vice-versa). This way, the software engineer can set its space (or time) requirements and then derive the LZ77 parsing which optimizes the decompression speed (or the space occupancy, respectively). We solve this problem efficiently in O(n log^2 n) time and optimal linear space within a small, additive approximation, by proving and deploying some specific structural properties of the weighted graph derived from the possible LZ77-parsings of the input file. The preliminary set of experiments shows that our novel proposal dominates all the highly engineered competitors, hence offering a win-win situation in theory&practice

    Data compression processor Patent

    Get PDF
    Data compression processor for monitoring analog signals by sampling procedur

    Data compression system

    Get PDF
    A data compression system is described in which TV PCM data for each line scan is received in the form of a succession of multibit pixel words. All or selected bits of each word are compressed by providing difference values between successive pixel words and coding the difference values of a selected number of pixel words forming a block into a fundamental sequence (FS). The FS, based on its length and the number of words per block, is either transmitted as the compressed data or is used to generate a code FS or its complement is used to generate a code FS bar. When the code FS is generated, its length is compared with the original block PCM and only if the former is the shorter of the two is the code transmitted. Selected bits per pixel word may be compressed, while the remaining bits may be transmitted directly, or some of them may be omitted altogether

    Prioritized Data Compression using Wavelets

    Full text link
    The volume of data and the velocity with which it is being generated by com- putational experiments on high performance computing (HPC) systems is quickly outpacing our ability to effectively store this information in its full fidelity. There- fore, it is critically important to identify and study compression methodologies that retain as much information as possible, particularly in the most salient regions of the simulation space. In this paper, we cast this in terms of a general decision-theoretic problem and discuss a wavelet-based compression strategy for its solution. We pro- vide a heuristic argument as justification and illustrate our methodology on several examples. Finally, we will discuss how our proposed methodology may be utilized in an HPC environment on large-scale computational experiments

    Science-driven 3D data compression

    Full text link
    Photometric redshift surveys map the distribution of matter in the Universe through the positions and shapes of galaxies with poorly resolved measurements of their radial coordinates. While a tomographic analysis can be used to recover some of the large-scale radial modes present in the data, this approach suffers from a number of practical shortcomings, and the criteria to decide on a particular binning scheme are commonly blind to the ultimate science goals. We present a method designed to separate and compress the data into a small number of uncorrelated radial modes, circumventing some of the problems of standard tomographic analyses. The method is based on the Karhunen-Lo\`{e}ve transform (KL), and is connected to other 3D data compression bases advocated in the literature, such as the Fourier-Bessel decomposition. We apply this method to both weak lensing and galaxy clustering. In the case of galaxy clustering, we show that the resulting optimal basis is closely associated with the Fourier-Bessel basis, and that for certain observables, such as the effects of magnification bias or primordial non-Gaussianity, the bulk of the signal can be compressed into a small number of modes. In the case of weak lensing we show that the method is able to compress the vast majority of the signal-to-noise into a single mode, and that optimal cosmological constraints can be obtained considering only three uncorrelated KL eigenmodes, considerably simplifying the analysis with respect to a traditional tomographic approach.Comment: 14 pages, 11 figures. Comments welcom

    Mixing Strategies in Data Compression

    Full text link
    We propose geometric weighting as a novel method to combine multiple models in data compression. Our results reveal the rationale behind PAQ-weighting and generalize it to a non-binary alphabet. Based on a similar technique we present a new, generic linear mixture technique. All novel mixture techniques rely on given weight vectors. We consider the problem of finding optimal weights and show that the weight optimization leads to a strictly convex (and thus, good-natured) optimization problem. Finally, an experimental evaluation compares the two presented mixture techniques for a binary alphabet. The results indicate that geometric weighting is superior to linear weighting.Comment: Data Compression Conference (DCC) 201

    Data compression development - A comparison of two floating-aperture data compression schemes

    Get PDF
    Comparison of two floating aperture data compression scheme
    • …
    corecore