620,202 research outputs found
Recommended from our members
Parallel data compression
Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested
Bicriteria data compression
The advent of massive datasets (and the consequent design of high-performing
distributed storage systems) have reignited the interest of the scientific and
engineering community towards the design of lossless data compressors which
achieve effective compression ratio and very efficient decompression speed.
Lempel-Ziv's LZ77 algorithm is the de facto choice in this scenario because of
its decompression speed and its flexibility in trading decompression speed
versus compressed-space efficiency. Each of the existing implementations offers
a trade-off between space occupancy and decompression speed, so software
engineers have to content themselves by picking the one which comes closer to
the requirements of the application in their hands. Starting from these
premises, and for the first time in the literature, we address in this paper
the problem of trading optimally, and in a principled way, the consumption of
these two resources by introducing the Bicriteria LZ77-Parsing problem, which
formalizes in a principled way what data-compressors have traditionally
approached by means of heuristics. The goal is to determine an LZ77 parsing
which minimizes the space occupancy in bits of the compressed file, provided
that the decompression time is bounded by a fixed amount (or vice-versa). This
way, the software engineer can set its space (or time) requirements and then
derive the LZ77 parsing which optimizes the decompression speed (or the space
occupancy, respectively). We solve this problem efficiently in O(n log^2 n)
time and optimal linear space within a small, additive approximation, by
proving and deploying some specific structural properties of the weighted graph
derived from the possible LZ77-parsings of the input file. The preliminary set
of experiments shows that our novel proposal dominates all the highly
engineered competitors, hence offering a win-win situation in theory&practice
Data compression processor Patent
Data compression processor for monitoring analog signals by sampling procedur
Data compression system
A data compression system is described in which TV PCM data for each line scan is received in the form of a succession of multibit pixel words. All or selected bits of each word are compressed by providing difference values between successive pixel words and coding the difference values of a selected number of pixel words forming a block into a fundamental sequence (FS). The FS, based on its length and the number of words per block, is either transmitted as the compressed data or is used to generate a code FS or its complement is used to generate a code FS bar. When the code FS is generated, its length is compared with the original block PCM and only if the former is the shorter of the two is the code transmitted. Selected bits per pixel word may be compressed, while the remaining bits may be transmitted directly, or some of them may be omitted altogether
Prioritized Data Compression using Wavelets
The volume of data and the velocity with which it is being generated by com-
putational experiments on high performance computing (HPC) systems is quickly
outpacing our ability to effectively store this information in its full
fidelity. There- fore, it is critically important to identify and study
compression methodologies that retain as much information as possible,
particularly in the most salient regions of the simulation space. In this
paper, we cast this in terms of a general decision-theoretic problem and
discuss a wavelet-based compression strategy for its solution. We pro- vide a
heuristic argument as justification and illustrate our methodology on several
examples. Finally, we will discuss how our proposed methodology may be utilized
in an HPC environment on large-scale computational experiments
Science-driven 3D data compression
Photometric redshift surveys map the distribution of matter in the Universe
through the positions and shapes of galaxies with poorly resolved measurements
of their radial coordinates. While a tomographic analysis can be used to
recover some of the large-scale radial modes present in the data, this approach
suffers from a number of practical shortcomings, and the criteria to decide on
a particular binning scheme are commonly blind to the ultimate science goals.
We present a method designed to separate and compress the data into a small
number of uncorrelated radial modes, circumventing some of the problems of
standard tomographic analyses. The method is based on the Karhunen-Lo\`{e}ve
transform (KL), and is connected to other 3D data compression bases advocated
in the literature, such as the Fourier-Bessel decomposition. We apply this
method to both weak lensing and galaxy clustering. In the case of galaxy
clustering, we show that the resulting optimal basis is closely associated with
the Fourier-Bessel basis, and that for certain observables, such as the effects
of magnification bias or primordial non-Gaussianity, the bulk of the signal can
be compressed into a small number of modes. In the case of weak lensing we show
that the method is able to compress the vast majority of the signal-to-noise
into a single mode, and that optimal cosmological constraints can be obtained
considering only three uncorrelated KL eigenmodes, considerably simplifying the
analysis with respect to a traditional tomographic approach.Comment: 14 pages, 11 figures. Comments welcom
Mixing Strategies in Data Compression
We propose geometric weighting as a novel method to combine multiple models
in data compression. Our results reveal the rationale behind PAQ-weighting and
generalize it to a non-binary alphabet. Based on a similar technique we present
a new, generic linear mixture technique. All novel mixture techniques rely on
given weight vectors. We consider the problem of finding optimal weights and
show that the weight optimization leads to a strictly convex (and thus,
good-natured) optimization problem. Finally, an experimental evaluation
compares the two presented mixture techniques for a binary alphabet. The
results indicate that geometric weighting is superior to linear weighting.Comment: Data Compression Conference (DCC) 201
Data compression development - A comparison of two floating-aperture data compression schemes
Comparison of two floating aperture data compression scheme
- …