7,806 research outputs found
Data Compression in the Petascale Astronomy Era: a GERLUMPH case study
As the volume of data grows, astronomers are increasingly faced with choices
on what data to keep -- and what to throw away. Recent work evaluating the
JPEG2000 (ISO/IEC 15444) standards as a future data format standard in
astronomy has shown promising results on observational data. However, there is
still a need to evaluate its potential on other type of astronomical data, such
as from numerical simulations. GERLUMPH (the GPU-Enabled High Resolution
cosmological MicroLensing parameter survey) represents an example of a data
intensive project in theoretical astrophysics. In the next phase of processing,
the ~27 terabyte GERLUMPH dataset is set to grow by a factor of 100 -- well
beyond the current storage capabilities of the supercomputing facility on which
it resides. In order to minimise bandwidth usage, file transfer time, and
storage space, this work evaluates several data compression techniques.
Specifically, we investigate off-the-shelf and custom lossless compression
algorithms as well as the lossy JPEG2000 compression format. Results of
lossless compression algorithms on GERLUMPH data products show small
compression ratios (1.35:1 to 4.69:1 of input file size) varying with the
nature of the input data. Our results suggest that JPEG2000 could be suitable
for other numerical datasets stored as gridded data or volumetric data. When
approaching lossy data compression, one should keep in mind the intended
purposes of the data to be compressed, and evaluate the effect of the loss on
future analysis. In our case study, lossy compression and a high compression
ratio do not significantly compromise the intended use of the data for
constraining quasar source profiles from cosmological microlensing.Comment: 15 pages, 9 figures, 5 tables. Published in the Special Issue of
Astronomy & Computing on The future of astronomical data format
A Universal Scheme for WynerâZiv Coding of Discrete Sources
We consider the WynerâZiv (WZ) problem of lossy compression where the decompressor observes a noisy version of the source, whose statistics are unknown. A new family of WZ coding algorithms is proposed and their universal optimality is proven. Compression consists of sliding-window processing followed by LempelâZiv (LZ) compression, while the decompressor is based on a modification of the discrete universal denoiser (DUDE) algorithm to take advantage of side information. The new algorithms not only universally attain the fundamental limits, but also suggest a paradigm for practical WZ coding. The effectiveness of our approach is illustrated with experiments on binary images, and English text using a low complexity algorithm motivated by our class of universally optimal WZ codes
Subband coding for image data archiving
The use of subband coding on image data is discussed. An overview of subband coding is given. Advantages of subbanding for browsing and progressive resolution are presented. Implementations for lossless and lossy coding are discussed. Algorithm considerations and simple implementations of subband systems are given
Locally adaptive vector quantization: Data compression with feature preservation
A study of a locally adaptive vector quantization (LAVQ) algorithm for data compression is presented. This algorithm provides high-speed one-pass compression and is fully adaptable to any data source and does not require a priori knowledge of the source statistics. Therefore, LAVQ is a universal data compression algorithm. The basic algorithm and several modifications to improve performance are discussed. These modifications are nonlinear quantization, coarse quantization of the codebook, and lossless compression of the output. Performance of LAVQ on various images using irreversible (lossy) coding is comparable to that of the Linde-Buzo-Gray algorithm, but LAVQ has a much higher speed; thus this algorithm has potential for real-time video compression. Unlike most other image compression algorithms, LAVQ preserves fine detail in images. LAVQ's performance as a lossless data compression algorithm is comparable to that of Lempel-Ziv-based algorithms, but LAVQ uses far less memory during the coding process
- âŠ