1,457 research outputs found
Data compression experiments with LANDSAT thematic mapper and Nimbus-7 coastal zone color scanner data
A case study is presented where an image segmentation based compression technique is applied to LANDSAT Thematic Mapper (TM) and Nimbus-7 Coastal Zone Color Scanner (CZCS) data. The compression technique, called Spatially Constrained Clustering (SCC), can be regarded as an adaptive vector quantization approach. The SCC can be applied to either single or multiple spectral bands of image data. The segmented image resulting from SCC is encoded in small rectangular blocks, with the codebook varying from block to block. Lossless compression potential (LDP) of sample TM and CZCS images are evaluated. For the TM test image, the LCP is 2.79. For the CZCS test image the LCP is 1.89, even though when only a cloud-free section of the image is considered the LCP increases to 3.48. Examples of compressed images are shown at several compression ratios ranging from 4 to 15. In the case of TM data, the compressed data are classified using the Bayes' classifier. The results show an improvement in the similarity between the classification results and ground truth when compressed data are used, thus showing that compression is, in fact, a useful first step in the analysis
A Progressive Universal Noiseless Coder
The authors combine pruned tree-structured vector quantization (pruned TSVQ) with Itoh's (1987) universal noiseless coder. By combining pruned TSVQ with universal noiseless coding, they benefit from the “successive approximation” capabilities of TSVQ, thereby allowing progressive transmission of images, while retaining the ability to noiselessly encode images of unknown statistics in a provably asymptotically optimal fashion. Noiseless compression results are comparable to Ziv-Lempel and arithmetic coding for both images and finely quantized Gaussian sources
Data Compression in the Petascale Astronomy Era: a GERLUMPH case study
As the volume of data grows, astronomers are increasingly faced with choices
on what data to keep -- and what to throw away. Recent work evaluating the
JPEG2000 (ISO/IEC 15444) standards as a future data format standard in
astronomy has shown promising results on observational data. However, there is
still a need to evaluate its potential on other type of astronomical data, such
as from numerical simulations. GERLUMPH (the GPU-Enabled High Resolution
cosmological MicroLensing parameter survey) represents an example of a data
intensive project in theoretical astrophysics. In the next phase of processing,
the ~27 terabyte GERLUMPH dataset is set to grow by a factor of 100 -- well
beyond the current storage capabilities of the supercomputing facility on which
it resides. In order to minimise bandwidth usage, file transfer time, and
storage space, this work evaluates several data compression techniques.
Specifically, we investigate off-the-shelf and custom lossless compression
algorithms as well as the lossy JPEG2000 compression format. Results of
lossless compression algorithms on GERLUMPH data products show small
compression ratios (1.35:1 to 4.69:1 of input file size) varying with the
nature of the input data. Our results suggest that JPEG2000 could be suitable
for other numerical datasets stored as gridded data or volumetric data. When
approaching lossy data compression, one should keep in mind the intended
purposes of the data to be compressed, and evaluate the effect of the loss on
future analysis. In our case study, lossy compression and a high compression
ratio do not significantly compromise the intended use of the data for
constraining quasar source profiles from cosmological microlensing.Comment: 15 pages, 9 figures, 5 tables. Published in the Special Issue of
Astronomy & Computing on The future of astronomical data format
- …