3 research outputs found
Compression and Conditional Emulation of Climate Model Output
Numerical climate model simulations run at high spatial and temporal
resolutions generate massive quantities of data. As our computing capabilities
continue to increase, storing all of the data is not sustainable, and thus it
is important to develop methods for representing the full datasets by smaller
compressed versions. We propose a statistical compression and decompression
algorithm based on storing a set of summary statistics as well as a statistical
model describing the conditional distribution of the full dataset given the
summary statistics. The statistical model can be used to generate realizations
representing the full dataset, along with characterizations of the
uncertainties in the generated data. Thus, the methods are capable of both
compression and conditional emulation of the climate models. Considerable
attention is paid to accurately modeling the original dataset--one year of
daily mean temperature data--particularly with regard to the inherent spatial
nonstationarity in global fields, and to determining the statistics to be
stored, so that the variation in the original data can be closely captured,
while allowing for fast decompression and conditional emulation on modest
computers
High-throughput variable-to-fixed entropy codec using selective, stochastic code forests
Efficient high-throughput (HT) compression algorithms are paramount to meet the stringent constraints of present and upcoming data storage, processing, and transmission systems. In particular, latency, bandwidth and energy requirements are critical for those systems. Most HT codecs are designed to maximize compression speed, and secondarily to minimize compressed lengths. On the other hand, decompression speed is often equally or more critical than compression speed, especially in scenarios where decompression is performed multiple times and/or at critical parts of a system. In this work, an algorithm to design variable-to-fixed (VF) codes is proposed that prioritizes decompression speed. Stationary Markov analysis is employed to generate multiple, jointly optimized codes (denoted code forests). Their average compression efficiency is on par with the state of the art in VF codes, e.g., within 1% of Yamamoto et al.\u27s algorithm. The proposed code forest structure enables the implementation of highly efficient codecs, with decompression speeds 3.8 times faster than other state-of-the-art HT entropy codecs with equal or better compression ratios for natural data sources. Compared to these HT codecs, the proposed forests yields similar compression efficiency and speeds
Implementation of Lossless Preprocessing Technique for Student Record System
The Implementation of Lossless Preprocessing Technique for Student Record System of the Lyceum of the Philippines University mainly aims to apply an effective data compression algorithm to efficiently store data, improve data transmission via web-based infrastructure and ensure security of records. The system is comprised of the following modules namely: Smart ID Registration Module, Client-Side Application Module and Information Kiosk Module. For enhanced security consideration, GZIPalso known as GNU ZIP algorithm is implemented to compress records before saving in the central database. It is a lossless data compression utility that is based on the deflate algorithm with the format defined in Internet Engineering Task Force RFC1951: DEFLATE Compressed Data Format Specification version 1.32. This standard references the use of the LZ77 (Lempil-Ziv, 1977) compression algorithm combined with Huffman coding. On the other hand, the lossless decompressionrestores the data by bringing back the removed redundancy and produces an exact replica of the original source data. Results of the software evaluation indicate high acceptability on the overall system performance