3 research outputs found

    Compression and Conditional Emulation of Climate Model Output

    Full text link
    Numerical climate model simulations run at high spatial and temporal resolutions generate massive quantities of data. As our computing capabilities continue to increase, storing all of the data is not sustainable, and thus it is important to develop methods for representing the full datasets by smaller compressed versions. We propose a statistical compression and decompression algorithm based on storing a set of summary statistics as well as a statistical model describing the conditional distribution of the full dataset given the summary statistics. The statistical model can be used to generate realizations representing the full dataset, along with characterizations of the uncertainties in the generated data. Thus, the methods are capable of both compression and conditional emulation of the climate models. Considerable attention is paid to accurately modeling the original dataset--one year of daily mean temperature data--particularly with regard to the inherent spatial nonstationarity in global fields, and to determining the statistics to be stored, so that the variation in the original data can be closely captured, while allowing for fast decompression and conditional emulation on modest computers

    High-throughput variable-to-fixed entropy codec using selective, stochastic code forests

    Get PDF
    Efficient high-throughput (HT) compression algorithms are paramount to meet the stringent constraints of present and upcoming data storage, processing, and transmission systems. In particular, latency, bandwidth and energy requirements are critical for those systems. Most HT codecs are designed to maximize compression speed, and secondarily to minimize compressed lengths. On the other hand, decompression speed is often equally or more critical than compression speed, especially in scenarios where decompression is performed multiple times and/or at critical parts of a system. In this work, an algorithm to design variable-to-fixed (VF) codes is proposed that prioritizes decompression speed. Stationary Markov analysis is employed to generate multiple, jointly optimized codes (denoted code forests). Their average compression efficiency is on par with the state of the art in VF codes, e.g., within 1% of Yamamoto et al.\u27s algorithm. The proposed code forest structure enables the implementation of highly efficient codecs, with decompression speeds 3.8 times faster than other state-of-the-art HT entropy codecs with equal or better compression ratios for natural data sources. Compared to these HT codecs, the proposed forests yields similar compression efficiency and speeds

    Implementation of Lossless Preprocessing Technique for Student Record System

    Get PDF
    The Implementation of Lossless Preprocessing Technique for Student Record System of the Lyceum of the Philippines University mainly aims to apply an effective data compression algorithm to efficiently store data, improve data transmission via web-based infrastructure and ensure security of records. The system is comprised of the following modules namely: Smart ID Registration Module, Client-Side Application Module and Information Kiosk Module. For enhanced security consideration, GZIPalso known as GNU ZIP algorithm is implemented to compress records before saving in the central database. It is a lossless data compression utility that is based on the deflate algorithm with the format defined in Internet Engineering Task Force RFC1951: DEFLATE Compressed Data Format Specification version 1.32. This standard references the use of the LZ77 (Lempil-Ziv, 1977) compression algorithm combined with Huffman coding. On the other hand, the lossless decompressionrestores the data by bringing back the removed redundancy and produces an exact replica of the original source data. Results of the software evaluation indicate high acceptability on the overall system performance
    corecore