4 research outputs found

    Local recovery in data compression for general sources

    Get PDF
    Source coding is concerned with optimally compressing data, so that it can be reconstructed up to a specified distortion from its compressed representation. Usually, in fixed-length compression, a sequence of n symbols (from some alphabet) is encoded to a sequence of k symbols (bits). The decoder produces an estimate of the original sequence of n symbols from the encoded bits. The rate-distortion function characterizes the optimal possible rate of compression allowing a given distortion in reconstruction as n grows. This function depends on the source probability distribution. In a locally recoverable decoding, to reconstruct a single symbol, only a few compressed bits are accessed. In this paper we find the limits of local recovery for rates near the rate-distortion function. For a wide set of source distributions, we show that, it is possible to compress within ε of the rate-distortion function such the local recoverability grows as Ω(log(1/ε)); that is, in order to recover one source symbol, at least Ω(log(1/ε)) bits of the compressed symbols are queried. We also show order optimal impossibility results. Similar results are provided for lossless source coding as well.National Science Foundation (U.S.). (grant CCF 1318093)United States. Air Force. Office of Scientific Research ( FA9550-11-1-0183)National Science Foundation (U.S.). (grant CCF-1319828

    Malleable coding for updatable cloud caching

    Full text link
    In software-as-a-service applications provisioned through cloud computing, locally cached data are often modified with updates from new versions. In some cases, with each edit, one may want to preserve both the original and new versions. In this paper, we focus on cases in which only the latest version must be preserved. Furthermore, it is desirable for the data to not only be compressed but to also be easily modified during updates, since representing information and modifying the representation both incur cost. We examine whether it is possible to have both compression efficiency and ease of alteration, in order to promote codeword reuse. In other words, we study the feasibility of a malleable and efficient coding scheme. The tradeoff between compression efficiency and malleability cost-the difficulty of synchronizing compressed versions-is measured as the length of a reused prefix portion. The region of achievable rates and malleability is found. Drawing from prior work on common information problems, we show that efficient data compression may not be the best engineering design principle when storing software-as-a-service data. In the general case, goals of efficiency and malleability are fundamentally in conflict.This work was supported in part by an NSF Graduate Research Fellowship (LRV), Grant CCR-0325774, and Grant CCF-0729069. This work was presented at the 2011 IEEE International Symposium on Information Theory [1] and the 2014 IEEE International Conference on Cloud Engineering [2]. The associate editor coordinating the review of this paper and approving it for publication was R. Thobaben. (CCR-0325774 - NSF Graduate Research Fellowship; CCF-0729069 - NSF Graduate Research Fellowship)Accepted manuscrip
    corecore