180,370 research outputs found
Malleable coding for updatable cloud caching
In software-as-a-service applications provisioned through cloud computing, locally cached data are often modified with updates from new versions. In some cases, with each edit, one may want to preserve both the original and new versions. In this paper, we focus on cases in which only the latest version must be preserved. Furthermore, it is desirable for the data to not only be compressed but to also be easily modified during updates, since representing information and modifying the representation both incur cost. We examine whether it is possible to have both compression efficiency and ease of alteration, in order to promote codeword reuse. In other words, we study the feasibility of a malleable and efficient coding scheme. The tradeoff between compression efficiency and malleability cost-the difficulty of synchronizing compressed versions-is measured as the length of a reused prefix portion. The region of achievable rates and malleability is found. Drawing from prior work on common information problems, we show that efficient data compression may not be the best engineering design principle when storing software-as-a-service data. In the general case, goals of efficiency and malleability are fundamentally in conflict.This work was supported in part by an NSF Graduate Research Fellowship (LRV), Grant CCR-0325774, and Grant CCF-0729069. This work was presented at the 2011 IEEE International Symposium on Information Theory [1] and the 2014 IEEE International Conference on Cloud Engineering [2]. The associate editor coordinating the review of this paper and approving it for publication was R. Thobaben. (CCR-0325774 - NSF Graduate Research Fellowship; CCF-0729069 - NSF Graduate Research Fellowship)Accepted manuscrip
On palimpsests in neural memory: an information theory viewpoint
The finite capacity of neural memory and the
reconsolidation phenomenon suggest it is important to be able
to update stored information as in a palimpsest, where new
information overwrites old information. Moreover, changing
information in memory is metabolically costly. In this paper, we
suggest that information-theoretic approaches may inform the
fundamental limits in constructing such a memory system. In
particular, we define malleable coding, that considers not only
representation length but also ease of representation update,
thereby encouraging some form of recycling to convert an old
codeword into a new one. Malleability cost is the difficulty of
synchronizing compressed versions, and malleable codes are of
particular interest when representing information and modifying
the representation are both expensive. We examine the tradeoff
between compression efficiency and malleability cost, under a
malleability metric defined with respect to a string edit distance.
This introduces a metric topology to the compressed domain. We
characterize the exact set of achievable rates and malleability as
the solution of a subgraph isomorphism problem. This is all done
within the optimization approach to biology framework.Accepted manuscrip
Malleable Coding with Fixed Reuse
In cloud computing, storage area networks, remote backup storage, and similar
settings, stored data is modified with updates from new versions. Representing
information and modifying the representation are both expensive. Therefore it
is desirable for the data to not only be compressed but to also be easily
modified during updates. A malleable coding scheme considers both compression
efficiency and ease of alteration, promoting codeword reuse. We examine the
trade-off between compression efficiency and malleability cost-the difficulty
of synchronizing compressed versions-measured as the length of a reused prefix
portion. Through a coding theorem, the region of achievable rates and
malleability is expressed as a single-letter optimization. Relationships to
common information problems are also described
More Efficient Algorithms and Analyses for Unequal Letter Cost Prefix-Free Coding
There is a large literature devoted to the problem of finding an optimal
(min-cost) prefix-free code with an unequal letter-cost encoding alphabet of
size. While there is no known polynomial time algorithm for solving it
optimally there are many good heuristics that all provide additive errors to
optimal. The additive error in these algorithms usually depends linearly upon
the largest encoding letter size.
This paper was motivated by the problem of finding optimal codes when the
encoding alphabet is infinite. Because the largest letter cost is infinite, the
previous analyses could give infinite error bounds. We provide a new algorithm
that works with infinite encoding alphabets. When restricted to the finite
alphabet case, our algorithm often provides better error bounds than the best
previous ones known.Comment: 29 pages;9 figures
Evolutionary Approaches to Minimizing Network Coding Resources
We wish to minimize the resources used for network coding while achieving the
desired throughput in a multicast scenario. We employ evolutionary approaches,
based on a genetic algorithm, that avoid the computational complexity that
makes the problem NP-hard. Our experiments show great improvements over the
sub-optimal solutions of prior methods. Our new algorithms improve over our
previously proposed algorithm in three ways. First, whereas the previous
algorithm can be applied only to acyclic networks, our new method works also
with networks with cycles. Second, we enrich the set of components used in the
genetic algorithm, which improves the performance. Third, we develop a novel
distributed framework. Combining distributed random network coding with our
distributed optimization yields a network coding protocol where the resources
used for coding are optimized in the setup phase by running our evolutionary
algorithm at each node of the network. We demonstrate the effectiveness of our
approach by carrying out simulations on a number of different sets of network
topologies.Comment: 9 pages, 6 figures, accepted to the 26th Annual IEEE Conference on
Computer Communications (INFOCOM 2007
Weak universality in sensory tradeoffs
For many organisms, the number of sensory neurons is largely determined
during development, before strong environmental cues are present. This is
despite the fact that environments can fluctuate drastically both from
generation to generation and within an organism's lifetime. How can organisms
get by by hard-coding the number of sensory neurons? We approach this question
using rate-distortion theory. A combination of simulation and theory suggests
that when environments are large, the rate-distortion function---a proxy for
material costs, timing delays, and energy requirements---depends only on
coarse-grained environmental statistics that are expected to change on
evolutionary, rather than ontogenetic, timescales
Source Coding for Quasiarithmetic Penalties
Huffman coding finds a prefix code that minimizes mean codeword length for a
given probability distribution over a finite number of items. Campbell
generalized the Huffman problem to a family of problems in which the goal is to
minimize not mean codeword length but rather a generalized mean known as a
quasiarithmetic or quasilinear mean. Such generalized means have a number of
diverse applications, including applications in queueing. Several
quasiarithmetic-mean problems have novel simple redundancy bounds in terms of a
generalized entropy. A related property involves the existence of optimal
codes: For ``well-behaved'' cost functions, optimal codes always exist for
(possibly infinite-alphabet) sources having finite generalized entropy. Solving
finite instances of such problems is done by generalizing an algorithm for
finding length-limited binary codes to a new algorithm for finding optimal
binary codes for any quasiarithmetic mean with a convex cost function. This
algorithm can be performed using quadratic time and linear space, and can be
extended to other penalty functions, some of which are solvable with similar
space and time complexity, and others of which are solvable with slightly
greater complexity. This reduces the computational complexity of a problem
involving minimum delay in a queue, allows combinations of previously
considered problems to be optimized, and greatly expands the space of problems
solvable in quadratic time and linear space. The algorithm can be extended for
purposes such as breaking ties among possibly different optimal codes, as with
bottom-merge Huffman coding.Comment: 22 pages, 3 figures, submitted to IEEE Trans. Inform. Theory, revised
per suggestions of reader
- …