16,201 research outputs found
Cores of Cooperative Games in Information Theory
Cores of cooperative games are ubiquitous in information theory, and arise
most frequently in the characterization of fundamental limits in various
scenarios involving multiple users. Examples include classical settings in
network information theory such as Slepian-Wolf source coding and multiple
access channels, classical settings in statistics such as robust hypothesis
testing, and new settings at the intersection of networking and statistics such
as distributed estimation problems for sensor networks. Cooperative game theory
allows one to understand aspects of all of these problems from a fresh and
unifying perspective that treats users as players in a game, sometimes leading
to new insights. At the heart of these analyses are fundamental dualities that
have been long studied in the context of cooperative games; for information
theoretic purposes, these are dualities between information inequalities on the
one hand and properties of rate, capacity or other resource allocation regions
on the other.Comment: 12 pages, published at
http://www.hindawi.com/GetArticle.aspx?doi=10.1155/2008/318704 in EURASIP
Journal on Wireless Communications and Networking, Special Issue on "Theory
and Applications in Multiuser/Multiterminal Communications", April 200
Improving Table Compression with Combinatorial Optimization
We study the problem of compressing massive tables within the
partition-training paradigm introduced by Buchsbaum et al. [SODA'00], in which
a table is partitioned by an off-line training procedure into disjoint
intervals of columns, each of which is compressed separately by a standard,
on-line compressor like gzip. We provide a new theory that unifies previous
experimental observations on partitioning and heuristic observations on column
permutation, all of which are used to improve compression rates. Based on the
theory, we devise the first on-line training algorithms for table compression,
which can be applied to individual files, not just continuously operating
sources; and also a new, off-line training algorithm, based on a link to the
asymmetric traveling salesman problem, which improves on prior work by
rearranging columns prior to partitioning. We demonstrate these results
experimentally. On various test files, the on-line algorithms provide 35-55%
improvement over gzip with negligible slowdown; the off-line reordering
provides up to 20% further improvement over partitioning alone. We also show
that a variation of the table compression problem is MAX-SNP hard.Comment: 22 pages, 2 figures, 5 tables, 23 references. Extended abstract
appears in Proc. 13th ACM-SIAM SODA, pp. 213-222, 200
Network correlated data gathering with explicit communication: NP-completeness and algorithms
We consider the problem of correlated data gathering by a network with a sink node and a tree-based communication structure, where the goal is to minimize the total transmission cost of transporting the information collected by the nodes, to the sink node. For source coding of correlated data, we consider a joint entropy-based coding model with explicit communication where coding is simple and the transmission structure optimization is difficult. We first formulate the optimization problem definition in the general case and then we study further a network setting where the entropy conditioning at nodes does not depend on the amount of side information, but only on its availability. We prove that even in this simple case, the optimization problem is NP-hard. We propose some efficient, scalable, and distributed heuristic approximation algorithms for solving this problem and show by numerical simulations that the total transmission cost can be significantly improved over direct transmission or the shortest path tree. We also present an approximation algorithm that provides a tree transmission structure with total cost within a constant factor from the optimal
Segregating Event Streams and Noise with a Markov Renewal Process Model
DS and MP are supported by EPSRC Leadership Fellowship EP/G007144/1
Decentralized Erasure Codes for Distributed Networked Storage
We consider the problem of constructing an erasure code for storage over a
network when the data sources are distributed. Specifically, we assume that
there are n storage nodes with limited memory and k<n sources generating the
data. We want a data collector, who can appear anywhere in the network, to
query any k storage nodes and be able to retrieve the data. We introduce
Decentralized Erasure Codes, which are linear codes with a specific randomized
structure inspired by network coding on random bipartite graphs. We show that
decentralized erasure codes are optimally sparse, and lead to reduced
communication, storage and computation cost over random linear coding.Comment: to appear in IEEE Transactions on Information Theory, Special Issue:
Networking and Information Theor
Traffic-Redundancy Aware Network Design
We consider network design problems for information networks where routers
can replicate data but cannot alter it. This functionality allows the network
to eliminate data-redundancy in traffic, thereby saving on routing costs. We
consider two problems within this framework and design approximation
algorithms.
The first problem we study is the traffic-redundancy aware network design
(RAND) problem. We are given a weighted graph over a single server and many
clients. The server owns a number of different data packets and each client
desires a subset of the packets; the client demand sets form a laminar set
system. Our goal is to connect every client to the source via a single path,
such that the collective cost of the resulting network is minimized. Here the
transportation cost over an edge is its weight times times the number of
distinct packets that it carries.
The second problem is a facility location problem that we call RAFL. Here the
goal is to find an assignment from clients to facilities such that the total
cost of routing packets from the facilities to clients (along unshared paths),
plus the total cost of "producing" one copy of each desired packet at each
facility is minimized.
We present a constant factor approximation for the RAFL and an O(log P)
approximation for RAND, where P is the total number of distinct packets. We
remark that P is always at most the number of different demand sets desired or
the number of clients, and is generally much smaller.Comment: 17 pages. To be published in the proceedings of the Twenty-Third
Annual ACM-SIAM Symposium on Discrete Algorithm
- …