39,319 research outputs found
On Source Coding with Coded Side Information for a Binary Source with Binary Side Information
The lossless rate region for the coded side information problem is "solved," but its solution is expressed in terms of an auxiliary random variable. As a result, finding the rate region for any fixed example requires an optimization over a family of allowed auxiliary random variables. While intuitive constructions are easy to come by and optimal solutions are known under some special conditions, proving the optimal solution is surprisingly difficult even for examples as basic as a binary source with binary side information. We derive the optimal auxiliary random variables and corresponding achievable rate regions for a family of problems where both the source and side information are binary. Our solution involves first tightening known bounds on the alphabet size of the auxiliary random variable and then optimizing the auxiliary random variable subject to this constraint. The technique used to tighten the bound on the alphabet size applies to a variety of problems beyond the one studied here
Providing Private and Fast Data Access for Cloud Systems
Cloud storage and computing systems have become the backbone of many applications such as streaming (Netflix, YouTube), storage (Dropbox, Google Drive), and computing (Amazon Elastic Computing, Microsoft Azure). To address the ever growing demand for storage and computing requirements of these applications, cloud services are typically im-plemented over a large-scale distributed data storage system. Cloud systems are expected to provide the following two pivotal services for the users: 1) private content access and 2) fast content access. The goal of this thesis is to understand and address some of the challenges that need to be overcome to provide these two services.
The first part of this thesis focuses on private data access in distributed systems. In particular, we contribute to the areas of Private Information Retrieval (PIR) and Private Computation (PC). In the PIR problem, there is a user who wishes to privately retrieve a subset of files belonging to a database stored on a single or multiple remote server(s). In the PC problem, the user wants to privately compute functions of a subset of files in the database. The PIR and PC problems seek the most efficient solutions with the minimum download cost that enable the user to retrieve or compute what it wants privately.
We establish fundamental bounds on the minimum download cost required for guaran-teeing the privacy requirement in some practical and realistic settings of the PIR and PC problems and develop novel and efficient privacy-preserving algorithms for these settings. In particular, we study the single-server and multi-server settings of PIR in which the user initially has a random linear combination of a subset of files in the database as side in-formation, referred to as PIR with coded side information. We also study the multi-server setting of the PC in which the user wants to privately compute multiple linear combinations of a subset of files in the database, referred to as Private Linear Transformation.
The second part of this thesis focuses on fast content access in distributed systems. In particular, we study the use of erasure coding to handle data access requests in distributed storage and computing systems. Service rate region is an important performance metric for coded distributed systems, which expresses the set of all data access request rates that can be simultaneously served by the system. In this context, two classes of problems arise: 1) characterizing the service rate region of a given storage scheme and finding the optimal request allocation, and 2) designing the underlying erasure code to handle a given desired service rate region.
As contributions along the first class of problems, we characterize the service rate region of systems with some common coding schemes such as Simplex codes and Reed-Muller codes by introducing two novel techniques: 1) fractional matching and vertex cover on graph representation of codes, and 2) geometric representations of codes. Moreover, along the second class of code design, we establish some lower bounds on the minimum storage required to handle a desired service rate region for a coded distributed system and in some regimes, we design efficient storage schemes that provide the desired service rate region while minimizing the storage requirements
Malleable coding for updatable cloud caching
In software-as-a-service applications provisioned through cloud computing, locally cached data are often modified with updates from new versions. In some cases, with each edit, one may want to preserve both the original and new versions. In this paper, we focus on cases in which only the latest version must be preserved. Furthermore, it is desirable for the data to not only be compressed but to also be easily modified during updates, since representing information and modifying the representation both incur cost. We examine whether it is possible to have both compression efficiency and ease of alteration, in order to promote codeword reuse. In other words, we study the feasibility of a malleable and efficient coding scheme. The tradeoff between compression efficiency and malleability cost-the difficulty of synchronizing compressed versions-is measured as the length of a reused prefix portion. The region of achievable rates and malleability is found. Drawing from prior work on common information problems, we show that efficient data compression may not be the best engineering design principle when storing software-as-a-service data. In the general case, goals of efficiency and malleability are fundamentally in conflict.This work was supported in part by an NSF Graduate Research Fellowship (LRV), Grant CCR-0325774, and Grant CCF-0729069. This work was presented at the 2011 IEEE International Symposium on Information Theory [1] and the 2014 IEEE International Conference on Cloud Engineering [2]. The associate editor coordinating the review of this paper and approving it for publication was R. Thobaben. (CCR-0325774 - NSF Graduate Research Fellowship; CCF-0729069 - NSF Graduate Research Fellowship)Accepted manuscrip
On feedback in network source coding
We consider source coding over networks with
unlimited feedback from the sinks to the sources. We first show
examples of networks where the rate region with feedback is
a strict superset of that without feedback. Next, we find an
achievable region for multiterminal lossy source coding with
feedback. Finally, we evaluate this region for the case when one
of the sources is fully known at the decoder and use the result
to show that this region is a strict superset of the best known
achievable region for the problem without feedback
Orthogonal Multiple Access with Correlated Sources: Feasible Region and Pragmatic Schemes
In this paper, we consider orthogonal multiple access coding schemes, where
correlated sources are encoded in a distributed fashion and transmitted,
through additive white Gaussian noise (AWGN) channels, to an access point (AP).
At the AP, component decoders, associated with the source encoders, iteratively
exchange soft information by taking into account the source correlation. The
first goal of this paper is to investigate the ultimate achievable performance
limits in terms of a multi-dimensional feasible region in the space of channel
parameters, deriving insights on the impact of the number of sources. The
second goal is the design of pragmatic schemes, where the sources use
"off-the-shelf" channel codes. In order to analyze the performance of given
coding schemes, we propose an extrinsic information transfer (EXIT)-based
approach, which allows to determine the corresponding multi-dimensional
feasible regions. On the basis of the proposed analytical framework, the
performance of pragmatic coded schemes, based on serially concatenated
convolutional codes (SCCCs), is discussed
On Two-Pair Two-Way Relay Channel with an Intermittently Available Relay
When multiple users share the same resource for physical layer cooperation
such as relay terminals in their vicinities, this shared resource may not be
always available for every user, and it is critical for transmitting terminals
to know whether other users have access to that common resource in order to
better utilize it. Failing to learn this critical piece of information may
cause severe issues in the design of such cooperative systems. In this paper,
we address this problem by investigating a two-pair two-way relay channel with
an intermittently available relay. In the model, each pair of users need to
exchange their messages within their own pair via the shared relay. The shared
relay, however, is only intermittently available for the users to access. The
accessing activities of different pairs of users are governed by independent
Bernoulli random processes. Our main contribution is the characterization of
the capacity region to within a bounded gap in a symmetric setting, for both
delayed and instantaneous state information at transmitters. An interesting
observation is that the bottleneck for information flow is the quality of state
information (delayed or instantaneous) available at the relay, not those at the
end users. To the best of our knowledge, our work is the first result regarding
how the shared intermittent relay should cooperate with multiple pairs of users
in such a two-way cooperative network.Comment: extended version of ISIT 2015 pape
Broadcast Caching Networks with Two Receivers and Multiple Correlated Sources
The correlation among the content distributed across a cache-aided broadcast
network can be exploited to reduce the delivery load on the shared wireless
link. This paper considers a two-user three-file network with correlated
content, and studies its fundamental limits for the worst-case demand. A class
of achievable schemes based on a two-step source coding approach is proposed.
Library files are first compressed using Gray-Wyner source coding, and then
cached and delivered using a combination of correlation-unaware cache-aided
coded multicast schemes. The second step is interesting in its own right and
considers a multiple-request caching problem, whose solution requires coding in
the placement phase. A lower bound on the optimal peak rate-memory trade-off is
derived, which is used to evaluate the performance of the proposed scheme. It
is shown that for symmetric sources the two-step strategy achieves the lower
bound for large cache capacities, and it is within half of the joint entropy of
two of the sources conditioned on the third source for all other cache sizes.Comment: in Proceedings of Asilomar Conference on Signals, Systems and
Computers, Pacific Grove, California, November 201
Hierarchical Coded Caching
Caching of popular content during off-peak hours is a strategy to reduce
network loads during peak hours. Recent work has shown significant benefits of
designing such caching strategies not only to deliver part of the content
locally, but also to provide coded multicasting opportunities even among users
with different demands. Exploiting both of these gains was shown to be
approximately optimal for caching systems with a single layer of caches.
Motivated by practical scenarios, we consider in this work a hierarchical
content delivery network with two layers of caches. We propose a new caching
scheme that combines two basic approaches. The first approach provides coded
multicasting opportunities within each layer; the second approach provides
coded multicasting opportunities across multiple layers. By striking the right
balance between these two approaches, we show that the proposed scheme achieves
the optimal communication rates to within a constant multiplicative and
additive gap. We further show that there is no tension between the rates in
each of the two layers up to the aforementioned gap. Thus, both layers can
simultaneously operate at approximately the minimum rate.Comment: 31 page
- …