160 research outputs found
Distributed Storage in Mobile Wireless Networks with Device-to-Device Communication
We consider the use of distributed storage (DS) to reduce the communication
cost of content delivery in wireless networks. Content is stored (cached) in a
number of mobile devices using an erasure correcting code. Users retrieve
content from other devices using device-to-device communication or from the
base station (BS), at the expense of higher communication cost. We address the
repair problem when a device storing data leaves the cell. We introduce a
repair scheduling where repair is performed periodically and derive analytical
expressions for the overall communication cost of content download and data
repair as a function of the repair interval. The derived expressions are then
used to evaluate the communication cost entailed by DS using several erasure
correcting codes. Our results show that DS can reduce the communication cost
with respect to the case where content is downloaded only from the BS, provided
that repairs are performed frequently enough. If devices storing content arrive
to the cell, the communication cost using DS is further reduced and, for large
enough arrival rate, it is always beneficial. Interestingly, we show that MDS
codes, which do not perform well for classical DS, can yield a low overall
communication cost in wireless DS.Comment: After final editing for publication in TCO
Coding for the Clouds: Coding Techniques for Enabling Security, Locality, and Availability in Distributed Storage Systems
Cloud systems have become the backbone of many applications such as multimedia
streaming, e-commerce, and cluster computing. At the foundation of any cloud architecture
lies a large-scale, distributed, data storage system. To accommodate the massive
amount of data being stored on the cloud, these distributed storage systems (DSS) have
been scaled to contain hundreds to thousands of nodes that are connected through a networking
infrastructure. Such data-centers are usually built out of commodity components,
which make failures the norm rather than the exception.
In order to combat node failures, data is typically stored in a redundant fashion. Due to
the exponential data growth rate, many DSS are beginning to resort to error control coding
over conventional replication methods, as coding offers high storage space efficiency. This
paradigm shift from replication to coding, along with the need to guarantee reliability, efficiency,
and security in DSS, has created a new set of challenges and opportunities, opening
up a new area of research. This thesis addresses several of these challenges and opportunities
by broadly making the following contributions. (i) We design practically amenable,
low-complexity coding schemes that guarantee security of cloud systems, ensure quick
recovery from failures, and provide high availability for retrieving partial information; and
(ii) We analyze fundamental performance limits and optimal trade-offs between the key
performance metrics of these coding schemes.
More specifically, we first consider the problem of achieving information-theoretic
security in DSS against an eavesdropper that can observe a limited number of nodes. We
present a framework that enables design of secure repair-efficient codes through a joint
construction of inner and outer codes. Then, we consider a practically appealing notion
of weakly secure coding, and construct coset codes that can weakly secure a wide class of regenerating codes that reduce the amount of data downloaded during node repair.
Second, we consider the problem of meeting repair locality constraints, which specify
the number of nodes participating in the repair process. We propose a notion of unequal
locality, which enables different locality values for different nodes, ensuring quick recovery
for nodes storing important data. We establish tight upper bounds on the minimum
distance of linear codes with unequal locality, and present optimal code constructions.
Next, we extend the notion of locality from the Hamming metric to the rank and subspace
metrics, with the goal of designing codes for efficient data recovery from special types of
correlated failures in DSS.We construct a family of locally recoverable rank-metric codes
with optimal data recovery properties.
Finally, we consider the problem of providing high availability, which is ensured by
enabling node repair from multiple disjoint subsets of nodes of small size. We study
codes with availability from a queuing-theoretical perspective by analyzing the average
time necessary to download a block of data under the Poisson request arrival model when
each node takes a random amount of time to fetch its contents. We compare the delay
performance of the availability codes with several alternatives such as conventional erasure
codes and replication schemes
Coding for the Clouds: Coding Techniques for Enabling Security, Locality, and Availability in Distributed Storage Systems
Cloud systems have become the backbone of many applications such as multimedia
streaming, e-commerce, and cluster computing. At the foundation of any cloud architecture
lies a large-scale, distributed, data storage system. To accommodate the massive
amount of data being stored on the cloud, these distributed storage systems (DSS) have
been scaled to contain hundreds to thousands of nodes that are connected through a networking
infrastructure. Such data-centers are usually built out of commodity components,
which make failures the norm rather than the exception.
In order to combat node failures, data is typically stored in a redundant fashion. Due to
the exponential data growth rate, many DSS are beginning to resort to error control coding
over conventional replication methods, as coding offers high storage space efficiency. This
paradigm shift from replication to coding, along with the need to guarantee reliability, efficiency,
and security in DSS, has created a new set of challenges and opportunities, opening
up a new area of research. This thesis addresses several of these challenges and opportunities
by broadly making the following contributions. (i) We design practically amenable,
low-complexity coding schemes that guarantee security of cloud systems, ensure quick
recovery from failures, and provide high availability for retrieving partial information; and
(ii) We analyze fundamental performance limits and optimal trade-offs between the key
performance metrics of these coding schemes.
More specifically, we first consider the problem of achieving information-theoretic
security in DSS against an eavesdropper that can observe a limited number of nodes. We
present a framework that enables design of secure repair-efficient codes through a joint
construction of inner and outer codes. Then, we consider a practically appealing notion
of weakly secure coding, and construct coset codes that can weakly secure a wide class of regenerating codes that reduce the amount of data downloaded during node repair.
Second, we consider the problem of meeting repair locality constraints, which specify
the number of nodes participating in the repair process. We propose a notion of unequal
locality, which enables different locality values for different nodes, ensuring quick recovery
for nodes storing important data. We establish tight upper bounds on the minimum
distance of linear codes with unequal locality, and present optimal code constructions.
Next, we extend the notion of locality from the Hamming metric to the rank and subspace
metrics, with the goal of designing codes for efficient data recovery from special types of
correlated failures in DSS.We construct a family of locally recoverable rank-metric codes
with optimal data recovery properties.
Finally, we consider the problem of providing high availability, which is ensured by
enabling node repair from multiple disjoint subsets of nodes of small size. We study
codes with availability from a queuing-theoretical perspective by analyzing the average
time necessary to download a block of data under the Poisson request arrival model when
each node takes a random amount of time to fetch its contents. We compare the delay
performance of the availability codes with several alternatives such as conventional erasure
codes and replication schemes
Global repair bandwidth cost optimization of generalized regenerating codes in clustered distributed storage systems
In clustered distributed storage systems (CDSSs), one of the main design goals is minimizing the transmission cost during the failed storage nodes repairing. Generalized regenerating codes (GRCs) are proposed to balance the intra-cluster repair bandwidth and the inter-cluster repair bandwidth for guaranteeing data availability. The trade-off performance of GRCs illustrates that, it can reduce storage overhead and inter-cluster repair bandwidths simultaneously. However, in practical big data storage scenarios, GRCs cannot give an effective solution to handle the heterogeneity of bandwidth costs among different clusters for node failures recovery. This paper proposes an asymmetric bandwidth allocation strategy (ABAS) of GRCs for the inter-cluster repair in heterogeneous CDSSs. Furthermore, an upper bound of the achievable capacity of ABAS is derived based on the information flow graph (IFG), and the constraints of storage capacity and intra-cluster repair bandwidth are also elaborated. Then, a metric termed global repair bandwidth cost (GRBC), which can be minimized regarding of the inter-cluster repair bandwidths by solving a linear programming problem, is defined. The numerical results demonstrate that, maintaining the same data availability and storage overhead, the proposed ABAS of GRCs can effectively reduce the GRBC compared to the traditional symmetric bandwidth allocation schemes
- …