299 research outputs found
Rank Minimization over Finite Fields: Fundamental Limits and Coding-Theoretic Interpretations
This paper establishes information-theoretic limits in estimating a finite
field low-rank matrix given random linear measurements of it. These linear
measurements are obtained by taking inner products of the low-rank matrix with
random sensing matrices. Necessary and sufficient conditions on the number of
measurements required are provided. It is shown that these conditions are sharp
and the minimum-rank decoder is asymptotically optimal. The reliability
function of this decoder is also derived by appealing to de Caen's lower bound
on the probability of a union. The sufficient condition also holds when the
sensing matrices are sparse - a scenario that may be amenable to efficient
decoding. More precisely, it is shown that if the n\times n-sensing matrices
contain, on average, \Omega(nlog n) entries, the number of measurements
required is the same as that when the sensing matrices are dense and contain
entries drawn uniformly at random from the field. Analogies are drawn between
the above results and rank-metric codes in the coding theory literature. In
fact, we are also strongly motivated by understanding when minimum rank
distance decoding of random rank-metric codes succeeds. To this end, we derive
distance properties of equiprobable and sparse rank-metric codes. These
distance properties provide a precise geometric interpretation of the fact that
the sparse ensemble requires as few measurements as the dense one. Finally, we
provide a non-exhaustive procedure to search for the unknown low-rank matrix.Comment: Accepted to the IEEE Transactions on Information Theory; Presented at
IEEE International Symposium on Information Theory (ISIT) 201
Coding for the Clouds: Coding Techniques for Enabling Security, Locality, and Availability in Distributed Storage Systems
Cloud systems have become the backbone of many applications such as multimedia
streaming, e-commerce, and cluster computing. At the foundation of any cloud architecture
lies a large-scale, distributed, data storage system. To accommodate the massive
amount of data being stored on the cloud, these distributed storage systems (DSS) have
been scaled to contain hundreds to thousands of nodes that are connected through a networking
infrastructure. Such data-centers are usually built out of commodity components,
which make failures the norm rather than the exception.
In order to combat node failures, data is typically stored in a redundant fashion. Due to
the exponential data growth rate, many DSS are beginning to resort to error control coding
over conventional replication methods, as coding offers high storage space efficiency. This
paradigm shift from replication to coding, along with the need to guarantee reliability, efficiency,
and security in DSS, has created a new set of challenges and opportunities, opening
up a new area of research. This thesis addresses several of these challenges and opportunities
by broadly making the following contributions. (i) We design practically amenable,
low-complexity coding schemes that guarantee security of cloud systems, ensure quick
recovery from failures, and provide high availability for retrieving partial information; and
(ii) We analyze fundamental performance limits and optimal trade-offs between the key
performance metrics of these coding schemes.
More specifically, we first consider the problem of achieving information-theoretic
security in DSS against an eavesdropper that can observe a limited number of nodes. We
present a framework that enables design of secure repair-efficient codes through a joint
construction of inner and outer codes. Then, we consider a practically appealing notion
of weakly secure coding, and construct coset codes that can weakly secure a wide class of regenerating codes that reduce the amount of data downloaded during node repair.
Second, we consider the problem of meeting repair locality constraints, which specify
the number of nodes participating in the repair process. We propose a notion of unequal
locality, which enables different locality values for different nodes, ensuring quick recovery
for nodes storing important data. We establish tight upper bounds on the minimum
distance of linear codes with unequal locality, and present optimal code constructions.
Next, we extend the notion of locality from the Hamming metric to the rank and subspace
metrics, with the goal of designing codes for efficient data recovery from special types of
correlated failures in DSS.We construct a family of locally recoverable rank-metric codes
with optimal data recovery properties.
Finally, we consider the problem of providing high availability, which is ensured by
enabling node repair from multiple disjoint subsets of nodes of small size. We study
codes with availability from a queuing-theoretical perspective by analyzing the average
time necessary to download a block of data under the Poisson request arrival model when
each node takes a random amount of time to fetch its contents. We compare the delay
performance of the availability codes with several alternatives such as conventional erasure
codes and replication schemes
Interleaving schemes for multidimensional cluster errors
We present two-dimensional and three-dimensional interleaving techniques for correcting two- and three-dimensional bursts (or clusters) of errors, where a cluster of errors is characterized by its area or volume. Correction of multidimensional error clusters is required in holographic storage, an emerging application of considerable importance. Our main contribution is the construction of efficient two-dimensional and three-dimensional interleaving schemes. The proposed schemes are based on t-interleaved arrays of integers, defined by the property that every connected component of area or volume t consists of distinct integers. In the two-dimensional case, our constructions are optimal: they have the lowest possible interleaving degree. That is, the resulting t-interleaved arrays contain the smallest possible number of distinct integers, hence minimizing the number of codewords required in an interleaving scheme. In general, we observe that the interleaving problem can be interpreted as a graph-coloring problem, and introduce the useful special class of lattice interleavers. We employ a result of Minkowski, dating back to 1904, to establish both upper and lower bounds on the interleaving degree of lattice interleavers in three dimensions. For the case t≡0 mod 6, the upper and lower bounds coincide, and the Minkowski lattice directly yields an optimal lattice interleaver. For t≠0 mod 6, we construct efficient lattice interleavers using approximations of the Minkowski lattice
Importance of Symbol Equity in Coded Modulation for Power Line Communications
The use of multiple frequency shift keying modulation with permutation codes
addresses the problem of permanent narrowband noise disturbance in a power line
communications system. In this paper, we extend this coded modulation scheme
based on permutation codes to general codes and introduce an additional new
parameter that more precisely captures a code's performance against permanent
narrowband noise. As a result, we define a new class of codes, namely,
equitable symbol weight codes, which are optimal with respect to this measure
Coding for the Clouds: Coding Techniques for Enabling Security, Locality, and Availability in Distributed Storage Systems
Cloud systems have become the backbone of many applications such as multimedia
streaming, e-commerce, and cluster computing. At the foundation of any cloud architecture
lies a large-scale, distributed, data storage system. To accommodate the massive
amount of data being stored on the cloud, these distributed storage systems (DSS) have
been scaled to contain hundreds to thousands of nodes that are connected through a networking
infrastructure. Such data-centers are usually built out of commodity components,
which make failures the norm rather than the exception.
In order to combat node failures, data is typically stored in a redundant fashion. Due to
the exponential data growth rate, many DSS are beginning to resort to error control coding
over conventional replication methods, as coding offers high storage space efficiency. This
paradigm shift from replication to coding, along with the need to guarantee reliability, efficiency,
and security in DSS, has created a new set of challenges and opportunities, opening
up a new area of research. This thesis addresses several of these challenges and opportunities
by broadly making the following contributions. (i) We design practically amenable,
low-complexity coding schemes that guarantee security of cloud systems, ensure quick
recovery from failures, and provide high availability for retrieving partial information; and
(ii) We analyze fundamental performance limits and optimal trade-offs between the key
performance metrics of these coding schemes.
More specifically, we first consider the problem of achieving information-theoretic
security in DSS against an eavesdropper that can observe a limited number of nodes. We
present a framework that enables design of secure repair-efficient codes through a joint
construction of inner and outer codes. Then, we consider a practically appealing notion
of weakly secure coding, and construct coset codes that can weakly secure a wide class of regenerating codes that reduce the amount of data downloaded during node repair.
Second, we consider the problem of meeting repair locality constraints, which specify
the number of nodes participating in the repair process. We propose a notion of unequal
locality, which enables different locality values for different nodes, ensuring quick recovery
for nodes storing important data. We establish tight upper bounds on the minimum
distance of linear codes with unequal locality, and present optimal code constructions.
Next, we extend the notion of locality from the Hamming metric to the rank and subspace
metrics, with the goal of designing codes for efficient data recovery from special types of
correlated failures in DSS.We construct a family of locally recoverable rank-metric codes
with optimal data recovery properties.
Finally, we consider the problem of providing high availability, which is ensured by
enabling node repair from multiple disjoint subsets of nodes of small size. We study
codes with availability from a queuing-theoretical perspective by analyzing the average
time necessary to download a block of data under the Poisson request arrival model when
each node takes a random amount of time to fetch its contents. We compare the delay
performance of the availability codes with several alternatives such as conventional erasure
codes and replication schemes
- …