430 research outputs found
PIR Array Codes with Optimal Virtual Server Rate
There has been much recent interest in Private information Retrieval (PIR) in
models where a database is stored across several servers using coding
techniques from distributed storage, rather than being simply replicated. In
particular, a recent breakthrough result of Fazelli, Vardy and Yaakobi
introduces the notion of a PIR code and a PIR array code, and uses this notion
to produce efficient PIR protocols.
In this paper we are interested in designing PIR array codes. We consider the
case when we have servers, with each server storing a fraction of
the bits of the database; here is a fixed rational number with . A
PIR array code with the -PIR property enables a -server PIR protocol
(with ) to be emulated on servers, with the overall storage
requirements of the protocol being reduced. The communication complexity of a
PIR protocol reduces as grows, so the virtual server rate, defined to be
, is an important parameter. We study the maximum virtual server rate of a
PIR array code with the -PIR property. We present upper bounds on the
achievable virtual server rate, some constructions, and ideas how to obtain PIR
array codes with the highest possible virtual server rate. In particular, we
present constructions that asymptotically meet our upper bounds, and the exact
largest virtual server rate is obtained when .
A -PIR code (and similarly a -PIR array code) is also a locally
repairable code with symbol availability . Such a code ensures
parallel reads for each information symbol. So the virtual server rate is very
closely related to the symbol availability of the code when used as a locally
repairable code. The results of this paper are discussed also in this context,
where subspace codes also have an important role
Some new constructions of optimal linear codes and alphabet-optimal -locally repairable codes
In distributed storage systems, locally repairable codes (LRCs) are designed
to reduce disk I/O and repair costs by enabling recovery of each code symbol
from a small number of other symbols. To handle multiple node failures,
-LRCs are introduced to enable local recovery in the event of up to
failed nodes. Constructing optimal -LRCs has been a
significant research topic over the past decade. In \cite{Luo2022}, Luo
\emph{et al.} proposed a construction of linear codes by using unions of some
projective subspaces within a projective space. Several new classes of Griesmer
codes and distance-optimal codes were constructed, and some of them were proved
to be alphabet-optimal -LRCs.
In this paper, we first modify the method of constructing linear codes in
\cite{Luo2022} by considering a more general situation of intersecting
projective subspaces. This modification enables us to construct good codes with
more flexible parameters. Additionally, we present the conditions for the
constructed linear codes to qualify as Griesmer codes or achieve distance
optimality. Next, we explore the locality of linear codes constructed by
eliminating elements from a complete projective space. The novelty of our work
lies in establishing the locality as , , or -locality,
in contrast to the previous literature that only considered -locality.
Moreover, by combining analysis of code parameters and the C-M like bound for
-LRCs, we construct some alphabet-optimal -LRCs which
may be either Griesmer codes or not Griesmer codes. Finally, we investigate the
availability and alphabet-optimality of -LRCs constructed from our
modified framework.Comment: 25 page
Coding for the Clouds: Coding Techniques for Enabling Security, Locality, and Availability in Distributed Storage Systems
Cloud systems have become the backbone of many applications such as multimedia
streaming, e-commerce, and cluster computing. At the foundation of any cloud architecture
lies a large-scale, distributed, data storage system. To accommodate the massive
amount of data being stored on the cloud, these distributed storage systems (DSS) have
been scaled to contain hundreds to thousands of nodes that are connected through a networking
infrastructure. Such data-centers are usually built out of commodity components,
which make failures the norm rather than the exception.
In order to combat node failures, data is typically stored in a redundant fashion. Due to
the exponential data growth rate, many DSS are beginning to resort to error control coding
over conventional replication methods, as coding offers high storage space efficiency. This
paradigm shift from replication to coding, along with the need to guarantee reliability, efficiency,
and security in DSS, has created a new set of challenges and opportunities, opening
up a new area of research. This thesis addresses several of these challenges and opportunities
by broadly making the following contributions. (i) We design practically amenable,
low-complexity coding schemes that guarantee security of cloud systems, ensure quick
recovery from failures, and provide high availability for retrieving partial information; and
(ii) We analyze fundamental performance limits and optimal trade-offs between the key
performance metrics of these coding schemes.
More specifically, we first consider the problem of achieving information-theoretic
security in DSS against an eavesdropper that can observe a limited number of nodes. We
present a framework that enables design of secure repair-efficient codes through a joint
construction of inner and outer codes. Then, we consider a practically appealing notion
of weakly secure coding, and construct coset codes that can weakly secure a wide class of regenerating codes that reduce the amount of data downloaded during node repair.
Second, we consider the problem of meeting repair locality constraints, which specify
the number of nodes participating in the repair process. We propose a notion of unequal
locality, which enables different locality values for different nodes, ensuring quick recovery
for nodes storing important data. We establish tight upper bounds on the minimum
distance of linear codes with unequal locality, and present optimal code constructions.
Next, we extend the notion of locality from the Hamming metric to the rank and subspace
metrics, with the goal of designing codes for efficient data recovery from special types of
correlated failures in DSS.We construct a family of locally recoverable rank-metric codes
with optimal data recovery properties.
Finally, we consider the problem of providing high availability, which is ensured by
enabling node repair from multiple disjoint subsets of nodes of small size. We study
codes with availability from a queuing-theoretical perspective by analyzing the average
time necessary to download a block of data under the Poisson request arrival model when
each node takes a random amount of time to fetch its contents. We compare the delay
performance of the availability codes with several alternatives such as conventional erasure
codes and replication schemes
Codes with efficient erasure correction
Distributed storage systems are becoming increasingly ubiquitous in the emerging era of Internet of Things. Major internet technology companies employ large-scale distributed storage systems to accommodate the massive amounts of data generated and requested by global users. The need of reliable and efficient storage of immense amounts of data calls for new applications and development of classical error-correcting codes.
This dissertation is devoted to a study of codes with efficient erasure correction for distributed storage systems. The efficiency of erasure correction is often assessed by two performance metrics, bandwidth and locality. In this dissertation we address several problems for each of these two metrics. We construct families of codes with optimal communication complexity for erasure correction ("repair bandwidth") for a heterogeneous storage model, and derive several results for the problem of optimal repair of Reed-Solomon codes. We also construct families of cyclic and convolutional codes with locality, extending the range of parameters for which such families were previously known
A family of optimal locally recoverable codes
A code over a finite alphabet is called locally recoverable (LRC) if every
symbol in the encoding is a function of a small number (at most ) other
symbols. We present a family of LRC codes that attain the maximum possible
value of the distance for a given locality parameter and code cardinality. The
codewords are obtained as evaluations of specially constructed polynomials over
a finite field, and reduce to a Reed-Solomon code if the locality parameter
is set to be equal to the code dimension. The size of the code alphabet for
most parameters is only slightly greater than the code length. The recovery
procedure is performed by polynomial interpolation over points. We also
construct codes with several disjoint recovering sets for every symbol. This
construction enables the system to conduct several independent and simultaneous
recovery processes of a specific symbol by accessing different parts of the
codeword. This property enables high availability of frequently accessed data
("hot data").Comment: Minor changes. This is the final published version of the pape
Optimal Locally Repairable Linear Codes
Linear erasure codes with local repairability are desirable for distributed
data storage systems. An [n, k, d] code having all-symbol (r,
\delta})-locality, denoted as (r, {\delta})a, is considered optimal if it also
meets the minimum Hamming distance bound. The existing results on the existence
and the construction of optimal (r, {\delta})a codes are limited to only the
special case of {\delta} = 2, and to only two small regions within this special
case, namely, m = 0 or m >= (v+{\delta}-1) > ({\delta}-1), where m = n mod
(r+{\delta}-1) and v = k mod r. This paper investigates the existence
conditions and presents deterministic constructive algorithms for optimal (r,
{\delta})a codes with general r and {\delta}. First, a structure theorem is
derived for general optimal (r, {\delta})a codes which helps illuminate some of
their structure properties. Next, the entire problem space with arbitrary n, k,
r and {\delta} is divided into eight different cases (regions) with regard to
the specific relations of these parameters. For two cases, it is rigorously
proved that no optimal (r, {\delta})a could exist. For four other cases the
optimal (r, {\delta})a codes are shown to exist, deterministic constructions
are proposed and the lower bound on the required field size for these
algorithms to work is provided. Our new constructive algorithms not only cover
more cases, but for the same cases where previous algorithms exist, the new
constructions require a considerably smaller field, which translates to
potentially lower computational complexity. Our findings substantially enriches
the knowledge on (r, {\delta})a codes, leaving only two cases in which the
existence of optimal codes are yet to be determined.Comment: Under Revie
- …