3,528 research outputs found
Multiset Combinatorial Batch Codes
Batch codes, first introduced by Ishai, Kushilevitz, Ostrovsky, and Sahai,
mimic a distributed storage of a set of data items on servers, in such
a way that any batch of data items can be retrieved by reading at most some
symbols from each server. Combinatorial batch codes, are replication-based
batch codes in which each server stores a subset of the data items.
In this paper, we propose a generalization of combinatorial batch codes,
called multiset combinatorial batch codes (MCBC), in which data items are
stored in servers, such that any multiset request of items, where any
item is requested at most times, can be retrieved by reading at most
items from each server. The setup of this new family of codes is motivated by
recent work on codes which enable high availability and parallel reads in
distributed storage systems. The main problem under this paradigm is to
minimize the number of items stored in the servers, given the values of
, which is denoted by . We first give a necessary and
sufficient condition for the existence of MCBCs. Then, we present several
bounds on and constructions of MCBCs. In particular, we
determine the value of for any , where
is the maximum size of a binary constant weight code of length
, distance four and weight . We also determine the exact value of
when or
Load-Balanced Fractional Repetition Codes
We introduce load-balanced fractional repetition (LBFR) codes, which are a
strengthening of fractional repetition (FR) codes. LBFR codes have the
additional property that multiple node failures can be sequentially repaired by
downloading no more than one block from any other node. This allows for better
use of the network, and can additionally reduce the number of disk reads
necessary to repair multiple nodes. We characterize LBFR codes in terms of
their adjacency graphs, and use this characterization to present explicit
constructions LBFR codes with storage capacity comparable existing FR codes.
Surprisingly, in some parameter regimes, our constructions of LBFR codes match
the parameters of the best constructions of FR codes
Repairable Block Failure Resilient Codes
In large scale distributed storage systems (DSS) deployed in cloud computing,
correlated failures resulting in simultaneous failure (or, unavailability) of
blocks of nodes are common. In such scenarios, the stored data or a content of
a failed node can only be reconstructed from the available live nodes belonging
to available blocks. To analyze the resilience of the system against such block
failures, this work introduces the framework of Block Failure Resilient (BFR)
codes, wherein the data (e.g., file in DSS) can be decoded by reading out from
a same number of codeword symbols (nodes) from each available blocks of the
underlying codeword. Further, repairable BFR codes are introduced, wherein any
codeword symbol in a failed block can be repaired by contacting to remaining
blocks in the system. Motivated from regenerating codes, file size bounds for
repairable BFR codes are derived, trade-off between per node storage and repair
bandwidth is analyzed, and BFR-MSR and BFR-MBR points are derived. Explicit
codes achieving these two operating points for a wide set of parameters are
constructed by utilizing combinatorial designs, wherein the codewords of the
underlying outer codes are distributed to BFR codeword symbols according to
projective planes
Locality and Availability in Distributed Storage
This paper studies the problem of code symbol availability: a code symbol is
said to have -availability if it can be reconstructed from disjoint
groups of other symbols, each of size at most . For example, -replication
supports -availability as each symbol can be read from its other
(disjoint) replicas, i.e., . However, the rate of replication must vanish
like as the availability increases.
This paper shows that it is possible to construct codes that can support a
scaling number of parallel reads while keeping the rate to be an arbitrarily
high constant. It further shows that this is possible with the minimum distance
arbitrarily close to the Singleton bound. This paper also presents a bound
demonstrating a trade-off between minimum distance, availability and locality.
Our codes match the aforementioned bound and their construction relies on
combinatorial objects called resolvable designs.
From a practical standpoint, our codes seem useful for distributed storage
applications involving hot data, i.e., the information which is frequently
accessed by multiple processes in parallel.Comment: Submitted to ISIT 201
- β¦