Increasing Availability in Distributed Storage Systems via Clustering

Gastpar, Michael; Sahraei, Saeid

research

Increasing Availability in Distributed Storage Systems via Clustering

Authors: Michael Gastpar
Saeid Sahraei
Publication date: 21 August 2018
Publisher
Doi

Abstract

We introduce the Fixed Cluster Repair System (FCRS) as a novel architecture for Distributed Storage Systems (DSS), achieving a small repair bandwidth while guaranteeing a high availability. Specifically we partition the set of servers in a DSS into

s

clusters and allow a failed server to choose any cluster other than its own as its repair group. Thereby, we guarantee an availability of

s-1

. We characterize the repair bandwidth vs. storage trade-off for the FCRS under functional repair and show that the minimum repair bandwidth can be improved by an asymptotic multiplicative factor of

2/3

compared to the state of the art coding techniques that guarantee the same availability. We further introduce Cubic Codes designed to minimize the repair bandwidth of the FCRS under the exact repair model. We prove an asymptotic multiplicative improvement of

0.79

in the minimum repair bandwidth compared to the existing exact repair coding techniques that achieve the same availability. We show that Cubic Codes are information-theoretically optimal for the FCRS with

2

and

3

complete clusters. Furthermore, under the repair-by-transfer model, Cubic Codes are optimal irrespective of the number of clusters

Similar works

Full text

Available Versions

Infoscience - École polytechnique fédérale de Lausanne

oai:infoscience.epfl.ch:256584

Last time updated on 09/07/2019

Crossref

Last time updated on 10/08/2021