11 research outputs found
CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems
Data availability is critical in distributed storage systems, especially when
node failures are prevalent in real life. A key requirement is to minimize the
amount of data transferred among nodes when recovering the lost or unavailable
data of failed nodes. This paper explores recovery solutions based on
regenerating codes, which are shown to provide fault-tolerant storage and
minimum recovery bandwidth. Existing optimal regenerating codes are designed
for single node failures. We build a system called CORE, which augments
existing optimal regenerating codes to support a general number of failures
including single and concurrent failures. We theoretically show that CORE
achieves the minimum possible recovery bandwidth for most cases. We implement
CORE and evaluate our prototype atop a Hadoop HDFS cluster testbed with up to
20 storage nodes. We demonstrate that our CORE prototype conforms to our
theoretical findings and achieves recovery bandwidth saving when compared to
the conventional recovery approach based on erasure codes.Comment: 25 page
Secure Cooperative Regenerating Codes for Distributed Storage Systems
Regenerating codes enable trading off repair bandwidth for storage in
distributed storage systems (DSS). Due to their distributed nature, these
systems are intrinsically susceptible to attacks, and they may also be subject
to multiple simultaneous node failures. Cooperative regenerating codes allow
bandwidth efficient repair of multiple simultaneous node failures. This paper
analyzes storage systems that employ cooperative regenerating codes that are
robust to (passive) eavesdroppers. The analysis is divided into two parts,
studying both minimum bandwidth and minimum storage cooperative regenerating
scenarios. First, the secrecy capacity for minimum bandwidth cooperative
regenerating codes is characterized. Second, for minimum storage cooperative
regenerating codes, a secure file size upper bound and achievability results
are provided. These results establish the secrecy capacity for the minimum
storage scenario for certain special cases. In all scenarios, the achievability
results correspond to exact repair, and secure file size upper bounds are
obtained using min-cut analyses over a suitable secrecy graph representation of
DSS. The main achievability argument is based on an appropriate pre-coding of
the data to eliminate the information leakage to the eavesdropper
A DELAYED PARITY GENERATION CODE FOR ACCELERATING DATA WRITE IN ERASURE CODED STORAGE SYSTEMS
We propose delayed parity generation as a method to improve the write speed in erasure-coded storage systems. In the proposed approach, only some of the parities in the erasure codes are generated at the time of data write (data commit), and the other parities are not generated, transported, or written in the system until system load is lighter. This allows faster data write, at the expense of a small sacrifice in the reliability of the data during a short period between the time of the initial data write and when the full set of parities is produced. Although the delayed parity generation procedure is anticipated to be performed during time of light system load, it is still important to reduce data traffic and disk IO as much as possible when doing so. For this purpose, we first identify the fundamental limits of this approach through a connection to the well-known multicast network coding problem, then provide an explicit and low-complexity code construction. The problem we consider is closely related to the regenerating code problem. However, our proposed code is much simpler and has a much smaller subpacketization factor than regenerating codes. Our result shows that blindly adopting regenerating codes in this setting is unnecessary and wasteful. Experimental results confirm that to obtain the improved write speed, the proposed code does not significantly increase computation burden