2,328 research outputs found
SEARS: Space Efficient And Reliable Storage System in the Cloud
Today's cloud storage services must offer storage reliability and fast data
retrieval for large amount of data without sacrificing storage cost. We present
SEARS, a cloud-based storage system which integrates erasure coding and data
deduplication to support efficient and reliable data storage with fast user
response time. With proper association of data to storage server clusters,
SEARS provides flexible mixing of different configurations, suitable for
real-time and archival applications.
Our prototype implementation of SEARS over Amazon EC2 shows that it
outperforms existing storage systems in storage efficiency and file retrieval
time. For 3 MB files, SEARS delivers retrieval time of s compared to
s with existing systems.Comment: 4 pages, IEEE LCN 201
CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems
Data availability is critical in distributed storage systems, especially when
node failures are prevalent in real life. A key requirement is to minimize the
amount of data transferred among nodes when recovering the lost or unavailable
data of failed nodes. This paper explores recovery solutions based on
regenerating codes, which are shown to provide fault-tolerant storage and
minimum recovery bandwidth. Existing optimal regenerating codes are designed
for single node failures. We build a system called CORE, which augments
existing optimal regenerating codes to support a general number of failures
including single and concurrent failures. We theoretically show that CORE
achieves the minimum possible recovery bandwidth for most cases. We implement
CORE and evaluate our prototype atop a Hadoop HDFS cluster testbed with up to
20 storage nodes. We demonstrate that our CORE prototype conforms to our
theoretical findings and achieves recovery bandwidth saving when compared to
the conventional recovery approach based on erasure codes.Comment: 25 page
Extending DIRAC File Management with Erasure-Coding for efficient storage
The state of the art in Grid style data management is to achieve increased
resilience of data via multiple complete replicas of data files across multiple
storage endpoints. While this is effective, it is not the most space-efficient
approach to resilience, especially when the reliability of individual storage
endpoints is sufficiently high that only a few will be inactive at any point in
time. We report on work performed as part of GridPP\cite{GridPP}, extending the
Dirac File Catalogue and file management interface to allow the placement of
erasure-coded files: each file distributed as N identically-sized chunks of
data striped across a vector of storage endpoints, encoded such that any M
chunks can be lost and the original file can be reconstructed. The tools
developed are transparent to the user, and, as well as allowing up and
downloading of data to Grid storage, also provide the possibility of
parallelising access across all of the distributed chunks at once, improving
data transfer and IO performance. We expect this approach to be of most
interest to smaller VOs, who have tighter bounds on the storage available to
them, but larger (WLCG) VOs may be interested as their total data increases
during Run 2. We provide an analysis of the costs and benefits of the approach,
along with future development and implementation plans in this area. In
general, overheads for multiple file transfers provide the largest issue for
competitiveness of this approach at present.Comment: 21st International Conference on Computing for High Energy and
Nuclear Physics (CHEP2015
AONT-LT: a Data Protection Scheme for Cloud and Cooperative Storage Systems
We propose a variant of the well-known AONT-RS scheme for dispersed storage
systems. The novelty consists in replacing the Reed-Solomon code with rateless
Luby transform codes. The resulting system, named AONT-LT, is able to improve
the performance by dispersing the data over an arbitrarily large number of
storage nodes while ensuring limited complexity. The proposed solution is
particularly suitable in the case of cooperative storage systems. It is shown
that while the AONT-RS scheme requires the adoption of fragmentation for
achieving widespread distribution, thus penalizing the performance, the new
AONT-LT scheme can exploit variable length codes which allow to achieve very
good performance and scalability.Comment: 6 pages, 8 figures, to be presented at the 2014 High Performance
Computing & Simulation Conference (HPCS 2014) - Workshop on Security, Privacy
and Performance in Cloud Computin
HFR Code: A Flexible Replication Scheme for Cloud Storage Systems
Fractional repetition (FR) codes are a family of repair-efficient storage
codes that provide exact and uncoded node repair at the minimum bandwidth
regenerating point. The advantageous repair properties are achieved by a
tailor-made two-layer encoding scheme which concatenates an outer
maximum-distance-separable (MDS) code and an inner repetition code. In this
paper, we generalize the application of FR codes and propose heterogeneous
fractional repetition (HFR) code, which is adaptable to the scenario where the
repetition degrees of coded packets are different. We provide explicit code
constructions by utilizing group divisible designs, which allow the design of
HFR codes over a large range of parameters. The constructed codes achieve the
system storage capacity under random access repair and have multiple repair
alternatives for node failures. Further, we take advantage of the systematic
feature of MDS codes and present a novel design framework of HFR codes, in
which storage nodes can be wisely partitioned into clusters such that data
reconstruction time can be reduced when contacting nodes in the same cluster.Comment: Accepted for publication in IET Communications, Jul. 201
- …