Search CORE

2,328 research outputs found

SEARS: Space Efficient And Reliable Storage System in the Cloud

Author: Guo Katherine
Li Ying
Soljanin Emina
Wang Xin
Woo Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/08/2015
Field of study

Today's cloud storage services must offer storage reliability and fast data retrieval for large amount of data without sacrificing storage cost. We present SEARS, a cloud-based storage system which integrates erasure coding and data deduplication to support efficient and reliable data storage with fast user response time. With proper association of data to storage server clusters, SEARS provides flexible mixing of different configurations, suitable for real-time and archival applications. Our prototype implementation of SEARS over Amazon EC2 shows that it outperforms existing storage systems in storage efficiency and file retrieval time. For 3 MB files, SEARS delivers retrieval time of

2.5

s compared to

7

s with existing systems.Comment: 4 pages, IEEE LCN 201

arXiv.org e-Print Archive

Crossref

CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems

Author: Lee Patrick P. C.
Li Runhui
Lin Jian
Publication venue
Publication date: 01/01/2013
Field of study

Data availability is critical in distributed storage systems, especially when node failures are prevalent in real life. A key requirement is to minimize the amount of data transferred among nodes when recovering the lost or unavailable data of failed nodes. This paper explores recovery solutions based on regenerating codes, which are shown to provide fault-tolerant storage and minimum recovery bandwidth. Existing optimal regenerating codes are designed for single node failures. We build a system called CORE, which augments existing optimal regenerating codes to support a general number of failures including single and concurrent failures. We theoretically show that CORE achieves the minimum possible recovery bandwidth for most cases. We implement CORE and evaluate our prototype atop a Hadoop HDFS cluster testbed with up to 20 storage nodes. We demonstrate that our CORE prototype conforms to our theoretical findings and achieves recovery bandwidth saving when compared to the conventional recovery approach based on erasure codes.Comment: 25 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Extending DIRAC File Management with Erasure-Coding for efficient storage

Author: Britton David
Crooks David
Roy Gareth
Skipsey Samuel Cadellin
Todev Paulin
Publication venue: 'IOP Publishing'
Publication date: 30/10/2015
Field of study

The state of the art in Grid style data management is to achieve increased resilience of data via multiple complete replicas of data files across multiple storage endpoints. While this is effective, it is not the most space-efficient approach to resilience, especially when the reliability of individual storage endpoints is sufficiently high that only a few will be inactive at any point in time. We report on work performed as part of GridPP\cite{GridPP}, extending the Dirac File Catalogue and file management interface to allow the placement of erasure-coded files: each file distributed as N identically-sized chunks of data striped across a vector of storage endpoints, encoded such that any M chunks can be lost and the original file can be reconstructed. The tools developed are transparent to the user, and, as well as allowing up and downloading of data to Grid storage, also provide the possibility of parallelising access across all of the distributed chunks at once, improving data transfer and IO performance. We expect this approach to be of most interest to smaller VOs, who have tighter bounds on the storage available to them, but larger (WLCG) VOs may be interested as their total data increases during Run 2. We provide an analysis of the costs and benefits of the approach, along with future development and implementation plans in this area. In general, overheads for multiple file transfers provide the largest issue for competitiveness of this approach at present.Comment: 21st International Conference on Computing for High Energy and Nuclear Physics (CHEP2015

arXiv.org e-Print Archive

Enlighten

CERN Document Server

AONT-LT: a Data Protection Scheme for Cloud and Cooperative Storage Systems

Author: Baldi Marco
Chiaraluce Franco
Maturo Nicola
Montali Eugenio
Publication venue
Publication date: 29/05/2014
Field of study

We propose a variant of the well-known AONT-RS scheme for dispersed storage systems. The novelty consists in replacing the Reed-Solomon code with rateless Luby transform codes. The resulting system, named AONT-LT, is able to improve the performance by dispersing the data over an arbitrarily large number of storage nodes while ensuring limited complexity. The proposed solution is particularly suitable in the case of cooperative storage systems. It is shown that while the AONT-RS scheme requires the adoption of fragmentation for achieving widespread distribution, thus penalizing the performance, the new AONT-LT scheme can exploit variable length codes which allow to achieve very good performance and scalability.Comment: 6 pages, 8 figures, to be presented at the 2014 High Performance Computing & Simulation Conference (HPCS 2014) - Workshop on Security, Privacy and Performance in Cloud Computin

arXiv.org e-Print Archive

Crossref

HFR Code: A Flexible Replication Scheme for Cloud Storage Systems

Author: Li Hui
Li Shuo-Yen Robert
Shum Kenneth W.
Zhu Bing
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2015
Field of study

Fractional repetition (FR) codes are a family of repair-efficient storage codes that provide exact and uncoded node repair at the minimum bandwidth regenerating point. The advantageous repair properties are achieved by a tailor-made two-layer encoding scheme which concatenates an outer maximum-distance-separable (MDS) code and an inner repetition code. In this paper, we generalize the application of FR codes and propose heterogeneous fractional repetition (HFR) code, which is adaptable to the scenario where the repetition degrees of coded packets are different. We provide explicit code constructions by utilizing group divisible designs, which allow the design of HFR codes over a large range of parameters. The constructed codes achieve the system storage capacity under random access repair and have multiple repair alternatives for node failures. Further, we take advantage of the systematic feature of MDS codes and present a novel design framework of HFR codes, in which storage nodes can be wisely partitioned into clusters such that data reconstruction time can be reduced when contacting nodes in the same cluster.Comment: Accepted for publication in IET Communications, Jul. 201

arXiv.org e-Print Archive