Search CORE

11 research outputs found

Node Repair for Distributed Storage Systems over Fading Channels

Author: Barreal Amaro
Hollanti Camilla
Karpuk David
Publication venue
Publication date: 25/09/2014
Field of study

Distributed storage systems and associated storage codes can efficiently store a large amount of data while ensuring that data is retrievable in case of node failure. The study of such systems, particularly the design of storage codes over finite fields, assumes that the physical channel through which the nodes communicate is error-free. This is not always the case, for example, in a wireless storage system. We study the probability that a subpacket is repaired incorrectly during node repair in a distributed storage system, in which the nodes communicate over an AWGN or Rayleigh fading channels. The asymptotic probability (as SNR increases) that a node is repaired incorrectly is shown to be completely determined by the repair locality of the DSS and the symbol error rate of the wireless channel. Lastly, we propose some design criteria for physical layer coding in this scenario, and use it to compute optimally rotated QAM constellations for use in wireless distributed storage systems.Comment: To appear in ISITA 201

arXiv.org e-Print Archive

Aaltodoc Publication Archive

A Repair Framework for Scalar MDS Codes

Author: Caire Giuseppe
Dimakis Alexandros G.
Papailiopoulos Dimitris S.
Shanmugam Karthikeyan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/12/2013
Field of study

Several works have developed vector-linear maximum-distance separable (MDS) storage codes that min- imize the total communication cost required to repair a single coded symbol after an erasure, referred to as repair bandwidth (BW). Vector codes allow communicating fewer sub-symbols per node, instead of the entire content. This allows non trivial savings in repair BW. In sharp contrast, classic codes, like Reed- Solomon (RS), used in current storage systems, are deemed to suffer from naive repair, i.e. downloading the entire stored message to repair one failed node. This mainly happens because they are scalar-linear. In this work, we present a simple framework that treats scalar codes as vector-linear. In some cases, this allows significant savings in repair BW. We show that vectorized scalar codes exhibit properties that simplify the design of repair schemes. Our framework can be seen as a finite field analogue of real interference alignment. Using our simplified framework, we design a scheme that we call clique-repair which provably identifies the best linear repair strategy for any scalar 2-parity MDS code, under some conditions on the sub-field chosen for vectorization. We specify optimal repair schemes for specific (5,3)- and (6,4)-Reed- Solomon (RS) codes. Further, we present a repair strategy for the RS code currently deployed in the Facebook Analytics Hadoop cluster that leads to 20% of repair BW savings over naive repair which is the repair scheme currently used for this code.Comment: 10 Pages; accepted to IEEE JSAC -Distributed Storage 201

arXiv.org e-Print Archive

Crossref

Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments

Author: Estrada-Galiñanes Vero
Felber Pascal
Miller Ethan
Pâris Jehan-François
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/10/2018
Field of study

Data centres that use consumer-grade disks drives and distributed peer-to-peer systems are unreliable environments to archive data without enough redundancy. Most redundancy schemes are not completely effective for providing high availability, durability and integrity in the long-term. We propose alpha entanglement codes, a mechanism that creates a virtual layer of highly interconnected storage devices to propagate redundant information across a large scale storage system. Our motivation is to design flexible and practical erasure codes with high fault-tolerance to improve data durability and availability even in catastrophic scenarios. By flexible and practical, we mean code settings that can be adapted to future requirements and practical implementations with reasonable trade-offs between security, resource usage and performance. The codes have three parameters. Alpha increases storage overhead linearly but increases the possible paths to recover data exponentially. Two other parameters increase fault-tolerance even further without the need of additional storage. As a result, an entangled storage system can provide high availability, durability and offer additional integrity: it is more difficult to modify data undetectably. We evaluate how several redundancy schemes perform in unreliable environments and show that alpha entanglement codes are flexible and practical codes. Remarkably, they excel at code locality, hence, they reduce repair costs and become less dependent on storage locations with poor availability. Our solution outperforms Reed-Solomon codes in many disaster recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN

arXiv.org e-Print Archive

Crossref

Sparsity Exploiting Erasure Coding for Resilient Storage and Efficient I/O Access in Delta based Versioning Systems

Author: Datta Anwitaman
Harshan J.
Oggier Frédérique
Publication venue
Publication date: 18/11/2014
Field of study

In this paper we study the problem of storing reliably an archive of versioned data. Specifically, we focus on systems where the differences (deltas) between subsequent versions rather than the whole objects are stored - a typical model for storing versioned data. For reliability, we propose erasure encoding techniques that exploit the sparsity of information in the deltas while storing them reliably in a distributed back-end storage system, resulting in improved I/O read performance to retrieve the whole versioned archive. Along with the basic techniques, we propose a few optimization heuristics, and evaluate the techniques' efficacy analytically and with numerical simulations.Comment: 10 pages, 8 figure

arXiv.org e-Print Archive

Crossref

Secure Cooperative Regenerating Codes for Distributed Storage Systems

Author: Koyluoglu O. Ozan
Rawat Ankit Singh
Vishwanath Sriram
Publication venue
Publication date: 08/07/2014
Field of study

Regenerating codes enable trading off repair bandwidth for storage in distributed storage systems (DSS). Due to their distributed nature, these systems are intrinsically susceptible to attacks, and they may also be subject to multiple simultaneous node failures. Cooperative regenerating codes allow bandwidth efficient repair of multiple simultaneous node failures. This paper analyzes storage systems that employ cooperative regenerating codes that are robust to (passive) eavesdroppers. The analysis is divided into two parts, studying both minimum bandwidth and minimum storage cooperative regenerating scenarios. First, the secrecy capacity for minimum bandwidth cooperative regenerating codes is characterized. Second, for minimum storage cooperative regenerating codes, a secure file size upper bound and achievability results are provided. These results establish the secrecy capacity for the minimum storage scenario for certain special cases. In all scenarios, the achievability results correspond to exact repair, and secure file size upper bounds are obtained using min-cut analyses over a suitable secrecy graph representation of DSS. The main achievability argument is based on an appropriate pre-coding of the data to eliminate the information leakage to the eavesdropper

arXiv.org e-Print Archive

CiteSeerX

Efficiently sphere-decodable physical layer transmission schemes for wireless storage networks

Author: A Lakshman
AG Dimakis
AK Singh
B Hassibi
B Hassibi
C Gong
C Hollanti
D Chen
D Tse
DNC Tse
E Bastug
E Telatar
F Oggier
F Oggier
HF Lu
J Jaldén
J Jaldén
J Rabaey
JC Belfiore
JN Laneman
K Azarian
K Azarian
KV Rashmi
L Zheng
M Yuksel
MO Damen
P Elia
P Elia
S Tavildar
S Yang
T Ernvall
T Tang
TM Cover
TW Tang
V Tarokh
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Hadoop : processament distribuït de gran volum de dades en el núvol d'Apache

Author: Arcarons Arqué Adriá
Universitat Autònoma de Barcelona. Escola d'Enginyeria
Publication venue
Publication date: 01/01/2014
Field of study

Avui en dia es genera un volum increïble de dades de diferents tipus i que provenen de multitud d'orígens. Els sistemes d'emmagatzematge i processament distribuït són els elements tecnològics que fan possible capturar aquest allau de dades i permeten donar-ne un valor a través d'anàlisis diversos. Hadoop, que integra un sistema d'emmagatzematge i processament distribuïts, s'ha convertit en l'estàndard de-facto per a aplicacions que necessiten una gran capacitat d'emmagatzematge, inclús de l'ordre de desenes de PBs. En aquest treball farem un estudi de Hadoop, analitzarem l'eficiència del seu sistema de durabilitat i en proposarem una alternativa.Hoy en día se genera un volumen increíble de datos de diferentes tipos y que proceden de multitud de orígenes. Los sistemas de almacenamiento y procesado distribuidos son los elementos tecnológicos que hacen posible capturar esta avalancha de datos y permiten extraer un valor de ellos a través de diferentes tipos de análisis. Hadoop, que integra un sistema de almacenaje y procesado distribuidos, se ha convertido en el estándar de-facto para aplicaciones que necesitan una gran capacidad de almacenaje, incluso del orden de decenas de PBs. En el presente trabajo realizaremos un estudio de Hadoop, analizaremos la eficiencia de su sistema de durabilidad, y propondremos una alternativa.Nowadays, the amount of data generated, which comes from various sources, is overwhelming. Distributed storage systems are the technological solution that make possible to capture this avalanche of data and to obtain a value from it. Hadoop, which offers a distributed storage and processing systems, has become the de-facto standard for applications that seek for a big storage capacity, even in the order of tens of PBs. In the present work, we'll study Hadoop, we'll analyze its durability system's efficiency and we will propose an alternative to it

Diposit Digital de Documents de la UAB