Search CORE

46,750 research outputs found

Locality and Availability in Distributed Storage

Author: Dimakis Alexandros G.
Papailiopoulos Dimitris S.
Rawat Ankit Singh
Vishwanath Sriram
Publication venue
Publication date: 01/01/2014
Field of study

This paper studies the problem of code symbol availability: a code symbol is said to have

(r, t)

-availability if it can be reconstructed from

t

disjoint groups of other symbols, each of size at most

r

. For example,

3

-replication supports

(1, 2)

-availability as each symbol can be read from its

t= 2

other (disjoint) replicas, i.e.,

r=1

. However, the rate of replication must vanish like

\frac{1}{t+1}

as the availability increases. This paper shows that it is possible to construct codes that can support a scaling number of parallel reads while keeping the rate to be an arbitrarily high constant. It further shows that this is possible with the minimum distance arbitrarily close to the Singleton bound. This paper also presents a bound demonstrating a trade-off between minimum distance, availability and locality. Our codes match the aforementioned bound and their construction relies on combinatorial objects called resolvable designs. From a practical standpoint, our codes seem useful for distributed storage applications involving hot data, i.e., the information which is frequently accessed by multiple processes in parallel.Comment: Submitted to ISIT 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments

Author: Estrada-Galiñanes Vero
Felber Pascal
Miller Ethan
Pâris Jehan-François
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/10/2018
Field of study

Data centres that use consumer-grade disks drives and distributed peer-to-peer systems are unreliable environments to archive data without enough redundancy. Most redundancy schemes are not completely effective for providing high availability, durability and integrity in the long-term. We propose alpha entanglement codes, a mechanism that creates a virtual layer of highly interconnected storage devices to propagate redundant information across a large scale storage system. Our motivation is to design flexible and practical erasure codes with high fault-tolerance to improve data durability and availability even in catastrophic scenarios. By flexible and practical, we mean code settings that can be adapted to future requirements and practical implementations with reasonable trade-offs between security, resource usage and performance. The codes have three parameters. Alpha increases storage overhead linearly but increases the possible paths to recover data exponentially. Two other parameters increase fault-tolerance even further without the need of additional storage. As a result, an entangled storage system can provide high availability, durability and offer additional integrity: it is more difficult to modify data undetectably. We evaluate how several redundancy schemes perform in unreliable environments and show that alpha entanglement codes are flexible and practical codes. Remarkably, they excel at code locality, hence, they reduce repair costs and become less dependent on storage locations with poor availability. Our solution outperforms Reed-Solomon codes in many disaster recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN

arXiv.org e-Print Archive

Crossref

Locality and Availability with Multiple Erasure Tolerance in Distributed Storage

Author: 소병현
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (석사)-- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2019. 2. 이정우.최근 여러 시스템에서 다루는 데이터의 양이 방대해지면서 분산 저장 시스템의 중요성이 커지고 있다. 분산 저장 시스템에서는 네트워크 상의 문제 혹은 장비의 문제로 인해 노드 손실이라는 결함이 생긴다. 이 경우 손실되지 않은 노드를 통해 손실된 노드를 원상태로 복구하는 것이 중요하다. 이 때 분산 저장에 사용된 부호가 복구의 성능을 결정짓게 된다. 시스템의 용도에 따라 분산 저장에 사용되는 부호의 성능을 결정하는 요소가 다르다. 그 중 부분접속수(locality)는 어떤 손실된 노드를 복구하기 위해 필요한 노드의 수를 의미하고 가용도는 어떤 손실된 노드를 복구할 수 있는 서로소(disjoint)인 복구집합의 수를 의미한다. 실용적인 측면에서 가용도 개념을 도입할 경우 다수의 사용자가 동시에 여러 데이터에 병렬적으로 접근함으로써 동시에 데이터를 읽을 수 있다는 장점이 있다. 따라서 가용도를 고려한 부분접속복구 부호는 핫 데이터가 주로 저장된 분산 저장 시스템에 매우 유용하다. 본 논문에서는 분산 저장 시스템에서 다중 노드 손실과 가용도를 함께 고려한 부분접속복구 부호를 새롭게 제안하고 그 부호에 대한 최소 거리의 상계를 구한다. 그리고 새롭게 제안한 부호의 최소 거리의 상계의 achievability를 보이기 위해 최소 거리 상계에 대한 등식을 만족하는 부호를 설계한다. 특히 본 논문에서는 정보 심볼들에 대한 복구집합들의 노드 손실까지 고려했기 때문에 기존의 가용도만을 고려한 부분접속복구 부호에 비해 손실에 대한 tolerance가 더 크다. 따라서 본 논문에서 제안하는 (n,k,r,t,δ)-부분접속복구 부호는 손실이 자주 일어나며 동시에 접속할 필요가 있는 핫 데이터 사용에 더욱 적합하다.Recently, as the amount of data to be handled by various systems has increased, the importance of distributed storage systems has increased. In a distributed storage system, there is a flaw in the node loss due to network problems or equipment problems. In this case, it is important to reconstruct the lost node through the non-lost node. At this time, the code used for distributed storage determines the performance of recovery. Depending on the use of the system, the factors that determine the performance of the codes used for distributed storage are different. Among them, 'locality' means the number of nodes needed to recover a lost node, and availability means the number of disjoint recovery sets that can recover a lost node. In practical terms, when the availability is introduced, it is advantageous that a plurality of users simultaneously access data at the same time and simultaneously read data. Therefore, locally repairable code considering availability is very useful for distributed storage systems where hot data is mainly stored. In this paper, we propose a locally repairable code considering multi - node loss and availability in a distributed storage system. Moreover, we find the upper bound of minimum distance for the code. In order to show the achievability of the upper bound of the minimum distance of the newly proposed code, a code satisfying the equation for the bound is designed. In particular, since we consider multiple node loss of recovery sets, we have more tolerance for loss than locally repairable code considering only the availability. Therefore, the (n, k, r, t, δ) – locally repairable code proposed in this paper is more suitable for using hot data which has frequent loss and frequent connection.제 1 장 서 론 1 제 1 절 연구의 배경 1 제 2 절 연구의 내용 1 제 2 장 배경이론 2 제 1 절 생성 행렬과 패리티 검사 행렬 2 제 2 절 부호의 최소 거리와 싱글톤 상계 3 제 3 장 부분접속복구 부호 6 제 1 절 부분접속복구 부호 6 제 2 절 분산저장에서 부분접속수와 가용도 8 제 4 장 다중 노드 손실을 고려한 부분접속수와 가용도 11 제 1 절 다중 노드 손실과 가용도를 고려한 부분접속복구 부호 11 제 2 절 (n,k,r,t,δ)-부분접속복구 부호의 최소 거리에 대한 상계 12 제 5 장 (n,k,r,t,δ)-부분접속복구 부호의 최소 거리 상계에 대한achievability 18 제 1 절 5(n,k,r,t,δ)-부분접속복구 부호의 설계 18 제 2 절 최소 거리 상계에 대한 achievability 20 제 6 장 결 론 23 참고문헌 24Maste

SNU Open Repository and Archive

Coding for the Clouds: Coding Techniques for Enabling Security, Locality, and Availability in Distributed Storage Systems

Author: Kadhe Swanand Ravindra
Publication venue
Publication date: 16/01/2019
Field of study

Cloud systems have become the backbone of many applications such as multimedia streaming, e-commerce, and cluster computing. At the foundation of any cloud architecture lies a large-scale, distributed, data storage system. To accommodate the massive amount of data being stored on the cloud, these distributed storage systems (DSS) have been scaled to contain hundreds to thousands of nodes that are connected through a networking infrastructure. Such data-centers are usually built out of commodity components, which make failures the norm rather than the exception. In order to combat node failures, data is typically stored in a redundant fashion. Due to the exponential data growth rate, many DSS are beginning to resort to error control coding over conventional replication methods, as coding offers high storage space efficiency. This paradigm shift from replication to coding, along with the need to guarantee reliability, efficiency, and security in DSS, has created a new set of challenges and opportunities, opening up a new area of research. This thesis addresses several of these challenges and opportunities by broadly making the following contributions. (i) We design practically amenable, low-complexity coding schemes that guarantee security of cloud systems, ensure quick recovery from failures, and provide high availability for retrieving partial information; and (ii) We analyze fundamental performance limits and optimal trade-offs between the key performance metrics of these coding schemes. More specifically, we first consider the problem of achieving information-theoretic security in DSS against an eavesdropper that can observe a limited number of nodes. We present a framework that enables design of secure repair-efficient codes through a joint construction of inner and outer codes. Then, we consider a practically appealing notion of weakly secure coding, and construct coset codes that can weakly secure a wide class of regenerating codes that reduce the amount of data downloaded during node repair. Second, we consider the problem of meeting repair locality constraints, which specify the number of nodes participating in the repair process. We propose a notion of unequal locality, which enables different locality values for different nodes, ensuring quick recovery for nodes storing important data. We establish tight upper bounds on the minimum distance of linear codes with unequal locality, and present optimal code constructions. Next, we extend the notion of locality from the Hamming metric to the rank and subspace metrics, with the goal of designing codes for efficient data recovery from special types of correlated failures in DSS.We construct a family of locally recoverable rank-metric codes with optimal data recovery properties. Finally, we consider the problem of providing high availability, which is ensured by enabling node repair from multiple disjoint subsets of nodes of small size. We study codes with availability from a queuing-theoretical perspective by analyzing the average time necessary to download a block of data under the Poisson request arrival model when each node takes a random amount of time to fetch its contents. We compare the delay performance of the availability codes with several alternatives such as conventional erasure codes and replication schemes

Texas A&M Repository

Improving capacity-performance tradeoffs in the storage tier

Author: Villasenor Eric P.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2015
Field of study

Data-set sizes are growing. New techniques are emerging to organize and analyze these data-sets. There is a key access pattern emerging with these new techniques, large sequential file accesses. The trend toward bigger files exists to help amortize the cost of data accesses from the storage layer, as many workloads are recognized to be I/O bound. The storage layer is widely recognized as the slowest layer in the system. This work focuses on the tradeoff one can make with that storage capacity to improve system performance. ^ Capacity can be leveraged for improved availability or improved performance. This tradeoff is key in the storage layer, as this allows for data loss prevention and bandwidth aggregation. Typically these tradeoffs do not allow much choice with regard to capacity use. This work will leverage replication as the enabling mechanism to improve the capacity-performance tradeoff in the storage tier, while still providing for availability. ^ This capacity-performance tradeoff can be made at both the local and distributed file system level. I propose two techniques that allow for an improved tradeoff of capacity. The local file system can be employed on scale-out or scale-up infrastructures to improve performance. The distributed file system is targeted at distributed frameworks, such as MapReduce, to improve the cluster performance. The local file system design is MorphStore, and the distributed file system is BoostDFS. ^ MorphStore is a file system that significantly improves performance when accessing large files by using two innovations. MorphStore combines (a) load-adaptive I/O access scheduling to dynamically optimize throughput (aggregation), and (b) utility-xiii driven replication to best use capacity for performance. Additionally, adaptive-access scheduling can be utilized to optimize scheduling of requests (for throughput) on systems with a large number of storage devices. Replication is utilized to make available high utility files and then optimize throughput of these high utility files based on system load. ^ BoostDFS is a distributed file system that allows a better capacity-performance tradeoff via inter-node file replication. BoostDFS is built on the observation that distributed file systems currently inter-node replication for availability, but provide no mechanism to further improve performance. Replication for availability provides diminishing returns on performance, this is due to saturation of locality. BoostDFS exploits the common by improving I/O performance of these local tasks. This is done via intra-node replication by leveraging MorphStore as the local file system. This technique allows for capacity to be traded for availability as well as performance, with a small capacity overhead under constant availability. ^ Both MorphStore and BoostDFS utilize replication. Replication allows for both bandwidth aggregation and availability, This work primarily focuses on the performance utility of replication, but does not sacrifice availability in the process. These techniques provide an improved capacity-performance tradeoff while allowing the desired level of availability

Purdue E-Pubs